SECTION-SOURCE-HUB-REST-OPENAPI-SPEC.md
file-level
1
files
1
commits
0
hotspots
0
🧊 dead
0
💥 blast risk
| 1 | # SectionSource Hub REST/OpenAPI Implementation Spec |
| 2 | |
| 3 | ## Simple Summary |
| 4 | |
| 5 | Phase 1M specifies a future Hub REST/OpenAPI surface for body-free SectionSource metadata. |
| 6 | |
| 7 | This phase is planning only. It does not add a Hub REST endpoint, OpenAPI schema, Hub UI, |
| 8 | canister route, search mode, persistence, Scooling runtime behavior, section body, snippet, |
| 9 | summary, PageIndex, OCR, LLM call, provider route, MCP resource, or write-back behavior. |
| 10 | |
| 11 | ## Technical Summary |
| 12 | |
| 13 | The future Hub REST surface must mirror the accepted hosted MCP `get_section_source` |
| 14 | runtime, but through a browser/API route instead of MCP. The route must read exactly one |
| 15 | authorized note, derive body-free `knowtation.section_source/v0` metadata in memory, and |
| 16 | return only the SectionSource v0 allowlist. |
| 17 | |
| 18 | The future OpenAPI surface may document that route only after the Hub REST implementation is |
| 19 | added and tested. |
| 20 | |
| 21 | ## Planning Decision |
| 22 | |
| 23 | Phase 1M accepts the Hub REST/OpenAPI implementation specification only. |
| 24 | |
| 25 | It does not approve: |
| 26 | |
| 27 | - adding `GET /api/v1/section-source` |
| 28 | - adding any other Hub REST SectionSource endpoint |
| 29 | - adding OpenAPI paths, tags, schemas, examples, or components for SectionSource |
| 30 | - adding Hub UI calls or display components |
| 31 | - adding canister routes |
| 32 | - adding search, vectors, indexes, persistence, sidecars, summaries, or memory events |
| 33 | - adding Scooling runtime behavior |
| 34 | - returning note body text |
| 35 | - returning section body text |
| 36 | - returning snippets or source excerpts |
| 37 | - returning full frontmatter |
| 38 | - returning line ranges, byte offsets, or section body lengths |
| 39 | - returning absolute paths, raw canister payloads, provider payloads, or MCP resource URIs |
| 40 | - calling PageIndex, OCR, LLMs, or external providers |
| 41 | - adding provider routing or write-back behavior |
| 42 | |
| 43 | ## Future REST Endpoint |
| 44 | |
| 45 | A later runtime phase may propose: |
| 46 | |
| 47 | ```text |
| 48 | GET /api/v1/section-source?path=<vault-relative-note-path> |
| 49 | ``` |
| 50 | |
| 51 | The endpoint uses a query parameter instead of a path parameter so vault-relative paths with |
| 52 | slashes do not require greedy route matching. |
| 53 | |
| 54 | The route must be read-only and must not accept a request body. |
| 55 | |
| 56 | ## REST Auth Requirements |
| 57 | |
| 58 | The future endpoint must require the same Hub API authentication as adjacent note-read |
| 59 | routes: |
| 60 | |
| 61 | - `Authorization: Bearer <access_token>` is required. |
| 62 | - Missing, expired, malformed, or invalid JWTs return `401`. |
| 63 | - The caller must have read access to the active vault. |
| 64 | - Viewer-equivalent read roles may call the route only after the route is implemented. |
| 65 | - Write, admin, proposal, evaluator, billing, import, export, and operator permissions must |
| 66 | not grant additional SectionSource fields. |
| 67 | - The request must not accept client-supplied user id, role, canister user id, vault access, |
| 68 | provider, resource, body, snippet, line range, byte offset, or search options. |
| 69 | |
| 70 | ## Active Vault Boundary |
| 71 | |
| 72 | The future endpoint must use the active Hub vault boundary: |
| 73 | |
| 74 | - Hosted gateway: the active vault comes from `X-Vault-Id` or the session default already |
| 75 | accepted by Hub gateway note-read routes. |
| 76 | - Self-hosted Hub: the active vault comes from the authenticated Hub context and configured |
| 77 | local vault access. |
| 78 | - The client cannot read from a different vault by placing a vault id in `path`. |
| 79 | - The returned `path` must be the normalized request path, not an upstream absolute path or |
| 80 | raw canister/storage key. |
| 81 | - Missing and unauthorized responses must not reveal whether the note exists in another |
| 82 | vault. |
| 83 | |
| 84 | ## Effective Canister User Boundary |
| 85 | |
| 86 | Hosted Hub REST must use the same effective canister user boundary as adjacent hosted note |
| 87 | routes: |
| 88 | |
| 89 | - The gateway resolves the effective canister user from hosted context. |
| 90 | - The canister read sends `X-User-Id` with the effective canister user id. |
| 91 | - The actor user id may be forwarded only as existing Hub audit context where adjacent routes |
| 92 | already do so. |
| 93 | - The client cannot supply or override the effective canister user id. |
| 94 | - SectionSource output must not mix notes across effective canister users. |
| 95 | |
| 96 | Self-hosted Hub has no canister user boundary. It must preserve the local authenticated |
| 97 | vault boundary instead. |
| 98 | |
| 99 | ## Canister Auth And Header Behavior |
| 100 | |
| 101 | Hosted Hub REST must use the same canister note-read behavior as adjacent gateway note |
| 102 | routes: |
| 103 | |
| 104 | ```text |
| 105 | GET {canisterUrl}/api/v1/notes/{encodeURIComponent(normalizedPath)} |
| 106 | ``` |
| 107 | |
| 108 | Headers: |
| 109 | |
| 110 | - `Authorization: Bearer <gateway JWT or trusted upstream token>` when adjacent note-read |
| 111 | proxy behavior requires it |
| 112 | - `X-Vault-Id: <active vault id>` |
| 113 | - `X-User-Id: <effective canister user id>` |
| 114 | - `X-Gateway-Auth: <configured canister auth secret>` when configured |
| 115 | - `Accept: application/json` |
| 116 | |
| 117 | The endpoint must not forward SectionSource-specific options, provider options, Scooling |
| 118 | options, search filters, body flags, snippet flags, line ranges, byte offsets, or resource |
| 119 | URIs upstream. |
| 120 | |
| 121 | ## One-Note Read Behavior |
| 122 | |
| 123 | The future endpoint must perform one note read per request. |
| 124 | |
| 125 | Allowed behavior: |
| 126 | |
| 127 | - validate auth |
| 128 | - resolve active vault and effective canister user |
| 129 | - normalize and validate one `path` |
| 130 | - read exactly one note |
| 131 | - derive body-free SectionSource metadata in memory |
| 132 | - return JSON |
| 133 | |
| 134 | Blocked behavior: |
| 135 | |
| 136 | - listing notes |
| 137 | - scanning the whole vault |
| 138 | - calling bridge search |
| 139 | - calling index, vector, PageIndex, OCR, LLM, provider, summary, memory, import, export, or |
| 140 | write routes |
| 141 | - creating sidecars, indexes, vectors, summaries, memory events, Scooling records, or |
| 142 | canister state |
| 143 | |
| 144 | ## Path Normalization And Unsafe Path Rejection |
| 145 | |
| 146 | The future endpoint must reject unsafe paths before any upstream note fetch. |
| 147 | |
| 148 | The normalization algorithm must: |
| 149 | |
| 150 | - require `path` as a query parameter |
| 151 | - require a string |
| 152 | - trim whitespace |
| 153 | - replace backslashes with `/` |
| 154 | - reject empty paths |
| 155 | - reject POSIX absolute paths |
| 156 | - reject Windows absolute paths |
| 157 | - split on `/` |
| 158 | - remove empty segments caused by duplicate slashes |
| 159 | - reject any `..` segment |
| 160 | - join safe segments with `/` |
| 161 | |
| 162 | Unsafe path errors must not echo the raw unsafe path. |
| 163 | |
| 164 | ## Output Allowlist |
| 165 | |
| 166 | The future endpoint may return only body-free `knowtation.section_source/v0` output: |
| 167 | |
| 168 | ```json |
| 169 | { |
| 170 | "schema": "knowtation.section_source/v0", |
| 171 | "path": "inbox/example.md", |
| 172 | "title": "Example", |
| 173 | "sections": [ |
| 174 | { |
| 175 | "section_id": "inbox-example-md:h1-example-0001", |
| 176 | "heading_id": "h1-example-0001", |
| 177 | "level": 1, |
| 178 | "heading_path": ["Example"], |
| 179 | "heading_text": "Example", |
| 180 | "child_section_ids": [], |
| 181 | "body_available": true, |
| 182 | "body_returned": false, |
| 183 | "snippet_returned": false |
| 184 | } |
| 185 | ], |
| 186 | "truncated": false |
| 187 | } |
| 188 | ``` |
| 189 | |
| 190 | Allowed top-level fields: |
| 191 | |
| 192 | - `schema` |
| 193 | - `path` |
| 194 | - `title` |
| 195 | - `sections` |
| 196 | - `truncated` |
| 197 | |
| 198 | Allowed section fields: |
| 199 | |
| 200 | - `section_id` |
| 201 | - `heading_id` |
| 202 | - `level` |
| 203 | - `heading_path` |
| 204 | - `heading_text` |
| 205 | - `child_section_ids` |
| 206 | - `body_available` |
| 207 | - `body_returned` |
| 208 | - `snippet_returned` |
| 209 | |
| 210 | Required constants: |
| 211 | |
| 212 | - `schema` must be exactly `knowtation.section_source/v0`. |
| 213 | - `body_returned` must be `false`. |
| 214 | - `snippet_returned` must be `false`. |
| 215 | |
| 216 | ## Explicitly Excluded Output |
| 217 | |
| 218 | The future endpoint must not output: |
| 219 | |
| 220 | - note body text |
| 221 | - section body text |
| 222 | - snippets |
| 223 | - source excerpts |
| 224 | - full frontmatter |
| 225 | - line ranges |
| 226 | - byte offsets |
| 227 | - section body lengths |
| 228 | - absolute filesystem paths |
| 229 | - raw canister paths |
| 230 | - raw canister payloads |
| 231 | - provider payloads |
| 232 | - provider keys |
| 233 | - rendered HTML |
| 234 | - summaries |
| 235 | - vector scores |
| 236 | - search results |
| 237 | - persistence ids |
| 238 | - sidecar paths |
| 239 | - memory events |
| 240 | - MCP resource URIs |
| 241 | - PageIndex output |
| 242 | - OCR text |
| 243 | - media metadata |
| 244 | - Scooling adapter state |
| 245 | - classroom policy state |
| 246 | |
| 247 | ## Error Sanitization |
| 248 | |
| 249 | The future endpoint must return the existing Hub JSON error envelope: |
| 250 | |
| 251 | ```json |
| 252 | { |
| 253 | "error": "Invalid path", |
| 254 | "code": "INVALID_PATH" |
| 255 | } |
| 256 | ``` |
| 257 | |
| 258 | Rules: |
| 259 | |
| 260 | - Missing `Authorization` returns `401` without note details. |
| 261 | - Missing `path` returns `400` with a generic invalid path code. |
| 262 | - Non-string or empty `path` returns `400`. |
| 263 | - Unsafe paths return `400` before any upstream fetch. |
| 264 | - Missing notes return `404` without note body, frontmatter, headings, or raw upstream body. |
| 265 | - Unauthorized notes return `403` without revealing whether another vault or user can read |
| 266 | the note. |
| 267 | - Upstream failures return a bounded upstream status class and no raw upstream payload. |
| 268 | - Runtime failures return a generic error and no private content. |
| 269 | |
| 270 | Errors must not contain: |
| 271 | |
| 272 | - note body text |
| 273 | - section body text |
| 274 | - snippets |
| 275 | - full frontmatter |
| 276 | - heading paths beyond an authorized successful response |
| 277 | - absolute paths |
| 278 | - requested unsafe paths |
| 279 | - raw canister payloads |
| 280 | - auth headers |
| 281 | - bearer tokens |
| 282 | - gateway secrets |
| 283 | - provider payloads |
| 284 | - MCP resource URIs |
| 285 | |
| 286 | ## Logging Exclusions |
| 287 | |
| 288 | The future implementation must not log: |
| 289 | |
| 290 | - note body text |
| 291 | - section body text |
| 292 | - snippets |
| 293 | - full frontmatter |
| 294 | - heading text |
| 295 | - heading paths |
| 296 | - raw canister payloads |
| 297 | - requested unsafe paths |
| 298 | - absolute paths |
| 299 | - bearer tokens |
| 300 | - gateway secrets |
| 301 | - canister auth secrets |
| 302 | - provider payloads |
| 303 | - MCP resource URIs |
| 304 | |
| 305 | Bounded operational logs may include only: |
| 306 | |
| 307 | - route name |
| 308 | - sanitized outcome class |
| 309 | - sanitized upstream status class |
| 310 | - elapsed time |
| 311 | - section count |
| 312 | - truncated flag |
| 313 | |
| 314 | ## Deletion, Export, And Staleness |
| 315 | |
| 316 | The future endpoint is on-demand and non-persistent. |
| 317 | |
| 318 | Until a separate persistence spec is accepted: |
| 319 | |
| 320 | - no hosted SectionSource sidecar is created |
| 321 | - no hosted SectionSource index is created |
| 322 | - no vector record is created |
| 323 | - no memory event is created |
| 324 | - no summary record is created |
| 325 | - no provider record is created |
| 326 | - no Scooling record is created |
| 327 | - export behavior remains unchanged |
| 328 | - deleting a note leaves no SectionSource-derived Hub artifact to delete |
| 329 | - editing a note leaves no stale SectionSource-derived Hub artifact to invalidate |
| 330 | |
| 331 | ## Prompt-Injection Handling |
| 332 | |
| 333 | SectionSource text fields are private, untrusted source material: |
| 334 | |
| 335 | - `title` |
| 336 | - `heading_text` |
| 337 | - `heading_path` |
| 338 | - future labels, snippets, or section bodies if separately accepted |
| 339 | |
| 340 | Prompt-like headings that ask a model to reveal secrets, bypass review, ignore policy, call |
| 341 | providers, exfiltrate learner data, alter grades, or disable guardrails must remain inert |
| 342 | text. They must not become tool instructions, system prompts, routing decisions, provider |
| 343 | requests, write-back approvals, UI actions, or authorization overrides. |
| 344 | |
| 345 | ## Hosted MCP Parity Boundary |
| 346 | |
| 347 | The future Hub REST endpoint must match hosted MCP `get_section_source` for the body-free |
| 348 | data contract, one-note read behavior, active vault boundary, effective canister user |
| 349 | boundary, path safety, and excluded fields. |
| 350 | |
| 351 | Parity does not mean adding: |
| 352 | |
| 353 | - MCP resource URIs |
| 354 | - MCP prompt behavior |
| 355 | - hosted MCP-only error envelopes |
| 356 | - search behavior |
| 357 | - body reads |
| 358 | - snippets |
| 359 | - write-back behavior |
| 360 | |
| 361 | Hosted MCP `get_section_source` remains available in this planning phase. |
| 362 | |
| 363 | ## Scooling Consumption Boundary |
| 364 | |
| 365 | This phase does not add Scooling runtime behavior. |
| 366 | |
| 367 | Future Scooling consumption of Hub REST SectionSource may happen only after: |
| 368 | |
| 369 | - the Hub REST runtime implementation is accepted and tested |
| 370 | - Scooling calls through a Scooling-owned adapter |
| 371 | - Scooling preserves the body-free `knowtation.section_source/v0` allowlist |
| 372 | - Scooling treats heading text and heading paths as untrusted source material |
| 373 | |
| 374 | Scooling must not: |
| 375 | |
| 376 | - bypass Knowtation hosted authorization |
| 377 | - parse Markdown as the canonical section parser |
| 378 | - derive canonical section ids |
| 379 | - store SectionSource as truth |
| 380 | - call PageIndex, OCR, LLMs, or external providers to recreate sections |
| 381 | - expose private learner section metadata outside authorized contexts |
| 382 | - request note bodies, section bodies, snippets, resource URIs, provider payloads, line |
| 383 | ranges, byte offsets, or section body lengths through this endpoint |
| 384 | - use SectionSource reads as write-back approval |
| 385 | |
| 386 | ## Seven-Tier Test Requirements |
| 387 | |
| 388 | ### Unit |
| 389 | |
| 390 | - The spec documents auth, vault, effective canister user, canister headers, one-note read, |
| 391 | path safety, output allowlist, error, logging, lifecycle, prompt-injection, hosted MCP |
| 392 | parity, and Scooling boundaries. |
| 393 | - The future output allowlist matches body-free `SectionSource v0`. |
| 394 | - `body_returned` and `snippet_returned` remain false. |
| 395 | - Invalid path errors do not echo unsafe paths. |
| 396 | |
| 397 | ### Integration |
| 398 | |
| 399 | - No Hub REST route is registered in this planning phase. |
| 400 | - No OpenAPI path, tag, schema, component, or example is added in this planning phase. |
| 401 | - Hosted MCP `get_section_source` remains registered and body-free. |
| 402 | - Future runtime tests must prove canister reads use the active vault and effective canister |
| 403 | user boundaries. |
| 404 | |
| 405 | ### End To End |
| 406 | |
| 407 | - A REST client cannot call `GET /api/v1/section-source` in this planning phase. |
| 408 | - A hosted MCP client can still call `get_section_source`. |
| 409 | - Future REST runtime tests must prove a client can request one body-free SectionSource |
| 410 | response only after route and OpenAPI updates are accepted. |
| 411 | - No flow returns note bodies, section bodies, snippets, full frontmatter, provider payloads, |
| 412 | or resource URIs. |
| 413 | |
| 414 | ### Stress |
| 415 | |
| 416 | - Planning checks stay bounded to SectionSource docs, Hub route files, OpenAPI docs, and |
| 417 | contract tests. |
| 418 | - Future runtime tests must prove large notes remain capped by heading and text caps. |
| 419 | - Future runtime tests must prove repeated calls for unchanged notes are deterministic. |
| 420 | - No test scans a real vault or calls external providers. |
| 421 | |
| 422 | ### Data Integrity |
| 423 | |
| 424 | - This planning phase writes no notes, sidecars, indexes, vectors, memory, summaries, |
| 425 | provider records, Scooling records, or canister state. |
| 426 | - Future runtime tests must prove one REST request performs one note read and no writes. |
| 427 | - Export, delete, edit, backup, and restore behavior remain unchanged in this phase. |
| 428 | |
| 429 | ### Performance |
| 430 | |
| 431 | - The future endpoint must read one note only. |
| 432 | - The future endpoint must not scan the whole vault. |
| 433 | - The future endpoint must not call bridge search. |
| 434 | - The future endpoint must not call external providers. |
| 435 | - Output size must remain bounded by accepted SectionSource caps. |
| 436 | |
| 437 | ### Security |
| 438 | |
| 439 | - Hub REST exposure remains blocked in this phase. |
| 440 | - OpenAPI exposure remains blocked in this phase. |
| 441 | - No note body text appears in future SectionSource REST output. |
| 442 | - No section body text appears in future SectionSource REST output. |
| 443 | - No snippets appear in future SectionSource REST output. |
| 444 | - No full frontmatter appears in future SectionSource REST output. |
| 445 | - No absolute filesystem paths appear in future SectionSource REST output or errors. |
| 446 | - No raw canister payload appears in future SectionSource REST output or errors. |
| 447 | - No provider payload appears in future SectionSource REST output or errors. |
| 448 | - No MCP resource URI appears for SectionSource REST content. |
| 449 | - Search, persistence, Scooling, PageIndex, OCR, LLM, and provider exposure remain blocked. |
| 450 | |
| 451 | ## Contract Guards |
| 452 | |
| 453 | This planning phase must add tests proving: |
| 454 | |
| 455 | - this Hub REST/OpenAPI spec is complete |
| 456 | - no Hub REST route is registered yet |
| 457 | - no OpenAPI path, schema, tag, component, or example is added yet |
| 458 | - hosted MCP `get_section_source` remains available |
| 459 | - no search, persistence, Scooling, body, snippet, provider, or resource surface is added |
| 460 | |
| 461 | ## Stop Conditions |
| 462 | |
| 463 | Stop and re-plan if Hub REST/OpenAPI work requires: |
| 464 | |
| 465 | - returning note body text |
| 466 | - returning section body text |
| 467 | - returning snippets |
| 468 | - returning full frontmatter |
| 469 | - returning exact line ranges |
| 470 | - returning byte offsets |
| 471 | - returning section body lengths |
| 472 | - returning absolute paths |
| 473 | - returning raw canister payloads |
| 474 | - returning provider payloads |
| 475 | - returning MCP resource URIs |
| 476 | - adding Hub UI |
| 477 | - adding canister routes |
| 478 | - adding search, vectors, indexes, persistence, sidecars, summaries, or memory events |
| 479 | - adding Scooling runtime behavior |
| 480 | - calling PageIndex, OCR, LLMs, or external providers |
| 481 | - weakening hosted role ACL, active vault, effective canister user, REST auth, or path |
| 482 | safety behavior |
| 483 | - logging note content, section content, headings, raw upstream payloads, auth headers, |
| 484 | gateway secrets, bearer tokens, or provider payloads |
| 485 | |
| 486 | ## Acceptance Criteria |
| 487 | |
| 488 | Phase 1M is accepted when: |
| 489 | |
| 490 | - The Hub REST/OpenAPI behavior is specified before runtime exposure. |
| 491 | - The future endpoint is limited to one vault-relative note path. |
| 492 | - The future route is read-only and auth-gated. |
| 493 | - The future canister request uses active vault and effective canister user boundaries. |
| 494 | - The future output is limited to body-free `knowtation.section_source/v0` metadata. |
| 495 | - Errors and logs are sanitized. |
| 496 | - Deletion, export, and staleness behavior remain non-persistent. |
| 497 | - Prompt-injection text remains untrusted source material. |
| 498 | - Hosted MCP parity is documented. |
| 499 | - Scooling remains a downstream consumer behind its adapter boundary. |
| 500 | - Contract tests prove Hub REST and OpenAPI exposure remain absent in this planning phase. |
| 501 | - Contract tests prove no search, persistence, Scooling, body, snippet, provider, or resource |
| 502 | surface was added. |
| 503 | |
| 504 | ## Recommendation |
| 505 | |
| 506 | Phase 1M is the accepted planning and contract-test phase. |
| 507 | |
| 508 | Phase 1N implements the Hub gateway REST/OpenAPI runtime that follows this spec. It adds |
| 509 | the route, OpenAPI schema, Hub API documentation, and seven-tier runtime tests together. It |
| 510 | does not add Hub UI, canister routes, search, persistence, Scooling runtime behavior, body |
| 511 | reads, snippets, summaries, PageIndex, OCR, LLM calls, provider routing, or write-back |
| 512 | behavior. |