SECTION-SOURCE-HUB-REST-OPENAPI-SPEC.md file-level

at sha256:8 · View file ↗ · Intel ↗

History
1 files
1 commits
0 hotspots
0 🧊 dead
0 💥 blast risk
sha256:4 fix(security): pin patched transitive deps to clear Dependabot moderate… · aaronrene · Jun 11, 2026
1 # SectionSource Hub REST/OpenAPI Implementation Spec
2
3 ## Simple Summary
4
5 Phase 1M specifies a future Hub REST/OpenAPI surface for body-free SectionSource metadata.
6
7 This phase is planning only. It does not add a Hub REST endpoint, OpenAPI schema, Hub UI,
8 canister route, search mode, persistence, Scooling runtime behavior, section body, snippet,
9 summary, PageIndex, OCR, LLM call, provider route, MCP resource, or write-back behavior.
10
11 ## Technical Summary
12
13 The future Hub REST surface must mirror the accepted hosted MCP `get_section_source`
14 runtime, but through a browser/API route instead of MCP. The route must read exactly one
15 authorized note, derive body-free `knowtation.section_source/v0` metadata in memory, and
16 return only the SectionSource v0 allowlist.
17
18 The future OpenAPI surface may document that route only after the Hub REST implementation is
19 added and tested.
20
21 ## Planning Decision
22
23 Phase 1M accepts the Hub REST/OpenAPI implementation specification only.
24
25 It does not approve:
26
27 - adding `GET /api/v1/section-source`
28 - adding any other Hub REST SectionSource endpoint
29 - adding OpenAPI paths, tags, schemas, examples, or components for SectionSource
30 - adding Hub UI calls or display components
31 - adding canister routes
32 - adding search, vectors, indexes, persistence, sidecars, summaries, or memory events
33 - adding Scooling runtime behavior
34 - returning note body text
35 - returning section body text
36 - returning snippets or source excerpts
37 - returning full frontmatter
38 - returning line ranges, byte offsets, or section body lengths
39 - returning absolute paths, raw canister payloads, provider payloads, or MCP resource URIs
40 - calling PageIndex, OCR, LLMs, or external providers
41 - adding provider routing or write-back behavior
42
43 ## Future REST Endpoint
44
45 A later runtime phase may propose:
46
47 ```text
48 GET /api/v1/section-source?path=<vault-relative-note-path>
49 ```
50
51 The endpoint uses a query parameter instead of a path parameter so vault-relative paths with
52 slashes do not require greedy route matching.
53
54 The route must be read-only and must not accept a request body.
55
56 ## REST Auth Requirements
57
58 The future endpoint must require the same Hub API authentication as adjacent note-read
59 routes:
60
61 - `Authorization: Bearer <access_token>` is required.
62 - Missing, expired, malformed, or invalid JWTs return `401`.
63 - The caller must have read access to the active vault.
64 - Viewer-equivalent read roles may call the route only after the route is implemented.
65 - Write, admin, proposal, evaluator, billing, import, export, and operator permissions must
66 not grant additional SectionSource fields.
67 - The request must not accept client-supplied user id, role, canister user id, vault access,
68 provider, resource, body, snippet, line range, byte offset, or search options.
69
70 ## Active Vault Boundary
71
72 The future endpoint must use the active Hub vault boundary:
73
74 - Hosted gateway: the active vault comes from `X-Vault-Id` or the session default already
75 accepted by Hub gateway note-read routes.
76 - Self-hosted Hub: the active vault comes from the authenticated Hub context and configured
77 local vault access.
78 - The client cannot read from a different vault by placing a vault id in `path`.
79 - The returned `path` must be the normalized request path, not an upstream absolute path or
80 raw canister/storage key.
81 - Missing and unauthorized responses must not reveal whether the note exists in another
82 vault.
83
84 ## Effective Canister User Boundary
85
86 Hosted Hub REST must use the same effective canister user boundary as adjacent hosted note
87 routes:
88
89 - The gateway resolves the effective canister user from hosted context.
90 - The canister read sends `X-User-Id` with the effective canister user id.
91 - The actor user id may be forwarded only as existing Hub audit context where adjacent routes
92 already do so.
93 - The client cannot supply or override the effective canister user id.
94 - SectionSource output must not mix notes across effective canister users.
95
96 Self-hosted Hub has no canister user boundary. It must preserve the local authenticated
97 vault boundary instead.
98
99 ## Canister Auth And Header Behavior
100
101 Hosted Hub REST must use the same canister note-read behavior as adjacent gateway note
102 routes:
103
104 ```text
105 GET {canisterUrl}/api/v1/notes/{encodeURIComponent(normalizedPath)}
106 ```
107
108 Headers:
109
110 - `Authorization: Bearer <gateway JWT or trusted upstream token>` when adjacent note-read
111 proxy behavior requires it
112 - `X-Vault-Id: <active vault id>`
113 - `X-User-Id: <effective canister user id>`
114 - `X-Gateway-Auth: <configured canister auth secret>` when configured
115 - `Accept: application/json`
116
117 The endpoint must not forward SectionSource-specific options, provider options, Scooling
118 options, search filters, body flags, snippet flags, line ranges, byte offsets, or resource
119 URIs upstream.
120
121 ## One-Note Read Behavior
122
123 The future endpoint must perform one note read per request.
124
125 Allowed behavior:
126
127 - validate auth
128 - resolve active vault and effective canister user
129 - normalize and validate one `path`
130 - read exactly one note
131 - derive body-free SectionSource metadata in memory
132 - return JSON
133
134 Blocked behavior:
135
136 - listing notes
137 - scanning the whole vault
138 - calling bridge search
139 - calling index, vector, PageIndex, OCR, LLM, provider, summary, memory, import, export, or
140 write routes
141 - creating sidecars, indexes, vectors, summaries, memory events, Scooling records, or
142 canister state
143
144 ## Path Normalization And Unsafe Path Rejection
145
146 The future endpoint must reject unsafe paths before any upstream note fetch.
147
148 The normalization algorithm must:
149
150 - require `path` as a query parameter
151 - require a string
152 - trim whitespace
153 - replace backslashes with `/`
154 - reject empty paths
155 - reject POSIX absolute paths
156 - reject Windows absolute paths
157 - split on `/`
158 - remove empty segments caused by duplicate slashes
159 - reject any `..` segment
160 - join safe segments with `/`
161
162 Unsafe path errors must not echo the raw unsafe path.
163
164 ## Output Allowlist
165
166 The future endpoint may return only body-free `knowtation.section_source/v0` output:
167
168 ```json
169 {
170 "schema": "knowtation.section_source/v0",
171 "path": "inbox/example.md",
172 "title": "Example",
173 "sections": [
174 {
175 "section_id": "inbox-example-md:h1-example-0001",
176 "heading_id": "h1-example-0001",
177 "level": 1,
178 "heading_path": ["Example"],
179 "heading_text": "Example",
180 "child_section_ids": [],
181 "body_available": true,
182 "body_returned": false,
183 "snippet_returned": false
184 }
185 ],
186 "truncated": false
187 }
188 ```
189
190 Allowed top-level fields:
191
192 - `schema`
193 - `path`
194 - `title`
195 - `sections`
196 - `truncated`
197
198 Allowed section fields:
199
200 - `section_id`
201 - `heading_id`
202 - `level`
203 - `heading_path`
204 - `heading_text`
205 - `child_section_ids`
206 - `body_available`
207 - `body_returned`
208 - `snippet_returned`
209
210 Required constants:
211
212 - `schema` must be exactly `knowtation.section_source/v0`.
213 - `body_returned` must be `false`.
214 - `snippet_returned` must be `false`.
215
216 ## Explicitly Excluded Output
217
218 The future endpoint must not output:
219
220 - note body text
221 - section body text
222 - snippets
223 - source excerpts
224 - full frontmatter
225 - line ranges
226 - byte offsets
227 - section body lengths
228 - absolute filesystem paths
229 - raw canister paths
230 - raw canister payloads
231 - provider payloads
232 - provider keys
233 - rendered HTML
234 - summaries
235 - vector scores
236 - search results
237 - persistence ids
238 - sidecar paths
239 - memory events
240 - MCP resource URIs
241 - PageIndex output
242 - OCR text
243 - media metadata
244 - Scooling adapter state
245 - classroom policy state
246
247 ## Error Sanitization
248
249 The future endpoint must return the existing Hub JSON error envelope:
250
251 ```json
252 {
253 "error": "Invalid path",
254 "code": "INVALID_PATH"
255 }
256 ```
257
258 Rules:
259
260 - Missing `Authorization` returns `401` without note details.
261 - Missing `path` returns `400` with a generic invalid path code.
262 - Non-string or empty `path` returns `400`.
263 - Unsafe paths return `400` before any upstream fetch.
264 - Missing notes return `404` without note body, frontmatter, headings, or raw upstream body.
265 - Unauthorized notes return `403` without revealing whether another vault or user can read
266 the note.
267 - Upstream failures return a bounded upstream status class and no raw upstream payload.
268 - Runtime failures return a generic error and no private content.
269
270 Errors must not contain:
271
272 - note body text
273 - section body text
274 - snippets
275 - full frontmatter
276 - heading paths beyond an authorized successful response
277 - absolute paths
278 - requested unsafe paths
279 - raw canister payloads
280 - auth headers
281 - bearer tokens
282 - gateway secrets
283 - provider payloads
284 - MCP resource URIs
285
286 ## Logging Exclusions
287
288 The future implementation must not log:
289
290 - note body text
291 - section body text
292 - snippets
293 - full frontmatter
294 - heading text
295 - heading paths
296 - raw canister payloads
297 - requested unsafe paths
298 - absolute paths
299 - bearer tokens
300 - gateway secrets
301 - canister auth secrets
302 - provider payloads
303 - MCP resource URIs
304
305 Bounded operational logs may include only:
306
307 - route name
308 - sanitized outcome class
309 - sanitized upstream status class
310 - elapsed time
311 - section count
312 - truncated flag
313
314 ## Deletion, Export, And Staleness
315
316 The future endpoint is on-demand and non-persistent.
317
318 Until a separate persistence spec is accepted:
319
320 - no hosted SectionSource sidecar is created
321 - no hosted SectionSource index is created
322 - no vector record is created
323 - no memory event is created
324 - no summary record is created
325 - no provider record is created
326 - no Scooling record is created
327 - export behavior remains unchanged
328 - deleting a note leaves no SectionSource-derived Hub artifact to delete
329 - editing a note leaves no stale SectionSource-derived Hub artifact to invalidate
330
331 ## Prompt-Injection Handling
332
333 SectionSource text fields are private, untrusted source material:
334
335 - `title`
336 - `heading_text`
337 - `heading_path`
338 - future labels, snippets, or section bodies if separately accepted
339
340 Prompt-like headings that ask a model to reveal secrets, bypass review, ignore policy, call
341 providers, exfiltrate learner data, alter grades, or disable guardrails must remain inert
342 text. They must not become tool instructions, system prompts, routing decisions, provider
343 requests, write-back approvals, UI actions, or authorization overrides.
344
345 ## Hosted MCP Parity Boundary
346
347 The future Hub REST endpoint must match hosted MCP `get_section_source` for the body-free
348 data contract, one-note read behavior, active vault boundary, effective canister user
349 boundary, path safety, and excluded fields.
350
351 Parity does not mean adding:
352
353 - MCP resource URIs
354 - MCP prompt behavior
355 - hosted MCP-only error envelopes
356 - search behavior
357 - body reads
358 - snippets
359 - write-back behavior
360
361 Hosted MCP `get_section_source` remains available in this planning phase.
362
363 ## Scooling Consumption Boundary
364
365 This phase does not add Scooling runtime behavior.
366
367 Future Scooling consumption of Hub REST SectionSource may happen only after:
368
369 - the Hub REST runtime implementation is accepted and tested
370 - Scooling calls through a Scooling-owned adapter
371 - Scooling preserves the body-free `knowtation.section_source/v0` allowlist
372 - Scooling treats heading text and heading paths as untrusted source material
373
374 Scooling must not:
375
376 - bypass Knowtation hosted authorization
377 - parse Markdown as the canonical section parser
378 - derive canonical section ids
379 - store SectionSource as truth
380 - call PageIndex, OCR, LLMs, or external providers to recreate sections
381 - expose private learner section metadata outside authorized contexts
382 - request note bodies, section bodies, snippets, resource URIs, provider payloads, line
383 ranges, byte offsets, or section body lengths through this endpoint
384 - use SectionSource reads as write-back approval
385
386 ## Seven-Tier Test Requirements
387
388 ### Unit
389
390 - The spec documents auth, vault, effective canister user, canister headers, one-note read,
391 path safety, output allowlist, error, logging, lifecycle, prompt-injection, hosted MCP
392 parity, and Scooling boundaries.
393 - The future output allowlist matches body-free `SectionSource v0`.
394 - `body_returned` and `snippet_returned` remain false.
395 - Invalid path errors do not echo unsafe paths.
396
397 ### Integration
398
399 - No Hub REST route is registered in this planning phase.
400 - No OpenAPI path, tag, schema, component, or example is added in this planning phase.
401 - Hosted MCP `get_section_source` remains registered and body-free.
402 - Future runtime tests must prove canister reads use the active vault and effective canister
403 user boundaries.
404
405 ### End To End
406
407 - A REST client cannot call `GET /api/v1/section-source` in this planning phase.
408 - A hosted MCP client can still call `get_section_source`.
409 - Future REST runtime tests must prove a client can request one body-free SectionSource
410 response only after route and OpenAPI updates are accepted.
411 - No flow returns note bodies, section bodies, snippets, full frontmatter, provider payloads,
412 or resource URIs.
413
414 ### Stress
415
416 - Planning checks stay bounded to SectionSource docs, Hub route files, OpenAPI docs, and
417 contract tests.
418 - Future runtime tests must prove large notes remain capped by heading and text caps.
419 - Future runtime tests must prove repeated calls for unchanged notes are deterministic.
420 - No test scans a real vault or calls external providers.
421
422 ### Data Integrity
423
424 - This planning phase writes no notes, sidecars, indexes, vectors, memory, summaries,
425 provider records, Scooling records, or canister state.
426 - Future runtime tests must prove one REST request performs one note read and no writes.
427 - Export, delete, edit, backup, and restore behavior remain unchanged in this phase.
428
429 ### Performance
430
431 - The future endpoint must read one note only.
432 - The future endpoint must not scan the whole vault.
433 - The future endpoint must not call bridge search.
434 - The future endpoint must not call external providers.
435 - Output size must remain bounded by accepted SectionSource caps.
436
437 ### Security
438
439 - Hub REST exposure remains blocked in this phase.
440 - OpenAPI exposure remains blocked in this phase.
441 - No note body text appears in future SectionSource REST output.
442 - No section body text appears in future SectionSource REST output.
443 - No snippets appear in future SectionSource REST output.
444 - No full frontmatter appears in future SectionSource REST output.
445 - No absolute filesystem paths appear in future SectionSource REST output or errors.
446 - No raw canister payload appears in future SectionSource REST output or errors.
447 - No provider payload appears in future SectionSource REST output or errors.
448 - No MCP resource URI appears for SectionSource REST content.
449 - Search, persistence, Scooling, PageIndex, OCR, LLM, and provider exposure remain blocked.
450
451 ## Contract Guards
452
453 This planning phase must add tests proving:
454
455 - this Hub REST/OpenAPI spec is complete
456 - no Hub REST route is registered yet
457 - no OpenAPI path, schema, tag, component, or example is added yet
458 - hosted MCP `get_section_source` remains available
459 - no search, persistence, Scooling, body, snippet, provider, or resource surface is added
460
461 ## Stop Conditions
462
463 Stop and re-plan if Hub REST/OpenAPI work requires:
464
465 - returning note body text
466 - returning section body text
467 - returning snippets
468 - returning full frontmatter
469 - returning exact line ranges
470 - returning byte offsets
471 - returning section body lengths
472 - returning absolute paths
473 - returning raw canister payloads
474 - returning provider payloads
475 - returning MCP resource URIs
476 - adding Hub UI
477 - adding canister routes
478 - adding search, vectors, indexes, persistence, sidecars, summaries, or memory events
479 - adding Scooling runtime behavior
480 - calling PageIndex, OCR, LLMs, or external providers
481 - weakening hosted role ACL, active vault, effective canister user, REST auth, or path
482 safety behavior
483 - logging note content, section content, headings, raw upstream payloads, auth headers,
484 gateway secrets, bearer tokens, or provider payloads
485
486 ## Acceptance Criteria
487
488 Phase 1M is accepted when:
489
490 - The Hub REST/OpenAPI behavior is specified before runtime exposure.
491 - The future endpoint is limited to one vault-relative note path.
492 - The future route is read-only and auth-gated.
493 - The future canister request uses active vault and effective canister user boundaries.
494 - The future output is limited to body-free `knowtation.section_source/v0` metadata.
495 - Errors and logs are sanitized.
496 - Deletion, export, and staleness behavior remain non-persistent.
497 - Prompt-injection text remains untrusted source material.
498 - Hosted MCP parity is documented.
499 - Scooling remains a downstream consumer behind its adapter boundary.
500 - Contract tests prove Hub REST and OpenAPI exposure remain absent in this planning phase.
501 - Contract tests prove no search, persistence, Scooling, body, snippet, provider, or resource
502 surface was added.
503
504 ## Recommendation
505
506 Phase 1M is the accepted planning and contract-test phase.
507
508 Phase 1N implements the Hub gateway REST/OpenAPI runtime that follows this spec. It adds
509 the route, OpenAPI schema, Hub API documentation, and seven-tier runtime tests together. It
510 does not add Hub UI, canister routes, search, persistence, Scooling runtime behavior, body
511 reads, snippets, summaries, PageIndex, OCR, LLM calls, provider routing, or write-back
512 behavior.