IMPORT-NORMALIZE-PIPELINE.md markdown
27 lines 1.6 KB
Raw
sha256:65ccb454656ea5acdea0a10e559b78bcde1eb6ff753ecc2911bc99d1c3d7cadd feat(calendar): enforce agent context tiers in retrieval AP… Human minor ⚠ breaking 1 day ago

Optional post-import normalization (agent JSON, LLM)

Today, import is implemented as format-specific parsers in lib/importers/. LLM usage in that path is limited to Whisper transcription for audio/video (lib/transcribe.mjs).

Problem

Agents or external tools may emit JSON or Markdown that does not match SPEC.md frontmatter (title, tags, optional intention fields in §2.3). MCP write currently accepts string key/value frontmatter only (lib/write.mjs); arrays must be represented in a way the writer can serialize correctly.

Add an explicit stage so deterministic importers stay testable:

  1. Normalize (rules) — Map known vendor keys to SPEC fields (no model).
  2. Normalize (LLM) — Optional: one shot “produce YAML frontmatter + body only” with a fixed schema and validation; reject on parse failure.
  3. Validate — Reject or quarantine notes missing required fields for your policy (e.g. inbox contract §2.2).

Implement as a separate subcommand or Hub action (e.g. knowtation import normalize <path> or “Normalize note” in UI), not hidden inside every importer.

Contract

  • Input: Path to a note or a JSON file + target vault path.
  • Output: Updated note or new note under inbox/ / staging, with machine-readable normalization_provenance (or equivalent) in frontmatter for audit.
File History 2 commits
sha256:65ccb454656ea5acdea0a10e559b78bcde1eb6ff753ecc2911bc99d1c3d7cadd feat(calendar): enforce agent context tiers in retrieval AP… Human minor 1 day ago
sha256:9103f98c89257ed2b01c237cea895dabb3e85ea337dccb1161c175e4422355b6 docs: accept Calendar Events v0 spec with Phase 0 security … Human 1 day ago