IMPORT-NORMALIZE-PIPELINE.md
markdown
sha256:fd47ab66017e55331b88ba3a59c34c23e4e05c5aec424251d3a404c5a7998c8e
feat(hub): restore integration tile detail modals; add Herm…
Human
minor
⚠ breaking
16 days ago
Optional post-import normalization (agent JSON, LLM)
Today, import is implemented as format-specific parsers in lib/importers/. LLM usage in that path is limited to Whisper transcription for audio/video (lib/transcribe.mjs).
Problem
Agents or external tools may emit JSON or Markdown that does not match SPEC.md frontmatter (title, tags, optional intention fields in §2.3). MCP write currently accepts string key/value frontmatter only (lib/write.mjs); arrays must be represented in a way the writer can serialize correctly.
Recommended approach (when needed)
Add an explicit stage so deterministic importers stay testable:
- Normalize (rules) — Map known vendor keys to SPEC fields (no model).
- Normalize (LLM) — Optional: one shot “produce YAML frontmatter + body only” with a fixed schema and validation; reject on parse failure.
- Validate — Reject or quarantine notes missing required fields for your policy (e.g. inbox contract §2.2).
Implement as a separate subcommand or Hub action (e.g. knowtation import normalize <path> or “Normalize note” in UI), not hidden inside every importer.
Contract
- Input: Path to a note or a JSON file + target vault path.
- Output: Updated note or new note under
inbox// staging, with machine-readablenormalization_provenance(or equivalent) in frontmatter for audit.
Related
- IMPORT-EVALS.md — import goldens vs retrieval vs proposal eval.
- IMPORT-SOURCES.md — batch import source types.
File History
3 commits
sha256:fd47ab66017e55331b88ba3a59c34c23e4e05c5aec424251d3a404c5a7998c8e
feat(hub): restore integration tile detail modals; add Herm…
Human
minor
⚠
16 days ago
sha256:2827ba9e7632a4b141c50caf1e8f7d77abbc3515be20e7465f2bccb0ac4edf91
fix: repair endpoint now sets has_active_subscription when …
Human
minor
⚠
16 days ago
sha256:6a102aafafdfe7e70a24f4e59740200f0ee713ce7915f1b53e9d4ba5ee8b4410
Initial Muse snapshot
Human
48 days ago