SPEC.md
file-level
1
files
1
commits
0
hotspots
0
🧊 dead
0
💥 blast risk
| 1 | # Knowtation — Specification |
| 2 | |
| 3 | This document is the **single source of truth** for data formats, contracts, and CLI behavior. Implementors, plugin authors, and agents should rely on it. Spec version aligns with package `version` (e.g. 0.1.x); CLI output shape and frontmatter schema are stable within the same major version. |
| 4 | |
| 5 | --- |
| 6 | |
| 7 | ## 0. Data ownership and vendor independence |
| 8 | |
| 9 | - **Vault is user-owned.** All content lives in the user’s vault (Markdown + frontmatter) on their machine or their chosen storage. There is no requirement to send data to a third-party vendor to use the tool. |
| 10 | - **Context, memory, and intention stay with the user.** Import from ChatGPT, Claude, Mem0, or other platforms brings data *into* the vault; the vault remains the source of truth. Users can switch LLMs or memory providers without losing their notation. |
| 11 | - **Modular backends.** Embedding provider, vector store, and optional memory layer are configurable and replaceable. The vault format does not depend on any specific vendor. Knowtation is designed to plug into any LLM or service via CLI or MCP — including OpenClaw, DeerFlow, Cursor, Claude Code, and any other agent runtime that can invoke a CLI or speak MCP. |
| 12 | - **Portability.** The vault directory is the portable backup. Export produces standard Markdown (and optional formats). Users own their data and can move or replicate it without lock-in. |
| 13 | |
| 14 | --- |
| 15 | |
| 16 | ## 1. Vault format and layout |
| 17 | |
| 18 | - **Format:** Markdown files with optional YAML frontmatter. UTF-8. Line endings: LF preferred; CRLF accepted. |
| 19 | - **Root:** One vault root directory (config: `vault_path` or env `KNOWTATION_VAULT_PATH`). |
| 20 | - **Layout (canonical folders):** |
| 21 | - `inbox/` — Raw captures from message interfaces. All inbox notes MUST conform to the [Inbox note frontmatter](#2-inbox-note-frontmatter) contract. |
| 22 | - `captures/` — Processed or moved captures (optional frontmatter). |
| 23 | - `projects/<project-slug>/` — Per-project notes; may contain `inbox/` for project-specific capture. |
| 24 | - `areas/` — Evergreen themes. |
| 25 | - `archive/`, `media/audio/`, `media/video/`, `templates/`, `meta/` — Optional; semantics are user-defined. |
| 26 | - **Project slug and tag normalization:** Lowercase; only `a-z0-9` and hyphen `-`; no leading/trailing hyphen. Examples: `born-free`, `dreambolt-network`. Tags in frontmatter use the same normalization when used for filtering. |
| 27 | |
| 28 | --- |
| 29 | |
| 30 | ## 2. Frontmatter schema |
| 31 | |
| 32 | ### 2.1 Common (any note) |
| 33 | |
| 34 | | Field | Type | Required | Description | |
| 35 | |----------|----------------|----------|-------------| |
| 36 | | `title` | string | No | Display title. | |
| 37 | | `project`| string | No | Project slug (normalized). Inferred from path if note under `vault/projects/<slug>/`. | |
| 38 | | `tags` | string[] or string | No | Tags (normalized). Can be YAML list or comma-separated string. | |
| 39 | | `date` | ISO 8601 or YYYY-MM-DD | No | Creation or capture date. | |
| 40 | | `updated`| ISO 8601 or YYYY-MM-DD | No | Last update. | |
| 41 | |
| 42 | ### 2.2 Inbox note frontmatter (message-interface output) |
| 43 | |
| 44 | Notes written by a message-interface plugin into `vault/inbox/` or `vault/projects/<project>/inbox/` MUST include: |
| 45 | |
| 46 | | Field | Type | Required | Description | |
| 47 | |------------|--------|----------|-------------| |
| 48 | | `source` | string | Yes | Identifier of the interface (e.g. `telegram`, `slack`, `jira`). | |
| 49 | | `date` | string | Yes | ISO 8601 or YYYY-MM-DD. | |
| 50 | | `source_id`| string | Recommended | External id (e.g. message id, ticket key) for deduplication. If present, plugins may skip or update when the same source_id is seen again. | |
| 51 | | `project` | string | No | Project slug when writing to global inbox. | |
| 52 | | `tags` | string[] or string | No | Tags. | |
| 53 | |
| 54 | All other common frontmatter fields are optional for inbox notes. |
| 55 | |
| 56 | ### 2.3 Optional frontmatter for intention and temporal (see docs/INTENTION-AND-TEMPORAL.md) |
| 57 | |
| 58 | For temporal sequence, causation, hierarchical memory, and state compression the following are **optional**; notes are valid without them. |
| 59 | |
| 60 | | Field | Type | Description | |
| 61 | |-------|------|-------------| |
| 62 | | `follows` | string or string[] | Vault-relative path(s) of note(s) this one follows (causal or sequential). | |
| 63 | | `causal_chain_id` | string | Id grouping notes in the same causal chain. | |
| 64 | | `entity` | string or string[] | Entity labels (person, project, concept) for relational queries. | |
| 65 | | `episode_id` | string | Id grouping notes into an episode/session (hierarchical memory). | |
| 66 | | `summarizes` | string or string[] | Path(s) of note(s) this note summarizes (state compression). | |
| 67 | | `summarizes_range` | string | e.g. `2025-01/2025-03` — this note summarizes that range. | |
| 68 | | `state_snapshot` | boolean | If true, this note is a state snapshot at its `date`. | |
| 69 | |
| 70 | Same slug normalization as project/tag for `causal_chain_id` and `entity`. CLI may support `--since`, `--until`, `--chain`, `--entity`, `--episode`, `--order` when these are present; see INTENTION-AND-TEMPORAL. |
| 71 | |
| 72 | ### 2.4 Mist attachment IDs (optional) |
| 73 | |
| 74 | Import pipelines and note-creation tools MAY record the original source blob in Mist (Muse's content-addressed object store) and stamp the returned ID in the note's frontmatter. This enables `muse code impact` to trace which notes depend on a given binary blob (PDF, audio, screenshot, etc.). |
| 75 | |
| 76 | | Field | Type | Description | |
| 77 | |-------|------|-------------| |
| 78 | | `attachments` | string[] | List of mist blob IDs. Each ID is exactly 12 characters from the [base58](https://en.bitcoin.it/wiki/Base58Check_encoding) alphabet (`123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz`), derived from the first 12 characters of the base58-encoded SHA-256 hash of the blob content. | |
| 79 | |
| 80 | **Validation:** parsers MUST NOT require `attachments`; notes without it are fully valid. When present, each entry MUST match the 12-character base58 pattern. |
| 81 | |
| 82 | **Mist push workflow (importers):** |
| 83 | 1. `muse mist push <blob>` → returns a mist ID (e.g. `4a7Jz2Xn9Kqw`). |
| 84 | 2. The importer stamps the ID into the note's `attachments` list. |
| 85 | 3. `muse code impact <mist-id>` returns all notes that reference the blob. |
| 86 | |
| 87 | **Schema version:** this field is part of frontmatter schema version 2. Legacy notes (schema v1) are automatically migrated by `migrate_frontmatter`. |
| 88 | |
| 89 | ### 2.5 Reserved for Phase 12 (blockchain and agent payments) |
| 90 | |
| 91 | The following frontmatter fields are **reserved** for a future phase. Notes remain valid without them; parsers and indexers MUST NOT require them. When implemented, they will be optional and used for payment attribution and on-chain provenance. |
| 92 | |
| 93 | | Field | Type | Description (when implemented) | |
| 94 | |-------|------|--------------------------------| |
| 95 | | `network` | string | Blockchain or network identifier. | |
| 96 | | `wallet_address` | string | Address used for payment or attribution. | |
| 97 | | `tx_hash` | string | Transaction hash (e.g. payment or attestation). | |
| 98 | | `payment_status` | string | Status of payment (e.g. pending, completed). | |
| 99 | |
| 100 | See BLOCKCHAIN-AND-AGENT-PAYMENTS.md for the Phase 12 scope. No CLI or Hub behavior depends on these fields until that phase. |
| 101 | |
| 102 | --- |
| 103 | |
| 104 | ## 3. Message-interface (capture plugin) contract |
| 105 | |
| 106 | - **Purpose:** Any adapter (Telegram, WhatsApp, Discord, JIRA, Slack, Teams, email, webhooks) that ingests messages or events into the vault. |
| 107 | - **Output location:** One of: |
| 108 | - `vault/inbox/<filename>.md` |
| 109 | - `vault/projects/<project-slug>/inbox/<filename>.md` |
| 110 | - **Filename:** Safe for filesystem; recommend date-based or `{source}_{source_id}.md` for dedup. Uniqueness is plugin responsibility. |
| 111 | - **Content:** Valid Markdown; frontmatter MUST satisfy [Inbox note frontmatter](#2-inbox-note-frontmatter). Body = message content or transcript. |
| 112 | - **Idempotency:** If plugin supports dedup, use `source_id` in frontmatter and overwrite or skip when a note with the same `source_id` (and same `source`) already exists. Not required by spec but recommended. |
| 113 | - **Discovery:** No built-in plugin discovery. User runs plugins via cron, scheduler, or manual invocation; config can list which capture scripts or services run. Plugins are standalone scripts or services that write files into the vault per this contract. |
| 114 | - **Webhooks:** Message interfaces may expose HTTP endpoints (e.g. Slack/Discord webhooks) that receive events and write notes; contract is the same. |
| 115 | |
| 116 | --- |
| 117 | |
| 118 | ## 4. CLI |
| 119 | |
| 120 | All commands support global `--json` for machine-readable output. Paths are vault-relative unless stated otherwise. |
| 121 | |
| 122 | ### 4.1 Commands and flags |
| 123 | |
| 124 | | Command | Description | Flags (command-specific) | Notes | |
| 125 | |--------|-------------|---------------------------|-------| |
| 126 | | `search <query>` | **Semantic** search over the indexed vault (default), or **keyword** search (`--keyword`: case-insensitive match in path, body, and selected frontmatter strings). | `--folder <path>`, `--project <slug>`, `--tag <tag>`, `--limit <n>` (default 10), `--since <date>`, `--until <date>`, `--chain <id>`, `--entity <id>`, `--episode <id>`, `--content-scope all\|notes\|approval_logs`, `--order date\|date-asc`, `--fields path\|path+snippet\|full` (default path+snippet), `--snippet-chars <n>`, `--count-only`, `--keyword`, `--match phrase\|all-terms` (with `--keyword`), `--json` | Semantic returns ranked chunks by embedding similarity; keyword returns substring / all-terms matches. Time, causal, entity, episode, and content-scope filters apply to both where implemented. See docs/INTENTION-AND-TEMPORAL.md; token levers: docs/RETRIEVAL-AND-CLI-REFERENCE.md. | |
| 127 | | `get-note <path>` | Return full content of one note (frontmatter + body), or a subset. | `--body-only`, `--frontmatter-only`, `--json` | Path vault-relative. Omit both body/frontmatter flags for full content. | |
| 128 | | `get-note-outline <path>` | Return a derived Markdown heading outline for one note. | `--json` required | Path vault-relative. Returns heading metadata only; never returns body, snippets, full frontmatter, or absolute filesystem paths. | |
| 129 | | `get-document-tree <path>` | Return a derived Markdown heading tree for one note. | `--json` required | Path vault-relative. Returns nested heading metadata only; never returns body, snippets, full frontmatter, summaries, vectors, labels, line ranges, or absolute filesystem paths. | |
| 130 | | `get-metadata-facets <path>` | Return bounded body-free metadata facets for one note. | `--json` required | Path vault-relative. Returns canonical frontmatter facets and deterministic path-inferred fields only; never returns body, snippets, full frontmatter, absolute filesystem paths, label text, OCR text, PageIndex output, summaries, vectors, media metadata, or memory events. | |
| 131 | | `list-notes` | List notes with optional filters. | `--folder <path>`, `--project <slug>`, `--tag <tag>`, `--limit <n>`, `--offset <n>`, `--since <date>`, `--until <date>`, `--chain <id>`, `--entity <id>`, `--episode <id>`, `--order date\|date-asc`, `--fields path\|path+metadata\|full` (default path+metadata), `--count-only`, `--json` | Order: by date (newest first) or by path; time and causal filters optional. Token levers: see docs/RETRIEVAL-AND-CLI-REFERENCE.md. | |
| 132 | | `index` | Re-run indexer: vault → chunk → embed → vector store. | (none) | Reads vault and config; writes to vector store and optional sidecar (e.g. docid → path map). | |
| 133 | | `write <path> [content]` | Create or overwrite a note. | `--stdin`, `--frontmatter k=v [k2=v2 ...]`, `--append`, `--json` | If `--stdin`, body from stdin. Frontmatter merged with existing or created. Inbox writes allowed; for non-inbox, AIR may be required (see Memory and AIR). | |
| 134 | | `export <path-or-query> <output-dir-or-file>` | Export note(s) to a format (e.g. Markdown, HTML) or directory. | `--format <md|html|...>`, `--project <slug>`, `--json` | Provenance (source_notes) recorded; AIR required when enabled. | |
| 135 | | `import <source-type> <input>` | Ingest from external platform or file into vault. | `--project <slug>`, `--output-dir <path>`, `--tags t1,t2`, `--dry-run`, `--json` | See **docs/IMPORT-SOURCES.md** and **docs/IMPORT-MANUAL-CHECKLIST.md**. Allowed `source_type` strings are defined in **lib/import-source-types.mjs** (CLI, Hub, MCP must stay aligned). | |
| 136 | |
| 137 | ### 4.2 JSON output shape (stable) |
| 138 | |
| 139 | - **search (--json):** |
| 140 | - Default or `--fields path+snippet`: `{ "results": [ { "path": "...", "snippet": "...", "score": number, "project": "...", "tags": [] } ], "query": "...", "mode": "semantic" | "keyword" }`. Implementations may omit `"mode"` for semantic-only CLIs; Hub and current repo include `mode` for both paths. Snippet length may be capped by `--snippet-chars <n>`. |
| 141 | - `--fields path`: same but each result has only `path`, `score`, and optionally `project`/`tags`; no `snippet`. |
| 142 | - `--fields full`: each result includes full note content (frontmatter + body) for that hit. |
| 143 | - `--count-only`: `{ "count": number, "query": "..." }`; no `results` array (or empty). Implementations may optionally include `"paths": [ ... ]` for first N paths when useful. |
| 144 | - **get-note (--json):** |
| 145 | - Default: `{ "path": "...", "frontmatter": { ... }, "body": "..." }`. |
| 146 | - `--body-only`: `{ "path": "...", "body": "..." }` (no frontmatter). |
| 147 | - `--frontmatter-only`: `{ "path": "...", "frontmatter": { ... } }` (no body). |
| 148 | - **get-note-outline (--json):** |
| 149 | - `{ "schema": "knowtation.note_outline/v1", "path": "...", "title": "..." | null, "headings": [ { "level": 1, "text": "...", "id": "h1-example-0001" } ], "truncated": false }`. |
| 150 | - This response MUST NOT include note body, snippets, full frontmatter, source excerpts, provider keys, absolute filesystem paths, raw HTML rendering, byte offsets, exact line ranges, section body lengths, LLM summaries, vector scores, or memory events. |
| 151 | - **get-document-tree (--json):** |
| 152 | - `{ "schema": "knowtation.document_tree/v0", "path": "...", "title": "..." | null, "root": { "children": [ { "id": "h1-example-0001", "level": 1, "text": "...", "children": [] } ] }, "truncated": false }`. |
| 153 | - This response MUST NOT include note body, snippets, full frontmatter, source excerpts, provider keys, absolute filesystem paths, raw HTML rendering, byte offsets, exact line ranges, section body lengths, LLM summaries, vector scores, labels, metadata facets, or memory events. |
| 154 | - **get-metadata-facets (--json):** |
| 155 | - `{ "schema": "knowtation.metadata_facets/v0", "path": "...", "facets": { "project": "..." | null, "tags": [], "date": "..." | null, "updated": "..." | null, "causal_chain_id": "..." | null, "entity": [], "episode_id": "..." | null }, "inferred": { "folder": "..." | null, "source_type": null }, "truncated": false }`. |
| 156 | - This response MUST NOT include note body, snippets, full frontmatter, source excerpts, provider keys, absolute filesystem paths, raw HTML rendering, byte offsets, exact line ranges, section body lengths, LLM summaries, vector scores, labels, OCR text, PageIndex output, media metadata, memory events, or MCP resource URIs. |
| 157 | - **list-notes (--json):** |
| 158 | - Default or `--fields path+metadata`: `{ "notes": [ { "path": "...", "project": "...", "tags": [], "date": "..." } ], "total": number }`. |
| 159 | - `--fields path`: notes array has only `path` per entry (and `total`). |
| 160 | - `--fields full`: each note includes full frontmatter and body. |
| 161 | - `--count-only`: `{ "total": number }`; no `notes` array (or empty). |
| 162 | - **write (--json):** `{ "path": "...", "written": true }` |
| 163 | - **export (--json):** `{ "exported": [ { "path": "...", "output": "..." } ], "provenance": "..." }` |
| 164 | - **import (--json):** `{ "imported": [ { "path": "...", "source_id": "..." } ], "count": n }` |
| 165 | |
| 166 | On error, JSON output (when `--json` was passed): `{ "error": "message", "code": "ERROR_CODE" }`. |
| 167 | |
| 168 | ### 4.3 Exit codes |
| 169 | |
| 170 | | Code | Meaning | |
| 171 | |------|--------| |
| 172 | | 0 | Success. | |
| 173 | | 1 | Usage error (missing args, unknown command, invalid options). | |
| 174 | | 2 | Runtime error (vault not found, vector store unreachable, write failed, etc.). | |
| 175 | |
| 176 | When `--json` is used and an error occurs, JSON is written to stdout (or stderr, implementation may choose) and exit code is 1 or 2 as above. |
| 177 | |
| 178 | ### 4.4 Config and environment |
| 179 | |
| 180 | CLI and indexer read, in order: env overrides, then `config/local.yaml`. |
| 181 | |
| 182 | | Key / Env | Type | Description | |
| 183 | |-----------|------|-------------| |
| 184 | | `vault_path` / `KNOWTATION_VAULT_PATH` | string | Absolute path to vault root. Required. | |
| 185 | | `qdrant_url` / `QDRANT_URL` | string | Qdrant base URL (e.g. http://localhost:6333). Optional if using sqlite-vec. | |
| 186 | | `vector_store` | `qdrant` \| `sqlite-vec` | Backend. Default implementation-defined. | |
| 187 | | `data_dir` / `KNOWTATION_DATA_DIR` | string | Directory for sqlite-vec DB, sidecar index files. Default: `data/` under project root. | |
| 188 | | `embedding.provider` | string | e.g. `ollama`, `openai`. | |
| 189 | | `embedding.model` | string | Model name. | |
| 190 | | `memory.enabled` | boolean | Enable memory layer. | |
| 191 | | `memory.provider` | string | `file` (default), `vector`, or `mem0`. | |
| 192 | | `memory.url` / `KNOWTATION_MEMORY_URL` | string | Optional endpoint for memory service. | |
| 193 | | `air.enabled` | boolean | Require AIR attestation for protected operations. | |
| 194 | | `air.endpoint` / `KNOWTATION_AIR_ENDPOINT` | string | Optional AIR service URL. | |
| 195 | |
| 196 | No secrets in config; use env for API keys (e.g. `OPENAI_API_KEY`). Do not commit `config/local.yaml`. |
| 197 | |
| 198 | --- |
| 199 | |
| 200 | ## 5. Indexer and chunk metadata |
| 201 | |
| 202 | - **Input:** All Markdown under vault root (respecting optional ignore patterns, e.g. `templates/`, `meta/`). **Approval audit notes** written by the Hub on proposal approve live under vault-relative `approvals/` (frontmatter `kind: approval_log`) and are indexed like other notes unless a deployment adds `approvals` to `ignore`. |
| 203 | - **Chunking:** Size and overlap are implementation-defined; typical 256–512 tokens with overlap. Each chunk MUST carry metadata: `path` (vault-relative), `project` (from path or frontmatter), `tags` (array from frontmatter). Optional: `date`, `source`. |
| 204 | - **Embedding:** Per config (`embedding.provider`, `embedding.model`). Vectors stored in Qdrant or sqlite-vec with the same metadata so that `search --project` and `--tag` can filter at retrieval time (metadata filter or post-filter). |
| 205 | - **Idempotency:** Indexer should upsert by stable chunk id (e.g. path + chunk index or content hash) so re-runs do not duplicate points. |
| 206 | |
| 207 | --- |
| 208 | |
| 209 | ## 6. MCP server (optional) |
| 210 | |
| 211 | When an MCP server is provided, it MUST expose the same operations and semantics as the CLI for tools implemented on that transport: search, get-note, get-note-outline, get-document-tree, list-notes, index, write, export, import. `get-document-tree` / `get_document_tree` is implemented for local CLI, self-hosted MCP, and hosted MCP as a body-free derived heading tree. Same filters (folder, project, tag), same JSON shapes, same error behavior. MCP is a transport only; the spec is the CLI. |
| 212 | |
| 213 | --- |
| 214 | |
| 215 | ## 7. Memory and AIR integration points |
| 216 | |
| 217 | - **Memory:** Optional. When `memory.enabled` is true, a multi-tier memory layer captures events across CLI, MCP, and hosted surfaces: |
| 218 | - **Providers:** `file` (default, JSONL append-only log + latest-value state overlay), `vector` (extends file with embedding-based semantic search), `mem0` (delegates to external Mem0 API). |
| 219 | - **Storage:** Per-vault at `{data_dir}/memory/{vault_id}/`. Hosted: per-user + per-vault at `DATA_DIR/memory/{userId}/{vaultId}/`. |
| 220 | - **Events captured:** `search`, `export`, `write`, `import`, `index`, `propose` (default). `agent_interaction`, `capture`, `error`, `session_summary` are opt-in via `memory.capture` config. `user` type is always available for manual/agent stores. |
| 221 | - **CLI commands:** `memory query <key>`, `memory list`, `memory store`, `memory search`, `memory clear`, `memory export`, `memory stats`. |
| 222 | - **MCP tools:** `memory_query`, `memory_store`, `memory_list`, `memory_search`, `memory_clear`. |
| 223 | - **MCP resources:** `knowtation://memory/` (summary), `knowtation://memory/events`, `knowtation://memory/last_search`, `knowtation://memory/last_export`. |
| 224 | - **Privacy:** Secret detection rejects data with sensitive key patterns. Configurable capture types. Retention limits via `memory.retention_days`. `memory clear` requires `--confirm`. |
| 225 | - **AIR:** Optional. If enabled, the following operations MUST obtain an attestation before proceeding: `write` (when path is outside inbox), `export`. Inbox writes are exempt. The attestation id (AIR id) MUST be logged or stored with the action (e.g. in a log file or in note frontmatter). Implementation may call `air.endpoint` or a local AIR flow. Memory writes can optionally carry an `air_id` for attested memory. |
| 226 | |
| 227 | _REMOVEBLOCK_ — see “last query + result set” for cross-session context; (2) after export, to store “provenance: these notes → this export”; (3) on demand via a dedicated subcommand (e.g. `knowtation memory query "last export"`). Implementation chooses when to read/write memory; the spec only requires that when `memory.enabled` is true, a memory backend is configured and used for these purposes. |
| 228 | - **AIR:** Optional. If enabled, the following operations MUST obtain an attestation before proceeding: `write` (when path is outside inbox), `export`. Inbox writes are exempt. The attestation id (AIR id) MUST be logged or stored with the action (e.g. in a log file or in note frontmatter). Implementation may call `air.endpoint` or a local AIR flow. |
| 229 | |
| 230 | --- |
| 231 | |
| 232 | ## 8. Backup and portability |
| 233 | |
| 234 | - **Vault:** The vault directory is the primary portable backup. Copy or sync the vault folder to backup or move to another machine. |
| 235 | - **Full backup:** Optionally include `data/` (vector store and sidecar files) and a copy of `config/local.yaml` (with secrets redacted if needed). No standard backup command is required; users may use git (recommended for vault) or filesystem backup. |
| 236 | - **Provenance vs Git (clarification):** **Provenance** = recording which notes were used for an export and, when AIR is enabled, which attestation authorized a write (traceability of outputs). **Vault under git** = storing the vault folder in a Git repo so you have version history and audit trail of note changes; the inbox remains a folder inside the vault (file-based), not "Git as the inbox." |
| 237 | |
| 238 | --- |
| 239 | |
| 240 | ## 9. Versioning and compatibility |
| 241 | |
| 242 | - **Spec version:** Tied to package version (e.g. in `package.json`). This SPEC applies to that version. |
| 243 | - **Stability:** Within a major version, frontmatter schema, CLI command set, and JSON output shapes are stable. New optional fields may be added; required fields are not removed. Minor versions may add optional flags or commands. |
| 244 | |
| 245 | --- |
| 246 | |
| 247 | ## 10. Use cases covered by this spec |
| 248 | |
| 249 | - Single vault, multiple projects (folders + project/tags); project- and tag-scoped search and list. |
| 250 | - Capture from many message interfaces (Telegram, WhatsApp, Discord, JIRA, Slack, webhooks, etc.) via a single inbox contract. |
| 251 | - Transcription → vault notes; indexer picks them up with metadata. |
| 252 | - Agents (Cursor, Claude Code, Windsurf, GNO, custom) run the CLI or MCP; SKILL.md describes when to use Knowtation. |
| 253 | - **Agent orchestration:** Multi-agent orchestration systems (e.g. [AgentCeption](https://github.com/cgcardona/agentception)) use Knowtation as a knowledge backend: agents read the vault (search, list-notes, get-note) for context and optionally write back plans or summaries. Both CLI (for agents in containers/worktrees) and MCP (for runtimes that speak MCP) are supported. See **docs/AGENT-ORCHESTRATION.md**. |
| 254 | - Sync across devices: vault on cloud drive (Dropbox, iCloud); no change to spec. |
| 255 | - Scheduled capture: user runs capture plugins via cron/scheduler; no built-in scheduler in spec. |
| 256 | - Provenance and governance: export and write (non-inbox) can record source_notes and AIR id; vault under git gives history. |
| 257 | - **Import from other platforms:** ChatGPT, Claude, Mem0, NotebookLM, Google Drive, MIF, generic Markdown, audio/video (see **docs/IMPORT-SOURCES.md**). Any external knowledge base or LLM memory can be brought into the vault and used like native content. |
| 258 | - **Any audio:** Smart glasses, wearables, past blogs/videos, recordings → transcribe and store as vault notes with `source` and `source_id`. |
| 259 | |
| 260 | ## 11. Import and ingestion from external sources |
| 261 | |
| 262 | - **Command:** `knowtation import <source-type> <input> [options]`. All importers write vault notes that satisfy §1–2 (frontmatter, project, tags). Origin is always traceable (`source`, `source_id`, `date`). |
| 263 | - **Source types:** `chatgpt-export`, `claude-export`, `mem0-export`, `notebooklm`, `gdrive`, `mif`, `markdown`, `audio`, `video`. Input is path (file/folder) or URI where applicable. Options: `--project`, `--output-dir`, `--tags`, `--dry-run`, `--json`. |
| 264 | - **Full definitions:** Input formats, output location, and idempotency per source type are in **docs/IMPORT-SOURCES.md**. Audio/video import uses the same transcription pipeline as capture; other LLM and KB imports map platform exports to one or more vault notes. |
| 265 | - **MIF:** Memory Interchange Format (`.memory.md` / `.memory.json`) is Obsidian-native; importer can copy as-is or normalize frontmatter for interop with other memory providers. |
| 266 | |
| 267 | --- |
| 268 | |
| 269 | ## 12. Extension points (without breaking the spec) |
| 270 | |
| 271 | The following can be added later as new subcommands or config options without changing existing contracts: bulk export or bulk tag; template expansion (e.g. `write` from a vault template); optional auth layer for shared vaults; additional vector-store or embedding providers; new import source types. **Muse-style variation/review/commit:** optional layer where proposed vault changes (variations) are reviewed before being applied to the canonical vault, preserving context and intention. **Intention and temporal:** optional frontmatter and filters for temporal sequence, causation, hierarchical memory, state compression, and evals (§2.3). **Evals:** optional `knowtation eval` command and eval set format (TBD). **Retrieval and token cost:** specified in §4.1–4.2 and documented in **docs/RETRIEVAL-AND-CLI-REFERENCE.md** (`--fields`, `--snippet-chars`, `--count-only`, `--body-only`, `--frontmatter-only`). **Blockchain, wallets, and agent payments:** optional frontmatter (`network`, `wallet_address`, `tx_hash`, `payment_status`), CLI filters (`--network`, `--wallet`), and capture/import for on-chain activity; reserved so agents with wallet access can be supported without backtracking. See **docs/BLOCKCHAIN-AND-AGENT-PAYMENTS.md** and Phase 12 in IMPLEMENTATION-PLAN. |