Foundations — Muse Developer Docs

Object store

Every byte stored by Muse lives in the object store — a content-addressed key-value store keyed by sha256:<64-hex>. The algorithm prefix is part of the ID (not decoration), making the format self-describing and algorithm-agnostic. When a new hash function is needed, old and new IDs coexist without collision.

IDs are produced by content_hash() for structured data and blob_id() for raw bytes:

python muse.core.types

def content_hash(obj: JsonValue) -> str:
    """Canonical SHA-256 ID for any JSON-serializable value.

    Canonical form: json.dumps with sort_keys=True, separators=(",",":"),
    ensure_ascii=True, UTF-8 encoded — then SHA-256.
    Returns "sha256:<64-hex>" (always 71 chars).
    """

def blob_id(data: bytes) -> str:
    """SHA-256 of raw bytes. Same "sha256:<64-hex>" format."""

The canonical JSON form — sorted keys, no whitespace — means two writers constructing the same logical object always produce the same ID, regardless of insertion order or formatting. This is what makes content-addressed merges safe: identical content is identical, provably.

Object types

Three distinct record types live in the object store:

Type	ID source	Storage path	Notes
Blob	SHA-256 of raw bytes	`objects/<algo>/<2-hex>/<62-hex>`	File contents; `blob <size>\0<bytes>` on disk
Snapshot	`content_hash(snapshot_dict)`	`objects/<algo>/<2-hex>/<62-hex>`	Path → blob-ID manifest; `snapshot <size>\0<json>` on disk
Commit	`content_hash(commit_dict)`	`objects/<algo>/<2-hex>/<62-hex>`	Snapshot ID + provenance + signature; `commit <size>\0<json>` on disk

Object graph

The three types form a strict DAG. Arrows point from referencing object to referenced object — blobs are always leaves:

  refs/heads/dev  ──►  CommitRecord (sha256:9e21b8...)
                            │  parent_commit_id
                            ▼
                       CommitRecord (sha256:3f8a1c...)
                            │  snapshot_id
                            ▼
                       SnapshotRecord (sha256:04ee5e...)
                            │  manifest
                       ┌────┼────────────┐
                       ▼    ▼            ▼
                     Blob  Blob  ...  Blob
                    (sha256:a1b2c3...)  (sha256:d4e5f6...)

Objects are written atomically: mkstemp → write → fsync → os.replace. A partial write can never produce a readable but corrupt object. Objects are immutable once stored — a collision on content_hash would require a SHA-256 preimage attack.

CommitRecord

A commit is the top-level record in the Muse DAG. It points to exactly one snapshot and zero, one, or two parent commits (zero for the genesis commit, two for a merge).

python muse.core.commits — CommitRecord

@dataclass
class CommitRecord:
    # Core identity
    commit_id:          str           # sha256:<64-hex>
    branch:             str
    snapshot_id:        str           # sha256:<64-hex>
    message:            str
    committed_at:       datetime.datetime

    # Graph edges
    parent_commit_id:   str | None    # None for genesis commits
    parent2_commit_id:  str | None    # Set for merge commits (two-parent)

    # Authorship
    author:             str           # handle
    metadata:           dict

    # Structured delta (populated by domain plugin's diff())
    structured_delta:   StructuredDelta | None

    # Semantic versioning
    sem_ver_bump:       Literal["none", "patch", "minor", "major"]
    breaking_changes:   list[str]

    # Agent provenance — all empty strings for human commits
    agent_id:           str           # e.g. "claude-code"
    model_id:           str           # e.g. "claude-sonnet-4-6"
    toolchain_id:       str
    prompt_hash:        str           # sha256 of system prompt

    # Ed25519 signature
    signature:          str           # "ed25519:<base64url>"
    signer_public_key:  str           # "ed25519:<base64url>"
    signer_key_id:      str           # fingerprint of signing key

    # Labels, review metadata
    reviewed_by:        list[str]
    test_runs:          int
    labels:             list[str]
    status:             str
    notes:              list[str]
    score:              float | None

Agent commits

When an agent commits, it populates agent_id, model_id, and signs with its derived Ed25519 key. The absence of these fields is itself a signal that a human committed directly — the format encodes the distinction structurally rather than by convention.

bash

# Agent commit — full provenance chain
muse commit -m "feat: add rate limiting" \
  --agent-id claude-code \
  --model-id claude-sonnet-4-6 \
  --sign

# Human commit — no provenance flags
muse commit -m "chore: update config"

json muse read --json

{
  "commit_id":         "sha256:6ab243df7bdb...",
  "branch":            "dev",
  "snapshot_id":       "sha256:04ee5ecd07ec...",
  "message":           "feat: add rate limiting",
  "committed_at":      "2026-04-21T23:00:00Z",
  "parent_commit_id":  "sha256:4c9e7959...",
  "parent2_commit_id": null,
  "author":            "gabriel",
  "agent_id":          "claude-code",
  "model_id":          "claude-sonnet-4-6",
  "signature":         "ed25519:AAAA...",
  "signer_public_key": "ed25519:BBBB...",
  "sem_ver_bump":      "minor",
  "labels":            ["reviewed"],
  "score":             0.94
}

SnapshotManifest

A snapshot is an immutable mapping of every tracked path to its blob ID at that point in time. It is the complete working tree — not a delta, not a patch. Diffing two snapshots is an O(|paths|) set operation with no chain of deltas to traverse.

python muse.core.snapshots — SnapshotRecord

@dataclass
class SnapshotRecord:
    snapshot_id:    str               # sha256:<64-hex>
    manifest:       dict[str, str]    # path → "sha256:<64-hex>"
    directories:    list[str]         # explicit empty directories
    created_at:     datetime.datetime
    note:           str
    schema_version: int

json muse read --json --manifest (excerpt)

{
  "schema_version": 1,
  "snapshot_id": "sha256:04ee5ecd...",
  "manifest": {
    "musehub/main.py":                  "sha256:a1b2c3...",
    "musehub/graph/dag.py":             "sha256:d4e5f6...",
    "musehub/graph/push_validator.py":  "sha256:7890ab..."
  },
  "directories": [],
  "created_at": "2026-04-21T23:00:00Z",
  "note": ""
}

Because a snapshot is just a flat map, any two snapshots can be diffed instantly: paths present in both with the same blob ID are unchanged; paths with different IDs are modified; paths only in one are added or removed. Domain plugins receive this pair as their base and target in diff(), then interpret the semantic meaning.

muse read --json returns commit metadata and file-level changes (files_added, files_modified, files_removed) but not the full manifest. Add --manifest to get the complete path → object_id map for every tracked file.

Branch DAG

Branches are mutable named pointers to commit IDs, stored in .muse/refs/heads/<branch>. The commit graph is an immutable DAG; the branch pointer advances atomically when you commit or merge.

Merge commits have two parents: parent_commit_id (the branch being merged into) and parent2_commit_id (the branch being merged from). Three-way merge is computed from the common ancestor found by walking both chains backward to their lowest common ancestor.

bash branch lifecycle

# Create and switch in one command, with intent metadata
muse checkout -b task/rate-limiting \
  --intent "implement token bucket rate limiter" \
  --resumable

# Work, stage, commit
muse code add src/rate_limiter.py
muse commit -m "feat: token bucket rate limiter" \
  --agent-id claude-code --model-id claude-sonnet-4-6 --sign

# Merge back
muse checkout dev
muse merge task/rate-limiting     # three-way, harmony auto-resolves known conflicts
muse branch -d task/rate-limiting
muse push local dev

Branch flow

Branch	Role	Rule
main	Production	Tagged releases only; never direct-pushed
dev	Integration	Latest deliverable state
feat/*	Feature	Short-lived; one atomic task; hours not days
task/*	Agent task	Same as feat/*; carries --intent and --resumable
bugfix/*	Bug fix	From dev; merges into dev
hotfix/*	Hot fix	From main; merges into main AND dev

Resumable branches

Branches carry --intent (a free-text description of the task) and --resumable (a boolean signal that another agent may safely pick this up mid-flight). Both are stored in branch metadata, readable via muse branch --json, and surfaced in the coordination bus so orchestrators can assign in-progress work to idle agents.

json muse branch --json (excerpt)

[
  {
    "name":       "task/rate-limiting",
    "current":    false,
    "intent":     "implement token bucket rate limiter",
    "resumable":  true,
    "created_by": "claude-code",
    "commit_id":  "sha256:6ab243..."
  }
]

Round-trip walkthrough

This walkthrough traces a single file change from working tree all the way to a remote and back, showing the sha256: IDs at each step so you can see how objects, snapshots, commits, and branch refs compose.

Step 1 — stage

muse code add hashes each file into the object store and records the path → blob-ID mapping in the staging area.

bash

muse code add src/rate_limiter.py

staged  src/rate_limiter.py  sha256:f3a7b92c4de1…

The blob sha256:f3a7b92c4de1… is written to .muse/objects/sha256/f3/a7b92c4de1… immediately — it exists in the object store whether or not you ever commit.

Step 2 — commit

muse commit builds a SnapshotRecord from the staging area, hashes it to get a snapshot ID, then builds a CommitRecord referencing that snapshot and the current branch tip as parent.

bash

muse commit -m "feat: token bucket rate limiter" \
  --agent-id claude-code --model-id claude-sonnet-4-6 --sign

committed  sha256:9e21b8a4f273…
  snapshot   sha256:c8d5e1f09ab3…
  parent     sha256:4c9e7959beef…
  branch     task/rate-limiting
  signed     ed25519:AAAA…

Three new objects hit disk:

ID	Type	Contents
`sha256:f3a7b92c…`	Blob	Raw bytes of `src/rate_limiter.py`
`sha256:c8d5e1f0…`	Snapshot	Full path → blob-ID map for the entire working tree
`sha256:9e21b8a4…`	Commit	Message, author, agent provenance, snapshot_id, parent_commit_id, signature

The branch ref .muse/refs/heads/task/rate-limiting is updated atomically to sha256:9e21b8a4f273….

Step 3 — push

muse push computes the set of objects the remote does not have, packs them into an MPack, and POSTs to the hub.

bash

muse push local task/rate-limiting

Pushing task/rate-limiting → local
  3 objects  (1 blob, 1 snapshot, 1 commit)
  ✓ sha256:f3a7b92c…  blob       4.2 kB
  ✓ sha256:c8d5e1f0…  snapshot   1.1 kB
  ✓ sha256:9e21b8a4…  commit     0.8 kB
  branch task/rate-limiting → sha256:9e21b8a4…
✔ pushed in 142 ms

The hub stores all three objects, verifies the Ed25519 signature, and advances the branch pointer in a single transaction.

Step 4 — pull

A second agent (or a second machine) pulls the branch. Muse fetches only the objects it doesn't already have — content-addressability makes deduplication trivial.

bash

muse pull local task/rate-limiting

Fetching task/rate-limiting from local
  3 new objects
  ✓ sha256:f3a7b92c…  blob
  ✓ sha256:c8d5e1f0…  snapshot
  ✓ sha256:9e21b8a4…  commit
  branch task/rate-limiting → sha256:9e21b8a4…
✔ fast-forward, working tree updated

Because every ID is derived from content, the pull output IDs are byte-for-byte identical to the push output. There is no translation, no rebase, no rewriting — the same objects are present on both sides.

Inspecting the result

bash

muse log --json

{
  "truncated": false,
  "commits": [
    {
      "commit_id":        "sha256:9e21b8a4f273…",
      "message":         "feat: token bucket rate limiter",
      "committed_at":    "2026-04-30T18:22:04Z",
      "author":          "gabriel",
      "agent_id":        "claude-code",
      "model_id":        "claude-sonnet-4-6",
      "parent_commit_id":"sha256:4c9e7959…",
      "snapshot_id":     "sha256:c8d5e1f0…"
    },
    …earlier commits…
  ]
}

bash

muse read --json --manifest

{
  "commit_id":     "sha256:9e21b8a4f273…",
  "snapshot_id":   "sha256:c8d5e1f09ab3…",
  "files_added":   ["src/rate_limiter.py"],
  "files_modified":[],
  "files_removed": [],
  "manifest": {
    "src/rate_limiter.py":   "sha256:f3a7b92c4de1…",
    "src/main.py":           "sha256:a1b2c3d4e5f6…",
    "src/config.py":         "sha256:789abc012def…"
  }
}

bash

muse diff --staged

+++ src/rate_limiter.py  (new file, sha256:f3a7b92c…)
+ class TokenBucket:
+     def __init__(self, rate: float, burst: int) -> None:
+         self.rate  = rate
+         self.burst = burst
+         self._tokens = burst
+         self._last   = time.monotonic()
+
+     def consume(self, n: int = 1) -> bool:
+         …

.museignore

.museignore is a TOML file that tells Muse which files to exclude from tracking. It has three section types:

Section	Applied when
`[global]`	All domains, always
`[domain.<name>]`	Only when the active domain is `<name>`
`[force_track]`	Whitelist — exact paths that bypass all ignore rules

toml .museignore

[global]
patterns = [
    ".DS_Store",
    "Thumbs.db",
    "*.tmp",
    "*.swp",
]

[domain.code]
patterns = [
    "__pycache__/",
    "*.pyc",
    ".venv/",
    "dist/",
    "build/",
    "*.egg-info/",
    "node_modules/",
]

# [force_track]
# Exact repo-relative paths to track even if they match a secrets pattern.
# paths = [
#     "deploy/local-tls/localhost.key",
# ]

Pattern syntax

Pattern	Matches
`*.pyc`	Any `.pyc` file at any depth
`__pycache__/`	Any directory named `__pycache__` (trailing `/` = directory)
`/dist/`	Only `dist/` at the repo root (leading `/` = anchored)
`!important.tmp`	Un-ignore a previously matched path (leading `!` = negate)
`src/*.min.js`	Minified JS files directly inside `src/` (`*` excludes `/`)
`tests/fixtures/**`	All contents of `tests/fixtures/` recursively (`**` includes `/`)

Patterns are evaluated in order — global first, then domain-specific. The last matching rule wins, mirroring gitignore semantics.

[force_track] — override the secrets blocklist

Muse automatically blocks certain file types from tracking (e.g. *.key, *.pem, .env). The [force_track] section lists exact repo-relative paths (no globs) that must be tracked regardless. Use it for dev infrastructure that would otherwise be blocked.

toml .museignore — force_track

[force_track]
paths = [
    "deploy/local-tls/localhost.key",
    "deploy/local-tls/localhost.crt",
]

muse check-ignore <path> --json tells you whether a given path is ignored and which rule matched. Use it when muse status shows a file as untracked and you want to understand why.

Serialization

Commits and snapshots are serialized with msgpack, not JSON. On real-world code repositories, msgpack is 3–6× faster to encode and decode, and produces smaller files. The coordination bus uses JSON (infrequent, small payloads) and the MCP wire uses JSON for tool calls, but the core object store is msgpack throughout.

Data	Format	Reason
Commits / snapshots	msgpack	3–6× faster; binary-safe for blob content
Objects (blobs)	raw bytes	No encoding overhead
Coordination records	JSON	Infrequent; human-readable debugging
Harmony patterns	JSON	Infrequent; inspectable
Wire push (HTTP)	msgpack	`application/x-msgpack`
MCP tool calls	JSON	MCP protocol requirement

msgpack CommitRecord — decoded

Commits are stored as msgpack binary files. The JSON below is the decoded equivalent — every field in the msgpack maps 1:1 to the CommitRecord dataclass. The sha256: prefix is stored as a plain string; it is never stripped.

bash

python3 -c "import msgpack,json,sys; d=msgpack.unpackb(open(sys.argv[1],'rb').read(),raw=False); print(json.dumps(d,indent=2))" \
  .muse/objects/sha256/9e/21b8a4f273…

{
  "commit_id":          "sha256:9e21b8a4f273…",
  "repo_id":             "sha256:0000genesis…",
  "branch":              "task/rate-limiting",
  "snapshot_id":         "sha256:c8d5e1f09ab3…",
  "message":             "feat: token bucket rate limiter",
  "committed_at":        "2026-04-30T18:22:04Z",
  "parent_commit_id":    "sha256:4c9e7959…",
  "parent2_commit_id":   null,
  "author":              "gabriel",
  "agent_id":            "claude-code",
  "model_id":            "claude-sonnet-4-6",
  "toolchain_id":        "",
  "prompt_hash":         "",
  "signature":           "ed25519:AAAA…",
  "signer_public_key":   "ed25519:BBBB…",
  "signer_key_id":       "sha256:CCCC…",
  "sem_ver_bump":         "minor",
  "breaking_changes":    [],
  "structured_delta":    null,
  "reviewed_by":         [],
  "test_runs":           0,
  "labels":              [],
  "status":              "",
  "notes":               [],
  "score":               null,
  "format_version":      8
}

Size limits

Limit	Value	Notes
Max commits per push	10,000	Rejected at wire layer
Max objects per push	1,000	Larger batches use presigned URLs
Max object size (inline)	38 MB	Above this: presigned upload
Max msgpack file	64 MiB	Per commit or snapshot file
Max blob in mpack	256 MiB	MPack format limit
Max string (msgpack)	1 MiB	Any single string value
Max collection entries	1M	Array or map

On-disk layout

Every Muse repo is a directory containing a .muse/ subdirectory. There is no index file, no packed-refs, no reflog by default — just the flat object store and a handful of ref files.

text .muse/ directory tree

.muse/
├── repo.json                       # repo_id, domain, owner, created_at
├── HEAD                            # "refs/heads/dev" (symbolic ref)
├── refs/
│   └── heads/
│       ├── main                    # "sha256:<64-hex>\n"
│       └── dev                     # "sha256:<64-hex>\n"
├── objects/
│   └── sha256/
│       └── ab/                     # first 2 hex chars (sharding)
│           └── <62-hex>           # commits, snapshots, and blobs — unified store
├── coordination/                   # multi-agent symbol reservations
│   ├── reservations/
│   ├── intents/
│   ├── releases/
│   └── heartbeats/
├── harmony/                        # conflict resolution memory
│   ├── patterns/
│   ├── policies/
│   └── audit/
└── agent.md                        # repo-specific agent rules

The two-level sharding on objects (sha256/ab/<62-hex>) keeps directory sizes bounded: at one million objects, each shard directory holds ~3,900 files on average — well within filesystem limits on all major platforms.

MuseHub server store

The MuseHub server uses the same on-disk format per repo under /data/repos/. Object IDs, shard directories, and ref files are byte-for-byte compatible with the client store — a blob stored by a push and a blob stored locally are indistinguishable at the byte level. The database holds only metadata caches and the collaboration layer; authoritative repo state is always on disk.

text Server-side per-repo tree

/data/repos/<owner>/<slug>/
├── objects/
│   └── sha256/                     # algorithm namespace (mldsa65/ slots in here)
│       └── ab/                     # 2-char hex shard
│           └── <62-hex>           # raw blob — same layout as .muse/objects/
└── refs/
    └── heads/
        ├── main                    # "sha256:<64-hex>" — same format as client
        └── dev

DB table	Role
`musehub_commits`	Cache — fast graph queries, search, API listing
`musehub_snapshots`	Cache — fast manifest lookups
`musehub_branches`	Cache — fast branch listing; disk ref is authoritative
`musehub_repos`	Canonical — repo metadata, visibility, owner
`musehub_identities` / `musehub_auth_keys`	Canonical — identity and auth
`musehub_issues` / `musehub_proposals`	Canonical — collaboration layer
`musehub_objects`	Canonical — `storage_uri` + `size_bytes` index for fetch path resolution

Push negotiation checks object existence directly on disk — never in the DB. This means force-resign, migration, or partial push failures cannot corrupt the have/want walk. If the DB cache drifts from disk, GET /repos/{owner}/{repo}/branches/{name}/repair heals it.

CLI reference

Every command accepts --json. The --json output is the stable machine contract; the default terminal output is for humans and is not versioned. Use muse -C ~/path/to/repo <cmd> when your working directory differs from the target repo.

Core workflow

Task	Command
Initialise repo	`muse init [--domain code\|midi\|identity]`
Working-tree status	`muse status --json`
Stage files	`muse code add <path>` / `muse code add .`
Unstage	`muse code reset <path>`
Delete + stage deletion	`muse rm <path>`
Commit	`muse commit -m "msg" [--agent-id X --model-id Y --sign]`
History	`muse log --json`
Inspect commit	`muse read --json [--manifest]`
Diff working tree	`muse diff` / `muse diff --staged`
Diff two refs	`muse diff HEAD~3 HEAD --json`
List branches	`muse branch --json`
Switch / create branch	`muse checkout [-b] <branch> [--intent "..." --resumable]`
Three-way merge	`muse merge <branch>`
Dry-run merge	`muse merge --dry-run <branch> --json`
Shelf (stash)	`muse shelf save [-m "msg"]` / `muse shelf pop`
Tag	`muse tag add "label" [<ref>]`
Release	`muse release add <semver>`

muse code add . stages new files, modifications, and deletions of already-tracked files — equivalent to git add -u && git add . combined. To remove a file from tracking without deleting it from disk, use muse rm --cached <path>.

muse status --json shape

The status JSON schema is always identical regardless of domain or staging state. All keys are always present — no dict.get guards needed.

json

{
  "branch":                    "dev",
  "head_commit":               "sha256:abc...",
  "upstream":                  null,          // tracking remote name, or null
  "ahead":                     null,          // commits ahead of remote; null when no upstream
  "behind":                    null,          // commits behind remote; null when no upstream
  "clean":                     true,          // true only when no staged, unstaged, or untracked files
  "dirty":                     false,         // always NOT clean
  "total_changes":             0,             // tracked-file changes (added+modified+deleted+renamed)
  "untracked_count":           0,             // len(untracked); nonzero when dirty but total_changes==0
  "added":                     [],            // flat union of staged + unstaged
  "modified":                  [],
  "deleted":                   [],
  "renamed":                   {},            // old_path → new_path map
  "staged": {
    "added": [], "modified": [], "deleted": []
  },
  "unstaged": {
    "added": [], "modified": [], "deleted": [], "renamed": {}
  },
  "untracked":                 [],            // on-disk but not tracked; presence makes clean=false
  "conflict_paths":            [],
  "merge_in_progress":         false,
  "merge_from":                null,          // branch being merged; null when no merge
  "conflict_count":            0,
  "checkout_interrupted":      false,
  "checkout_target":           null
}

Push / pull

Push sends a WireMPack — a compact envelope containing every commit, snapshot, and object the remote doesn't already have — over application/x-msgpack to POST /{owner}/{slug}/push. The hub validates the mpack, stores objects atomically, and advances the branch pointer in a single transaction. Pull is the reverse: the client fetches an mpack from the hub and integrates it locally.

python WireMPack shape (musehub.models.wire)

class WireMPack(BaseModel):
    commits:      list[WireCommit]    # CommitRecord dicts
    snapshots:    list[WireSnapshot]  # SnapshotRecord dicts
    objects:      list[WireObject]   # raw blob bytes
    branch_heads: dict[str, str]      # branch → commit_id

class WireObject(BaseModel):
    object_id: str
    content:   bytes                  # raw; no base64
    path:      str = ""
    encoding:  str = "raw"           # "raw" | "zlib" | "delta+zlib"
    base_id:   str | None             # set for delta-encoded objects

Push flow

The hub validates a push in this order before persisting anything:

text

1.  Verify MSign Authorization header (Ed25519, ±30s replay window)
2.  Resolve repo — owner + slug → repo_id + repo_root (/data/repos/<owner>/<slug>/)
3.  Confirm pusher has write access
4.  Negotiate have/want — hub checks object existence on disk, not in DB
5.  Validate mpack schema + ID format (sha256:<hex>, ≥32 chars)
6.  Enforce push limits (max 10k commits, 1k objects, 38 MB/object)
7.  Persist objects → /data/repos/<owner>/<slug>/objects/sha256/<2-hex>/<62-hex> (atomic)
8.  Persist snapshots → musehub_snapshots (DB cache)
9.  Persist commits → musehub_commits (DB cache)
10. Advance branch pointer → refs/heads/<branch> on disk (atomic rename), then musehub_branches (cache)
11. Update repo.pushed_at timestamp
12. Upsert reachability index (musehub_object_refs)

bash

# Push dev branch to the local hub
muse push local dev

# Push to staging
muse push staging dev

# Pull from remote
muse pull local dev

# Check configured remotes
muse remote --json

Push returns 404 ("Repository not found on remote") if the repo hasn't been created on the hub yet. Create it first via muse hub repo create --name <name> --json, then retry.