Foundations
Everything in Muse is content-addressed. Every object — blob, snapshot, commit — has a deterministic ID derived from its content. There is no mutable global state to coordinate: if two writers produce identical content, they produce identical IDs, and only one copy is stored. This section explains the three core record types, how the commit DAG is built, how objects are serialized and stored on disk, and how push/pull moves state between repositories.
Object store
Every byte stored by Muse lives in the object store — a
content-addressed key-value store keyed by
sha256:<64-hex>. The algorithm prefix is part of the ID
(not decoration), making the format self-describing and algorithm-agnostic.
When a new hash function is needed, old and new IDs coexist without collision.
IDs are produced by content_hash() for structured data and
blob_id() for raw bytes:
def content_hash(obj: JsonValue) -> str:
"""Canonical SHA-256 ID for any JSON-serializable value.
Canonical form: json.dumps with sort_keys=True, separators=(",",":"),
ensure_ascii=True, UTF-8 encoded — then SHA-256.
Returns "sha256:<64-hex>" (always 71 chars).
"""
def blob_id(data: bytes) -> str:
"""SHA-256 of raw bytes. Same "sha256:<64-hex>" format."""
The canonical JSON form — sorted keys, no whitespace — means two writers constructing the same logical object always produce the same ID, regardless of insertion order or formatting. This is what makes content-addressed merges safe: identical content is identical, provably.
Object types
Three distinct record types live in the object store:
| Type | ID source | Storage path | Notes |
|---|---|---|---|
| Blob | SHA-256 of raw bytes | objects/<algo>/<2-hex>/<62-hex> |
File contents; blob <size>\0<bytes> on disk |
| Snapshot | content_hash(snapshot_dict) |
objects/<algo>/<2-hex>/<62-hex> |
Path → blob-ID manifest; snapshot <size>\0<json> on disk |
| Commit | content_hash(commit_dict) |
objects/<algo>/<2-hex>/<62-hex> |
Snapshot ID + provenance + signature; commit <size>\0<json> on disk |
Object graph
The three types form a strict DAG. Arrows point from referencing object to referenced object — blobs are always leaves:
refs/heads/dev ──► CommitRecord (sha256:9e21b8...)
│ parent_commit_id
▼
CommitRecord (sha256:3f8a1c...)
│ snapshot_id
▼
SnapshotRecord (sha256:04ee5e...)
│ manifest
┌────┼────────────┐
▼ ▼ ▼
Blob Blob ... Blob
(sha256:a1b2c3...) (sha256:d4e5f6...)
mkstemp → write → fsync → os.replace.
A partial write can never produce a readable but corrupt object.
Objects are immutable once stored — a collision on content_hash would require
a SHA-256 preimage attack.CommitRecord
A commit is the top-level record in the Muse DAG. It points to exactly one snapshot and zero, one, or two parent commits (zero for the genesis commit, two for a merge).
@dataclass
class CommitRecord:
# Core identity
commit_id: str # sha256:<64-hex>
branch: str
snapshot_id: str # sha256:<64-hex>
message: str
committed_at: datetime.datetime
# Graph edges
parent_commit_id: str | None # None for genesis commits
parent2_commit_id: str | None # Set for merge commits (two-parent)
# Authorship
author: str # handle
metadata: dict
# Structured delta (populated by domain plugin's diff())
structured_delta: StructuredDelta | None
# Semantic versioning
sem_ver_bump: Literal["none", "patch", "minor", "major"]
breaking_changes: list[str]
# Agent provenance — all empty strings for human commits
agent_id: str # e.g. "claude-code"
model_id: str # e.g. "claude-sonnet-4-6"
toolchain_id: str
prompt_hash: str # sha256 of system prompt
# Ed25519 signature
signature: str # "ed25519:<base64url>"
signer_public_key: str # "ed25519:<base64url>"
signer_key_id: str # fingerprint of signing key
# Labels, review metadata
reviewed_by: list[str]
test_runs: int
labels: list[str]
status: str
notes: list[str]
score: float | None
Agent commits
When an agent commits, it populates agent_id, model_id,
and signs with its derived Ed25519 key. The absence of these fields is itself
a signal that a human committed directly — the format encodes the distinction
structurally rather than by convention.
# Agent commit — full provenance chain
muse commit -m "feat: add rate limiting" \
--agent-id claude-code \
--model-id claude-sonnet-4-6 \
--sign
# Human commit — no provenance flags
muse commit -m "chore: update config"
{
"commit_id": "sha256:6ab243df7bdb...",
"branch": "dev",
"snapshot_id": "sha256:04ee5ecd07ec...",
"message": "feat: add rate limiting",
"committed_at": "2026-04-21T23:00:00Z",
"parent_commit_id": "sha256:4c9e7959...",
"parent2_commit_id": null,
"author": "gabriel",
"agent_id": "claude-code",
"model_id": "claude-sonnet-4-6",
"signature": "ed25519:AAAA...",
"signer_public_key": "ed25519:BBBB...",
"sem_ver_bump": "minor",
"labels": ["reviewed"],
"score": 0.94
}
SnapshotManifest
A snapshot is an immutable mapping of every tracked path to its blob ID at that point in time. It is the complete working tree — not a delta, not a patch. Diffing two snapshots is an O(|paths|) set operation with no chain of deltas to traverse.
@dataclass
class SnapshotRecord:
snapshot_id: str # sha256:<64-hex>
manifest: dict[str, str] # path → "sha256:<64-hex>"
directories: list[str] # explicit empty directories
created_at: datetime.datetime
note: str
schema_version: int
{
"schema_version": 1,
"snapshot_id": "sha256:04ee5ecd...",
"manifest": {
"musehub/main.py": "sha256:a1b2c3...",
"musehub/graph/dag.py": "sha256:d4e5f6...",
"musehub/graph/push_validator.py": "sha256:7890ab..."
},
"directories": [],
"created_at": "2026-04-21T23:00:00Z",
"note": ""
}
Because a snapshot is just a flat map, any two snapshots can be diffed
instantly: paths present in both with the same blob ID are unchanged; paths
with different IDs are modified; paths only in one are added or removed.
Domain plugins receive this pair as their base and target
in diff(), then interpret the semantic meaning.
muse read --json returns commit metadata and file-level changes
(files_added, files_modified, files_removed)
but not the full manifest. Add --manifest to get the complete
path → object_id map for every tracked file.
Branch DAG
Branches are mutable named pointers to commit IDs, stored in
.muse/refs/heads/<branch>. The commit graph is an immutable DAG;
the branch pointer advances atomically when you commit or merge.
Merge commits have two parents: parent_commit_id (the branch
being merged into) and parent2_commit_id (the branch being merged
from). Three-way merge is computed from the common ancestor found by walking
both chains backward to their lowest common ancestor.
# Create and switch in one command, with intent metadata
muse checkout -b task/rate-limiting \
--intent "implement token bucket rate limiter" \
--resumable
# Work, stage, commit
muse code add src/rate_limiter.py
muse commit -m "feat: token bucket rate limiter" \
--agent-id claude-code --model-id claude-sonnet-4-6 --sign
# Merge back
muse checkout dev
muse merge task/rate-limiting # three-way, harmony auto-resolves known conflicts
muse branch -d task/rate-limiting
muse push local dev
Branch flow
| Branch | Role | Rule |
|---|---|---|
| main | Production | Tagged releases only; never direct-pushed |
| dev | Integration | Latest deliverable state |
| feat/* | Feature | Short-lived; one atomic task; hours not days |
| task/* | Agent task | Same as feat/*; carries --intent and --resumable |
| bugfix/* | Bug fix | From dev; merges into dev |
| hotfix/* | Hot fix | From main; merges into main AND dev |
Resumable branches
Branches carry --intent (a free-text description of the task) and
--resumable (a boolean signal that another agent may safely pick this
up mid-flight). Both are stored in branch metadata, readable via
muse branch --json, and surfaced in the coordination bus so
orchestrators can assign in-progress work to idle agents.
[
{
"name": "task/rate-limiting",
"current": false,
"intent": "implement token bucket rate limiter",
"resumable": true,
"created_by": "claude-code",
"commit_id": "sha256:6ab243..."
}
]
Round-trip walkthrough
This walkthrough traces a single file change from working tree all the way
to a remote and back, showing the sha256: IDs at each step so
you can see how objects, snapshots, commits, and branch refs compose.
Step 1 — stage
muse code add hashes each file into the object store and records
the path → blob-ID mapping in the staging area.
muse code add src/rate_limiter.py
staged src/rate_limiter.py sha256:f3a7b92c4de1…
The blob sha256:f3a7b92c4de1… is written to
.muse/objects/sha256/f3/a7b92c4de1… immediately — it
exists in the object store whether or not you ever commit.
Step 2 — commit
muse commit builds a SnapshotRecord from the staging area, hashes
it to get a snapshot ID, then builds a CommitRecord referencing that snapshot
and the current branch tip as parent.
muse commit -m "feat: token bucket rate limiter" \
--agent-id claude-code --model-id claude-sonnet-4-6 --sign
committed sha256:9e21b8a4f273…
snapshot sha256:c8d5e1f09ab3…
parent sha256:4c9e7959beef…
branch task/rate-limiting
signed ed25519:AAAA…
Three new objects hit disk:
| ID | Type | Contents |
|---|---|---|
sha256:f3a7b92c… |
Blob | Raw bytes of src/rate_limiter.py |
sha256:c8d5e1f0… |
Snapshot | Full path → blob-ID map for the entire working tree |
sha256:9e21b8a4… |
Commit | Message, author, agent provenance, snapshot_id, parent_commit_id, signature |
The branch ref .muse/refs/heads/task/rate-limiting is updated
atomically to sha256:9e21b8a4f273….
Step 3 — push
muse push computes the set of objects the remote does not have,
packs them into an MPack, and POSTs to the hub.
muse push local task/rate-limiting
Pushing task/rate-limiting → local 3 objects (1 blob, 1 snapshot, 1 commit) ✓ sha256:f3a7b92c… blob 4.2 kB ✓ sha256:c8d5e1f0… snapshot 1.1 kB ✓ sha256:9e21b8a4… commit 0.8 kB branch task/rate-limiting → sha256:9e21b8a4… ✔ pushed in 142 ms
The hub stores all three objects, verifies the Ed25519 signature, and advances the branch pointer in a single transaction.
Step 4 — pull
A second agent (or a second machine) pulls the branch. Muse fetches only the objects it doesn't already have — content-addressability makes deduplication trivial.
muse pull local task/rate-limiting
Fetching task/rate-limiting from local 3 new objects ✓ sha256:f3a7b92c… blob ✓ sha256:c8d5e1f0… snapshot ✓ sha256:9e21b8a4… commit branch task/rate-limiting → sha256:9e21b8a4… ✔ fast-forward, working tree updated
Because every ID is derived from content, the pull output IDs are byte-for-byte identical to the push output. There is no translation, no rebase, no rewriting — the same objects are present on both sides.
Inspecting the result
muse log --json
{
"truncated": false,
"commits": [
{
"commit_id": "sha256:9e21b8a4f273…",
"message": "feat: token bucket rate limiter",
"committed_at": "2026-04-30T18:22:04Z",
"author": "gabriel",
"agent_id": "claude-code",
"model_id": "claude-sonnet-4-6",
"parent_commit_id":"sha256:4c9e7959…",
"snapshot_id": "sha256:c8d5e1f0…"
},
…earlier commits…
]
}
muse read --json --manifest
{
"commit_id": "sha256:9e21b8a4f273…",
"snapshot_id": "sha256:c8d5e1f09ab3…",
"files_added": ["src/rate_limiter.py"],
"files_modified":[],
"files_removed": [],
"manifest": {
"src/rate_limiter.py": "sha256:f3a7b92c4de1…",
"src/main.py": "sha256:a1b2c3d4e5f6…",
"src/config.py": "sha256:789abc012def…"
}
}
muse diff --staged
+++ src/rate_limiter.py (new file, sha256:f3a7b92c…) + class TokenBucket: + def __init__(self, rate: float, burst: int) -> None: + self.rate = rate + self.burst = burst + self._tokens = burst + self._last = time.monotonic() + + def consume(self, n: int = 1) -> bool: + …
.museignore
.museignore is a TOML file that tells Muse which
files to exclude from tracking. It has three section types:
| Section | Applied when |
|---|---|
[global] | All domains, always |
[domain.<name>] | Only when the active domain is <name> |
[force_track] | Whitelist — exact paths that bypass all ignore rules |
[global]
patterns = [
".DS_Store",
"Thumbs.db",
"*.tmp",
"*.swp",
]
[domain.code]
patterns = [
"__pycache__/",
"*.pyc",
".venv/",
"dist/",
"build/",
"*.egg-info/",
"node_modules/",
]
# [force_track]
# Exact repo-relative paths to track even if they match a secrets pattern.
# paths = [
# "deploy/local-tls/localhost.key",
# ]
Pattern syntax
| Pattern | Matches |
|---|---|
*.pyc | Any .pyc file at any depth |
__pycache__/ | Any directory named __pycache__ (trailing / = directory) |
/dist/ | Only dist/ at the repo root (leading / = anchored) |
!important.tmp | Un-ignore a previously matched path (leading ! = negate) |
src/*.min.js | Minified JS files directly inside src/ (* excludes /) |
tests/fixtures/** | All contents of tests/fixtures/ recursively (** includes /) |
Patterns are evaluated in order — global first, then domain-specific. The last matching rule wins, mirroring gitignore semantics.
[force_track] — override the secrets blocklist
Muse automatically blocks certain file types from tracking (e.g. *.key,
*.pem, .env). The [force_track] section
lists exact repo-relative paths (no globs) that must be tracked
regardless. Use it for dev infrastructure that would otherwise be blocked.
[force_track]
paths = [
"deploy/local-tls/localhost.key",
"deploy/local-tls/localhost.crt",
]
muse check-ignore <path> --json tells you whether a given path
is ignored and which rule matched. Use it when muse status shows
a file as untracked and you want to understand why.
Serialization
Commits and snapshots are serialized with msgpack, not JSON. On real-world code repositories, msgpack is 3–6× faster to encode and decode, and produces smaller files. The coordination bus uses JSON (infrequent, small payloads) and the MCP wire uses JSON for tool calls, but the core object store is msgpack throughout.
| Data | Format | Reason |
|---|---|---|
| Commits / snapshots | msgpack | 3–6× faster; binary-safe for blob content |
| Objects (blobs) | raw bytes | No encoding overhead |
| Coordination records | JSON | Infrequent; human-readable debugging |
| Harmony patterns | JSON | Infrequent; inspectable |
| Wire push (HTTP) | msgpack | application/x-msgpack |
| MCP tool calls | JSON | MCP protocol requirement |
msgpack CommitRecord — decoded
Commits are stored as msgpack binary files. The JSON below is the decoded
equivalent — every field in the msgpack maps 1:1 to the CommitRecord dataclass.
The sha256: prefix is stored as a plain string; it is never
stripped.
python3 -c "import msgpack,json,sys; d=msgpack.unpackb(open(sys.argv[1],'rb').read(),raw=False); print(json.dumps(d,indent=2))" \
.muse/objects/sha256/9e/21b8a4f273…
{
"commit_id": "sha256:9e21b8a4f273…",
"repo_id": "sha256:0000genesis…",
"branch": "task/rate-limiting",
"snapshot_id": "sha256:c8d5e1f09ab3…",
"message": "feat: token bucket rate limiter",
"committed_at": "2026-04-30T18:22:04Z",
"parent_commit_id": "sha256:4c9e7959…",
"parent2_commit_id": null,
"author": "gabriel",
"agent_id": "claude-code",
"model_id": "claude-sonnet-4-6",
"toolchain_id": "",
"prompt_hash": "",
"signature": "ed25519:AAAA…",
"signer_public_key": "ed25519:BBBB…",
"signer_key_id": "sha256:CCCC…",
"sem_ver_bump": "minor",
"breaking_changes": [],
"structured_delta": null,
"reviewed_by": [],
"test_runs": 0,
"labels": [],
"status": "",
"notes": [],
"score": null,
"format_version": 8
}
Size limits
| Limit | Value | Notes |
|---|---|---|
| Max commits per push | 10,000 | Rejected at wire layer |
| Max objects per push | 1,000 | Larger batches use presigned URLs |
| Max object size (inline) | 38 MB | Above this: presigned upload |
| Max msgpack file | 64 MiB | Per commit or snapshot file |
| Max blob in mpack | 256 MiB | MPack format limit |
| Max string (msgpack) | 1 MiB | Any single string value |
| Max collection entries | 1M | Array or map |
On-disk layout
Every Muse repo is a directory containing a .muse/ subdirectory.
There is no index file, no packed-refs, no reflog by default — just the
flat object store and a handful of ref files.
.muse/
├── repo.json # repo_id, domain, owner, created_at
├── HEAD # "refs/heads/dev" (symbolic ref)
├── refs/
│ └── heads/
│ ├── main # "sha256:<64-hex>\n"
│ └── dev # "sha256:<64-hex>\n"
├── objects/
│ └── sha256/
│ └── ab/ # first 2 hex chars (sharding)
│ └── <62-hex> # commits, snapshots, and blobs — unified store
├── coordination/ # multi-agent symbol reservations
│ ├── reservations/
│ ├── intents/
│ ├── releases/
│ └── heartbeats/
├── harmony/ # conflict resolution memory
│ ├── patterns/
│ ├── policies/
│ └── audit/
└── agent.md # repo-specific agent rules
The two-level sharding on objects (sha256/ab/<62-hex>) keeps
directory sizes bounded: at one million objects, each shard directory
holds ~3,900 files on average — well within filesystem limits on all
major platforms.
MuseHub server store
The MuseHub server uses the same on-disk format per repo under
/data/repos/. Object IDs, shard directories, and ref files are
byte-for-byte compatible with the client store — a blob stored by a push and
a blob stored locally are indistinguishable at the byte level.
The database holds only metadata caches and the collaboration layer;
authoritative repo state is always on disk.
/data/repos/<owner>/<slug>/
├── objects/
│ └── sha256/ # algorithm namespace (mldsa65/ slots in here)
│ └── ab/ # 2-char hex shard
│ └── <62-hex> # raw blob — same layout as .muse/objects/
└── refs/
└── heads/
├── main # "sha256:<64-hex>" — same format as client
└── dev
| DB table | Role |
|---|---|
musehub_commits | Cache — fast graph queries, search, API listing |
musehub_snapshots | Cache — fast manifest lookups |
musehub_branches | Cache — fast branch listing; disk ref is authoritative |
musehub_repos | Canonical — repo metadata, visibility, owner |
musehub_identities / musehub_auth_keys | Canonical — identity and auth |
musehub_issues / musehub_proposals | Canonical — collaboration layer |
musehub_objects | Canonical — storage_uri + size_bytes index for fetch path resolution |
GET /repos/{owner}/{repo}/branches/{name}/repair heals it.
CLI reference
Every command accepts --json. The --json output is the
stable machine contract; the default terminal output is for humans and is not
versioned. Use muse -C ~/path/to/repo <cmd> when your working
directory differs from the target repo.
Core workflow
| Task | Command |
|---|---|
| Initialise repo | muse init [--domain code|midi|identity] |
| Working-tree status | muse status --json |
| Stage files | muse code add <path> / muse code add . |
| Unstage | muse code reset <path> |
| Delete + stage deletion | muse rm <path> |
| Commit | muse commit -m "msg" [--agent-id X --model-id Y --sign] |
| History | muse log --json |
| Inspect commit | muse read --json [--manifest] |
| Diff working tree | muse diff / muse diff --staged |
| Diff two refs | muse diff HEAD~3 HEAD --json |
| List branches | muse branch --json |
| Switch / create branch | muse checkout [-b] <branch> [--intent "..." --resumable] |
| Three-way merge | muse merge <branch> |
| Dry-run merge | muse merge --dry-run <branch> --json |
| Shelf (stash) | muse shelf save [-m "msg"] / muse shelf pop |
| Tag | muse tag add "label" [<ref>] |
| Release | muse release add <semver> |
muse code add . stages new files, modifications, and
deletions of already-tracked files — equivalent to git add -u && git add .
combined. To remove a file from tracking without deleting it from disk,
use muse rm --cached <path>.
muse status --json shape
The status JSON schema is always identical regardless of domain or staging state.
All keys are always present — no dict.get guards needed.
{
"branch": "dev",
"head_commit": "sha256:abc...",
"upstream": null, // tracking remote name, or null
"ahead": null, // commits ahead of remote; null when no upstream
"behind": null, // commits behind remote; null when no upstream
"clean": true, // true only when no staged, unstaged, or untracked files
"dirty": false, // always NOT clean
"total_changes": 0, // tracked-file changes (added+modified+deleted+renamed)
"untracked_count": 0, // len(untracked); nonzero when dirty but total_changes==0
"added": [], // flat union of staged + unstaged
"modified": [],
"deleted": [],
"renamed": {}, // old_path → new_path map
"staged": {
"added": [], "modified": [], "deleted": []
},
"unstaged": {
"added": [], "modified": [], "deleted": [], "renamed": {}
},
"untracked": [], // on-disk but not tracked; presence makes clean=false
"conflict_paths": [],
"merge_in_progress": false,
"merge_from": null, // branch being merged; null when no merge
"conflict_count": 0,
"checkout_interrupted": false,
"checkout_target": null
}
Push / pull
Push sends a WireMPack — a compact envelope containing every
commit, snapshot, and object the remote doesn't already have — over
application/x-msgpack to POST /{owner}/{slug}/push.
The hub validates the mpack, stores objects atomically, and advances the
branch pointer in a single transaction. Pull is the reverse: the client
fetches an mpack from the hub and integrates it locally.
class WireMPack(BaseModel):
commits: list[WireCommit] # CommitRecord dicts
snapshots: list[WireSnapshot] # SnapshotRecord dicts
objects: list[WireObject] # raw blob bytes
branch_heads: dict[str, str] # branch → commit_id
class WireObject(BaseModel):
object_id: str
content: bytes # raw; no base64
path: str = ""
encoding: str = "raw" # "raw" | "zlib" | "delta+zlib"
base_id: str | None # set for delta-encoded objects
Push flow
The hub validates a push in this order before persisting anything:
1. Verify MSign Authorization header (Ed25519, ±30s replay window)
2. Resolve repo — owner + slug → repo_id + repo_root (/data/repos/<owner>/<slug>/)
3. Confirm pusher has write access
4. Negotiate have/want — hub checks object existence on disk, not in DB
5. Validate mpack schema + ID format (sha256:<hex>, ≥32 chars)
6. Enforce push limits (max 10k commits, 1k objects, 38 MB/object)
7. Persist objects → /data/repos/<owner>/<slug>/objects/sha256/<2-hex>/<62-hex> (atomic)
8. Persist snapshots → musehub_snapshots (DB cache)
9. Persist commits → musehub_commits (DB cache)
10. Advance branch pointer → refs/heads/<branch> on disk (atomic rename), then musehub_branches (cache)
11. Update repo.pushed_at timestamp
12. Upsert reachability index (musehub_object_refs)
# Push dev branch to the local hub
muse push local dev
# Push to staging
muse push staging dev
# Pull from remote
muse pull local dev
# Check configured remotes
muse remote --json
muse hub repo create --name <name> --json, then retry.