Developer Docs Foundations
PHASE 01

Foundations

Everything in Muse is content-addressed. Every object — blob, snapshot, commit — has a deterministic ID derived from its content. There is no mutable global state to coordinate: if two writers produce identical content, they produce identical IDs, and only one copy is stored. This section explains the three core record types, how the commit DAG is built, how objects are serialized and stored on disk, and how push/pull moves state between repositories.

Object store

Every byte stored by Muse lives in the object store — a content-addressed key-value store keyed by sha256:<64-hex>. The algorithm prefix is part of the ID (not decoration), making the format self-describing and algorithm-agnostic. When a new hash function is needed, old and new IDs coexist without collision.

IDs are produced by content_hash() for structured data and blob_id() for raw bytes:

python muse.core.types
def content_hash(obj: JsonValue) -> str:
    """Canonical SHA-256 ID for any JSON-serializable value.

    Canonical form: json.dumps with sort_keys=True, separators=(",",":"),
    ensure_ascii=True, UTF-8 encoded — then SHA-256.
    Returns "sha256:<64-hex>" (always 71 chars).
    """

def blob_id(data: bytes) -> str:
    """SHA-256 of raw bytes. Same "sha256:<64-hex>" format."""

The canonical JSON form — sorted keys, no whitespace — means two writers constructing the same logical object always produce the same ID, regardless of insertion order or formatting. This is what makes content-addressed merges safe: identical content is identical, provably.

Object types

Three distinct record types live in the object store:

TypeID sourceStorage pathNotes
Blob SHA-256 of raw bytes objects/<algo>/<2-hex>/<62-hex> File contents; blob <size>\0<bytes> on disk
Snapshot content_hash(snapshot_dict) objects/<algo>/<2-hex>/<62-hex> Path → blob-ID manifest; snapshot <size>\0<json> on disk
Commit content_hash(commit_dict) objects/<algo>/<2-hex>/<62-hex> Snapshot ID + provenance + signature; commit <size>\0<json> on disk

Object graph

The three types form a strict DAG. Arrows point from referencing object to referenced object — blobs are always leaves:

  refs/heads/dev  ──►  CommitRecord (sha256:9e21b8...)
                            │  parent_commit_id
                            ▼
                       CommitRecord (sha256:3f8a1c...)
                            │  snapshot_id
                            ▼
                       SnapshotRecord (sha256:04ee5e...)
                            │  manifest
                       ┌────┼────────────┐
                       ▼    ▼            ▼
                     Blob  Blob  ...  Blob
                    (sha256:a1b2c3...)  (sha256:d4e5f6...)
Objects are written atomically: mkstemp → write → fsync → os.replace. A partial write can never produce a readable but corrupt object. Objects are immutable once stored — a collision on content_hash would require a SHA-256 preimage attack.

CommitRecord

A commit is the top-level record in the Muse DAG. It points to exactly one snapshot and zero, one, or two parent commits (zero for the genesis commit, two for a merge).

python muse.core.commits — CommitRecord
@dataclass
class CommitRecord:
    # Core identity
    commit_id:          str           # sha256:<64-hex>
    branch:             str
    snapshot_id:        str           # sha256:<64-hex>
    message:            str
    committed_at:       datetime.datetime

    # Graph edges
    parent_commit_id:   str | None    # None for genesis commits
    parent2_commit_id:  str | None    # Set for merge commits (two-parent)

    # Authorship
    author:             str           # handle
    metadata:           dict

    # Structured delta (populated by domain plugin's diff())
    structured_delta:   StructuredDelta | None

    # Semantic versioning
    sem_ver_bump:       Literal["none", "patch", "minor", "major"]
    breaking_changes:   list[str]

    # Agent provenance — all empty strings for human commits
    agent_id:           str           # e.g. "claude-code"
    model_id:           str           # e.g. "claude-sonnet-4-6"
    toolchain_id:       str
    prompt_hash:        str           # sha256 of system prompt

    # Ed25519 signature
    signature:          str           # "ed25519:<base64url>"
    signer_public_key:  str           # "ed25519:<base64url>"
    signer_key_id:      str           # fingerprint of signing key

    # Labels, review metadata
    reviewed_by:        list[str]
    test_runs:          int
    labels:             list[str]
    status:             str
    notes:              list[str]
    score:              float | None

Agent commits

When an agent commits, it populates agent_id, model_id, and signs with its derived Ed25519 key. The absence of these fields is itself a signal that a human committed directly — the format encodes the distinction structurally rather than by convention.

bash
# Agent commit — full provenance chain
muse commit -m "feat: add rate limiting" \
  --agent-id claude-code \
  --model-id claude-sonnet-4-6 \
  --sign

# Human commit — no provenance flags
muse commit -m "chore: update config"
json muse read --json
{
  "commit_id":         "sha256:6ab243df7bdb...",
  "branch":            "dev",
  "snapshot_id":       "sha256:04ee5ecd07ec...",
  "message":           "feat: add rate limiting",
  "committed_at":      "2026-04-21T23:00:00Z",
  "parent_commit_id":  "sha256:4c9e7959...",
  "parent2_commit_id": null,
  "author":            "gabriel",
  "agent_id":          "claude-code",
  "model_id":          "claude-sonnet-4-6",
  "signature":         "ed25519:AAAA...",
  "signer_public_key": "ed25519:BBBB...",
  "sem_ver_bump":      "minor",
  "labels":            ["reviewed"],
  "score":             0.94
}

SnapshotManifest

A snapshot is an immutable mapping of every tracked path to its blob ID at that point in time. It is the complete working tree — not a delta, not a patch. Diffing two snapshots is an O(|paths|) set operation with no chain of deltas to traverse.

python muse.core.snapshots — SnapshotRecord
@dataclass
class SnapshotRecord:
    snapshot_id:    str               # sha256:<64-hex>
    manifest:       dict[str, str]    # path → "sha256:<64-hex>"
    directories:    list[str]         # explicit empty directories
    created_at:     datetime.datetime
    note:           str
    schema_version: int
json muse read --json --manifest (excerpt)
{
  "schema_version": 1,
  "snapshot_id": "sha256:04ee5ecd...",
  "manifest": {
    "musehub/main.py":                  "sha256:a1b2c3...",
    "musehub/graph/dag.py":             "sha256:d4e5f6...",
    "musehub/graph/push_validator.py":  "sha256:7890ab..."
  },
  "directories": [],
  "created_at": "2026-04-21T23:00:00Z",
  "note": ""
}

Because a snapshot is just a flat map, any two snapshots can be diffed instantly: paths present in both with the same blob ID are unchanged; paths with different IDs are modified; paths only in one are added or removed. Domain plugins receive this pair as their base and target in diff(), then interpret the semantic meaning.

muse read --json returns commit metadata and file-level changes (files_added, files_modified, files_removed) but not the full manifest. Add --manifest to get the complete path → object_id map for every tracked file.

Branch DAG

Branches are mutable named pointers to commit IDs, stored in .muse/refs/heads/<branch>. The commit graph is an immutable DAG; the branch pointer advances atomically when you commit or merge.

Merge commits have two parents: parent_commit_id (the branch being merged into) and parent2_commit_id (the branch being merged from). Three-way merge is computed from the common ancestor found by walking both chains backward to their lowest common ancestor.

bash branch lifecycle
# Create and switch in one command, with intent metadata
muse checkout -b task/rate-limiting \
  --intent "implement token bucket rate limiter" \
  --resumable

# Work, stage, commit
muse code add src/rate_limiter.py
muse commit -m "feat: token bucket rate limiter" \
  --agent-id claude-code --model-id claude-sonnet-4-6 --sign

# Merge back
muse checkout dev
muse merge task/rate-limiting     # three-way, harmony auto-resolves known conflicts
muse branch -d task/rate-limiting
muse push local dev

Branch flow

BranchRoleRule
mainProductionTagged releases only; never direct-pushed
devIntegrationLatest deliverable state
feat/*FeatureShort-lived; one atomic task; hours not days
task/*Agent taskSame as feat/*; carries --intent and --resumable
bugfix/*Bug fixFrom dev; merges into dev
hotfix/*Hot fixFrom main; merges into main AND dev

Resumable branches

Branches carry --intent (a free-text description of the task) and --resumable (a boolean signal that another agent may safely pick this up mid-flight). Both are stored in branch metadata, readable via muse branch --json, and surfaced in the coordination bus so orchestrators can assign in-progress work to idle agents.

json muse branch --json (excerpt)
[
  {
    "name":       "task/rate-limiting",
    "current":    false,
    "intent":     "implement token bucket rate limiter",
    "resumable":  true,
    "created_by": "claude-code",
    "commit_id":  "sha256:6ab243..."
  }
]

Round-trip walkthrough

This walkthrough traces a single file change from working tree all the way to a remote and back, showing the sha256: IDs at each step so you can see how objects, snapshots, commits, and branch refs compose.

Step 1 — stage

muse code add hashes each file into the object store and records the path → blob-ID mapping in the staging area.

bash
muse code add src/rate_limiter.py
staged  src/rate_limiter.py  sha256:f3a7b92c4de1…

The blob sha256:f3a7b92c4de1… is written to .muse/objects/sha256/f3/a7b92c4de1… immediately — it exists in the object store whether or not you ever commit.

Step 2 — commit

muse commit builds a SnapshotRecord from the staging area, hashes it to get a snapshot ID, then builds a CommitRecord referencing that snapshot and the current branch tip as parent.

bash
muse commit -m "feat: token bucket rate limiter" \
  --agent-id claude-code --model-id claude-sonnet-4-6 --sign
committed  sha256:9e21b8a4f273…
  snapshot   sha256:c8d5e1f09ab3…
  parent     sha256:4c9e7959beef…
  branch     task/rate-limiting
  signed     ed25519:AAAA…

Three new objects hit disk:

IDTypeContents
sha256:f3a7b92c… Blob Raw bytes of src/rate_limiter.py
sha256:c8d5e1f0… Snapshot Full path → blob-ID map for the entire working tree
sha256:9e21b8a4… Commit Message, author, agent provenance, snapshot_id, parent_commit_id, signature

The branch ref .muse/refs/heads/task/rate-limiting is updated atomically to sha256:9e21b8a4f273….

Step 3 — push

muse push computes the set of objects the remote does not have, packs them into an MPack, and POSTs to the hub.

bash
muse push local task/rate-limiting
Pushing task/rate-limiting → local
  3 objects  (1 blob, 1 snapshot, 1 commit)
   sha256:f3a7b92c…  blob       4.2 kB
   sha256:c8d5e1f0…  snapshot   1.1 kB
   sha256:9e21b8a4…  commit     0.8 kB
  branch task/rate-limiting → sha256:9e21b8a4…
 pushed in 142 ms

The hub stores all three objects, verifies the Ed25519 signature, and advances the branch pointer in a single transaction.

Step 4 — pull

A second agent (or a second machine) pulls the branch. Muse fetches only the objects it doesn't already have — content-addressability makes deduplication trivial.

bash
muse pull local task/rate-limiting
Fetching task/rate-limiting from local
  3 new objects
   sha256:f3a7b92c…  blob
   sha256:c8d5e1f0…  snapshot
   sha256:9e21b8a4…  commit
  branch task/rate-limiting → sha256:9e21b8a4…
 fast-forward, working tree updated

Because every ID is derived from content, the pull output IDs are byte-for-byte identical to the push output. There is no translation, no rebase, no rewriting — the same objects are present on both sides.

Inspecting the result

bash
muse log --json
{
  "truncated": false,
  "commits": [
    {
      "commit_id":        "sha256:9e21b8a4f273…",
      "message":         "feat: token bucket rate limiter",
      "committed_at":    "2026-04-30T18:22:04Z",
      "author":          "gabriel",
      "agent_id":        "claude-code",
      "model_id":        "claude-sonnet-4-6",
      "parent_commit_id":"sha256:4c9e7959…",
      "snapshot_id":     "sha256:c8d5e1f0…"
    },
    …earlier commits…
  ]
}
bash
muse read --json --manifest
{
  "commit_id":     "sha256:9e21b8a4f273…",
  "snapshot_id":   "sha256:c8d5e1f09ab3…",
  "files_added":   ["src/rate_limiter.py"],
  "files_modified":[],
  "files_removed": [],
  "manifest": {
    "src/rate_limiter.py":   "sha256:f3a7b92c4de1…",
    "src/main.py":           "sha256:a1b2c3d4e5f6…",
    "src/config.py":         "sha256:789abc012def…"
  }
}
bash
muse diff --staged
+++ src/rate_limiter.py  (new file, sha256:f3a7b92c…)
+ class TokenBucket:
+     def __init__(self, rate: float, burst: int) -> None:
+         self.rate  = rate
+         self.burst = burst
+         self._tokens = burst
+         self._last   = time.monotonic()
+
+     def consume(self, n: int = 1) -> bool:
+

.museignore

.museignore is a TOML file that tells Muse which files to exclude from tracking. It has three section types:

SectionApplied when
[global]All domains, always
[domain.<name>]Only when the active domain is <name>
[force_track]Whitelist — exact paths that bypass all ignore rules
toml .museignore
[global]
patterns = [
    ".DS_Store",
    "Thumbs.db",
    "*.tmp",
    "*.swp",
]

[domain.code]
patterns = [
    "__pycache__/",
    "*.pyc",
    ".venv/",
    "dist/",
    "build/",
    "*.egg-info/",
    "node_modules/",
]

# [force_track]
# Exact repo-relative paths to track even if they match a secrets pattern.
# paths = [
#     "deploy/local-tls/localhost.key",
# ]

Pattern syntax

PatternMatches
*.pycAny .pyc file at any depth
__pycache__/Any directory named __pycache__ (trailing / = directory)
/dist/Only dist/ at the repo root (leading / = anchored)
!important.tmpUn-ignore a previously matched path (leading ! = negate)
src/*.min.jsMinified JS files directly inside src/ (* excludes /)
tests/fixtures/**All contents of tests/fixtures/ recursively (** includes /)

Patterns are evaluated in order — global first, then domain-specific. The last matching rule wins, mirroring gitignore semantics.

[force_track] — override the secrets blocklist

Muse automatically blocks certain file types from tracking (e.g. *.key, *.pem, .env). The [force_track] section lists exact repo-relative paths (no globs) that must be tracked regardless. Use it for dev infrastructure that would otherwise be blocked.

toml .museignore — force_track
[force_track]
paths = [
    "deploy/local-tls/localhost.key",
    "deploy/local-tls/localhost.crt",
]
muse check-ignore <path> --json tells you whether a given path is ignored and which rule matched. Use it when muse status shows a file as untracked and you want to understand why.

Serialization

Commits and snapshots are serialized with msgpack, not JSON. On real-world code repositories, msgpack is 3–6× faster to encode and decode, and produces smaller files. The coordination bus uses JSON (infrequent, small payloads) and the MCP wire uses JSON for tool calls, but the core object store is msgpack throughout.

DataFormatReason
Commits / snapshotsmsgpack3–6× faster; binary-safe for blob content
Objects (blobs)raw bytesNo encoding overhead
Coordination recordsJSONInfrequent; human-readable debugging
Harmony patternsJSONInfrequent; inspectable
Wire push (HTTP)msgpackapplication/x-msgpack
MCP tool callsJSONMCP protocol requirement

msgpack CommitRecord — decoded

Commits are stored as msgpack binary files. The JSON below is the decoded equivalent — every field in the msgpack maps 1:1 to the CommitRecord dataclass. The sha256: prefix is stored as a plain string; it is never stripped.

bash
python3 -c "import msgpack,json,sys; d=msgpack.unpackb(open(sys.argv[1],'rb').read(),raw=False); print(json.dumps(d,indent=2))" \
  .muse/objects/sha256/9e/21b8a4f273…
{
  "commit_id":          "sha256:9e21b8a4f273…",
  "repo_id":             "sha256:0000genesis…",
  "branch":              "task/rate-limiting",
  "snapshot_id":         "sha256:c8d5e1f09ab3…",
  "message":             "feat: token bucket rate limiter",
  "committed_at":        "2026-04-30T18:22:04Z",
  "parent_commit_id":    "sha256:4c9e7959…",
  "parent2_commit_id":   null,
  "author":              "gabriel",
  "agent_id":            "claude-code",
  "model_id":            "claude-sonnet-4-6",
  "toolchain_id":        "",
  "prompt_hash":         "",
  "signature":           "ed25519:AAAA…",
  "signer_public_key":   "ed25519:BBBB…",
  "signer_key_id":       "sha256:CCCC…",
  "sem_ver_bump":         "minor",
  "breaking_changes":    [],
  "structured_delta":    null,
  "reviewed_by":         [],
  "test_runs":           0,
  "labels":              [],
  "status":              "",
  "notes":               [],
  "score":               null,
  "format_version":      8
}

Size limits

LimitValueNotes
Max commits per push10,000Rejected at wire layer
Max objects per push1,000Larger batches use presigned URLs
Max object size (inline)38 MBAbove this: presigned upload
Max msgpack file64 MiBPer commit or snapshot file
Max blob in mpack256 MiBMPack format limit
Max string (msgpack)1 MiBAny single string value
Max collection entries1MArray or map

On-disk layout

Every Muse repo is a directory containing a .muse/ subdirectory. There is no index file, no packed-refs, no reflog by default — just the flat object store and a handful of ref files.

text .muse/ directory tree
.muse/
├── repo.json                       # repo_id, domain, owner, created_at
├── HEAD                            # "refs/heads/dev" (symbolic ref)
├── refs/
│   └── heads/
│       ├── main                    # "sha256:<64-hex>\n"
│       └── dev                     # "sha256:<64-hex>\n"
├── objects/
│   └── sha256/
│       └── ab/                     # first 2 hex chars (sharding)
│           └── <62-hex>           # commits, snapshots, and blobs — unified store
├── coordination/                   # multi-agent symbol reservations
│   ├── reservations/
│   ├── intents/
│   ├── releases/
│   └── heartbeats/
├── harmony/                        # conflict resolution memory
│   ├── patterns/
│   ├── policies/
│   └── audit/
└── agent.md                        # repo-specific agent rules

The two-level sharding on objects (sha256/ab/<62-hex>) keeps directory sizes bounded: at one million objects, each shard directory holds ~3,900 files on average — well within filesystem limits on all major platforms.

MuseHub server store

The MuseHub server uses the same on-disk format per repo under /data/repos/. Object IDs, shard directories, and ref files are byte-for-byte compatible with the client store — a blob stored by a push and a blob stored locally are indistinguishable at the byte level. The database holds only metadata caches and the collaboration layer; authoritative repo state is always on disk.

text Server-side per-repo tree
/data/repos/<owner>/<slug>/
├── objects/
│   └── sha256/                     # algorithm namespace (mldsa65/ slots in here)
│       └── ab/                     # 2-char hex shard
│           └── <62-hex>           # raw blob — same layout as .muse/objects/
└── refs/
    └── heads/
        ├── main                    # "sha256:<64-hex>" — same format as client
        └── dev
DB tableRole
musehub_commitsCache — fast graph queries, search, API listing
musehub_snapshotsCache — fast manifest lookups
musehub_branchesCache — fast branch listing; disk ref is authoritative
musehub_reposCanonical — repo metadata, visibility, owner
musehub_identities / musehub_auth_keysCanonical — identity and auth
musehub_issues / musehub_proposalsCanonical — collaboration layer
musehub_objectsCanonical — storage_uri + size_bytes index for fetch path resolution
Push negotiation checks object existence directly on disk — never in the DB. This means force-resign, migration, or partial push failures cannot corrupt the have/want walk. If the DB cache drifts from disk, GET /repos/{owner}/{repo}/branches/{name}/repair heals it.

CLI reference

Every command accepts --json. The --json output is the stable machine contract; the default terminal output is for humans and is not versioned. Use muse -C ~/path/to/repo <cmd> when your working directory differs from the target repo.

Core workflow

TaskCommand
Initialise repomuse init [--domain code|midi|identity]
Working-tree statusmuse status --json
Stage filesmuse code add <path> / muse code add .
Unstagemuse code reset <path>
Delete + stage deletionmuse rm <path>
Commitmuse commit -m "msg" [--agent-id X --model-id Y --sign]
Historymuse log --json
Inspect commitmuse read --json [--manifest]
Diff working treemuse diff / muse diff --staged
Diff two refsmuse diff HEAD~3 HEAD --json
List branchesmuse branch --json
Switch / create branchmuse checkout [-b] <branch> [--intent "..." --resumable]
Three-way mergemuse merge <branch>
Dry-run mergemuse merge --dry-run <branch> --json
Shelf (stash)muse shelf save [-m "msg"] / muse shelf pop
Tagmuse tag add "label" [<ref>]
Releasemuse release add <semver>
muse code add . stages new files, modifications, and deletions of already-tracked files — equivalent to git add -u && git add . combined. To remove a file from tracking without deleting it from disk, use muse rm --cached <path>.

muse status --json shape

The status JSON schema is always identical regardless of domain or staging state. All keys are always present — no dict.get guards needed.

json
{
  "branch":                    "dev",
  "head_commit":               "sha256:abc...",
  "upstream":                  null,          // tracking remote name, or null
  "ahead":                     null,          // commits ahead of remote; null when no upstream
  "behind":                    null,          // commits behind remote; null when no upstream
  "clean":                     true,          // true only when no staged, unstaged, or untracked files
  "dirty":                     false,         // always NOT clean
  "total_changes":             0,             // tracked-file changes (added+modified+deleted+renamed)
  "untracked_count":           0,             // len(untracked); nonzero when dirty but total_changes==0
  "added":                     [],            // flat union of staged + unstaged
  "modified":                  [],
  "deleted":                   [],
  "renamed":                   {},            // old_path → new_path map
  "staged": {
    "added": [], "modified": [], "deleted": []
  },
  "unstaged": {
    "added": [], "modified": [], "deleted": [], "renamed": {}
  },
  "untracked":                 [],            // on-disk but not tracked; presence makes clean=false
  "conflict_paths":            [],
  "merge_in_progress":         false,
  "merge_from":                null,          // branch being merged; null when no merge
  "conflict_count":            0,
  "checkout_interrupted":      false,
  "checkout_target":           null
}

Push / pull

Push sends a WireMPack — a compact envelope containing every commit, snapshot, and object the remote doesn't already have — over application/x-msgpack to POST /{owner}/{slug}/push. The hub validates the mpack, stores objects atomically, and advances the branch pointer in a single transaction. Pull is the reverse: the client fetches an mpack from the hub and integrates it locally.

python WireMPack shape (musehub.models.wire)
class WireMPack(BaseModel):
    commits:      list[WireCommit]    # CommitRecord dicts
    snapshots:    list[WireSnapshot]  # SnapshotRecord dicts
    objects:      list[WireObject]   # raw blob bytes
    branch_heads: dict[str, str]      # branch → commit_id

class WireObject(BaseModel):
    object_id: str
    content:   bytes                  # raw; no base64
    path:      str = ""
    encoding:  str = "raw"           # "raw" | "zlib" | "delta+zlib"
    base_id:   str | None             # set for delta-encoded objects

Push flow

The hub validates a push in this order before persisting anything:

text
1.  Verify MSign Authorization header (Ed25519, ±30s replay window)
2.  Resolve repo — owner + slug → repo_id + repo_root (/data/repos/<owner>/<slug>/)
3.  Confirm pusher has write access
4.  Negotiate have/want — hub checks object existence on disk, not in DB
5.  Validate mpack schema + ID format (sha256:<hex>, ≥32 chars)
6.  Enforce push limits (max 10k commits, 1k objects, 38 MB/object)
7.  Persist objects → /data/repos/<owner>/<slug>/objects/sha256/<2-hex>/<62-hex> (atomic)
8.  Persist snapshots → musehub_snapshots (DB cache)
9.  Persist commits → musehub_commits (DB cache)
10. Advance branch pointer → refs/heads/<branch> on disk (atomic rename), then musehub_branches (cache)
11. Update repo.pushed_at timestamp
12. Upsert reachability index (musehub_object_refs)
bash
# Push dev branch to the local hub
muse push local dev

# Push to staging
muse push staging dev

# Pull from remote
muse pull local dev

# Check configured remotes
muse remote --json
Push returns 404 ("Repository not found on remote") if the repo hasn't been created on the hub yet. Create it first via muse hub repo create --name <name> --json, then retry.