Wire fetch pseudocode — client and server (first principles)
Overview
`fetch` downloads commits, snapshots, and blobs from a remote branch into the local repo without touching the working tree or advancing the local branch pointer. It is the shared primitive that both `clone` and `pull` call into.
Client
``` fetch <remote> <branch>:
discover remote state GET /refs (no filter — returns all branch heads) → { branch_heads: { <name>: <commit_id>, ... } } remote_tip = branch_heads[<branch>] or null if remote_tip is null: exit 0 (branch doesn't exist on remote — nothing to fetch)
discover local state have = all commit_ids reachable from ALL local branch heads (what we already have locally — dedup anchors) local_tip = local branch head for <branch> or null
if local_tip == remote_tip: exit 0 (already up-to-date)
compute want set want = { remote_tip } (MVP: single branch tip) missing = want - have (commits we don't have yet) if missing is empty: exit 0 (we already have everything the remote has)
POST /fetch/plan { want: [remote_tip], have: [...have] } → { commits_to_send: [commit_id, ...], # topo sorted, parents first snapshots_to_send: [snapshot_id, ...], blobs_to_send: [blob_id, ...], mpack_key: "sha256:...", mpack_size_bytes: N, } Server computes the minimal set of objects the client is missing. Client trusts the list but re-verifies everything on receipt.
GET /fetch/mpack?mpack_key=<mpack_key> → presigned GET URL for the mpack binary in object store (URL may be a direct S3/MinIO presigned URL)
GET <presigned_url> download mpack_bytes
verify integrity before any writes: actual_key = blob_id(mpack_bytes) # sha256("blob <size>\0" + bytes) if actual_key != mpack_key: abort: "mpack integrity check failed — fetch corrupt or tampered"
parse mpack → { commits, snapshots, blobs } verify b"MUSE" magic → abort if not MUSE binary format
write blobs to local object store: for each blob in mpack["blobs"]: actual_id = blob_id(blob["content"]) if actual_id != blob["object_id"]: warn "skipped corrupt blob <blob.object_id>" skipped_blobs += 1 continue if not has_object(local_root, blob["object_id"]): write_object(local_root, blob["object_id"], blob["content"]) blobs_written += 1
write snapshots to local snapshot store: for each snapshot in mpack["snapshots"] (topo order, parents first): if has_snapshot(local_root, snapshot["snapshot_id"]): continue # idempotent write_snapshot(local_root, SnapshotRecord(**snapshot)) snapshots_written += 1
write commits to local commit store: for each commit in mpack["commits"] (topo order, parents first): if has_commit(local_root, commit["commit_id"]): continue # idempotent write_commit(local_root, CommitRecord(**commit)) commits_written += 1
advance FETCH_HEAD (not the branch pointer): write local_root/.muse/FETCH_HEAD = remote_tip NOTE: fetch never touches refs/heads/<branch> that is pull's job (fetch + merge)
return { commits_written, snapshots_written, blobs_written, skipped_blobs, remote_tip, already_up_to_date: False } ```
Server
POST /fetch/plan
``` receive { want: [commit_id, ...], have: [commit_id, ...] }:
validate auth (MSign header)
resolve want set: for each want_id: if not exists in commits table → 404 "commit not found: <id>"
walk commit DAG from each want_id: commits_to_send = BFS/DFS from want set, stopping at any id in have set (topo sorted: parents before children) NOTE: "have" set may be empty (clone case)
collect snapshots: snapshots_to_send = { c.snapshot_id for c in commits_to_send } minus any snapshot_id already in have's snapshot closure topo sort (parents before children)
collect blobs: blobs_to_send = union of all object_ids referenced in each snapshot_to_send's manifest minus any blob_id reachable from have set NOTE: compute manifest by applying deltas from root or loading stored manifest blob
build mpack binary: mpack = build_wire_mpack({ commits: [commit_record for c in commits_to_send], snapshots: [snapshot_record for s in snapshots_to_send], blobs: [BlobPayload(object_id=b, content=read_object(b)) for b in blobs_to_send], }) mpack_key = blob_id(mpack) # sha256("blob <size>\0" + bytes) size_bytes = len(mpack)
store mpack in object store (MinIO/S3): PUT <s3_bucket>/<mpack_key> = mpack bytes (reuse if already stored — mpack_key is content-addressed)
return { commits_to_send: [c.commit_id for c in commits_to_send], snapshots_to_send: [s.snapshot_id for s in snapshots_to_send], blobs_to_send: [b for b in blobs_to_send], mpack_key: mpack_key, mpack_size_bytes: size_bytes, } ```
GET /fetch/mpack
``` receive { mpack_key: "sha256:..." } (query param)
validate auth
verify mpack_key exists in object store → 404 if missing (client must call /fetch/plan first)
generate presigned GET URL (expiry=15 min)
return { presigned_url } ```
FETCH_HEAD semantics
``` .muse/FETCH_HEAD — written by fetch, read by pull
format: <commit_id> (single line, the remote tip that was fetched)
pull reads FETCH_HEAD to know what to merge: fetched_tip = read(.muse/FETCH_HEAD) muse merge fetched_tip ```
Error table
| Condition | Exit code | Message |
|---|---|---|
| Branch not on remote | 0 | "nothing to fetch" |
| Already up-to-date | 0 | "already up-to-date" |
| Integrity failure | 1 | "mpack integrity check failed" |
| Corrupt blob (skipped) | 3 (PARTIAL) | "skipped N corrupt blobs" |
| Remote commit not found | 1 | "commit not found: <id>" |
| Network error | 1 | "fetch failed: <reason>" |
Pseudocode: clone (fetch + init)
``` clone <url> <dest>:
- mkdir <dest>
- muse init <dest>
- muse remote add local <url> (or origin — TBD)
- fetch local <default_branch> ← calls the fetch logic above
- checkout -b <default_branch>
- restore working tree from FETCH_HEAD snapshot manifest: for each (path, object_id) in manifest: content = read_object(local_root, object_id) write file to <dest>/<path>
- update refs/heads/<default_branch> = FETCH_HEAD
- clear FETCH_HEAD ```
Pseudocode: pull (fetch + merge)
``` pull <remote> <branch>:
fetch <remote> <branch> ← calls the fetch logic above if already_up_to_date: exit 0
fetched_tip = read(.muse/FETCH_HEAD)
muse merge fetched_tip ← existing merge logic
clear FETCH_HEAD ```
Out of scope (post-MVP)
- Shallow fetch (`--depth N`)
- Partial clone / blob filters (`--filter blob:none`)
- Fetch by tag or arbitrary ref
- Multi-remote fan-out fetch
- Delta compression in the mpack wire format
- Byte-range GET read path (`mpack://` URI + mpack_index table)
- Fetch pack splitting for packs > 512 MiB
Implementation complete — closing
All pseudocode steps are implemented and validated end-to-end.
Deviations from pseudocode (intentional improvements):
POST /fetch/plan+GET /fetch/mpack?key=were collapsed into a singlePOST /fetch/mpackthat returns the presigned GET URL inline. One less round-trip..muse/remotes/<remote>/<branch>tracking refs rather than a single globalFETCH_HEADfile. This correctly tracks per-remote/per-branch state and is whatmuse pullreads.Bugs fixed during implementation (all shipped):
signer_public_keywas hardcoded to""in push unpack — commit hash verification failed on clonedirectoriesnot read in delta format path of_apply_snapshot_deltas— snapshot hash mismatch for repos with tracked directoriesbuild_wire_mpack— blob content arrived garbled; fixed by decompressing before packingcommit_existsfilter on have anchors caused full re-push when remote had commits not in local store; removeddirectoriesnot included in snapshot deltas sent in push mpack — server stored[]; fixedValidated: wire-hello (16 commits), muse-zsh (17 commits) — clone, fetch, pull all confirmed working.