Remove repo_id from CommitRecord and resolve_commit_ref — client/server parity
Context
musehub removed repo_id from musehub_commits and musehub_snapshots in May 2026 — commits and snapshots are globally content-addressed objects; per-repo membership is tracked via MusehubCommitRef and MusehubSnapshotRef. The muse CLI still carries repo_id in CommitRecord and passes it as a dead parameter through ~50 commands. This ticket closes that gap.
Key invariants confirmed before planning:
compute_commit_iddoes NOT includerepo_id— no existing commit hashes are invalidatedresolve_commit_reftakesrepo_idbut never uses it — it is a dead parameter- musehub already does
d.get("repo_id", "")on wire commits — handles absence today TagRecordandReleaseRecordkeep theirrepo_id— tag IDs hash with it and filesystem sharding uses it; those are correctly repo-scoped
Phase 1 — Decouple repo_id from CommitRecord (forward/backward compat)
Scope: muse/core/store.py — CommitRecord, CommitDict, from_msgpack, from_dict, to_dict
- Give
repo_ida default of""onCommitRecordso it is optional at construction to_dict(): stop emittingrepo_id— musehub already handles its absencefrom_msgpack(): keep reading it (silently discard) so old on-disk files parse without error- No on-disk rewrite. No wire breakage.
Phase 2 — Remove repo_id from resolve_commit_ref
Scope: muse/core/store.py::resolve_commit_ref + all ~50 command callers
- Drop
repo_id: strparameter fromresolve_commit_ref— unused in the function body - Remove
repo_idargument from every call site inmuse/cli/commands/ - Commands affected: age, agent_map, annotate, api_surface, apply_patch, archive, bisect, blame, blast_risk, branch, breakage, bundle, cadence, cat, check, checkout, checkout_symbol, cherry_pick, clones, codemap, compare, content_grep, contour, contract, core_blame, core_cat, coupling, and ~25 more
Phase 3 — Remove repo_id from commit creation sites
Scope: muse/cli/commands/commit.py, commit_tree.py, bridge.py, test helpers
commit.py: stop readingrepo_id = read_repo_id(root)when its only use wasCommitRecord(repo_id=repo_id, ...)commit_tree.py: samebridge.py:cr.repo_idused in two places — replace withread_repo_id(root)directly where the git bridge still needs itmuse/core/mpack.py:repo_idparam onbuild_mpackis for tag lookup only, not embedded in commits — keep that param, just stop reading it fromCommitRecord
Phase 4 — Delete repo_id from CommitRecord and CommitDict entirely
Scope: muse/core/store.py
- Remove
repo_id: strfield fromCommitRecorddataclass - Remove
repo_id: strfromCommitDictTypedDict - Remove
repo_id=self.repo_idfromto_dict() - Keep
from_msgpacksilently ignoringrepo_idif present — this compat shim stays permanently
Gate: Phases 1-3 must be complete and all tests green before this runs.
Phase 5 — Prune orphaned read_repo_id imports
Scope: ~20 command files
- After Phases 2 and 3, commands whose only use of
read_repo_idwas for commit resolution no longer need it - Commands still using
read_repo_idfor tags and releases keep it - Grep each file post-Phase-3 for remaining
repo_idusage to determine which imports to drop
Phase 6 — Docs, tests, wire verification
- Update
docs/reference/type-contracts.md— removerepo_idrows from CommitRecord and CommitDict tables - Update
docs/commit-id-v2-rewrite.md— remove the Step 2 Load repo_id section - Update
docs/reference/plumbing.md,cli-tiers.md,remotes.md— removerepo_idfrom example commit JSON - Tests: find all
CommitRecord(repo_id=...)constructors and remove the kwarg - Wire smoke test: push a commit from patched CLI to staging, confirm musehub accepts it without
repo_idin the payload
Execution order
Phases are strict. Phase 1 is the safety net — makes repo_id optional so Phases 2-4 can be done file-by-file without breaking anything mid-flight. Phase 4 (hard delete) only lands after all call sites are clean. Phases 5 and 6 are cleanup.
Phase 2 complete ✅
Branch: task/issue-9-phase2-resolve-commit-ref → merged to dev → pushed to local/dev
What shipped
resolve_commit_ref(repo_root, repo_id, branch, ref)→resolve_commit_ref(repo_root, branch, ref)repo_idwas a dead parameter — never read in the function body, only threaded through the recursive tilde-notation call- 91 call sites updated across 87 command files, 2 core files (
store.py,doc_history.py), 4 test files - Caught and fixed an aliased call in
narrative.py(_rcrlocal import also used the old 4-arg form) - 1215 tests pass
Up next: Phase 3
Clean all commit-creation sites — stop passing repo_id when constructing CommitRecord objects.
Phase 3 complete — merged to dev (sha256:58267095dfd3).
Removed repo_id= from every CommitRecord(...) constructor call across the entire codebase:
- Source files:
muse/cli/commands/bridge.py,muse/cli/commands/rebase.py(plus all commands that pass throughwrite_commit) - Test files: ~190 files cleaned
- Preserved intact:
TagRecord,ReleaseRecord,RemoteInfo,CommitDict,_init_repohelpers,compute_release_idcalls, TypedDict instantiations — all have their own repo scoping semantics
196 files changed in one atomic commit.
Phase 4 complete — merged to dev (sha256:e3248448b953).
Hard-deleted repo_id: str = "" from CommitRecord dataclass. The CommitDict TypedDict was already clean (no repo_id field), to_dict() was already not emitting it, and from_msgpack() already silently ignores it when present in old on-disk files — the compat shim remains permanently by omission.
Phase 4 sweep complete — merged to dev (sha256:fded69da598f).
Three additional items caught on re-sweep:
store.pymodule docstring commit schema: removed stale"repo_id"key from the example JSON block_verify_commit_iddocstring: removedrepo_idfrom thecompute_commit_idcoverage list (it was never in the formula)get_commits_for_branch: removed deadrepo_id: strparameter that was never read in the function body, plus its docstring entry
Updated 21 call sites across 9 files (log.py, shortlog.py, and 7 test files). 512 tests pass.
Phase 5 complete — orphaned read_repo_id imports pruned
Commit: sha256:4d399952496819361713d9828d1978f4978a517d2dfd234d05c7c76c1c5fa707
What was done
Removed all dead read_repo_id import + dead-assignment patterns from ~70 command files that were left behind after Phases 2–4 stripped the legitimate repo_id usages.
DROP files (~70 command files): Removed from muse.core.repo import read_repo_id and the repo_id = read_repo_id(root) assignment line from every file where the result was never consumed downstream.
Dead-param helper functions cleaned (6 files):
branch.py—_resolve_start_pointlost itsrepo_id: strparamcheckout.py—_checkout_with_mergelost itsrepo_id: strparambundle.py—_resolve_refslost itsrepo_id: strparamlineage.py—_gather_commitslost itsrepo_id: strparammidi_compare.py—_loadlost itsrepo_id: strparamtype_cmd.py—_get_manifestlost itsrepo_id: strparam; entireread_repo_idimport dropped
TypedDict cleaned: _ReadCommitJson in read_commit.py lost its repo_id: str field.
Test fixtures updated:
test_cmd_branch.py— updated_resolve_start_pointcall sitestest_cmd_checkout.py— removed dead assignmenttest_cmd_plan_merge.py— removed 10 stalepatch('...plan_merge.read_repo_id')callstest_cmd_pull_hardening.py— removed 1 stalepatch('...pull.read_repo_id')calltest_rebase_missing_snapshot_guard.py— removed dead import + assignmenttest_directories_feature.py— removed dead import + assignment from 3 locations
KEEP files (untouched): bridge.py, describe.py, docs_cmd.py, log.py, release.py, shortlog.py, snapshot_cmd.py, tag.py, verify.py — all legitimately use read_repo_id for tag/release/identity purposes.
Tests passing
test_phase5_store_linear_walks.py: 7 passedtest_core_store.py+test_core_coverage_gaps.py: 70 passedtest_cmd_branch.py+test_cmd_checkout.py: 273 passedtest_core_repo.py+test_cmd_tag_hardening.py+test_rebase_missing_snapshot_guard.py+test_directories_feature.py: 303 passedtest_cmd_plan_merge.py+test_cmd_pull_hardening.py: 121 passed (8 pre-existing failures in pull tests unrelated to Phase 5)
Next: Phase 6 — update documentation files (type-contracts.md, commit-id-v2-rewrite.md, plumbing.md, cli-tiers.md, remotes.md) to remove all repo_id references from CommitRecord docs.
Phase 6 complete — documentation updated
Commit: sha256:9ee4bb8346c8f20e0aade6c5462a69a6328cc2102b1890bd6c09b27f528507b7
Changes
docs/reference/type-contracts.md: Removed repo_id row from the CommitDict field table. The field was removed from CommitRecord and CommitDict in Phase 4; the doc entry was stale.
docs/reference/plumbing.md: Removed repo_id and format_version from the read-commit example JSON output. Dropped the stale sentence referencing format_version in the command description.
docs/reference/cli-tiers.md: Removed repo_id from the read-commit key fields JSON example.
docs/reference/remotes.md: Removed repo_id from the MPack wire format commits array example (CommitRecord fields on the wire no longer include repo_id).
docs/commit-id-v2-rewrite.md: Deleted. This was a planning document tracking the v2 commit ID rewrite. Phase 11 of that document explicitly said 'Delete this doc; commit' — all 11 phases were already marked done.
What was NOT changed
All other repo_id references in docs remain — they describe types that still legitimately carry repo_id:
RemoteInfo,WireTag(mpack.py) — remote/wire types that identify the repoTagDict,ReleaseDict(store.py) — tags and releases still userepo_idfor content addressing- Hub API response types (
_HubApiResponse,_RepoJson, etc.) — server-side repo identity - Command output types for KEEP files (describe, shortlog, snapshot, verify) — all legitimately call
read_repo_id
Issue #9 work is now complete across all 6 phases. All orphaned repo_id usages have been removed from CommitRecord, its TypedDicts, all command files, all test fixtures, and all documentation.
All 6 phases verified complete. Reviewed code state in muse/core/store.py:
CommitRecorddataclass has norepo_idfield (confirmed at line 939)resolve_commit_refsignature is clean (2-arg: repo_root, branch, ref)- All
CommitRecord(...)construction sites no longer passrepo_id= - ~70 command files pruned of dead
read_repo_idimports - Docs updated: type-contracts.md, plumbing.md, cli-tiers.md, remotes.md; commit-id-v2-rewrite.md deleted
- Remaining
repo_idreferences in store.py are allTagRecord/ReleaseRecord/path functions — correctly repo-scoped, untouched per spec
Closing.
Phase 1 complete ✅
Branch: task/issue-9-phase1-commit-record-repo-id → merged to dev → pushed to local/dev
What shipped
CommitRecord.repo_iddemoted to optional field (str = "") positioned aftercommitted_at, so old on-disk records that serialised the field deserialise without errorCommitDict.repo_idremoved — field is no longer emitted into_dict(), not passed infrom_msgpack(), not passed infrom_dict()from_msgpack()silently ignores anyrepo_idpresent in existing on-disk files (backward-compat read, forward-clean write)repo_iddropped, constructor call cleaned upUp next: Phase 2
Remove
repo_idas a dead parameter fromresolve_commit_refand all its call sites.