snapshot.directories: list[str] → dict[str, str] — make empty dirs fully first-class in the snapshot schema
Background
Empty directories are now content-addressed at the stage level (EMPTY_DIR_OID = blob_id(b'')). However, snapshot.directories is still list[str] — just a list of path strings with no object_id. This is inconsistent with the file manifest which maps path → sha256 object_id.
The goal of this issue is to make the snapshot schema fully symmetric:
# Current
snapshot.manifest = {"src/main.py": "sha256:abc..."}
snapshot.directories = ["mydir", "emptydir"]
# Target
snapshot.manifest = {"src/main.py": "sha256:abc..."}
snapshot.directories = {"mydir": "sha256:473a...", "emptydir": "sha256:473a..."}
All empty directories share the same object_id (EMPTY_DIR_OID = sha256:473a0f4c3be8a93681a267e3b1e9a7dcda1185436fe141f7749120a303721813) since they have identical content (zero bytes). This is correct and efficient — one object in the store, referenced by all dir entries.
Why this matters
- Consistency: snapshot is the unit of content-addressing; a path list with no object_ids is structurally unlike every other part of the system
- Push/pull completeness: with list[str], push cannot verify dir objects are present on the remote; with dict, dirs participate in the same object-transfer logic as files
- Future metadata: if dirs ever gain content (permissions, ACLs, .musekeep metadata), the dict schema already supports it — list[str] does not
- Rust port alignment: the Rust port should not need to special-case directories differently from files in the snapshot layer
Implementation plan (TDD, phased)
Phase 1 — Schema definition and read/write migration
Tests first:
- snapshot_record_accepts_directories_as_dict
- read_snapshot_with_list_dirs_migrates_to_dict (backward compat)
- hash_snapshot_with_dict_dirs_is_stable (deterministic hash)
- write_then_read_snapshot_preserves_dir_dict
Implementation:
- Change SnapshotRecord.directories: list[str] → dict[str, str]
- Update hash_snapshot(manifest, directories) to accept dict
- Update write_snapshot / read_snapshot: read list format migrates transparently to dict (all values become EMPTY_DIR_OID)
- Update SnapshotManifest TypedDict to match
Phase 2 — Plugin and stage layer
Tests first:
- plugin_snapshot_returns_directories_as_dict
- stage_status_reads_dir_oid_from_snapshot_dict
- workdir_snapshot_directories_is_dict
Implementation:
- CodePlugin.snapshot() returns SnapshotManifest with directories as dict
- workdir_snapshot() same
- _head_snapshot_dirs_for() returns dict, not list
- Callers that iterate directories update accordingly (set(dirs) → set(dirs.keys()))
Phase 3 — Diff engine
Tests first:
- diff_detects_dir_added_from_dict_snapshots
- diff_detects_dir_removed_from_dict_snapshots
- diff_detects_dir_renamed_via_content_match (all same oid, match by elimination)
Implementation:
- plugin.diff() consumes dict directories
- directory rename detection uses dict keys
- AddressedInsertOp / DeleteOp / RenameOp emission unchanged
Phase 4 — Push/pull wire format
Tests first:
- push_includes_empty_dir_object_in_mpack
- pull_receives_and_stores_empty_dir_object
- bundle_inspect_shows_dir_objects
Implementation:
- Push mpack builder: include EMPTY_DIR_OID object when any dir is in snapshot
- Pull/unbundle: no change needed (object store handles zero-byte blobs naturally)
- bundle diff/inspect: show dir objects
Phase 5 — CLI surface (status, diff, read)
Tests first:
- status_json_staged_added_uses_dir_oid_not_sentinel_string
- read_manifest_includes_directories_dict
- diff_json_includes_dir_object_id_in_ops
Implementation:
- status --json: staged.added dir entries show object_id = sha256:473a...
- read --manifest: directories field is dict not list
- Any display that showed 'dir:' now shows proper sha256
Phase 6 — Snapshot migration tool
Tests first:
- migrate_existing_repo_converts_list_dirs_to_dict
- migration_is_idempotent (run twice = no change)
- migration_preserves_commit_ids (commits are not rewritten, only snapshots)
Implementation:
- muse migrate --dirs (or automatic on first access)
- Walk all snapshots, rewrite those with list-format directories
- Log count of migrated snapshots
Invariants across all phases
- EMPTY_DIR_OID = sha256:473a0f4c3be8a93681a267e3b1e9a7dcda1185436fe141f7749120a303721813 (never changes)
- All existing repos with list-format directories must migrate transparently — no manual user action
- hash_snapshot output must be stable: same files + dirs → same hash regardless of list vs dict input format
- All 107 existing directory tests must remain green throughout
Out of scope
- Non-empty directory objects (tree objects) — future work
- Changing EMPTY_DIR_OID itself — the zero-byte blob is correct and stable
- Renaming snapshot fields — keep backward compat naming