fix: globally content-addressed storage — drop repo_id from object paths
Objects are content-addressed globally (same hash = same bytes = one object, period) matching how Git/GitHub/S3 work at web scale. The DB already had this right: object_id is the global PK. Storage had a per-repo path prefix that contradicted it, causing the recurring 'Empty file' bug on blob_page.
Root cause: wire_push_objects bulk existence check saw an object already in DB (stored for repo A) and skipped the upload for repo B, leaving no bytes at repo B's per-repo storage path. blob_page called storage.get(repo_B_id, oid) and got None → 'Empty file'.
Fix: - LocalBackend: objects now at objects/{object_id} (global, not per-repo) - S3Backend: key is now objects/{object_id} (global, not per-repo) - Both backends fall back to the old per-repo path on read misses so existing data on disk/R2 stays readable without a migration script - wire_push_objects: global DB existence check (no repo_id filter) - wire_filter_objects: same — global check, objects deduplicated across repos
Also: - muse code cat error message now tells agents exactly what to do when they pass a bare file path (a common AX failure point) - AGENTS.md / docs/agent-guide.md: anti-pattern entry for muse code cat bare file path - snapshots.py: optional_token auth dependency on all snapshot routes - Regression test: test_push_objects_cross_repo_object_globally_deduped
0 comments
muse hub commit comment sha256:6d327a10bdefc81505e52fd85989b63686115a0972af2f2d98bfb14d5317febd --body "your comment"
No comments yet. Be the first to start the discussion.