gabriel / musehub public
fix patch task/mpack-index-backfill #2 / 5
AI Agent gabriel · 6 days ago · Jun 1, 2026 · Diff

fix: blobs only in S3/mpack — remove commit/snapshot individual S3 writes

Architecture correction: S3/MinIO is for blobs only. - Commits: DB canonical (musehub_commits), no individual S3 writes - Snapshots: DB canonical (musehub_snapshots + manifest_blob), no S3 writes - Blobs: mpack:// in MinIO, byte-range indexed via musehub_mpack_index

Removed from all commit write paths (proposals, repository, sync, wire_push): commit_to_bytes + backend.put() + storage_uri population

Removed from all snapshot write paths (snapshot, wire_push repair+bulk): snapshot_to_bytes + backend.put() + storage_uri population

Simplified _snap_row_to_wire_s3 and _commit_to_wire_s3 to serve from DB (S3 reads are slower than DB and the data is already there).

New tests in test_blob_only_object_store.py cover the correct architecture. Deleted phase1/4/5 tests that tested the now-incorrect S3 writes for commits/snapshots.

138 tests passing.

sha256:f3995ec2c05c9c34b0e4d6e96349a811d0117a1c51d78096d757998ccb3c0520 sha
+36 ~13 −98 symbols
sha256:8b5e52788fe758ddd20195aad01eb72da3dd7cef966cbe725835adea37837092 snapshot
+36
symbols added
~13
symbols modified
−98
symbols removed
0
dead code introduced
Semantic Changes 147 symbols
~ tests/test_blob_only_object_store.py .py 29 symbols added
+ _make_repo function async_function _make_repo L45–55 ← moved from tests/test_mpack_index_stale_cleanup.py
+ _mock_backend function function _mock_backend L58–65
+ _uid function function _uid L40–42
+ AsyncMock import import AsyncMock L30–30
+ AsyncSession import import AsyncSession L29–29 ← moved from tests/test_object_store_invariant_phase5.py
+ MagicMock import import MagicMock L30–30
+ MusehubBranch import import MusehubBranch L34–34
+ MusehubCommit import import MusehubCommit L34–34
+ MusehubCommitGraph import import MusehubCommitGraph L34–34
+ MusehubCommitRef import import MusehubCommitRef L34–34
+ MusehubObject import import MusehubObject L34–34
+ MusehubRepo import import MusehubRepo L34–34
+ MusehubSnapshot import import MusehubSnapshot L34–34
+ MusehubSnapshotRef import import MusehubSnapshotRef L34–34
+ annotations import import annotations L24–24 ← moved from tests/test_object_store_invariant_phase5.py
+ blob_id import import blob_id L32–32
+ call import import call L30–30
+ compute_identity_id import import compute_identity_id L33–33
+ compute_repo_id import import compute_repo_id L33–33 ← moved from tests/test_object_store_invariant_phase5.py
+ datetime import import datetime L26–26 ← moved from tests/test_object_store_invariant_phase5.py
+ fake_id import import fake_id L32–32 ← moved from tests/test_object_store_invariant_phase1.py
+ patch import import patch L30–30
+ pytest import import pytest L27–27 ← moved from tests/test_object_store_invariant_phase5.py
+ select import import select L28–28
+ test_BOS1_commit_storage_uri_is_null_after_push function async_function test_BOS1_commit_storage_uri_is_null_after_push L73–100
+ test_BOS2_snapshot_storage_uri_is_null_after_push function async_function test_BOS2_snapshot_storage_uri_is_null_after_push L108–131
+ test_BOS3_blob_storage_uri_is_mpack_after_push function async_function test_BOS3_blob_storage_uri_is_mpack_after_push L139–168
+ test_BOS4_commits_served_from_db_not_s3 function async_function test_BOS4_commits_served_from_db_not_s3 L176–217
+ test_BOS6_merge_proposal_commit_storage_uri_is_null function async_function test_BOS6_merge_proposal_commit_storage_uri_is_null L225–269
TestCommitFilesToRepoObjectStore class class TestCommitFilesToRepoObjectStore L317–407
test_commit_is_muse_binary_format method async_method test_commit_is_muse_binary_format L363–383
test_commit_written_to_object_store method async_method test_commit_written_to_object_store L318–337
test_snapshot_is_muse_binary_format method async_method test_snapshot_is_muse_binary_format L385–407
test_snapshot_written_to_object_store method async_method test_snapshot_written_to_object_store L339–361
TestMergeProposalObjectStore class class TestMergeProposalObjectStore L105–248
test_merge_commit_object_is_muse_binary_format method async_method test_merge_commit_object_is_muse_binary_format L129–150
test_merge_commit_storage_uri_populated method async_method test_merge_commit_storage_uri_populated L205–225
test_merge_commit_written_to_object_store method async_method test_merge_commit_written_to_object_store L106–127
test_merged_snapshot_is_muse_binary_format method async_method test_merged_snapshot_is_muse_binary_format L179–203
test_merged_snapshot_storage_uri_populated method async_method test_merged_snapshot_storage_uri_populated L227–248
test_merged_snapshot_written_to_object_store method async_method test_merged_snapshot_written_to_object_store L152–177
TestRepoInitObjectStore class class TestRepoInitObjectStore L255–310
test_init_commit_is_muse_binary_format method async_method test_init_commit_is_muse_binary_format L284–310
test_init_commit_written_to_object_store method async_method test_init_commit_written_to_object_store L256–282
TestUpsertSnapshotObjectStore class class TestUpsertSnapshotObjectStore L414–452
test_snapshot_is_muse_binary_format method async_method test_snapshot_is_muse_binary_format L433–452
test_snapshot_written_to_object_store method async_method test_snapshot_written_to_object_store L415–431
_backend function function _backend L88–90
_decode_header function function _decode_header L93–98
_make_branch_with_commit function async_function _make_branch_with_commit L64–85
_make_repo function async_function _make_repo L46–61
_uid function function _uid L41–43
AsyncSession import import AsyncSession L23–23
MusehubBranch import import MusehubBranch L27–27
MusehubCommit import import MusehubCommit L27–27
MusehubCommitRef import import MusehubCommitRef L27–27
MusehubRepo import import MusehubRepo L27–27
MusehubSnapshot import import MusehubSnapshot L27–27
MusehubSnapshotRef import import MusehubSnapshotRef L27–27
annotations import import annotations L18–18
blob_id import import blob_id L25–25
compute_identity_id import import compute_identity_id L26–26
compute_repo_id import import compute_repo_id L26–26
datetime import import datetime L21–21
fake_id import import fake_id L25–25 → moved to tests/test_blob_only_object_store.py
json import import json L20–20
pytest import import pytest L22–22
TestBackfillCommits class class TestBackfillCommits L91–149
test_backfill_skips_commits_already_in_s3 method async_method test_backfill_skips_commits_already_in_s3 L132–149
test_commit_s3_bytes_are_muse_binary_format method async_method test_commit_s3_bytes_are_muse_binary_format L110–130
test_commit_storage_uri_populated_after_backfill method async_method test_commit_storage_uri_populated_after_backfill L92–108
TestBackfillSnapshots class class TestBackfillSnapshots L156–278
test_backfill_skips_snapshots_already_in_s3 method async_method test_backfill_skips_snapshots_already_in_s3 L260–278
test_snapshot_delta_only_reconstructed_and_backfilled method async_method test_snapshot_delta_only_reconstructed_and_backfilled L217–258
test_snapshot_manifest_correct_after_backfill method async_method test_snapshot_manifest_correct_after_backfill L198–215
test_snapshot_s3_bytes_are_muse_binary_format method async_method test_snapshot_s3_bytes_are_muse_binary_format L175–196
test_snapshot_storage_uri_populated_after_backfill method async_method test_snapshot_storage_uri_populated_after_backfill L157–173
TestBackfillStats class class TestBackfillStats L285–315
test_idempotent_second_run_writes_nothing method async_method test_idempotent_second_run_writes_nothing L295–315
test_stats_zero_when_nothing_to_backfill method async_method test_stats_zero_when_nothing_to_backfill L286–293
_backend function function _backend L39–41
_make_legacy_commit function function _make_legacy_commit L58–70
_make_legacy_snapshot function function _make_legacy_snapshot L73–84
_make_repo function async_function _make_repo L44–55
_uid function function _uid L35–36
AsyncSession import import AsyncSession L21–21 → moved to tests/test_blob_only_object_store.py
MusehubCommit import import MusehubCommit L26–26
MusehubCommitRef import import MusehubCommitRef L26–26
MusehubRepo import import MusehubRepo L26–26
MusehubSnapshot import import MusehubSnapshot L26–26
MusehubSnapshotRef import import MusehubSnapshotRef L26–26
StrDict import import StrDict L25–25
annotations import import annotations L14–14 → moved to tests/test_blob_only_object_store.py
compute_identity_id import import compute_identity_id L24–24
compute_repo_id import import compute_repo_id L24–24 → moved to tests/test_blob_only_object_store.py
datetime import import datetime L17–17 → moved to tests/test_blob_only_object_store.py
fake_id import import fake_id L23–23
json import import json L16–16
msgpack import import msgpack L19–19
pytest import import pytest L20–20 → moved to tests/test_blob_only_object_store.py
secrets import import secrets L18–18
~ musehub/services/musehub_wire_push.py .py 1 symbol added, 2 symbols modified
+ purge_stale_mpack_index_entries function async_function purge_stale_mpack_index_entries L927–968
~ musehub/storage/backends.py .py 1 symbol added, 1 symbol modified
+ exists_mpack method async_method exists_mpack L427–440
+ test_MIE5_mpack_index_payload_contains_mpack_key function async_function test_MIE5_mpack_index_payload_contains_mpack_key L105–143
~ tests/test_mpack_index_stale_cleanup.py .py moved from tests/test_object_store_invariant_phase4.py; 4 added, 26 removed, 1 modified
TestWireFetchCommitFromS3 class class TestWireFetchCommitFromS3 L236–294
test_commit_s3_bytes_decode_correctly method async_method test_commit_s3_bytes_decode_correctly L251–275
test_commit_storage_uri_is_set method async_method test_commit_storage_uri_is_set L237–249
test_wire_fetch_returns_commit method async_method test_wire_fetch_returns_commit L277–294
TestWireFetchFallback class class TestWireFetchFallback L301–349
test_snapshot_fallback_when_no_storage_uri method async_method test_snapshot_fallback_when_no_storage_uri L302–349
TestWireFetchSnapshotFromS3 class class TestWireFetchSnapshotFromS3 L126–229
test_snapshot_manifest_served_from_s3 method async_method test_snapshot_manifest_served_from_s3 L127–158
test_snapshot_storage_uri_is_set method async_method test_snapshot_storage_uri_is_set L217–229
test_snapshot_wire_manifest_correct_even_when_manifest_blob_zeroed method async_method test_snapshot_wire_manifest_correct_even_when_manifest_blob_zeroed L160–215
_backend function function _backend L43–45
_make_commit_with_snapshot function async_function _make_commit_with_snapshot L61–119
_make_repo function async_function _make_repo L48–58 → moved to tests/test_blob_only_object_store.py
_uid function function _uid L39–40
MusehubBranch import import MusehubBranch L27–27
MusehubCommit import import MusehubCommit L27–27
MusehubCommitGraph import import MusehubCommitGraph L27–27
MusehubCommitRef import import MusehubCommitRef L27–27
MusehubRepo import import MusehubRepo L27–27
MusehubSnapshot import import MusehubSnapshot L27–27
MusehubSnapshotRef import import MusehubSnapshotRef L27–27
compute_branch_id import import compute_branch_id L26–26
compute_identity_id import import compute_identity_id L26–26
compute_repo_id import import compute_repo_id L26–26
json import import json L19–19
secrets import import secrets L21–21
+ select import import select L22–22
+ test_SC1_purge_removes_stale_entries function async_function test_SC1_purge_removes_stale_entries L33–63
+ test_SC2_purge_keeps_live_entries function async_function test_SC2_purge_keeps_live_entries L71–98
+ test_SC3_purge_returns_accurate_counts function async_function test_SC3_purge_returns_accurate_counts L106–135

0 comments

No comments yet. Be the first to start the discussion.

To add a comment, use the Muse CLI: muse hub commit comment sha256:f3995ec2c05c9c34b0e4d6e96349a811d0117a1c51d78096d757998ccb3c0520 --body "your comment"