Data model lockdown: full audit, consolidation, and zero-drift enforcement
Why this exists
MuseHub has been built at high velocity over two months. The data model has accumulated significant sprawl: 70 ORM tables across 7 files, a 2236-line monolith model file, 55 migrations including 7 repair migrations (0028, 0035, 0036, 0037, 0038, 0039, 0040), and live schema drift that is currently blocking staging deployment. This issue closes that debt permanently.
The goal: after this issue is closed, alembic check produces zero meaningful differences, assert_schema_matches_orm passes every deploy, and we have a documented policy that prevents future drift from accumulating silently.
Scope of known problems (pre-audit)
Live drift blocking staging deploy right now
musehub_issue_labels— table exists in staging DB, dropped from ORM in migration 0026, still present (migration did not run cleanly on staging)musehub_auth_challenges.challenge_id— column removed from ORM, no migration writtenuq_musehub_auth_challenges_nonce_hex— constraint removed from ORM, no migration writtenmuse_commits.commit_metadata— ORM says JSONB, DB has JSON, no migration writtenmuse_snapshots.manifest— ORM says JSONB, DB has JSON, no migration written- 30+
server_defaultmodify_defaultitems — likely benign noise from Alembic's comparison of equivalent expressions, but need verification
Structural debt
musehub_models.pyis 2236 lines and contains 56 tables — a maintenance hazardmodels.pyexists but appears empty — dead file- 7 repair migrations in a 55-migration history — symptom of recurring schema instability
- No index audit has ever been done — high-traffic query paths may be missing indexes
assert_schema_matches_ormonly checks table/column presence and nullability — does not catch type drift or index drift
Phases
Phase 1 — Fix the live drift, unblock staging deploy
TDD first: write a test that runs alembic check against a fresh local DB and asserts zero meaningful differences (excluding known-benign server_default noise). The test must fail before the fix and pass after.
Then:
- Audit each item in the known drift list above against the actual staging DB
- Write migration 0056 that resolves all real differences in one shot
- Verify
alembic checkoutput is clean against both local and staging DB - Redeploy staging — health check must pass
Gate: alembic check exits 0 (or our filtered equivalent exits 0) against a fresh DB after alembic upgrade head.
Phase 2 — Full table audit: remove the dead, consolidate the sprawl
For every table in the ORM (all 70):
- Is it queried anywhere in the application? (
muse code impact) - Is it written anywhere? If neither — candidate for deletion
- Does it have data on staging that matters? If so, migrate it out first
- Does it duplicate another table's responsibility?
Known candidates to investigate:
musehub_snapshot_entries— snapshots now usemanifest_blob(msgpack), entries table may be orphanedmusehub_hash_occurrence_entries— intel subsystem, check if actively populatedmusehub_intel_codemap_modules/musehub_intel_codemap_meta— codemap feature status?musehub_wire_tags— what is this? Check usagemusehub_bridge_mirrors— git bridge feature, is it live?musehub_mpay_claims— MPay feature status?musehub_attestations/musehub_attestation_claim_types— attestation feature status?musehub_profile_snapshots— is this actively populated or orphaned?musehub_sessions— vsmusehub_auth_challenges/musehub_auth_keys— are all three needed?
Gate: every surviving table has at least one active read path in the application code. Deleted tables have a DROP migration. No orphaned ORM classes.
Phase 3 — Index audit: every hot query path has the right index
For every table that survives Phase 2, audit:
- All
WHEREclause columns in hot queries (push, fetch, wire, issue list, proposal list) - All
JOINcolumns - All
ORDER BYcolumns used in pagination - Foreign keys that lack a corresponding index (PostgreSQL does not auto-index FKs)
Known missing/suspect indexes to check:
musehub_object_refs(repo_id)— added in 0052, verify coveragemusehub_commits(repo_id, branch)— push and fetch walk this constantlymusehub_snapshots(repo_id)— samemusehub_issues(repo_id, state)— issue list querymusehub_proposals(repo_id, state)— proposal list querymusehub_symbol_intel(repo_id, commit_id)— intel indexing hot path- All
created_atcolumns used in time-range queries
Gate: EXPLAIN ANALYZE on the top-10 hot queries shows index scans, not seq scans, on tables with >1000 rows.
Phase 4 — Consolidate musehub_models.py
The 2236-line monolith makes the model hard to navigate and review. Split by domain:
Proposed split:
musehub_repo_models.py— repo, branch, commit, snapshot, object, object_ref, snapshot_entriesmusehub_social_models.py— issue, issue_comment, issue_event, proposal, proposal_review, proposal_comment, proposal_dependency, proposal_simulation, proposal_labelsmusehub_release_models.py— release, release_assetmusehub_webhook_models.py— webhook, webhook_deliverymusehub_intel_models.py— all musehub_intel_* and musehub_symbol_* tablesmusehub_identity_models.py— identity (moved from musehub_auth_models.py or kept separate)musehub_abuse_models.py— blocked_hashes, daily_push_bytes, push_anomalies, stream_rejectionsmusehub_jobs_models.py— background_job- Keep existing:
musehub_auth_models.py,musehub_collaborator_models.py,musehub_domain_models.py,musehub_label_models.py,coord_models.py,muse_cli_models.py
No behavior changes — pure file reorganization. All imports updated. Monolith deleted.
Gate: all existing tests pass. No import of musehub_models.py anywhere in the codebase.
Phase 5 — Harden assert_schema_matches_orm + deploy gate
Current assert_schema_matches_orm only checks table/column presence and nullability. Extend it:
- Type checking for critical columns (JSONB vs JSON, VARCHAR length, BOOLEAN vs INTEGER)
- Index existence check for declared
Index(...)objects in the ORM - Foreign key existence check
Also harden the deploy gate:
- Replace the
alembic checkwarning-only step indeploy.shwith a hard fail using our custom filter (exclude known-benignserver_defaultnoise, fail on everything else) - Add a pre-deploy test that runs
alembic upgrade head+alembic checkagainst a fresh local DB as part of CI
Gate: assert_schema_matches_orm catches JSON→JSONB drift, missing indexes, and FK mismatches. deploy.sh hard-fails on any meaningful schema difference before the new container ever starts.
Definition of done
alembic check(filtered) exits 0 on staging afteralembic upgrade headassert_schema_matches_ormpasses on staging startup- No repair migrations added after this issue closes (new drift → new migration, same-day)
- Every ORM table has a documented read path
- Every hot query path has an index
musehub_models.pydeleted — replaced by domain-split files- Deploy hard-fails on schema drift before container start
- Zero manual
alembic stampinterventions required on staging ever again
Phase 2 complete
Audit scope: all 69 ORM tables checked for active read/write paths using muse code impact + muse content-grep.
Dead tables found and dropped (2)
| Table | ORM class | Reason |
|---|---|---|
muse_objects |
MuseCliObject |
CLI blob cache, never implemented — roadmap E-01 confirms "all currently zero" |
muse_tags |
MuseCliTag |
Music-semantic commit tags, superseded by musehub_repos.tags JSON. Docstring in ui.py was stale; actual query never touched the table |
Migration 0056: drops both tables (idempotent downgrade restores them).
Dead file deleted: musehub/db/models.py — contained only a tombstone comment, imported nowhere meaningful.
Other cleanup: dead re-exports removed from musehub/muse_cli/models.py, stale docstring in ui.py corrected, all from musehub.db import models imports removed from env.py, database.py, and test_schema_check.py.
Notable non-candidates (agent flagged, verified active):
musehub_attestations/musehub_mpay_claims— use rawtext()SQL, invisible tomuse code impact, confirmed active via content-grepmusehub_snapshot_entries/musehub_hash_occurrence_entries— same issue with inline imports; both actively written and readmusehub_sessions— 87 callers, fully active (MCP, session service)
Phase 2 gate: PASSED — every surviving table (67) has at least one active read path. Migration 0056 drops the 2 dead tables.
Phase 3 complete
Scope: 69 surviving tables audited for index coverage against hot query patterns in push, fetch, wire, issue, proposal, and symbol indexer paths.
Finding: one real gap.
| Table | Gap | Query | Fix |
|---|---|---|---|
musehub_branches |
Only single-column repo_id index |
WHERE repo_id = ? AND name = ? (point lookup on every push/fetch) and ORDER BY name (list branches) |
Composite (repo_id, name) index |
Migration 0057: creates ix_musehub_branches_repo_name (repo_id, name), drops the now-redundant single-column ix_musehub_branches_repo_id (covered by the leftmost prefix of the composite).
Everything else confirmed covered:
musehub_commits: composite(repo_id, branch)and(repo_id, timestamp)— push/fetch HEAD lookups coveredmusehub_object_refs: composite PK(repo_id, object_id)+ix_musehub_object_refs_object_id— bitmap IN-list scans covered (added 0052)musehub_issues/musehub_proposals: composite(repo_id, state),(repo_id, number)indexes presentmusehub_symbol_history_entries: composite(repo_id, address)and(repo_id, address, committed_at)present- All FK columns: every FK has
index=Trueor is part of a composite PK — no unindexed FK gaps
Phase 3 gate: PASSED — all confirmed hot query paths covered by indexes. No seq scans on branch name lookups after migration 0057.
Phase 4 — musehub_models.py split ✅
Completed. The 2240-line monolith has been split into 8 domain-specific files:
| File | Domain | Classes |
|---|---|---|
musehub_repo_models.py |
Repos, VCS objects | 11 |
musehub_social_models.py |
Issues, proposals, forks | 9 |
musehub_release_models.py |
Releases, assets | 2 |
musehub_webhook_models.py |
Webhooks, deliveries | 2 |
musehub_intel_models.py |
All intel_* and symbol_* tables | 23 |
musehub_identity_models.py |
Identities, attestations, mpay, profiles | 5 |
musehub_abuse_models.py |
Content safety, rate limits | 4 |
musehub_jobs_models.py |
Background jobs | 1 |
musehub_models.py is now a 95-line pure re-export shim — all 56 classes remain importable from the original namespace so the 314 existing call sites continue to work unchanged during incremental migration.
Gate: 79 model/schema/DB tests pass, 18 alembic/schema-check tests pass.
Phase 1 complete
What was done
Audited staging DB state directly via SSM: alembic is at revision
0019— 36 migrations behind head, not partially-applied drift. The schema divergence flagged byalembic checkon staging is entirely explained by the migration gap. A fresh local DB with all 55 migrations applied is clean.Tests landed on dev
test_s3_alembic_check_exits_clean) — callsalembic.command.check()on a fresh migration-applied DB, the exact gatedeploy.shuses. Passes now, guards against future ORM↔migration drift._EXPECTED_HEADand_EXPECTED_MIGRATION_COUNTupdated from0051/51 to49036ecac3d6/55. These two tests were failing before this change.All 9 schema tests green: S1, S2, S3 + 6 structural alembic tests.
Migration 0056 status
Not written. The original plan assumed partially-applied migrations that needed idempotent repair. The actual situation is simpler: staging is cleanly behind and will be nuked and rebuilt fresh. A repair migration would be a no-op on every correctly-deployed instance.
Staging deploy plan
alembic upgrade headon empty schema, deploy, re-import issues/proposals from backupPhase 1 gate: PASSED —
alembic checkexits 0 against a fresh DB afteralembic upgrade head.