Implement true soft-delete for repos with cascade tombstoning
Overview
There is a critical discrepancy between the API contract and the implementation for repo deletion.
The route docstring claims:
"Marks the repo as deleted by recording a
deleted_attimestamp; all data is retained in the database for audit purposes."
The actual service implementation in musehub/services/musehub_repository.py::delete_repo (line 293):
async def delete_repo(session: AsyncSession, repo_id: str) -> bool:
"""Hard-delete a repo and all its cascade-deleted dependents."""
row = await session.get(MusehubRepo, repo_id)
if row is None:
return False
await session.delete(row) # ← hard delete, not soft delete
await session.flush()
return True
The MusehubRepo model does have a deleted_at column (added in the consolidated schema), but it is never set by this function. Calling DELETE /repos/{repo_id} today hard-deletes the repo row and cascades to all dependents — issues, proposals, comments, collaborators, webhooks, objects — permanently and immediately.
This must be fixed before any repo deletion is exposed in the UI or invoked during the rc11 deploy cleanup, as it is irreversible.
Phase 1 — Audit the full cascade surface
Before writing any fix, map exactly what a hard delete currently destroys via DB cascade. Check the ORM model relationships and any ON DELETE CASCADE constraints in the schema:
muse -C ~/ecosystem/musehub content-grep "CASCADE|ondelete" --json
muse -C ~/ecosystem/musehub code symbols --file musehub/db/musehub_repo_models.py --json
Produce a complete list of every table/model that would be affected by deleting a MusehubRepo row. Expected candidates:
musehub_issues+musehub_issue_comments+musehub_issue_labelsmusehub_proposals+musehub_proposal_commentsmusehub_collaboratorsmusehub_webhooks+musehub_webhook_deliveriesmusehub_objects(already has its owndeleted_at)musehub_branches,musehub_commits,musehub_snapshots- Fork relationships (
fork_parent_idforeign keys)
Phase 2 — Fix delete_repo to set deleted_at instead of hard-deleting
Replace the hard delete with a tombstone:
async def delete_repo(session: AsyncSession, repo_id: str) -> bool:
"""Soft-delete a repo by recording a deleted_at timestamp.
All associated data is retained in the database for audit purposes.
Subsequent reads return 404. Returns True if the repo existed, False if not found.
The caller is responsible for committing the session.
"""
row = await session.get(MusehubRepo, repo_id)
if row is None:
return False
if row.deleted_at is not None:
return True # idempotent — already deleted
row.deleted_at = datetime.now(timezone.utc)
await session.flush()
logger.info("✅ Soft-deleted MuseHub repo %s", repo_id)
return True
Verify that deleted_at is already on the MusehubRepo ORM model. If not, add a migration to add the column (check musehub/db/musehub_repo_models.py line 276 — it appears to be there already).
Phase 3 — Cascade tombstone to child resources
A soft-deleted repo must hide all its children. Dependents must be tombstoned in the same transaction so they are invisible in list queries without requiring a JOIN to the parent repo on every query.
Resources to tombstone when a repo is soft-deleted:
musehub_issues→ setdeleted_at = now()whererepo_id = :repo_idanddeleted_at IS NULLmusehub_issue_comments→ setis_deleted = truewhereissue_id IN (SELECT issue_id FROM musehub_issues WHERE repo_id = :repo_id)musehub_proposals→ setdeleted_at = now()(confirm column exists; add migration if not)musehub_proposal_comments→ setis_deleted = truevia proposal subquery
Resources that do NOT need tombstoning (filtered by repo lookup upstream):
musehub_objects,musehub_branches,musehub_commits— these are already hidden once the repo row returns 404; no independent list endpoint exposes them without a repo context
Implement the cascade as explicit bulk UPDATE statements within delete_repo (not ORM-level cascade), so the tombstone is atomic with the repo's own deleted_at set in a single flush.
Phase 4 — Verify all read paths filter on deleted_at
After the tombstone is set, every read path that queries a repo must filter it out. Audit these surfaces:
muse -C ~/ecosystem/musehub content-grep "get_repo|list_repos|MusehubRepo" --json
For each query that touches MusehubRepo, confirm it includes:
MusehubRepo.deleted_at.is_(None)
Known surfaces to check:
musehub_repository.get_repo— isdeleted_at IS NULLalready in the WHERE clause?musehub_repository.list_repos_for_usermusehub_repository.list_repo_forks/list_repo_forks_flat- Sitemap generator (
musehub/api/routes/musehub/sitemap.py) - MCP write tools (
musehub/mcp/write_tools/repos.py) - Push / fetch / clone handlers — a deleted repo must reject incoming writes
Any path that does not filter deleted_at is a data-leak bug — fix it in this phase.
Phase 5 — muse hub repo delete CLI — verify end-to-end
Once the service is fixed, verify the full CLI path:
# Create a throwaway repo
muse -C ~/ecosystem/musehub hub repo create --name test-soft-delete-$(date +%s) --visibility private --json
# Delete it
muse -C ~/ecosystem/musehub hub repo delete --json
# Confirm 404 on subsequent read
muse -C ~/ecosystem/musehub hub repo read --json # must return 404
# Confirm row still exists in DB with deleted_at set (via SSM):
# SELECT repo_id, slug, deleted_at FROM musehub_repos WHERE repo_id = '<id>';
Idempotency check: calling delete twice must return success, not 404 on second call.
Phase 6 — Migration: add deleted_at to musehub_proposals if missing
Phase 3 assumes musehub_proposals has a deleted_at column. If it does not, write a migration:
op.add_column('musehub_proposals',
sa.Column('deleted_at', sa.DateTime(timezone=True), nullable=True))
op.create_index('ix_musehub_proposals_deleted_at', 'musehub_proposals', ['deleted_at'])
Add the corresponding ORM column to the MusehubProposal model. Without it, proposals belonging to a soft-deleted repo remain visible in any proposal list query that does not join to musehub_repos.
Acceptance criteria
[ ] DELETE /repos/{repo_id} sets deleted_at on the repo row — no hard delete
[ ] Deleted repo returns 404 on all read paths (GET, ls-remote, push, fetch, clone)
[ ] Issues belonging to the repo are tombstoned (deleted_at set) in the same transaction
[ ] Issue comments belonging to those issues are tombstoned (is_deleted = true)
[ ] Proposals belonging to the repo are tombstoned
[ ] Proposal comments belonging to those proposals are tombstoned
[ ] All MusehubRepo list/get queries filter deleted_at IS NULL
[ ] Deleting an already-deleted repo is idempotent (returns success, not 404)
[ ] DB row is preserved — a hard delete never occurs via this path
[ ] Migration added for musehub_proposals.deleted_at if column was missing
[ ] End-to-end CLI test passes (create → delete → read returns 404 → DB row survives)