gabriel / musehub public
Closed #1
filed by gabriel human · 48 days ago

Per-repo object store: eliminate three-store sync (flat file + musehub_objects + commit graph)

0 Anchors
Blast radius
Churn 30d
0 Proposals

Problem

MuseHub currently maintains three separate stores that must stay perfectly in sync on every push:

  • Flat file store/data/musehub/<sha256_hex> — 34,699 files in a single directory, globally namespaced across all repos
  • musehub_objects — DB mirror of the flat files (34,569 rows)
  • musehub_commits / musehub_snapshots / musehub_branches — parsed commit graph in the DB

Push negotiation (wire_negotiate) queries musehub_commits via SQL to determine what the remote already has. The object store is written separately. Any operation that rewrites commit IDs (migration, force-resign, partial push failure) leaves the three stores inconsistent, and the push negotiation then makes incorrect decisions — either under-sending (the bug we hit today) or over-sending.

This is the root cause of a whole class of bugs. The immediate symptom: after muse code migrate --force-resign rewrote all musehub commit IDs, the remote's DB had old IDs and the flat file store had a mix. The push walk stopped at a commit whose objects happened to be in the flat store but whose record wasn't in musehub_commits. Force push couldn't fix it.

What git forges do: GitHub/GitLab store bare repos on disk (repo.git/ with standard pack files). The DB holds only user metadata, access control, issues, and a search index. The repo data itself is never in the DB — the forge's API reads the on-disk git object store directly. A push is just writing objects to disk and advancing a ref file. There is no DB-object sync to break.

Target architecture

/data/repos/<owner>/<slug>/
  objects/
    sha256/                 ← algorithm namespace (mldsa65/ slots in here when PQ lands)
      ab/                   ← 2-char hex shard
        <62-hex>            ← object blob  (sha256 = 64 hex total; 2 consumed by shard dir)
  refs/
    heads/
      main                  ← contains: sha256:<64-hex>
      dev
  HEAD                      ← contains: ref: refs/heads/main

This mirrors .muse/objects/sha256/<shard>/<rest> exactly — the on-disk format is identical between the local client store and the server store.

DB tables that become caches only (rebuilt from disk, never canonical):

  • musehub_commits — fast graph queries, search
  • musehub_snapshots — fast manifest lookups
  • musehub_branches — fast branch listing

DB tables that remain canonical (not derivable from objects):

  • musehub_repos — repo metadata, visibility, owner
  • musehub_identities / musehub_auth_keys — identity/auth
  • musehub_issues / musehub_proposals / musehub_reviews — collaboration layer
  • musehub_objects — reduced to just storage_uri + size_bytes index for fetch path resolution

Implementation phases (load-bearing order)

Phase 1 — Per-repo directory isolation (unblocks everything else)

Goal: each repo gets its own object directory. Objects are no longer globally namespaced.

Changes:

LocalBackend._path(object_id) currently maps everything to /data/musehub/<safe_id>. Change to accept a repo_root parameter:

# Before
def _path(self, object_id: str) -> Path:
    return self._root / self._safe_id(object_id)

# After — Phase 1 (per-repo root, still flat; Phase 2 adds algo/shard)
def _path(self, object_id: str, repo_root: Path | None = None) -> Path:
    base = repo_root / "objects" if repo_root else self._root
    safe = self._safe_id(object_id)   # removed in Phase 2
    return base / safe

musehub_config.musehub_objects_dir becomes musehub_repos_dir (/data/repos) Repo root = /data/repos/<owner>/<slug>/ All wire push/fetch/presign paths pass repo_root through to the backend

Migration:

# One-time job: for each object in musehub_objects:
# 1. Read old path from storage_uri
# 2. Compute new path: /data/repos/<owner>/<slug>/objects/<safe_id>  (still flat here; Phase 2 reshards)
# 3. hardlink (same filesystem) or copy, then update storage_uri
# Note: one object may be referenced by multiple repos (shared blobs) —
# hardlinks are correct here; copies are safe but use more space.

Why first: all subsequent phases build on per-repo roots. Can't shard, can't add refs, can't make DB a cache until objects are per-repo.


Phase 2 — Object sharding (algo-namespaced + 2-char prefix)

Goal: eliminate flat-directory inode hell AND add algorithm namespacing. 34K files in one dir is already slow on some filesystems; at 1M it becomes a hard limit. The algo level (sha256/, mldsa65/, …) mirrors the local .muse/objects/ layout exactly and future-proofs for post-quantum object IDs with zero layout changes.

Changes:

def _path(self, object_id: str, repo_root: Path) -> Path:
    # "sha256:abcdef0123..." → objects/sha256/ab/cdef0123...
    algo, hex_part = object_id.split(":", 1)
    return repo_root / "objects" / algo / hex_part[:2] / hex_part[2:]

_safe_id can be deleted — the algo/shard/rest structure never produces paths with reserved characters; the colon never appears on disk

No other DB changes — storage_uri already points to the resolved path

Migration: rename files to <algo>/<shard>/<rest> paths, update storage_uri

Why second: the shard layout is a prerequisite for pack file support in Phase 5. Doing it before Phase 3 means the cache-rebuild logic in Phase 3 never sees the flat layout.


Phase 3 — On-disk refs as canonical branch pointers

Goal: branch heads live in refs/heads/<name> files on disk. DB musehub_branches.head_commit_id becomes a cache column.

Changes:

# On push: after objects are written, write refs/heads/<branch> atomically (rename-into-place)
ref_path = repo_root / "refs" / "heads" / branch_name
tmp = ref_path.with_suffix(".tmp")
tmp.write_text(f"{new_head_commit_id}\n")
tmp.rename(ref_path)   # atomic on POSIX
# Then update musehub_branches in the DB (cache write, not authoritative)

Add GET /repos/{owner}/{repo}/branches/{name}/repair — reads disk, heals DB if diverged Startup health-check: compare DB branch heads vs disk refs; log divergence

Why third: once refs are on disk, the DB can be treated as a cache for the first time. Push negotiation can fall back to disk when the DB is stale (fixing our immediate bug class).


Phase 4 — Push negotiation reads disk, not DB

Goal: wire_negotiate reads the on-disk commit graph, eliminating DB-drift bugs.

Current:

# wire_negotiate — queries musehub_commits via SQL
ack_q = await session.execute(
    select(db.MusehubCommit.commit_id).where(
        db.MusehubCommit.commit_id.in_(have_set), ...))

Target: read commit objects directly from the per-repo object store:

async def _commit_exists_on_disk(repo_root: Path, commit_id: str) -> bool:
    path = object_path(repo_root, commit_id)
    return path.exists()

# wire_negotiate — no DB query for have/want negotiation
ack = [cid for cid in have_set if await _commit_exists_on_disk(repo_root, cid)]

DB musehub_commits is still written on push (for fast graph queries and API search) But push negotiation never trusts it as the source of truth

Result: force-resign, migration, partial push — none can corrupt the negotiation

Why fourth: depends on per-repo roots (Phase 1) to know where to look. This phase directly fixes the class of bug that motivated this ticket.


Phase 5 — Pack file support + GC

Goal: periodically pack loose objects into pack files (like git pack-objects). Reduces inode count from O(objects) to O(1) per pack.

Changes:

  • Pack format: a sorted index file + a data file, mirroring the msgpack wire format already used in bundles
  • muse maintenance run --pack on the server-side repo triggers packing
  • Loose objects written by push; background job packs them (like git gc --auto)
  • LocalBackend.get() checks pack files when loose object not found
  • Pack files are immutable once written; GC deletes packs whose objects are all present in newer packs

Why fifth: purely a performance/scalability concern. No correctness dependency on earlier phases but requires the sharded layout from Phase 2.


Phase 6 — Storage tier formalisation

Goal: formalise the hot/warm/cold tiering that get_backend() already hints at.

Hot  — local disk (per-repo objects, recently pushed, < 30 days)
Warm — S3/R2 (objects older than 30 days, large blobs > 10 MB)
Cold — Glacier / archival (objects > 1 year, unpopular repos)

Changes:

  • StorageBackend protocol gains tier() -> Literal['hot','warm','cold']
  • get_backend() returns a TieredBackend that falls through hot → warm → cold
  • Background job promotes/demotes objects between tiers based on access time + age
  • LocalBackend = hot; S3Backend = warm (already implemented); new GlacierBackend = cold

Why last: pure operational concern, no correctness impact. Can be done incrementally per-repo after the storage layout is stable.

Acceptance criteria

  • Each repo has an isolated object directory under /data/repos/<owner>/<slug>/objects/
  • Objects are algo-namespaced and sharded: objects/sha256/ab/<62-hex>
  • Branch heads written atomically to refs/heads/<name> on disk on every push
  • wire_negotiate does not query musehub_commits for have/want resolution
  • Force-resign + re-push works without DB surgery
  • GET /repos/{owner}/{repo}/branches/{name}/repair heals DB from disk
  • Migration job moves all existing objects to per-repo sharded paths
  • Pack file GC runs as a background maintenance job
  • muse push local dev --force after a full force-resign succeeds in one command with no manual DB edits
Activity1
gabriel opened this issue 48 days ago
gabriel 48 days ago

Implemented across six phases. All phases shipped and live on staging.

Phase 1 (per-repo isolation): sha256:af840c37d9e3f Phase 3 (on-disk refs): sha256:4d14406aff879 Phase 4 (disk-based push negotiation): sha256:b68da9c205ac4 Phase 5 (pack file + GC): sha256:699c1fcf16db4 Phase 6 (storage tier formalisation): sha256:386278accc3dd Wire-up (repo_root threading through push/fetch): sha256:627a1cfe4efcd sha256:cde72cee233cf sha256:5edef517efe89 Migration script (flat → per-repo, dry-run / migrate / verify / prune): sha256:f8d3be6b8476c

Architecture correction (algo-namespaced paths, _safe_id deletion) captured in the issue body.