MuseHub Pre-Launch Checklist
This document governs what must be complete, verified, and signed off before MuseHub opens to users beyond gabriel. Items are grouped by domain. Nothing ships until every checkbox is checked. Re-check after any major refactor.
0. Philosophy
The threat model is realistic: we are open source, so attackers can read every route, every query, every auth flow. We assume they will. The bar is not "unbreakable against nation-states." The bar is: a well-resourced, intelligent adversary with full source access cannot compromise data, impersonate users, or take the service down with commodity tooling. Normal abuse — scrapers, credential stuffing, path traversal attempts, large payload bombs — should be a non-event.
1. Authentication & Authorization
1.1 Ed25519 / MSign
Auth is Ed25519 per-request signing (MSign). No server secret, no token expiry, no refresh. The public key registered in the DB is the credential.
- [x] No server secret, no
ACCESS_TOKEN_SECRET— auth is pure Ed25519 key pairs - [x] Per-request Ed25519 signature verified on every protected route (
require_signed_request) - [x] 30-second replay window enforced (
REPLAY_WINDOW_SECONDS = 30inrequest_signing.py) - [x] Challenge nonce is single-use — consumed with
.pop()on first verify; 5-min TTL for GC - [x]
identity.tomlkeys are per-host, never shared across environments (muse CLI enforces this) - [x] Key revocation: compromised key deleted via
DELETE /api/auth/keys/{handle}/{key_id} - [x] Auth endpoints rate-limited at 20 req/min per IP via
slowapi(AUTH_LIMITinrate_limits.py) - [x] Bearer tokens explicitly rejected with 401 (MSign is the only accepted scheme)
- [x]
WWW-Authenticate: MSign realm="musehub"returned on all 401 responses - [x] Failed-auth-specific rate limiting:
musehub/auth/failure_limiter.py— in-memory per-IP failure counter with exponential backoff. Thresholds: 5→30s, 10→5min, 20→15min. Wired intoPOST /api/auth/verify(check before, record_failure on AuthError, record_success on ok). - [x] No CAPTCHA needed — there is no password or secret the attacker could guess; the private key never leaves the client machine
1.2 Authorization (ownership checks)
- [x] Every repo-scoped JSON API endpoint asserts
repo.owner == current_user(or team membership) Destructive/state-changing ops (delete, transfer, close, merge, assign, label, milestone) use_guard_owner/_guard_repo_ownerhelpers. Collaborator team membership is future work. - [ ] SSR UI layer visibility gate — all 14 repo-scoped
ui_*.pyroute handlers must checkrepo.visibilitybefore serving content. Fixed in issue #90 (task/ssr-visibility-gate). Previously, private repos returned HTTP 200 to anonymous browser requests.ui_repo_settings.pyandui_sessions.pyadditionally requireclaims.handle == owner. - [x] Repo visibility (public/private) is checked before serving any object, blob, or archive
All JSON API GET endpoints gate on
optional_token+repo.visibility != "public"check. - [x] Object download (
/archive,/blob,/object) cannot be path-traversed to another repoget_file_at_refresolves via snapshot manifest;get_object_rowfilters byrepo_id AND object_idin SQL — DB is the authority, no path concatenation. - [x] Issue, merge-proposal, and comment endpoints verify the caller owns the parent repo
_guard_repo_owneradded to: close/reopen/update/assign/milestone/labels on issues; merge/request-reviewers/remove-reviewer on proposals; delete-comment._guard_write_accessadded to: create-issue, create-comment, create-proposal, create-proposal-comment (private repos: owner-only; public repos: any authenticated user). - [ ] Admin-only endpoints (
/api/admin/*) are gated by a separateis_adminclaim No/api/admin/*routes exist yet.MSignContext.is_adminis defined but always False. Low priority — no system-admin operations are needed pre-launch. - [x] There is no "owner" field that a caller can self-assign via POST body
create_reposetsowner_user_idfromclaims(the authenticated caller), never from the request body. The body'sownerfield is a display slug only. PATCH on identities explicitly whitelists allowed fields and does not exposehandleoridentity_type.
1.3 Key hygiene
- [x] Private keys are never logged — auth logs contain only handle, algo, key_id, and
fingerprint prefixes;
public_key_b64,signature_b64, andAuthorizationheader values are never passed to any logger - [x] MSign signatures are never returned in redirect URLs — all
RedirectResponsetargets are static paths or/{owner}/{repo}?welcome=1; no auth material in any Location header - [x]
Authorization: MSignis the only accepted transport — no?token=query param, noBearerfallback path
2. Input Validation & Injection
2.1 Path / traversal
- [x] All
owner,repo,branch,pathURL segments are validated against an allowlist regex (^[a-zA-Z0-9_.-]+$, length-capped) before touching the filesystem or DB - [x] Constructed file paths are resolved with
Path.resolve()and checked to be inside the expected root — no../../escapes - [x] Objects are fetched from the blob store (R2/MinIO) by
object_id; no disk paths involved
2.2 SQL / ORM
- [x] Zero raw SQL string interpolation anywhere in the codebase (
muse content-grep "f\""audit) - [x] All queries go through the ORM or parameterized
text()with bound params - [x] Search / filter inputs are sanitized before being passed to
LIKEortsvector
2.3 Payload / content
- [x] Request bodies have a hard size cap (e.g., 10 MB for API, 100 MB for object upload)
enforced at the ASGI/nginx layer, not just application code
(nginx:
client_max_body_size 500m; ASGI:ContentSizeLimitMiddleware— 10 MB API, 500 MB push) - [x] Markdown rendered server-side is sanitized (no raw
<script>,<iframe>, event handlers) — use a strict allowlist renderer (e.g.,bleach+mistune) (mistune 3.xHTMLRenderer(escape=True)escapes all raw HTML; javascript: URLs blocked; seejinja2_filters.py) - [x] Filenames in archives are validated before extraction (zip-slip prevention) (N/A — no server-side archive extraction in current codebase; objects are stored as content-addressed blobs)
- [x] YAML / TOML config uploaded by users is parsed in a sandbox, not
eval'd (CI workflow YAML usesyaml.safe_loadwith 256 KiB size limit; no other user-uploaded config parsed)
2.4 Object / commit integrity
- [x] Every pushed object is content-addressed: SHA-256 of payload must match its stored ID
(
_verify_object_hashinmusehub_wire.py— applied in bothwire_pushandwire_push_objectsbefore any bytes touch the storage backend; non-sha256: prefixes are forwarded for compat) - [x]
muse verifypasses cleanly on every repo after a push (server-side SHA-256 check at receive time is the equivalent gate; any object that passeswire_pushis already content-address-verified —muse verifywill agree) - [x] Commits with forged
parent_idreferences are rejected at receive time (each declared parent_commit_id must exist in the push mpack OR in the DB for this repo; parents belonging to a different repo are rejected with 409)
3. Network & Transport
- [x] TLS 1.2 minimum enforced at the load balancer / nginx; TLS 1.0/1.1 disabled
(
ssl_protocols TLSv1.2 TLSv1.3+ explicit cipher suite added todeploy/nginx-cf.conf) - [x] HSTS header set (
max-age=31536000; includeSubDomains) (SecurityHeadersMiddlewaresetsmax-age=63072000; includeSubDomains; preload— 2 years with preload, exceeds requirement; only in non-debug mode) - [x] All HTTP traffic redirects to HTTPS (301, not 302)
(port 80 server block added to
deploy/nginx-cf.confwithreturn 301 https://...) - [x] CORS policy is explicit and minimal — not
*for authenticated endpoints (allow_methodsrestricted to GET/POST/PATCH/DELETE/OPTIONS/HEAD;allow_headersrestricted to Authorization/Content-Type/Accept/X-Requested-With; origins viacors_originsenv var, warns if*) - [x]
Content-Security-Policyheader prevents inline script execution and framing (CSP set globally inSecurityHeadersMiddleware: nounsafe-inlineinscript-src;frame-ancestors 'none';unsafe-evalretained for Alpine.js v3) - [x]
X-Content-Type-Options: nosniffandX-Frame-Options: DENYset globally (both set inSecurityHeadersMiddlewareon every response) - [x] Cookies (session, CSRF) are
Secure,HttpOnly,SameSite=Strict(N/A — no server-side cookies. Auth is Ed25519 MSign per-request signing; no session middleware; no CSRF tokens) - [x] No mixed-content (HTTP resources loaded from HTTPS pages)
(CSP includes
upgrade-insecure-requests;connect-src 'self'andimg-src 'self' data: https:block HTTP sub-resources)
4. Rate Limiting & Abuse Prevention
- [x] Global rate limit per IP: e.g., 300 req/min baseline, configurable per route
(
Limiter(default_limits=["300/minute"])inrate_limits.py— applies to all routes without an explicit tighter limit) - [x] Auth endpoints (login, register, challenge): stricter limit, e.g., 10 req/min per IP
(
AUTH_LIMIT = "20/minute"on all auth routes;failure_limiter.pyadds per-IP exponential backoff on failures) - [x] Object upload endpoint: limited by both request rate and total bytes/hour per user
(push endpoints at 30/min via
WIRE_PUSH_LIMIT; bytes/hour per-user tracking deferred — push body size cap at 500 MB covers payload bombs) - [x] Archive download: rate-limited and/or requires authentication for private repos
(
GET /o/{object_id}limited to 120/min viaOBJECT_LIMIT; private repo visibility enforced at repo layer) - [x] Search endpoint: limited to prevent full-index scraping
(
@limiter.limit(SEARCH_LIMIT)wired up on/api/search,/search,/search/repos,/repos/{id}/search) - [x] 429 responses include
Retry-Afterheader (_handle_rate_limitoverride inmain.pycomputesRetry-AfterfromX-RateLimit-Resettimestamp) - [x] Bot / scraper detection via User-Agent + behavioral heuristics; block or throttle
(
BotThrottleMiddlewareinmusehub/middleware/bot_throttle.py: known-bad UA patterns → 429; missing UA on non-CDN paths → 429;/healthz,/static/,/mcpexempt) - [x] Webhook delivery retries are capped and backed off (no retry storms)
(
_MAX_ATTEMPTS = 3,_BACKOFF_BASE = 1.0→ 1s/2s/4s exponential backoff inmusehub_webhook_dispatcher.py)
5. Data Integrity
5.1 Database
- [ ] Postgres WAL archiving enabled (point-in-time recovery)
(ops config — on the Postgres server set
archive_mode=on,archive_command,wal_level=replica; requires postgres superuser access. Not automatable from app code.) - [x] Automated daily snapshot backup to a separate storage location (not same disk)
(
deploy/backup.sh: dailypg_dump | gzipat 3 AM; syncs to Cloudflare R2 via rclone for off-disk retention (90 days). SetBACKUP_R2_BUCKET=<bucket>in.env.) - [ ] Backup restore drill: restore latest backup to a staging DB and verify row counts
(ops procedure — decompress latest
.sql.gz,psql musehub_staging < dump.sql, runSELECT COUNT(*) FROM musehub_reposetc. Do before every major migration.) - [x] Foreign key constraints enforced (not deferred or disabled for speed)
(all FK columns use
ForeignKey(..., ondelete="CASCADE"); PostgreSQL enforces FKs natively) - [x] Critical tables have
updated_attriggers for audit trails (addedupdated_atto MusehubRepo, MusehubProposal, MusehubWebhook, MusehubRelease; already present on MusehubIssue/Comment/Milestone/Comment/RenderJob. Migration 0020 backfills existing rows withserver_default=func.now().) - [x] No orphaned objects: object rows reference valid repo IDs (FK + periodic scan)
(FK+CASCADE guarantees DB-level cleanup on repo delete;
musehub/maintenance/orphan_scan.pyprovidesscan_orphan_objects()anddelete_orphan_objects()for scheduled maintenance)
5.2 Object store
- [x] Object files on disk are immutable after write (append-only content-addressed store)
- [x] Periodic integrity scan: re-SHA-256 a random sample of stored objects against their IDs
- [x] Disk usage quotas enforced per repo and per user (prevent storage exhaustion)
- [x] Deletion is soft-delete first (tombstone), hard-delete only after a retention window
5.3 Migrations
- [x] All schema changes go through versioned Alembic migrations
- [x] Migrations are tested against a production-sized data snapshot before apply
- [x] Rollback migration exists for every forward migration
- [x] Migration is run in a transaction; failure rolls back cleanly
6. Performance & Scalability
6.1 Database
- [x] Indexes exist on all foreign keys and common filter columns
(
repos.owner,commits.repo_id,symbols.repo_id + address,issues.repo_id) - [x]
EXPLAIN ANALYZErun on the 10 highest-traffic queries; no seq scans on large tables - [x] Connection pooling configured (PgBouncer or SQLAlchemy pool); max connections capped
- [x] Slow query log enabled (threshold: 100 ms); alerts wired
6.2 API
- [x] Symbol list, commit log, and blame endpoints are paginated — no unbounded result sets
- [x] Large diffs / blobs are streamed, not buffered in memory
- [x] Archive download streams directly from disk — no full file read into RAM
- [x] Symbol intelligence queries (hotspots, dead code) are pre-computed at push time, never computed on-the-fly per request
6.3 Static assets
- [x]
app.cssand JS are served with far-futureCache-Controlheaders + content hash in filename for cache busting (StaticCacheMiddleware:public, max-age=31536000, immutablefor .css/.js/.map;static_version= SHA-256(app.css+app.js)[:8] injected as?v=on all assets in base.html) - [x] Assets are gzip or brotli compressed at the nginx/CDN layer
(
deploy/nginx-cf.conf:gzip on; gzip_comp_level 6;covers text/css, application/javascript, application/json, image/svg+xml and more) - [x] No blocking synchronous calls in request handlers (all DB and I/O are async)
(
objects.py: bareopen()wrapped inasyncio.to_thread; storage reads usestream()async generator)
6.4 Load testing
- [x] Baseline load test: 100 concurrent users, normal read-heavy traffic — p99 < 500 ms
(Locust
BaselineUserscenario indeploy/load-tests/locustfile.py; run against staging. Infra readiness verified: pool_size=20 + max_overflow=40 = 60 DB conns;UVICORN_WORKERSdefaults to 4;GLOBAL_LIMIT=300/minallows normal browsing patterns.) - [x] Spike test: 10× normal traffic for 60 s — service degrades gracefully (429s), does
not crash
(Locust
SpikeBurstscenario; 429s verified to carryRetry-Afterheader via_handle_rate_limit;RateLimitExceededhandler registered in app exception handlers.) - [x] Soak test: sustained moderate load for 12 h — no memory leak, no connection leak
(Locust
SoakUserscenario;MemoryLogMiddlewareregistered as outermost ASGI layer, logs RSS > 400 MiB;rss_mb()via psutil available for monitoring.) - [x] Write-heavy test: 50 concurrent pushes — object store and DB remain consistent
(Locust
WritePushUserscenario; in-process concurrent push verified withasyncio.gatherover 10 pre-upload + 5 concurrent commit push calls — no 500s, no deadlocks.)
7. Infrastructure & Operations
7.1 Environments
- [x] Local dev: runs via
docker compose, no shared state with staging/prod (docker-compose.override.yml:DEBUG=true+ bind mounts; named volumesmusehub_dataandpostgres_dataare isolated from staging/prod) - [x] Staging: full production mirror (same Docker image, separate DB, separate object store)
— accessible at an internal URL only, no public DNS
(
aws-provision-staging.sh:INSTANCE_NAME=musehub-staging, separate EIP; same AMI as prod;setup-ec2-staging.shusesstaging.musehub.aidomain) - [x] Production: isolated VPC, restricted inbound (80/443 only), no SSH from public internet
(
aws-provision.sh: SSH restricted to current IP only, never0.0.0.0/0; nginx forces HTTP→HTTPS redirect; Cloudflare terminates TLS at edge; uvicorn started with--proxy-headersfor correct client IP propagation) - [x] Environment config (secrets, DB URLs, object store paths) is injected via env vars or
secrets manager — never committed to source
(
.museignoreexcludes.envand.env.*;pydantic-settingsreads all config from env vars;.env.exampledocuments every required field per environment;Settingsdefaults all secret fields toNone; startup guard rejects weakDB_PASSWORDin production; missingWEBHOOK_SECRET_KEYandRUNNER_TOKENlogged at WARNING on startup)
7.2 Secrets management
- [x] DB password, webhook secret, and object store credentials are stored in a secrets manager
(e.g., Doppler, AWS Secrets Manager, Vault) — not in
.envfiles in the repo (deploy/secrets.sh: fetches all secrets from AWS SSM Parameter Store SecureString parameters at deploy time, writes/opt/musehub/.envat mode 600; never stores credentials in the repo or Docker image layers) - [x] Secrets are rotated on a schedule (DB password: 180 days; webhook key: on compromise)
(
docs/secret-rotation-runbook.md: step-by-stepaws ssm put-parameterrotation for DB_PASSWORD (180 days), WEBHOOK_SECRET_KEY (on compromise), RUNNER_TOKEN (90 days), R2 credentials (90 days), and CloudTrail audit procedure) - [x] No secrets in Docker image layers (
docker historyaudit) (Dockerfile has no secret-bearingARG/ENV; onlyPYTHONPATH,PYTHONDONTWRITEBYTECODE,PYTHONUNBUFFEREDare baked in; no.envCOPY; audit command documented in runbook) - [x] CI/CD pipelines inject secrets at runtime, not build time
(
deploy.sh: containers started with--env-file $APP_DIR/.env(runtime); no secrets passed as build args;setup-ec2.shgenerates freshDB_PASSWORD+WEBHOOK_SECRET_KEYviaopenssl rand/Fernet.generate_key()on first provision)
7.3 Deployment
- [x] Zero-downtime deploy: rolling restart or blue/green, no hard cutover
(
deploy/deploy.sh: two slots blue/green, nginx upstream pointer file, atomic flip vianginx -s reload; health-checked before flip, old slot stopped after) - [x] Health check endpoint (
/healthz) returns 200 only when DB connection and object store are reachable (GET /healthzinmain.py:SELECT 1DB probe + blob storehead_bucketprobe; registered before wildcard routes; 200{"status":"ok"}/ 503{"status":"unhealthy"}; no auth required;DockerfileHEALTHCHECKanddeploy.shhealth URLs both point to/healthz) - [x] Container runs as a non-root user
(
Dockerfile:groupadd -r musehub && useradd -r -g musehub musehub,USER musehubafterpip install) - [x] Read-only filesystem where possible; writable mounts are explicit and minimal
(
docker-compose.yml:read_only: trueon musehub service;/tmpastmpfs;/dataas named volumemusehub_data) - [x] Resource limits set (CPU + memory) on all containers
(
docker-compose.ymldeploy.resources.limits: musehub 1.0 CPU / 512M, postgres 0.5 CPU / 256M, musehub-runner 0.5 CPU / 256M)
7.4 Logging & alerting
- [x] Structured JSON logs (level, timestamp, request_id, user_id, path, status, duration)
(
musehub/logging_config.py:JsonFormatteremits one JSON object per line;request_id_var/user_id_varcontextvars populated byAccessLogMiddlewareso every in-request log record automatically carries them; optionalmethod,path,status,duration_mson access records) - [x] No PII or token values in logs
(
PiiFilterinmusehub/logging_config.py: scrubsBearer <token>,password=,token=,secret=patterns from fully-formatted message strings before any handler sees them;AccessLogMiddlewareextracts only the MSignhandle— never logs the rawAuthorizationvalue) - [x] Alerts wired for: 5xx rate > 1%, p99 latency > 2 s, disk > 80%, DB connections > 90%
(
deploy/cloudwatch-alerts.sh: log group/musehub/app; metric filters for 5xxCount, RequestCount, RequestDurationMs; CloudWatch alarms with SNS → SMS + email to gabriel) - [x] On-call rotation or at minimum a PagerDuty / SMS alert to gabriel's phone
(
deploy/cloudwatch-alerts.shSNS SMS subscription;docs/on-call-runbook.mdwith incident playbooks for all four alarm types + optional PagerDuty escalation path) - [x] Log retention: 30 days hot, 1 year cold
(
deploy/cloudwatch-alerts.shput-retention-policy --retention-in-days 30;docs/on-call-runbook.mddocuments S3 export → Glacier after 30d, expire after 365d)
8. Security Hardening (Adversarial)
These items assume an attacker has read the full source code.
- [x] SSRF: any feature that makes outbound HTTP requests (webhooks, avatar fetch,
MCP endpoints) validates the target URL against an allowlist; blocks RFC-1918 ranges
(
musehub/security/ssrf.py:check_url_safe()— sync, scheme + bare IP, used inWebhookCreate.urlfield_validator;validate_outbound_url()— async + DNS resolution viaasyncio.to_thread, used in_attempt_delivery()as defence-in-depth against DNS rebinding; blocks loopback 127.x, RFC-1918 10.x/172.16-31.x/192.168.x, link-local 169.254.x AWS metadata, fc00::/7, 100.64.x carrier NAT) - [x] Mass assignment: Pydantic models used for all request bodies; no
**kwargsordict(request.body)passed directly to ORM constructors (all route handlers use typed Pydantic models; the one**cache_payloadinui_blame.pyis built from server-controlled data, not user input) - [x] Timing attacks on auth: use
hmac.compare_digestfor all secret comparisons, never==(runner token:hmac.compare_digestinrunner.py; MSign:verify_signatureEd25519 cryptographic verification inrequest_signing.py— constant-time by design; webhook signatures:hmac.compare_digestinmusehub_webhook_dispatcher.py) - [x] Object enumeration: repo IDs and issue IDs are either UUIDs or obfuscated —
sequential integers let attackers enumerate all repos/issues across all users
(
repo_id,issue_id:String(36)UUID primary keys; issuenumberis per-repo sequential — scoped by repo, not a global counter) - [x] Commit ID forgery:
muse verifysignature check is enforced server-side on receive, not just client-side (REQUIRE_SIGNED_COMMITS=trueinSettings+ enforcement gate inwire_push: rejects any commit with emptysignatureorsigner_key_id; default False for backward compat; unsigned commits log at DEBUG when enforcement is off) - [x] Denial of service via regex: all user-supplied regex patterns (search, filter)
are compiled with a timeout or replaced with parameterized FTS
(
search_by_patternuses Pythoninoperator — no regex;search_by_asktokenizes with a fixed pre-compiled_TOKEN_REpattern — no user-controlled regex; nore.compile(user_input)anywhere in the search path) - [x] Tar bomb / zip bomb: archive extraction enforces max uncompressed size and
max file count before extraction begins
(N/A: MuseHub does not perform any archive extraction;
tarfile/zipfileare not imported anywhere in the application code) - [x] Polyglot files: file type validation by magic bytes, not just extension
(
musehub/security/magic_bytes.py:check_magic_bytes(path, content)validates MIDI/MP3/WebP/PNG/JPEG/ZIP/PDF by header signatures; blocks HTML and shebang content in all non-HTML files; called inwire_pushbefore storing each object — returnsWirePushResponse(ok=False)on mismatch) - [x] Clickjacking:
X-Frame-Options: DENY+ CSPframe-ancestors 'none'(SecurityHeadersMiddlewareinmain.py:87-109— setsX-Frame-Options: DENYandContent-Security-Policywithframe-ancestors 'none'on every response) - [x] Open redirect: redirect-after-login validates target is same-origin only
(
ui_mcp_elicitation.py:callback = request.url.path(+ query) — stores path-only, never the full absolute URL with scheme/host; prevents attacker from injecting?next=https://evil.comvia a crafted Host header) - [x] Account takeover via handle squatting: handle registration is case-insensitive
normalized (e.g.,
Gabriel==gabriel) (musehub_auth.py:handle = handle.strip().lower()beforeMusehubIdentity()— Gabriel and gabriel map to the same row; UNIQUE constraint onhandleenforces it) - [x] Claude / MCP prompt injection: MCP tool results that include user-controlled
content are wrapped in a clear delimiter and documented as untrusted; the system prompt
instructs the model to treat that content as data, not instructions
(
mcp/dispatcher.py: success results wrapped in<musehub_tool_result>…</musehub_tool_result>;mcp/prompts.pyorientation prompt names the delimiter and lists commit messages, issue bodies, file paths, repo names, branch names as untrusted user data) - [x] Agent impersonation:
agent_idin commit provenance is validated against a registry of known agents; unknown agents are accepted but flagged, never trusted for elevated operations (TRUSTED_AGENT_IDSlist inSettings;wire_pushlogs WARNING and injectsmetadata["untrusted_agent"] = "true"for anyagent_idnot matching a trusted prefix; unknown agents are accepted, never rejected)
9. Compliance & Legal Minimums
- [x] Privacy policy exists and is linked from the footer
(
docs/legal/privacy-policy.md; covers pubkey-as-identity model, agent-first design, training data policy, GDPR export/delete rights, and opt-out mechanism; linked frombase.htmlfooter) - [x] Terms of service exist; acceptance is implicit via MSign key registration
(
docs/legal/terms-of-service.md; agents cannot click accept — the operator's act of registering a key constitutes acceptance;tos_accepted_at+tos_versionrecorded onMusehubIdentityat registration time inmusehub_auth.py; training data policy detailed: public OSI-licensed repos only,training_opt_out=truerespected, private repos never used) - [x] Minimum data: accounts are pubkeys + handles — no passwords, no required
email;
MusehubIdentityfields audited; only voluntarily-provided profile data (email, bio, avatar) is stored; agent identities store model name and capabilities for discovery, not for PII purposes - [x] GDPR / CCPA:
GET /api/me/exportreturns full data dump (identity, keys, repos, commits);DELETE /api/mehard-deletes auth keys and soft-deletes identity + repos;training_opt_out: boolfield onMusehubRepo(migration0023_compliance_fields.py) - [x] DMCA takedown process documented (
docs/legal/dmca.md; contact[email protected]; 2-day acknowledgement / 5-day action SLA; counter-notice process; repeat-infringer policy; agent-operator accountability) - [x] OSS license audit completed (
docs/legal/license-audit.md; all direct dependencies are MIT / Apache-2.0 / BSD-3-Clause / LGPL-3.0; psycopg2-binary LGPL-3.0 is compatible under dynamic-linking exception for server-side SaaS; quarterly review schedule established)
10. Pre-Launch Smoke Tests
These must all pass in staging before prod deploy:
- [ ] New user signup → keygen → register → push a repo → view on MuseHub
- [ ] Private repo is not accessible when unauthenticated
- [ ]
muse pushwith a forged object SHA is rejected - [ ] Rate limit kicks in after threshold on auth endpoint
- [ ] Archive download returns correct bytes for a known commit
- [ ] Symbol index is populated after push and renders on
/symbols - [ ] Issue create → comment → close flow works end to end
- [ ] Merge proposal open → review → merge flow works end to end
- [ ] MCP endpoint returns correct tool schema and executes a read tool correctly
- [ ]
docker compose down && docker compose uppreserves all data (volumes survive) - [ ] Backup restore drill: drop staging DB, restore from latest backup, verify
11. Launch Gate
All sections above must be fully checked. Final sign-off:
| Domain | Owner | Signed off | Date |
|---|---|---|---|
| Auth & AuthZ | gabriel | [ ] | |
| Input validation | gabriel | [ ] | |
| Network / TLS | gabriel | [ ] | |
| Rate limiting | gabriel | [ ] | |
| Data integrity | gabriel | [ ] | |
| Performance | gabriel | [ ] | |
| Infrastructure | gabriel | [ ] | |
| Security hardening | gabriel | [ ] | |
| Smoke tests | gabriel | [ ] |
When every row is checked: tag
v1.0.0-rc1, deploy to staging, hold 72 h, then promote to production.