gabriel / musehub public
Open #8 Enhancement
filed by gabriel human · 43 days ago

Supercharge intel indexing: full muse code GUI — multiphase implementation plan

0 Anchors
Blast radius
Churn 30d
0 Proposals

Context

MuseHub's Intel Hub currently surfaces a fraction of what muse code can produce. After a deep-dive on every command and its JSON output shape, this issue captures the full plan to:

  1. Index all repo-wide intel into structured DB tables on every push
  2. Build a GUI sub-page for every muse code command
  3. Add a per-symbol detail panel usable from any page
  4. Fix the symbols page N+1 / full-history-load performance regression

Current state

What the worker already indexes per push:

Table Source What it stores
musehub_symbol_intel muse code hotspots/gravity/blast Per-symbol churn, blast, gravity, last_changed
musehub_symbol_history_entries push delta ops Raw per-symbol op history
musehub_hash_occurrence_entries push delta ops Clone-detection (content_id → addresses)
musehub_intel_results intel.structural Velocity + contributor JSON blobs

What muse code can produce but is NOT indexed:

Command Output shape GUI potential
coupling pairs[]{file_a, file_b, co_changes} Co-change heatmap
entangle pairs[]{symbol_a, symbol_b, co_change_rate, structurally_linked} Symbol entanglement graph
dead candidates[]{address, kind, confidence, reason} Dead-code hit list
blast-risk symbols[]{address, risk, impact_score, churn_score, test_gap_score, coupling_score} Pre-release risk dashboard
stable symbols[]{address, days_stable, last_changed_commit} Stability leaderboard
velocity modules[]{module, current{added,net,modified}, acceleration} Module growth chart
clones clusters[]{tier, hash, count, members[]} Duplicate code browser
type {coverage_fraction, fully_typed, untyped} + symbols[]{address, type_score, params_with_any} Type health dashboard
api-surface symbols[]{address, kind, signature, visibility} Public API browser
languages languages[]{language, symbol_count, file_count, pct} Language breakdown
codemap modules[]{module, symbols[], imports[], exports[]} Module topology
detect-refactor events[]{kind, from_address, to_address, commit_id} Refactoring timeline
breakage issues[]{address, kind, detail} Working-tree health gate

Per-symbol commands (on-demand, not pre-indexed):

Command Trigger GUI location
impact "file::Sym" Click symbol Detail panel → Blast radius tab
deps "file.py" Click file File detail → Imports/calls graph
blame "file::Sym" Click symbol Detail panel → Blame tab
lineage "file::Sym" Click symbol Detail panel → Lineage tab
narrative "file::Sym" Click symbol Detail panel → Story tab
symbol-log "file::Sym" Click symbol Detail panel → History tab
age Per symbol row Stability badge on symbol list
coverage "file::Class" Click class Detail panel → Coverage tab
contract "file::Sym" Click symbol Detail panel → Contract tab
predict Repo landing "What changes next" sidebar widget
semantic-test-coverage Symbols list Coverage column

Phase 1 — DB schema additions

Add normalized tables for each missing intel type. All have (repo_id, …) primary keys and CASCADE deletes.

-- Co-changing file pairs
musehub_intel_coupling(repo_id, file_a, file_b, co_changes, from_ref, to_ref)

-- Symbol entanglement pairs
musehub_intel_entangle(repo_id, symbol_a, symbol_b, co_change_rate, co_changes,
                       structurally_linked, a_in_test, b_in_test)

-- Dead-code candidates
musehub_intel_dead(repo_id, address, kind, confidence, reason, ref)

-- Composite pre-release risk
musehub_intel_blast_risk(repo_id, address, kind, risk, impact_score,
                         churn_score, test_gap_score, coupling_score, ref)

-- Long-stable symbols
musehub_intel_stable(repo_id, address, days_stable, last_changed_commit,
                     last_changed_at, from_ref, to_ref)

-- Module velocity
musehub_intel_velocity(repo_id, module, added, removed, net, modified,
                       active_commits, prior_added, prior_net, acceleration,
                       stagnant_commits, ref)

-- Clone clusters
musehub_intel_clones(repo_id, cluster_hash, tier, member_count,
                     members_json, ref)

-- Per-symbol type health
musehub_intel_type(repo_id, address, kind, return_is_any, params_total,
                   params_annotated, params_with_any, type_score, ref)

-- API surface entries
musehub_intel_api_surface(repo_id, address, kind, signature, visibility, ref)

-- Language breakdown (one row per language per push)
musehub_intel_languages(repo_id, language, symbol_count, file_count,
                        pct, ref)

-- Refactoring events
musehub_intel_refactor_events(repo_id, event_id, kind, from_address,
                              to_address, commit_id, committed_at)

Also extend musehub_symbol_intel with:

  • last_commit_id VARCHAR(128) — enables symbol list links without loading history
  • op VARCHAR(16) — latest op (add/modify/delete) — replaces full history load on list page

Phase 2 — Worker intel providers

Add one IntelProvider per new job type in musehub_intel_providers.py:

Job type Runs muse code command Stores to
intel.coupling coupling --json musehub_intel_coupling
intel.entangle entangle --json musehub_intel_entangle
intel.dead dead --json musehub_intel_dead
intel.blast_risk blast-risk --json musehub_intel_blast_risk
intel.stable stable --json musehub_intel_stable
intel.velocity velocity --json musehub_intel_velocity
intel.clones clones --json musehub_intel_clones
intel.type type --json musehub_intel_type
intel.api_surface api-surface --json musehub_intel_api_surface
intel.languages languages --json musehub_intel_languages
intel.refactor detect-refactor --json musehub_intel_refactor_events

All enqueued by job_types_for_push for code-domain repos. Each job calls muse -C <repo_root> code <command> --json, parses the output, and batch-upserts rows (respecting the asyncpg 32,767 param limit already fixed in history entries).


Phase 3 — Intel Hub GUI sub-pages

Extend /intel hub with sub-pages, each reading from the new tables:

URL Command Primary UI pattern
/intel/hotspots hotspots Ranked bar list
/intel/gravity gravity Ranked list + depth distribution sparkline
/intel/dead dead Confidence-grouped list with delete affordance
/intel/blast-risk blast-risk Multi-score table (impact / churn / test-gap / coupling)
/intel/coupling coupling File-pair heat list
/intel/entangle entangle Symbol-pair list, badge if structurally unlinked
/intel/stable stable Reverse-leaderboard: longest unchanged wins
/intel/velocity velocity Module table with acceleration indicator arrows
/intel/clones clones Cluster browser grouped by tier (exact / near)
/intel/type type Donut health summary + untyped symbol list
/intel/api-surface api-surface Public API browser, filterable by kind
/intel/languages languages Language breakdown bars
/intel/refactor detect-refactor Timeline of rename / extract / inline events

Each sub-page is a read from a single narrow DB table — sub-millisecond queries, no muse subprocess at request time.


Phase 4 — Per-symbol detail panel

Symbol clicks anywhere (symbols list, hotspots, gravity, etc.) open a sliding panel or dedicated route (/symbol/{address}) with tabs:

Tab Source Notes
Overview musehub_symbol_intel Churn, gravity, blast, age — all pre-computed
History musehub_symbol_history_entries Commit-by-commit ops, paginated
Blast radius muse code impact on demand Lazy-loaded via HTMX
Deps muse code deps on demand Import/call graph, lazy
Lineage muse code lineage on demand Lazy
Narrative muse code narrative on demand AI story, cached after first load
Coverage musehub_intel_type + semantic-test-coverage Static type + test signal
Contract muse code contract on demand Lazy, expensive

On-demand tabs call a /{owner}/{repo}/intel/symbol-panel?address=…&tab=… HTMX endpoint that shells out to muse code and returns an HTML fragment. Results cached in musehub_intel_results with a TTL so repeated loads are free.


Phase 5 — Symbols page performance fix

Current bug: symbol_list_page calls load_symbol_history which fetches every row in musehub_symbol_history_entries into Python memory — O(commits × symbols) rows just to display a paginated list.

Fix: query musehub_symbol_intel directly (one row per symbol). After Phase 1 adds last_commit_id and op columns, the page makes one indexed query, applies cursor pagination in SQL, and never touches the history table.


Acceptance criteria

  • Phase 1: Alembic migration with all new tables, no data loss
  • Phase 2: Worker enqueues and completes all new job types on push; intel.code job extends to populate extended symbol_intel columns (last_commit_id, op)
  • Phase 3: All 13 Intel Hub sub-pages render from DB — median page load < 200ms
  • Phase 4: Symbol detail panel opens from any symbol link; Overview and History tabs load from DB; on-demand tabs use HTMX lazy loading
  • Phase 5: Symbols list page load < 300ms for any repo size; no full history load
  • No muse code subprocess runs in the web request path except on-demand panel tabs

Notes

  • Batch size limits: asyncpg caps at 32,767 params. Already fixed for history entries. Apply same pattern to all new upsert helpers.
  • muse code clones is expensive on large repos — run with --top 50 and store only the top clusters.
  • muse code narrative and muse code contract are LLM-backed — never index; always on-demand with response caching.
  • detect-refactor needs --from <prev_head> --to <new_head> per push so only new refactoring events are appended.
Activity15
gabriel opened this issue 43 days ago
gabriel 43 days ago

Phase 1 — Complete ✅

Schema & migration (both repos committed and pushed to local):

  • Migration 0004_phase1_intel_tables.py — 11 new normalized intel tables + 2 extended columns on musehub_symbol_intel
  • All 5 schema fields corrected to match actual CLI output shapes:
    • dead.confidence: FloatString(16) (CLI emits "high"/"medium")
    • blast_risk: added risk_score Integer (0–100), kept risk String(16) as label
    • stable: dropped last_changed_commit/last_changed_at, added since_start Boolean
    • api_surface.signaturesignature_id String(128) (CLI returns object ID, not text)
    • refactor_events: from_address/to_addressaddress + detail (matches CLI event shape)

CLI output normalization (muse repo, 7 commands, all tests GREEN):

  • dead: resultscandidates
  • stable: stablesymbols, unchanged_fordays_stable
  • languages: filesfile_count, symbolssymbol_count
  • clones: hashcluster_hash, countmember_count
  • api_surface: resultssymbols
  • blast_risk: added risk_label field (high/medium/low)
  • detect_refactor: from/tofrom_ref/to_ref; class-style TypedDict

Tests: 40/40 Phase 1 schema tests passing. 858/858 muse tests passing.

Not yet deployed to staging (holding until Phase 2 is ready for an integrated deploy).


Next: Phase 2 — Worker intel providers (11 new IntelProvider subclasses, TDD).

gabriel 43 days ago

Phase 1 + Phase 2 complete ✅

Phase 1 — Schema (migration 0004)

11 new normalized intel tables live on staging:

Table Provider
musehub_intel_coupling file co-change pairs
musehub_intel_entangle symbol entanglement pairs
musehub_intel_dead dead-code candidates
musehub_intel_blast_risk composite pre-release risk per symbol
musehub_intel_stable long-stable symbols
musehub_intel_velocity module growth velocity
musehub_intel_clones duplicate code clusters
musehub_intel_type per-symbol type health
musehub_intel_api_surface public API surface entries
musehub_intel_languages language breakdown per push
musehub_intel_refactor_events detected refactoring events

musehub_symbol_intel extended with last_commit_id and op columns.

Phase 2 — Providers (11 new IntelProvider subclasses)

Each provider runs on push via the existing intel dispatch pipeline:

  • CouplingProvidermuse code coupling
  • EntangleProvidermuse code entangle
  • DeadProvidermuse code dead --high-confidence-only
  • BlastRiskProvidermuse code blast-risk
  • StableProvidermuse code stable
  • VelocityProvidermuse code velocity
  • ClonesProvidermuse code clones
  • TypeProvidermuse code type
  • ApiSurfaceProvidermuse api-surface
  • LanguagesProvidermuse languages
  • DetectRefactorProvidermuse code detect-refactor

All providers use pg_insert().on_conflict_do_update() upsert semantics — safe to re-run on every push.

Test coverage

  • test_phase1_intel_schema.py — 40/40 passing
  • test_phase2_intel_providers.py — 40/40 passing (80 total across both phases)

Deployed

Image 63b502a6-20260502185102 live on staging. Migration 0004 applied.

gabriel 42 days ago

Phase 6 — Intel landing page revamp (deferred)

The /intel hub page currently shows the old layout. Phase 6 would replace it with a redesigned landing that surfaces all the new intel types added in Phases 1–5.

Proposed work:

  • Replace the current intel hub page with a proper dashboard layout
  • Surface gravity top-N preview (top 5 symbols by gravity_pct with links to detail)
  • Surface structural velocity sparkline and contributor count
  • Surface coupling/entangle highlights if data is present
  • Edge-to-edge layout consistent with the gravity pages
  • Navigation tiles to all intel sub-pages (gravity, hotspots, etc.)

This was scoped but not implemented — picked up separately when the full intel hub redesign is prioritized.

gabriel 42 days ago

/intel/blast-risk is now live (issue #11 closed). The blast-risk page is backed by BlastRiskProvider — a pure SQL provider that derives risk scores from musehub_symbol_intel using the formula: impact×40 + churn×25 + test_gap×20 + coupling×15. List page, per-symbol detail page, explosion icon, and clickable rows all shipped.

gabriel 42 days ago

Issue #12 (stable symbols leaderboard) is now complete and closed. The stable symbols page is live at /intel/stable as part of the Intelligence Hub.

gabriel 42 days ago

Issue #13 tracks the full entangle GUI implementation — EntangleProvider, /intel/entangle pages, dashboard card, migration, and eight tiers of tests.

gabriel 42 days ago

Cross-ref: issue #13 (entangle GUI) closed.

muse code entangle is now fully surfaced in the Intelligence Hub:

  • /intel/entangle — ranked co-change pair leaderboard with rate/co-change filters
  • /intel/entangle/symbol?address=... — per-symbol focus view
  • Dashboard card on /intel showing top 3 pairs

Data parity verified against CLI output. Algorithm: Jaccard-min rate, import-symbol and mass-commit exclusion, BFS commit walk — all matching muse code entangle exactly.

gabriel 42 days ago

Issue #15 created: feat(intel): coupling GUI — file co-change heatmap

Covers: pure-SQL CouplingProvider rewrite, migration 0011 (indexes), /intel/coupling list page with heat intensity bars, dashboard 6th card, 49-case test suite (CP_01–CP_49, 7 tiers).

gabriel 42 days ago

Issue #15 (Coupling GUI) complete — all 5 phases shipped:

  • Phase 0: CouplingProvider (pure-SQL BFS, no subprocess dependency)
  • Phase 1: migration 0011 + model indexes
  • Phase 2: SCSS components/_coupling.scss + pages/_coupling.scss
  • Phase 3: intel_coupling.html subpage + intel_dashboard.html card (6th panel)
  • Phase 4: ui_intel.py route + dashboard queries
  • Phase 5: 53-case test suite (7 tiers, all green)

Coupling pairs with co-change heatmap are live on local. Awaiting staging deploy.

gabriel 42 days ago

Intel Dashboard — session update (2026-05-03)

Shipped and deployed to staging (8eef4aef):

Issue #15 — Coupling GUI (all 5 phases):

  • CouplingProvider: pure-SQL BFS replacing subprocess dependency
  • Migration 0011: coupling indexes
  • SCSS: components/_coupling.scss + pages/_coupling.scss
  • Template: intel_coupling.html subpage + dashboard card
  • Route: intel_coupling_page + dashboard queries
  • 53-case test suite (7 tiers, all green)

Dashboard UX fixes:

  • Layout: 2-card top row (health + alerts), 4-card panel grid
  • Dead code card: now queries musehub_intel_dead directly (was always showing 0)
  • Coupling card: fixed worker container (old subprocess provider → new pure-SQL)
  • Gravity card: new panel showing top symbols by gravity_pct with bar chart
  • Overflow fix: min-width:0 on .intel-card stops grid blowout
  • Removed 'Intelligence Hub' header (redundant with breadcrumb)
  • Colored icons on every card: flame/hotspots, ban/dead, explosion/blast, snowflake/stable, zap/entangle, activity/coupling, target/gravity

Velocity sparkline fix:

  • compute_intel was reading 'timestamp'/'ts' but history entries use 'committed_at' — weeks were all zero; now populated with real data
gabriel 42 days ago

Issue #16 created: feat(intel): Velocity subpage — module growth rates, acceleration, dual-window comparison. Full 5-phase plan with ASCII wireframes, Spectral theme tokens, pure-SQL VelocityProvider rewrite, migration 0012, SCSS, route/template, and 50-case TDD suite (7 tiers). Assigned to gabriel.

gabriel 42 days ago

Issue #17 — Clone Browser: closed ✅

Clone Browser (feat(intel): Clone Browser — visual GUI for muse code clones) is fully shipped and closed.

All phases complete:

  • ClonesProvider (worker, DB-backed — no subprocess in request path) ✅
  • List page (/intel/clones) with tier/top filters ✅
  • Detail page (/intel/clones/{hash}) ✅
  • Dashboard card ✅
  • SCSS (layout + visual) ✅
  • Full 7-tier test suite (TDD, unit, integration, E2E, stress, state integrity, performance + security) ✅
  • Provider fidelity fixes committed alongside (days_stable calendar days, Jaccard co_change_rate, hotspots off legacy snapshot path) ✅

Next up: Type Health dashboard (new issue created)

The Type Health sub-page (/intel/type) is being tracked in a dedicated issue on this repo. It follows the same DB-backed, no-subprocess-in-request-path architecture as all prior intel pages.

gabriel 42 days ago

Phase 3 sub-pages status — 2026-05-03

11 of 13 Intel Hub sub-pages are live on staging.

URL Status
/intel/hotspots ✅ live
/intel/gravity ✅ live
/intel/dead ✅ live
/intel/blast-risk ✅ live
/intel/coupling ✅ live
/intel/entangle ✅ live
/intel/stable ✅ live
/intel/velocity ✅ live
/intel/clones ✅ live
/intel/type ✅ live (ticket #18)
/intel/api-surface ✅ live (ticket #19) — 8,372 symbols indexed on musehub
/intel/languages ❌ not started
/intel/refactor ❌ not started

Phases 1 & 2 are complete for all 11 live sub-pages. All providers use pure-Python / DB reads — no muse code subprocess in the web request path.

Phases 4 & 5 (per-symbol detail panel, symbols page performance fix) not yet started.

gabriel 42 days ago

Status update — 2026-05-03

Completed since last update

Languages intel (issue #20) — fully shipped to staging:

  • LanguagesProvider: pure language_of() + parse_symbols() on stored snapshot objects — zero subprocesses, zero on-disk repo required
  • Migration 0014: kinds_json JSONB column on musehub_intel_languages
  • Route /{owner}/{slug}/intel/languages: sort by pct/files/symbols, top 20/50/100
  • Template: stat chips (Languages / Total Files / Total Symbols), spectral bars per language, per-kind chips
  • Dashboard card: top-5 languages by file count with spectral bars
  • 30/30 tests GREEN across 7 tiers
  • Live: https://staging.musehub.ai/gabriel/musehub/intel/languages

API surface intel (issue #19) — fully shipped to staging:

  • ApiSurfaceProvider: same no-subprocess pattern, stored objects only
  • Route, template, SCSS, dashboard card with kind breakdown bars
  • Live: https://staging.musehub.ai/gabriel/musehub/intel/api-surface

Up next

Code map intel — implementation plan being authored now as a new issue. Will follow the same no-subprocess architecture: extract at push time via stored snapshot objects, store in Postgres, serve from SQL.

gabriel 42 days ago

Code Map intel deployed to staging

Commit sha256:82ee0f34b3c3 — image 82ee0f34-20260503144856 — deployed to staging blue slot (port 1337).

What shipped

Phase 1 — DB Migration 0015: created musehub_intel_codemap_modules (one row per repo_id + file_path; fan_in, fan_out, symbol_count, language, ref) and musehub_intel_codemap_meta (aggregate row: total_modules, total_edges, cycle_count, cycles_json).

Phase 2 — CodemapProvider: pure Python, zero subprocess. Walks snapshot manifest via get_backend(owner, slug).get(object_id), calls parse_symbols() per file to extract import records, resolves qualified_name = import::<dotted.module>::<sym> to tracked file paths, computes fan_in/fan_out, runs Tarjan's SCC for cycle detection, batch-upserts in 1,000-row chunks. Registered as intel.code.codemap in _PROVIDER_REGISTRY and job_types_for_push.

Phase 3–7 — TDD, route, template, SCSS, dashboard: 32/32 tests GREEN. Page live at /gabriel/musehub/intel/codemap with stat chips (Modules / Edges / Cycles), fan-in spectral bars, sort + top filter bar, cycle panel (✓ green for musehub — 0 cycles confirmed), and a dashboard card on the Intel Hub.

Acceptance criteria check

  • 32/32 tests GREEN
  • /gabriel/musehub/intel/codemap returns 200 with real data after next push
  • Modules stat chip uses DB aggregate (not page-length)
  • All numbers use | fmtnum filter
  • No _run_muse / subprocess in CodemapProvider
  • Cycles panel green ✓ (0 cycles in musehub)
  • Fan-in bars use var(--gradient-spectral)
  • Dashboard card links to /intel/codemap
  • Deployed to staging