feat(intel): coupling GUI — file co-change heatmap
Overview
Surface muse code coupling in the Intelligence Hub as a ranked file co-change heatmap. The CLI already produces the data; the worker already has a CouplingProvider that calls _run_muse subprocess. This issue replaces the subprocess with a pure-SQL BFS algorithm (same pattern as EntangleProvider), adds indexes, builds the /intel/coupling list page with heatmap-style heat intensity bars, wires a dashboard card, and delivers a 7-tier test suite.
CLI output shape (verified against muse code coupling --json on this repo):
{
"pairs": [
{ "file_a": "musehub/api/routes/wire.py",
"file_b": "musehub/services/musehub_wire.py",
"co_changes": 33 },
{ "file_a": "musehub/models/musehub.py",
"file_b": "musehub/services/musehub_repository.py",
"co_changes": 19 },
...
]
}
Top pairs for this repo: wire.py ↔ musehub_wire.py (33), models ↔ repository (19), models/wire ↔ musehub_wire (16). Only 20 pairs total — a tight, readable signal.
Web UI Wireframe
┌─────────────────────────────────────────────────────────────────────────┐
│ ⚡ COUPLING gabriel/musehub │
│ File pairs that co-change most frequently — structural coupling signal │
├─────────────────────────────────────────────────────────────────────────┤
│ PAIRS 20 REF sha256:cedbb6f8 BUILT 2026-05-03 │
├─────────────────────────────────────────────────────────────────────────┤
│ MIN CO-CHANGES ≥ [ 2 ] SHOW [ 50 ▾] [ Apply ] │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ musehub/api/routes/wire.py ↔ musehub/services/musehub_wire.py │ │
│ │ ████████████████████████████████████████████████████ 33 │ │
│ ├─────────────────────────────────────────────────────────────────────┤ │
│ │ musehub/models/musehub.py ↔ musehub/services/musehub_repository │ │
│ │ ████████████████████████████████████████████ 19 │ │
│ ├─────────────────────────────────────────────────────────────────────┤ │
│ │ musehub/models/wire.py ↔ musehub/services/musehub_wire.py │ │
│ │ ██████████████████████████████████████ 16 │ │
│ ├─────────────────────────────────────────────────────────────────────┤ │
│ │ musehub/mcp/dispatcher.py ↔ musehub/mcp/tools/musehub.py │ │
│ │ █████████████████████████████████████ 15 │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
│ Heat key: ░░ low ▒▒ medium ▓▓ high ██ critical │
└─────────────────────────────────────────────────────────────────────────┘
Theme elements used:
--bg-surface/--bg-elevated/--bg-hover— row surfaces--border-default/--border-subtle— list dividers--color-accent→ low coupling bar fill--color-warning→ medium coupling (≥ 10 co-changes)--color-danger→ high coupling (≥ 20 co-changes)--font-mono— file paths, counts--gradient-spectral— optional heat-key decorative stripintel-page-header/intel-meta-bar/intel-meta-pill— standard Intel Hub header pattern
Current state
| What | Status |
|---|---|
musehub_intel_coupling table |
✅ exists (repo_id, file_a, file_b, co_changes, ref) |
CouplingProvider |
⚠️ exists but uses _run_muse subprocess — breaks in environments without a local repo |
| DB indexes | ❌ only ix_intel_coupling_repo — missing rate/file indexes |
/intel/coupling route + template |
❌ not implemented |
| Dashboard card | ❌ not wired |
| Tests | ❌ none |
Phase 0 — Rewrite CouplingProvider to pure SQL
Replace the _run_muse subprocess call with a BFS commit walk over musehub_symbol_history_entries, identical in structure to EntangleProvider.
Algorithm (mirrors muse code coupling exactly):
1. Fetch all commits for repo → commit_parents dict
2. BFS from HEAD, cap at _MAX_WALK = 10,000 commits
3. Bulk-fetch history entries → (commit_id, address) pairs
4. For each commit in walk:
- Derive file = address.split("::")[0] (or bare address if no "::")
- Skip entries where file is empty or starts with special prefixes
- If len(distinct files in commit) > _MAX_FILES_PER_COMMIT (200) → skip (mass commit)
- For each pair (file_a, file_b) where file_a < file_b → pair_co_changes[(a,b)] += 1
5. Filter: co_changes >= _MIN_CO_CHANGES (2), file_a != file_b (guaranteed by sort)
6. Sort by co_changes DESC
7. Truncate to _MAX_PAIRS (200)
8. DELETE stale rows for repo, upsert fresh set
Key differences from EntangleProvider:
- File-level not symbol-level —
file = address.split("::")[0](bare paths are valid here) - No import-symbol filter needed (working at file level)
- No Jaccard rate — raw
co_changescount is the signal _MAX_FILES_PER_COMMIT = 200(tighter than symbol-level 500 — a commit touching 200+ files is a mass-import or scaffolding, not signal)
Docstring (load-bearing):
class CouplingProvider:
"""Persist co-changing file pairs by mining musehub_symbol_history_entries.
Mirrors ``muse code coupling`` exactly — same BFS commit walk, same
mass-commit exclusion, same minimum co-change threshold.
Algorithm
---------
1. BFS-walk commits from HEAD (cap _MAX_WALK).
2. Bulk-fetch all history entries for this repo.
3. For each commit, derive the touched file set by splitting each address
on ``::`` and taking the left part. Bare-path entries (no ``::``) are
treated as file paths directly — unlike EntangleProvider which filters
them out, because at the file level they are valid.
4. Skip commits where the distinct file count exceeds _MAX_FILES_PER_COMMIT
(mass scaffolding / import commits produce O(N²) noise).
5. For each qualifying commit, accumulate pair_co_changes[(a, b)] for every
unordered file pair (a < b lexicographically).
6. Filter: co_changes >= _MIN_CO_CHANGES, then sort DESC, truncate to
_MAX_PAIRS.
7. DELETE stale rows, upsert fresh set.
Constants
---------
_MAX_WALK = 10_000 cap on BFS commit depth
_MAX_FILES_PER_COMMIT = 200 mass-commit guard
_MAX_PAIRS = 200 stored leaderboard size
_MIN_CO_CHANGES = 2 noise floor
"""
Phase 1 — Migration 0011: add indexes
# alembic/versions/0011_coupling_indexes.py
revision = "0011"
down_revision = "0010"
def upgrade():
op.create_index("ix_intel_coupling_repo_co", "musehub_intel_coupling",
["repo_id", "co_changes"])
op.create_index("ix_intel_coupling_repo_file_a", "musehub_intel_coupling",
["repo_id", "file_a"])
Phase 2 — SCSS
Two-file split (structural/visual) following the established pattern:
src/scss/components/_coupling.scss — visual only
.cp-list border + radius surface
.cp-pair-row divider, hover tint
.cp-file-a muted — left file path (font-mono)
.cp-file-b accent-link — right file path (font-mono)
.cp-arrow muted ↔ separator
.cp-count bold mono right-aligned
.cp-bar-track bg-elevated rail
.cp-bar-fill accent base fill
&--medium warning fill (co_changes >= 10)
&--high danger fill (co_changes >= 20)
.cp-filter-label uppercase muted label
.cp-empty-state centered muted with icon
src/scss/pages/_coupling.scss — layout only
.cp-wrap padding:0
.intel-page-header margin-bottom (same rule as stable/dead)
.cp-filter-bar flex row, gap, margin-bottom
.cp-filter-group flex align-center, gap
.cp-list flex-col
.cp-pair-row grid 1fr auto / auto auto; padding 0.75rem 1rem
.cp-files grid-col 1, row 1; flex row, gap, min-width 0, overflow hidden
.cp-stats grid-col 2, row 1; flex-col, align-end
.cp-bar-wrap grid-col 1/-1, row 2; height 3px
Wire into app.scss:
@use "components/coupling";
@use "pages/coupling" as page-coupling;
Phase 3 — Route + template
Route: GET /{owner}/{repo_slug}/intel/coupling
async def intel_coupling_page(request, owner, repo_slug, db,
min_co: int = 2,
top: int = 50):
"""
Render the file co-change coupling leaderboard.
Reads from musehub_intel_coupling ordered by co_changes DESC.
Applies min_co filter in SQL. Computes bar widths client-free
by normalising against the top pair's co_changes.
Parameters
----------
min_co : int
Minimum co-change count to include (default 2, noise floor).
top : int
Maximum pairs to display (choices: 25, 50, 100, 200).
Context variables
-----------------
pairs list of dicts — file_a, file_b, co_changes,
bar_pct, heat_modifier
total_count int — total stored pairs before filter
min_co int — current filter value
selected_top int — current page size
valid_tops list[int] — [25, 50, 100, 200]
index_meta IndexMeta | None
"""
Heat modifier logic:
def _cp_heat(co_changes: int) -> str:
if co_changes >= 20: return "high"
if co_changes >= 10: return "medium"
return ""
Template: intel_coupling.html
{% extends "musehub/base.html" %}
breadcrumb: owner / repo / intel / coupling
<header class="intel-page-header">
{{ icon("zap", 16) }} Coupling
<p>File pairs that co-change most frequently — structural coupling signal.</p>
</header>
intel-meta-bar: pairs | ref | built
<form> min_co input + top select + Apply button </form>
<div class="cp-list">
{% for p in pairs %}
<div class="cp-pair-row">
<div class="cp-files">
<span class="cp-file-a font-mono">{{ p.short_a }}</span>
<span class="cp-arrow">↔</span>
<span class="cp-file-b font-mono">{{ p.short_b }}</span>
</div>
<span class="cp-count font-mono">{{ p.co_changes | fmtnum }}</span>
<div class="cp-bar-wrap">
<div class="cp-bar-track">
<div class="cp-bar-fill{% if p.heat_modifier %} cp-bar-fill--{{ p.heat_modifier }}{% endif %}"
style="width:{{ p.bar_pct }}%"></div>
</div>
</div>
</div>
{% endfor %}
</div>
File paths are truncated to their last two components for display:
musehub/services/musehub_wire.py → services/musehub_wire.py
Phase 4 — Dashboard card
Add a 6th card to .intel-cards on the dashboard (after entangle):
┌─────────────────────┐
│ ⚡ COUPLING │ View all →
├─────────────────────┤
│ 20 pairs │
│ │
│ routes/wire ↔ │
│ services/wire 33 │
│ │
│ models/musehub ↔ │
│ services/repo 19 │
│ │
│ models/wire ↔ │
│ services/wire 16 │
└─────────────────────┘
Update .intel-cards grid: repeat(5, 1fr) → repeat(6, 1fr).
New breakpoints: 1400px → 3col, 960px → 2col, 540px → 1col.
Route adds coupling_count + coupling_preview (top 3 non-test pairs) to dashboard context.
Phase 5 — Test suite (CP_01–CP_49)
Tier 1 — Unit (CP_01–CP_08)
CP_01 file extraction from symbol address "src/a.py::fn" → "src/a.py"
CP_02 bare path treated as file "cloudflare" → "cloudflare"
CP_03 pair key canonical a < b ("z.py", "a.py") → ("a.py", "z.py")
CP_04 same-file pair excluded "src/a.py::fn1" + "src/a.py::fn2" → no pair
CP_05 heat modifier "" for co < 10
CP_06 heat modifier "medium" for co = 10..19
CP_07 heat modifier "high" for co >= 20
CP_08 _MIN_CO_CHANGES constant == 2
Tier 2 — Integration (CP_09–CP_18)
CP_09 empty repo → no pairs
CP_10 no history entries → no pairs
CP_11 single co-change commit → co_changes=1 → below threshold, no row
CP_12 two co-change commits → co_changes=2 → one pair stored
CP_13 three files in commit → 3 pairs (A↔B, A↔C, B↔C)
CP_14 same-file symbols (two fns in same file) → no pair
CP_15 pair key stored canonical (a < b)
CP_16 ref column populated correctly
CP_17 co_changes count exact
CP_18 bar_pct = 100 for top pair
Tier 3 — E2E (CP_19–CP_25)
CP_19 three files across 5 commits → correct ranking
CP_20 top pair has bar_pct = 100, second pair proportional
CP_21 result metadata: key="intel.code.coupling", count matches stored rows
CP_22 truncated=True when over MAX_PAIRS
CP_23 min_co filter removes low-signal pairs from route response
CP_24 top=25 returns at most 25 rows
CP_25 heat_modifier "high" on pairs with co_changes >= 20
Tier 4 — Performance (CP_26–CP_32)
CP_26 10 commits × 10 files → completes < 500ms
CP_27 100 commits × 20 files → completes < 2s
CP_28 empty repo fast-path → < 50ms
CP_29 second run not > 5× slower than first
CP_30 point lookup (fetch pairs for repo) < 10ms after provider run
CP_31 200-pair leaderboard rendered in route < 200ms
CP_32 dashboard preview query < 20ms
Tier 5 — State integrity (CP_33–CP_38)
CP_33 idempotent: two runs produce identical rows
CP_34 stale rows purged on re-run (DELETE before upsert)
CP_35 incremental: new commits add new pairs on re-run
CP_36 no duplicate (file_a, file_b) rows after 3 runs
CP_37 co_changes increases when more co-change commits added
CP_38 truncated flag False when pairs ≤ MAX_PAIRS
Tier 6 — Security (CP_39–CP_44)
CP_39 SQL injection in file path stored verbatim, table survives
CP_40 XSS payload in file path stored safely
CP_41 repo A pairs never visible in repo B query
CP_42 two repos each get independent pair sets
CP_43 re-run for new ref updates ref column on all rows
CP_44 unicode in file path handled without crash
Tier 7 — Stress (CP_45–CP_49)
CP_45 MAX_PAIRS cap: 50 files × 3 commits → stored ≤ MAX_PAIRS
CP_46 mass-commit exclusion: commit with >200 files skipped
CP_47 500 commits × 5 files → completes without error
CP_48 result count matches stored rows
CP_49 BFS walk cap: commits_analysed ≤ MAX_WALK
Acceptance criteria
CouplingProvideruses pure-SQL BFS — no_run_muse, no local repo required- Migration 0011 adds
ix_intel_coupling_repo_co+ix_intel_coupling_repo_file_a /intel/couplingpage renders from DB, median load < 200ms- Heat intensity bars: accent (low) / warning (medium ≥10) / danger (high ≥20)
- File paths truncated to last 2 components in display
- Dashboard 6th card wired; grid updated to 6-col
- 49 tests across 7 tiers, all green on
python3 -m pytest tests/test_coupling_provider.py - Data parity: GUI co_changes values match
muse code coupling --jsonoutput exactly - No regressions on existing intel pages
Duplicate of #15. Closing.