gabriel / musehub public
Closed #22 Enhancement
filed by gabriel human · 46 days ago

feat(intel): Detect Refactor page — pure Python provider, route, template, SCSS, dashboard card

0 Anchors
Blast radius
Churn 30d
0 Proposals

Multi-Dimensional Code Intelligence Findings

Before designing this feature, `muse code` was run across seven axes.

`muse code detect-refactor` — current output shape

total_events:     6,123   (truncated at 500 commits scanned)
kinds:
  implementation  3,810   body_hash changed between consecutive commits
  signature       1,816   signature_id changed (API surface mutation)
  move              485   symbol relocated to different file
  rename             12   symbol name changed in same file
event keys:  kind, address, detail, commit_id, commit_message, committed_at

Root cause of empty data today

The worker logs `muse code detect-refactor exited 2` on every run — `_run_muse` spawns a subprocess that fails because no on-disk muse repo is mounted in the container. The `musehub_intel_refactor_events` table has always had zero rows.

`muse code hotspots` — top churners (highest blast radius for this feature)

musehub/services/musehub_wire.py::wire_push_stream       40 changes
musehub/mcp/dispatcher.py::_call_tool                    22 changes
musehub/services/musehub_wire.py::wire_push_object_pack  18 changes
musehub/api/routes/wire.py::push_stream                  16 changes
musehub/services/musehub_intel_providers.py              (hotspot — touches every push)

Existing DB table

`musehub_intel_refactor_events` exists with columns: `event_id`, `repo_id`, `kind`, `address`, `detail`, `commit_id`, `committed_at` Missing: `commit_message` (needed for the template's commit context chip).


Pure-Python Detection Algorithm (no subprocess)

For each push (HEAD commit vs its parent):

  1. Fetch HEAD snapshot manifest  →  map{ file_path: object_id }
  2. Fetch parent snapshot manifest  →  map{ file_path: object_id }

  3. For each Python file in the union of both manifests:
     a. parse_symbols(src_head, path)    →  head_tree
     b. parse_symbols(src_parent, path)  →  parent_tree

  4. Build lookup maps:
     head_by_address    = { addr: rec }
     parent_by_address  = { addr: rec }
     head_by_content    = { content_id: addr }   ← for move detection
     parent_by_content  = { content_id: addr }

  5. For each address in parent_tree:
     - If NOT in head_tree:
         check head_by_content[rec.content_id] → different file?  → move
     - If body_hash changed AND signature_id changed  → implementation + signature
     - If body_hash changed                           → implementation
     - If signature_id changed (body_hash same)       → signature
     - If name changed in same file (body_hash same)  → rename

  6. Batch-upsert detected events into musehub_intel_refactor_events
     (idempotent via event_id = blob_id(repo_id:commit_id:address:kind))

All data flows from `get_backend(owner, slug).get(object_id)` — no subprocess, no on-disk repo checkout, no `_run_muse` call anywhere.


Page Wireframe

╔══════════════════════════════════════════════════════════════════════════════╗
║  ← Intel Hub                                                                ║
║                                                                             ║
║    ◈ DETECT REFACTOR                                                       ║
║    Symbol-level refactoring events detected at push time from HEAD→parent  ║
║    diff — implementation changes, signature mutations, moves, renames.     ║
║                                                                             ║
╠══════════════════════════════════════════════════════════════════════════════╣
║                                                                             ║
║  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐      ║
║  │    6,123    │  │    3,810    │  │    1,816    │  │     497     │      ║
║  │   EVENTS    │  │    IMPL     │  │     SIG     │  │ MOVE+RENAME │      ║
║  └─────────────┘  └─────────────┘  └─────────────┘  └─────────────┘      ║
║   (spectral txt)   (accent border)  (orange border)  (purple border)       ║
║                                                                             ║
║  FILTER  ┌──────────────┐ ┌───────────┐ ┌──────┐ ┌────────┐ ┌─────┐     ║
║          │implementation│ │ signature │ │ move │ │ rename │ │ all │     ║
║          └──────────────┘ └───────────┘ └──────┘ └────────┘ └─────┘     ║
║                                                                             ║
║  SORT    [ recent ▾ ]  [ address ]     SHOW  [ 20 ]  [ 50 ]  [ 100 ]     ║
║                                                                             ║
║  ┌──────────────────────────────────────────────────────────────────────┐  ║
║  │ ADDRESS                              KIND    COMMIT      WHEN        │  ║
║  ├──────────────────────────────────────────────────────────────────────┤  ║
║  │ musehub/services/musehub_wire.py::   ██impl  abc123…    2 days ago  │  ║
║  │   wire_push_stream                                                   │  ║
║  │   "feat(wire): streaming push refactor"                              │  ║
║  ├──────────────────────────────────────────────────────────────────────┤  ║
║  │ musehub/mcp/dispatcher.py::          ░░sig   def456…    3 days ago  │  ║
║  │   _call_tool                                                         │  ║
║  ├──────────────────────────────────────────────────────────────────────┤  ║
║  │ musehub/storage/backends.py →        ▒▒move  ghi789…    5 days ago  │  ║
║  │   musehub/storage/local.py::                                         │  ║
║  │   LocalBackend                                                       │  ║
║  ├──────────────────────────────────────────────────────────────────────┤  ║
║  │ tests/test_wire.py::               ✦rename  jkl012…    6 days ago  │  ║
║  │   test_fp6_counts_are_accurate                                       │  ║
║  │   → test_fp6_counts_accurate                                         │  ║
║  └──────────────────────────────────────────────────────────────────────┘  ║
║                                                                             ║
╠══════════════════════════════════════════════════════════════════════════════╣
║  ← Pagination: showing 20 of 6,123 events                                  ║
╚══════════════════════════════════════════════════════════════════════════════╝

Spectral Theme Tokens

Element Token Notes
Stat chip values `var(--gradient-spectral)` background-clip text
impl kind chip `var(--color-accent)` most common — blue
sig kind chip `var(--color-orange)` API surface mutation
move kind chip `var(--color-purple)` structural relocation
rename kind chip `var(--color-rose)` symbol identity change
Stat chip border (events) `var(--color-teal) 30%` base count
Stat chip border (impl) `var(--color-accent) 30%`
Stat chip border (sig) `var(--color-orange) 30%`
Stat chip border (move+rename) `var(--color-purple) 30%`
Address text `var(--font-mono)` monospace
Column headers `var(--text-muted)` 0.65rem, uppercase
Commit SHA chip `var(--bg-surface)` + mono short hex
Row hover `var(--bg-surface)` 0.12s ease
Kind bar fill per-kind color solid, 4px radius

DB Schema Changes (Migration 0016)

Add `commit_message` column to `musehub_intel_refactor_events`:

ALTER TABLE musehub_intel_refactor_events
  ADD COLUMN commit_message TEXT;

No data migration needed — existing rows (all zero) get NULL, which the template handles gracefully with a fallback empty string.

Also add a composite index on `(repo_id, kind)` for the filter-by-kind query that powers the filter pills.


Implementation Phases (load-bearing order)

Phase 1 — Migration 0016 + ORM model update

Files: `alembic/versions/0016_refactor_commit_message.py`, `musehub/db/musehub_models.py`

  • Add `commit_message TEXT nullable` to `musehub_intel_refactor_events`
  • Add composite index `ix_intel_refactor_events_repo_kind` on `(repo_id, kind)`
  • Update `MusehubIntelRefactorEvent` ORM model with `commit_message` mapped column
  • Full NumPy-style docstring on the model

Phase 2 — Rewrite DetectRefactorProvider (pure Python, zero subprocess)

File: `musehub/services/musehub_intel_providers.py`

  • Delete `_run_muse` call and `_resolve_repo_root` dependency
  • Implement HEAD vs parent snapshot diff using `parse_symbols()` per file
  • Module-level imports: `get_backend`, `parse_symbols`, `language_of` (test patchability)
  • Detect all four kinds: implementation / signature / move / rename
  • Batch-upsert with `commit_message` populated
  • Return `[("intel.code.detect_refactor", {"count": N, "impl": I, "sig": S, "move": M, "rename": R})]`
  • Full NumPy-style class docstring

Phase 3 — TDD: 34 Tests RED

File: `tests/test_intel_detect_refactor.py`

All 34 tests written and confirmed failing before Phase 4.

Tier breakdown:

  • T01–T05 DB model (columns, nullable, cascade, index, pk)
  • T06–T13 Provider (no subprocess, impl detection, sig detection, move detection, rename detection, no-parent, empty manifest, idempotent)
  • T14–T21 Route (200, empty, 404, filter by kind, sort, top, stat chips, pagination count)
  • T22–T25 E2E HTML (kind chips, commit sha, detail text, dashboard link)
  • T26–T28 Data integrity (upsert idempotent, cross-repo, kind counts accurate)
  • T29–T31 Performance (provider under 15s for 50-file diff, route under 500ms, index exists)
  • T32–T34 Security (XSS in address, SQL injection in kind filter, invalid top param)

Phase 4 — Route Handler

File: `musehub/api/routes/musehub/ui_intel.py`

  • `intel_detect_refactor_page` at `GET /{owner}/{repo_slug}/intel/refactoring`
  • Query params: `kind` (all/implementation/signature/move/rename), `sort` (recent/address), `top` (20/50/100 as str, coerced)
  • Aggregate query: `COUNT(*)` total + per-kind breakdown (never uses `len(page)`)
  • Context: `events`, `total_events`, `impl_count`, `sig_count`, `move_rename_count`, `selected_kind`, `selected_sort`, `selected_top`, `valid_tops`, `valid_kinds`

Phase 5 — Template

File: `musehub/templates/musehub/pages/intel_detect_refactor.html`

  • Extends `musehub/base.html`, uses `intel-wrap`
  • 4 stat chips: Events / Impl / Sig / Move+Rename (each with per-kind border tint)
  • Filter pills for kind + sort + top
  • Event rows: address (split at `::`) | kind chip | short commit SHA | relative time | detail line
  • Move events show `old_path → new_path` arrow
  • Rename events show `old_name → new_name` arrow
  • All numeric values use `| fmtnum`
  • All user content uses `| e` for XSS safety
  • Empty state with refactor icon

Phase 6 — SCSS (`.rf-*` namespace)

Files: `src/scss/components/_detect_refactor.scss`, `src/scss/pages/_detect_refactor.scss`

`_detect_refactor.scss` (visual):

  • `.rf-stat-card` per-variant borders (teal/accent/orange/purple)
  • `.rf-stat-val` with `var(--gradient-spectral)` background-clip text
  • `.rf-kind-chip` — 4 variants (impl=accent, sig=orange, move=purple, rename=rose)
  • `.rf-commit-sha` — mono pill, bg-surface, border-default
  • `.rf-address-file` — muted, 0.68rem
  • `.rf-address-sym` — primary, 0.82rem, mono
  • `.rf-detail` — secondary, 0.7rem, italic
  • `.rf-arrow` — muted `→` glyph between old/new for move+rename rows

`_detect_refactor.scss` (layout — actually `pages/_detect_refactor.scss`):

  • `.rf-stats-row`: flex, gap, wrap
  • `.rf-list-hd` + `.rf-row`: `grid-template-columns: 1fr 8rem 7rem 8rem` (address | kind | commit | when)
  • Responsive: collapse commit + when at 700px

Wire into `app.scss` after `@use "components/codemap"` and `@use "pages/codemap"`.

Phase 7 — Dashboard Card + Wire-Up

Files: `musehub/api/routes/musehub/ui_intel.py`, `musehub/templates/musehub/pages/intel_dashboard.html`

  • Dashboard queries: total event count + per-kind breakdown + preview (top 5 most recent)
  • Context keys: `refactor_total`, `refactor_impl`, `refactor_sig`, `refactor_move_rename`, `refactor_preview`
  • Dashboard card: ◈ DETECT REFACTOR icon (rose), total count, kind breakdown bar, top-3 recent events
  • Links to `/intel/refactoring`
  • Deploy to staging, update issue #8

Docstring Standard

class DetectRefactorProvider:
    """Persist symbol-level refactoring events by diffing HEAD vs parent snapshots.

    Compares ``parse_symbols()`` output between the HEAD commit's snapshot and its
    parent's snapshot for every Python file in the manifest.  Detects four event
    kinds from symbol hash changes:

    - ``implementation`` — ``body_hash`` changed (function body rewritten)
    - ``signature``      — ``signature_id`` changed (parameter/return type mutated)
    - ``move``           — symbol disappeared from one file; same ``content_id``
                           found at a different file path in HEAD
    - ``rename``         — symbol name changed within the same file; ``body_hash``
                           matches across old and new name

    No subprocess is spawned.  All data flows from objects stored at push time
    via ``get_backend(owner, slug).get(object_id)``.  Events are batch-upserted
    into ``musehub_intel_refactor_events``; the ``event_id`` is
    ``blob_id(repo_id:commit_id:address:kind)`` so re-running the provider on the
    same commit is fully idempotent.

    Parameters
    ----------
    session : AsyncSession
    repo_id : str
    ref : str
    payload : JSONObject

    Returns
    -------
    IntelResults
        ``[("intel.code.detect_refactor", {"count": N, "impl": I, "sig": S,
        "move": M, "rename": R})]`` on success.  ``[]`` when HEAD has no parent
        (initial commit) or the snapshot manifest cannot be resolved.

    Notes
    -----
    Only the diff between HEAD and its immediate parent is processed per push.
    Events accumulate across pushes so the page shows the full project history.
    A symbol that is both renamed AND its body changed produces two events:
    one ``rename`` keyed on the old address and one ``implementation`` keyed on
    the new address.
    """

Acceptance Criteria

  • 34/34 tests GREEN
  • `/gabriel/musehub/intel/refactoring` returns 200 with real data
  • Worker log shows `intel.code.detect_refactor done` with count > 0 (no `exited 2`)
  • No `_run_muse` / `_resolve_repo_root` / `create_subprocess` in provider
  • All four kind filter pills work correctly
  • Stat chips show total counts (DB aggregates, not `len(page)`)
  • All numbers use `| fmtnum`, all user content uses `| e`
  • Dashboard card links to `/intel/refactoring`
  • Deployed to staging, issue #8 updated
Activity1
gabriel opened this issue 46 days ago
gabriel 46 days ago

Completed

All acceptance criteria met:

  • 34/34 tests GREEN across 7 tiers (DB, provider, route, HTML, integrity, perf, security)
  • /gabriel/musehub/intel/refactoring live with real data on staging and local
  • DetectRefactorProvider fully rewritten — pure Python HEAD-vs-parent snapshot diff, zero _run_muse / subprocess
  • Detects: implementation (body_hash), signature (signature_id), move (cross-file body match), rename (same-file body match)
  • Migration 0016: commit_message column + ix_intel_refactor_events_repo_kind composite index
  • All four kind filter pills, top 20/50/100, spectral stat chips, .rf-* SCSS namespace
  • Dashboard card wired up on intel hub
  • Deployed to staging