Comprehensive merge matrix: strategy × history × granularity parity between muse merge and proposal merge
Background
Muse tracks merges at three granularities git cannot: directory, file, and symbol. It has Harmony (conflict-resolution memory) instead of rerere. In theory this gives Muse a dramatically smoother dev UX than git. In practice we are introducing phantom conflicts and regressions during merges — which means the merge layer is not fully dialed in yet.
The proximate trigger was issue #85: `muse hub proposal merge --strategy overlay` reporting phantom conflicts on files with identical object IDs on both sides. The deeper problem is that:
- `muse merge` (local) and `muse hub proposal merge` (server) expose different strategy and history options — the server has capabilities the local command lacks.
- The full strategy × history × granularity matrix has never been systematically tested.
- Harmony (conflict-resolution memory) interacts differently with each combination and those interactions are undocumented and untested.
This ticket makes Muse's merge layer genuinely class-apart from git.
Goal
- Parity: every strategy and history mode available in `muse hub proposal merge` must also be available in `muse merge` locally. A proposal merge is a server-side automated `muse merge` — they must be identical in behavior.
- Correctness: every cell in the strategy × history × granularity matrix must have a passing integration test.
- Harmony: every merge path must correctly teach Harmony what it learned, or explicitly document why it cannot (e.g. squash destroys per-commit provenance).
- No phantom conflicts: convergent edits and untouched files must never produce conflicts under any strategy or history mode.
Phase 1 — Canonical strategy vocabulary
Conceptual model
Every merge strategy answers exactly two questions:
1. What unit do you diff at? (diff_unit)
three_way: combine(delta(base→ours), delta(base→theirs)) # uses the common ancestor
snapshot: compare(ours_state, theirs_state) # no base, final states only
replay: apply(delta(base→ours), onto=theirs_state) # one side's delta onto the other's state
apply(delta(base→theirs), onto=ours_state) # (direction is a flag)
2. When two units conflict, what do you do? (resolution_policy)
escalate: surface the conflict — human or Harmony resolves
prefer_ours: auto-resolve by keeping ours
prefer_theirs: auto-resolve by keeping theirs
These two axes produce every strategy. Named strategies are aliases for specific combinations — not independent code paths.
Internal architecture
class MergeEngine:
diff_unit: Literal["three_way", "snapshot", "replay_ours", "replay_theirs"]
resolution: Literal["escalate", "prefer_ours", "prefer_theirs"]
Strategy table
| Strategy | diff_unit | resolution | Notes |
|---|---|---|---|
| `recursive` | three_way | escalate | Default for `muse merge` |
| `overlay` | snapshot | prefer_theirs | Default for proposals |
| `snapshot` | snapshot | escalate | Snapshot-level with conflict surfacing |
| `replay` | replay_ours | escalate | Ours' delta applied onto theirs' state |
Convenience aliases (decompose internally to the above):
| Alias | Expands to |
|---|---|
| `--strategy ours` | `--strategy recursive --on-conflict ours` |
| `--strategy theirs` | `--strategy recursive --on-conflict theirs` |
CLI design — two flags, both optional
`--strategy` answers "how do I compare the two sides?" Defaults to `recursive`.
`--on-conflict` answers "what do I do when they disagree?" Defaults to `escalate`.
Both flags are optional, both default to the right thing, and they compose freely:
# Normal merge — no flags needed
muse merge feat/x
# Auto-resolve all conflicts keeping ours
muse merge feat/x --on-conflict ours
# Overlay, surface conflicts
muse merge feat/x --strategy overlay
# Overlay, auto-resolve keeping theirs
muse merge feat/x --strategy overlay --on-conflict theirs
# Convenience alias (muscle-memory friendly)
muse merge feat/x --strategy ours
Why three_way + domain plugin = automatic 3-level granularity
The three levels of conflict granularity (directory, file, symbol) are not separate strategies. They fall out naturally from `three_way` when the domain plugin knows how to decompose a file into diffable units:
- Symbol-level: domain plugin decomposes file → symbols; delta computed per symbol. Conflict only where the same symbol diverged. Untouched symbols auto-merge.
- File-level: fallback when no domain plugin; delta computed on whole file bytes.
- Directory-level: emergent — a directory conflict is just "all files under this prefix conflicted."
A file that looks like a conflict at the file level is often zero conflicts at the symbol level. This is Muse's core advantage over git. It is delivered by `three_way` + an active domain plugin — not by a separate strategy.
Deliverables
- [✅] `muse merge` gains `--on-conflict escalate|ours|theirs` flag.
- [✅] `muse merge` gains `--history merge|squash|rebase` flag (today it has no `--history` flag).
- [✅] `muse merge` `--strategy` choices: `recursive`, `overlay`, `snapshot`, `replay`, `ours` (alias), `theirs` (alias).
- [✅] One `MergeEngine` class shared by `muse merge` (local) and proposal merge (server).
- [✅] JSON output shape identical for both paths.
Phase 2 — Conflict granularity specification
Muse detects conflicts at three levels. Each must be exercised by the test matrix.
| Level | What it covers | Example |
|---|---|---|
| Directory | A directory deleted on one side, files added inside it on the other | `src/` deleted vs `src/new_module.py` added |
| File | Same file modified on both sides, object IDs differ | `config.py` modified differently on A and B |
| Symbol | Same symbol modified on both sides within a file | `config.py::MAX_CONNECTIONS` changed on both |
Convergent-edit sub-cases (must never produce a conflict)
| Sub-case | Description |
|---|---|
| Untouched file | File unchanged from base on both sides — object IDs identical |
| Convergent edit | Both branches independently arrive at the same content (same object ID) |
| Convergent symbol | Both branches modify the same symbol to the same `new_content_id` |
| Both deleted | File deleted on both sides — clean, no conflict |
Divergent-edit sub-cases (must produce a conflict)
| Sub-case | Description |
|---|---|
| File divergence | Same file, different content, no symbol info |
| Symbol divergence | Same symbol address, different `new_content_id` |
| Add/add collision | Both sides add a file at the same path with different content |
| Delete/modify | One side deletes a file, the other modifies it |
Deliverables
One new test file: `tests/test_phase2_conflict_granularity.py`.
Convergent sub-cases — all must produce `conflicts == []` and a clean status:
- [✅] CE_01 — Untouched file: both branches leave a file's object ID unchanged from base → no conflict detected.
- [✅] CE_02 — Convergent edit: both branches independently write the same bytes to a file → no conflict, file present in merged snapshot.
- [✅] CE_03 — Convergent symbol: both branches update the same Python symbol to identical source → no conflict detected (requires code plugin).
- [✅] CE_04 — Both deleted: both branches delete the same file from base → no conflict, file absent from merged snapshot.
Divergent sub-cases — all must produce the conflicting path in `conflicts`:
- [✅] DE_01 — File divergence: same path modified to different bytes on each side → path appears in `conflicts`.
- [✅] DE_02 — Symbol divergence: same symbol modified to different source on each side → conflict detected (requires code plugin).
- [✅] DE_03 — Add/add collision: both sides add the same path with different content → path appears in `conflicts`.
- [✅] DE_04 — Delete/modify: one side removes a file, the other modifies it → path appears in `conflicts`.
Directory-level:
- [✅] DIR_01 — Directory conflict: all files under `src/` deleted on one side; a new file added under `src/` on the other → conflict detected for the new file path.
[✅] All tests green with no regressions to PHANTOM_01–05 or Phase 1 tests.
Phase 3 — Full test matrix
Every cell is one integration test. Tests run identically whether executed via `muse merge` locally or simulated as a proposal merge.
Strategy resolution cheat sheet
| Strategy | diff_unit | resolution | Divergent conflict behaviour |
|---|---|---|---|
| `recursive` | three_way | escalate | surfaces conflict |
| `overlay` | snapshot | prefer_theirs | auto-resolves — theirs always wins, no conflict |
| `snapshot` | snapshot | escalate | surfaces conflict |
| `replay` | replay_ours | escalate | surfaces conflict |
Minimum required test assertions
Each test must assert:
- Correct `conflicts` list (empty for convergent/overlay; populated for escalating divergent)
- Correct merged snapshot content (non-conflicting changes from both branches present)
- Correct commit graph shape (one parent for squash, two for merge, linear for rebase)
Deliverables
One new test file: `tests/test_phase3_strategy_matrix.py`.
Group 1 — Convergent cases under every strategy (all must produce `conflicts == []`):
- [✅] SM_01 — `recursive` + untouched file → no conflict, untouched file in merged snapshot.
- [✅] SM_02 — `overlay` + untouched file → no conflict.
- [✅] SM_03 — `snapshot` + untouched file → no conflict.
- [✅] SM_04 — `replay` + untouched file → no conflict.
- [✅] SM_05 — `recursive` + convergent edit → no conflict, convergent content in merged snapshot.
- [✅] SM_06 — `overlay` + convergent edit → no conflict, convergent content in merged snapshot.
- [✅] SM_07 — `snapshot` + convergent edit → no conflict, convergent content in merged snapshot.
- [✅] SM_08 — `replay` + convergent edit → no conflict, convergent content in merged snapshot.
Group 2 — Divergent cases: conflict surfacing vs auto-resolution:
- [✅] SM_09 — `recursive` + file divergence → `conflicts` non-empty.
- [✅] SM_10 — `overlay` + file divergence → `conflicts == []`, theirs wins in merged snapshot.
- [✅] SM_11 — `snapshot` + file divergence → `conflicts` non-empty.
- [✅] SM_12 — `replay` + file divergence → `conflicts` non-empty.
- [✅] SM_13 — `recursive` + add/add collision → `conflicts` non-empty.
- [✅] SM_14 — `overlay` + add/add collision → `conflicts == []`, theirs content wins.
- [✅] SM_15 — `snapshot` + add/add collision → `conflicts` non-empty.
- [✅] SM_16 — `recursive` + delete/modify → `conflicts` non-empty.
- [✅] SM_17 — `overlay` + delete/modify → `conflicts == []`, modified version survives.
- [✅] SM_18 — `snapshot` + delete/modify → `conflicts` non-empty.
Group 3 — History mode produces correct commit graph shape:
- [✅] SM_19 — `--history merge` after a clean merge → commit has two parents (`parent2_commit_id` set).
- [✅] SM_20 — `--history squash` after a clean merge → commit has one parent (`parent2_commit_id` is None).
- [✅] SM_21 — `--history rebase` after a clean merge → commits are linear; no merge commit created.
Group 4 — PHANTOM regressions confirmed under all 4 strategies:
- [✅] SM_22 — PHANTOM_01 (untouched file) passes under `recursive`, `overlay`, `snapshot`, `replay`.
- [✅] SM_23 — PHANTOM_02 (convergent edit) passes under all 4 strategies; merged snapshot contains the convergent content.
- [✅] SM_24 — PHANTOM_05 (clean merge snapshot has non-conflicting changes from both branches) under all 4 strategies.
All 24 tests green with no regressions to Phase 1, Phase 2, or PHANTOM_01–05.
Named regression tests (carry forward from #85)
- PHANTOM_01: untouched file, strategy=overlay → no conflict
- PHANTOM_02: convergent edit, strategy=overlay → no conflict, correct snapshot
- PHANTOM_03: untouched file, strategy=snapshot → no conflict
- PHANTOM_04: real divergence, strategy=snapshot → conflict correctly detected
- PHANTOM_05: clean merge → snapshot has changes from both branches
Phase 4 — Harmony integration per history mode
Harmony learns conflict resolutions so future identical conflicts auto-resolve. Each history mode affects what Harmony can learn.
| History mode | Harmony behavior | Issue |
|---|---|---|
| `merge` | Full learning: per-file, per-symbol resolutions recorded against ours/theirs commit IDs | None — ideal case |
| `squash` | Partial learning: the squash commit has no `parent2_commit_id` — Harmony cannot key resolutions to the incoming branch's commit ID. It can still record file-level outcome blobs. | Must be explicitly handled: record resolutions against the squash snapshot, not a commit ID |
| `rebase` | No learning possible by default: rebased commits have new IDs; the original conflict context is destroyed. Harmony must be called during the rebase loop, once per replayed commit, not at the final result. | Requires Harmony to be called inside the rebase engine, not just at commit time |
Deliverables
- [✅] `muse commit` Harmony recording path handles the squash case correctly (no `parent2_commit_id` present — fall back to snapshot-keyed recording). (HA_04)
- `muse rebase` calls Harmony inside the replay loop. Deferred — `--history rebase` runs squash-equivalent in Phase 3; per-commit Harmony during replay not yet implemented.
- [✅] Integration test: after a `--history squash` merge that had a conflict, re-running the same merge with the same conflict auto-resolves via Harmony. (HA_02)
- [✅] Integration test: after a `--history rebase` merge with a conflict, the same conflict on a subsequent rebase auto-resolves. (HA_03 — squash-equivalent path; passes)
Phase 5 — Phantom conflict guard (universal)
The fix from #85 (`ops_commute` convergent-edit check) must hold under every strategy. Add a universal pre-conflict guard:
# Before recording any conflict for a path:
if ours_object_id == theirs_object_id:
continue # identical content — no conflict possible under any strategy
This guard must be applied:
- In `CodePlugin.merge()` (file-level)
- In `CodePlugin.merge_ops()` Step 1 (symbol-level, shared patch files)
- In the server-side proposal merge path
- Under ALL strategies — `ours_object_id == theirs_object_id` is a physical fact, not a strategy decision
Deliverables
- [✅] Guard present and tested in all four locations above.
tests/test_phase5_phantom_guard.py— 15 tests (PG_01–08). - [✅] PHANTOM_01–05 pass under every strategy in the matrix. (SM_22 covers PHANTOM_01/03 under all 4; SM_23 covers PHANTOM_02; SM_24 covers PHANTOM_05; SM_09/11/12 cover PHANTOM_04 under escalating strategies; SM_10 confirms overlay correctly auto-resolves — not a phantom. All 5 original tests green.)
Phase 6 — muse merge and proposal merge use one code path
Today the server-side proposal merge logic lives in musehub/services/proposal_merge_strategies.py and the local merge logic lives in muse/cli/commands/merge.py. They can drift.
- [✅] Extract
run_merge()(see Phase 1 architecture) inmuse/core/merge_engine.py— pure function shared by both paths; handles all fourdiff_unitvalues and all threeresolutionpolicies. - [✅]
run_merge()acceptsbase/ours/theirsmanifests +MergeEngine+ optional domain plugin and returnsmuse.domain.MergeResult. CLI and server each handle their own I/O wrapping. - [✅]
muse/cli/commands/merge.pyCases 3 and 4 now delegate torun_merge().musehub/services/proposal_merge_strategies.pycanonical strategy routing delegates torun_merge()+_to_local_result()conversion. Also addedours/theirstoSTRATEGY_MAP(were missing, causingreported_strategyto showrecursiveinstead ofours). - [✅]
tests/test_phase6_unified_merge_engine.py— 12 tests (ME_01–ME_12) covering all diff_units, resolution policies, clean merges, and musehub canonical routes (overlay, recursive, replay).
Phase 7 — Step 1.5 independence merge correctness + Harmony confidence gating
Two correctness gaps identified during Phase 3/4 implementation that are not covered by Phases 5 or 6.
Gap 1 — Step 1.5 silent union-merge on unparseable files
What happens today: In CodePlugin.merge_ops, Step 1.5 checks whether all symbol-level changes commute before attempting _independence_merge_blob. The check is:
file_has_symbol_conflict = any(
not ops_commute(oc, tc)
for oc in our_patch["child_ops"]
for tc in their_patch["child_ops"]
)
When a file cannot be parsed as Python (SyntaxError), _semantic_ops still returns a PatchOp with child_ops = []. The loop above is vacuously False (no pairs to compare), so _independence_merge_blob fires. That function runs a three-way text merge; if both sides changed the same line differently, it union-resolves them — silently writing both lines into the output (e.g. config = ours version\nconfig = theirs version\n) without surfacing a conflict. This is data corruption on real divergence in unparseable files.
This is distinct from the phantom conflict guard in Phase 5 (which handles the convergent case ours_id == theirs_id). Phase 5 does not address this path.
The fix: In Step 1.5's candidate loop, skip the independence path when both sides have empty child_ops AND neither patch was derived from a successful parse. Condition to add:
# Don't union-merge when neither side produced any symbols — the file couldn't
# be parsed, so "no symbol conflicts" means "we have no information", not "safe to merge".
if not our_patch["child_ops"] and not their_patch["child_ops"]:
continue
Gap 2 — Harmony auto-apply ignores confidence score
What happens today: harmony_auto_apply calls best_resolution(root, pattern_id) which ranks by (human_verified, confidence, applied_count) and returns the top result — then applies it unconditionally regardless of confidence score. A pattern learned from muse checkout --ours (confidence=0.8, human_verified=False) replays forever at the same strength as a carefully hand-edited resolution (confidence=1.0, human_verified=True).
The fix: Add a configurable minimum confidence threshold to harmony_auto_apply. Patterns below the threshold escalate (surface as conflicts) rather than auto-applying silently. The threshold is a repo-level config key with a sensible default.
# In harmony_auto_apply, after best_resolution():
min_confidence = get_config_float("harmony.min_auto_apply_confidence", root, def qqqqqqault=0.85)
if best.confidence < min_confidence:
continue # escalate — let the user resolve and improve confidence
The default of 0.85 means --ours/--theirs resolutions (0.8) do not auto-apply by default; only manually-verified resolutions (1.0) do.
Deliverables
One new test file: tests/test_phase7_merge_correctness.py.
Step 1.5 fix:
- [✅] IC_01 — Divergent unparseable Python (SyntaxError on both sides) → conflict surfaces correctly; no silent union-merge.
- [✅] IC_02 — Divergent valid Python (symbol changed on both sides, same file) → independence merge fires correctly and produces a clean merge when symbols don't overlap.
- [✅] IC_03 — Divergent valid Python (same symbol changed on both sides) → conflict surfaces; independence merge does NOT fire.
- [✅] IC_04 —
CodePlugin.merge_opsStep 1.5 skipped whenchild_opsempty on both sides (unit test on the skip condition).
Harmony confidence gating:
- [✅] HC_01 — Resolution with confidence >= 0.85 (default threshold) auto-applies on re-merge.
- [✅] HC_02 — Resolution with confidence < 0.85 (e.g. from
--ourscheckout, confidence=0.8) does NOT auto-apply; conflict surfaces for manual resolution. - [✅] HC_03 —
harmony.min_auto_apply_confidenceconfig key overrides the default; setting it to 0.75 causes the 0.8-confidence pattern to auto-apply. - [✅] HC_04 — When no resolution meets the confidence threshold, Harmony escalates cleanly (returns the conflict unresolved, does not error).
All 8 tests green with no regressions to Phases 1–4 or PHANTOM_01–05.
Phase 8 — muse merge --explain
The merge pipeline makes several invisible decisions — strategy routing, per-path case selection, Harmony pattern lookup, .museattributes rule firing — that are currently only visible in debug logs. When a merge produces a surprising result (phantom conflict, unexpected auto-resolution, wrong content in the merged snapshot), there is no way for a user or agent to see why without reading source code.
--explain makes the decision path first-class output.
Design
--explain is a flag on muse merge. It augments the existing output rather than replacing it:
muse merge feat --explain— human-readable decision trace to stdout alongside normal merge output.muse merge feat --explain --json— structured machine-readable trace embedded in the JSON envelope; agents can parse it to understand exactly what fired and why.muse merge feat --explain --dry-run— full explanation of what would happen without writing anything.
--explain does not change merge behavior — it only adds observation.
JSON shape (explain key in merge envelope)
"explain": {
"strategy_routing": {
"requested_strategy": "recursive",
"on_conflict": null,
"history": "merge",
"resolved_diff_unit": "three_way",
"resolved_resolution": "escalate",
"case": 4
},
"merge_base_commit_id": "sha256:...",
"per_path": [
{
"path": "config.py",
"ours_changed": true,
"theirs_changed": true,
"ours_id": "sha256:...",
"theirs_id": "sha256:...",
"base_id": "sha256:...",
"decision": "conflict",
"reason": "both sides modified; content diverged",
"harmony_checked": true,
"harmony_pattern_id": null,
"harmony_result": "no_pattern",
"attributes_rule": null,
"independence_merge_attempted": false
},
{
"path": "README.md",
"ours_changed": false,
"theirs_changed": true,
"decision": "take_theirs",
"reason": "only theirs changed",
"harmony_checked": false,
"harmony_pattern_id": null,
"harmony_result": "not_applicable",
"attributes_rule": null,
"independence_merge_attempted": false
}
],
"summary": {
"total_paths": 5,
"clean_no_change": 2,
"take_ours_only": 0,
"take_theirs_only": 1,
"convergent": 1,
"harmony_auto_resolved": 0,
"attributes_auto_resolved": 0,
"independence_merged": 0,
"conflicts": 1
}
}
decision values: "no_change" | "take_ours_only" | "take_theirs_only" | "convergent" | "harmony_auto_resolved" | "attributes_auto_resolved" | "independence_merged" | "conflict" | "overlay_auto_resolved".
harmony_result values: "not_applicable" | "no_pattern" | "below_confidence_threshold" | "auto_resolved" | "escalated".
Human-readable format
muse merge feat --explain
Merge base: sha256:cc521... (2 commits behind)
Strategy: recursive (three_way + escalate) → Case 4
config.py CONFLICT both sides modified; diverged
harmony: no pattern found
README.md take-theirs only theirs changed
shared.py no-change identical on both sides
utils.py convergent both sides independently arrived at same content
billing.py harmony auto-resolved via pattern a3f2c9 (confidence 1.0)
1 conflict · 1 harmony auto-resolved · 3 clean
Deliverables
One new test file: tests/test_phase8_explain.py.
- [✅] EX_01 —
--explainflag accepted onmuse merge; no error, no change to exit code or merge behavior. - [✅] EX_02 —
--explain --dry-run --jsonproduces anexplainkey in the JSON envelope without writing any commits or MERGE_STATE. - [✅] EX_03 —
explain.strategy_routingcontainsrequested_strategy,resolved_diff_unit,resolved_resolution,case. - [✅] EX_04 —
explain.per_pathentry for a conflicting file hasdecision == "conflict",ours_changed == true,theirs_changed == true,ours_id != theirs_id. - [✅] EX_05 —
explain.per_pathentry for a theirs-only change hasdecision == "take_theirs_only",ours_changed == false,theirs_changed == true. - [✅] EX_06 —
explain.per_pathentry for a convergent edit hasdecision == "convergent",ours_changed == true,theirs_changed == true,ours_id == theirs_id. - [✅] EX_07 —
explain.per_pathentry for an untouched file hasdecision == "no_change",harmony_checked == false. - [✅] EX_08 — After Harmony has a pattern for a conflict, re-merging with
--explain --jsonshowsharmony_result == "auto_resolved"andharmony_pattern_idis set for the resolved path. - [✅] EX_09 —
explain.summarycounts are correct:conflicts,harmony_auto_resolved,convergent,clean_no_changeall match the actualper_pathentries. - [✅] EX_10 —
--explainwithout--jsonproduces human-readable output containing the merge base commit ID, strategy line, and one line per changed path.
All 10 tests green with no regressions to Phases 1–4, 7, or PHANTOM_01–05.
Acceptance criteria
- [✅] `muse merge --strategy X --on-conflict Y --history Z` works for every valid combination. (Phase 1 + Phase 3 SM_01–SM_24)
- [✅] Every cell in the Phase 3 matrix has a passing test. (SM_01–SM_24)
- [✅] PHANTOM_01–05 pass under every strategy. (Phase 5 + SM_22–SM_24)
- [✅] Harmony learns correctly from `merge`, `squash`, and `rebase` history modes. (HA_02–HA_04; per-commit rebase loop explicitly deferred in Phase 4)
- [✅] `muse merge` and `muse hub proposal merge` produce bit-identical merged snapshots given the same inputs. (Phase 6 ME_01–ME_12)
- [✅] No phantom conflicts under any strategy or history mode. (Phase 5 + SM_22–SM_24)
- [✅] No silent union-merge on unparseable files (Phase 7 Step 1.5 fix). (IC_01–IC_04)
- [✅] Harmony auto-apply respects confidence threshold; low-confidence patterns escalate. (HC_01–HC_04)
- [✅] `muse merge --explain --json` exposes full per-path decision trace for every path touched by the merge. (EX_01–EX_10)
- [✅] Existing merge tests unaffected. (test_cmd_merge.py passes throughout all phases)
Out of scope
- Muse wire protocol changes (push validation, pack format, object store).
- Changes to Harmony's storage format or pattern-matching algorithms.
- New domain plugins.