gabriel / muse public
key-material-security-audit.md markdown
362 lines 24.2 KB
Raw
sha256:84df9126d09aeec0b8f1b908f0b06c10913feec28f3514b382efb1ba6d619385 refactor: rename StructuredMergePlugin to AddressedMergePlu… Sonnet 4.6 minor ⚠ breaking 24 days ago

Key Material Security Audit — Muse + MuseHub

Scope: muse and musehub repos only. Agentception, Stori, Maestro deferred.
Status: Phases 1–8 complete.


The Target Architecture

OS Keychain  ←→  one master BIP39 mnemonic per machine (not per hub)
                  │
                  ▼  mnemonic_to_seed()   [in memory, never logged]
                  64-byte seed
                  │
                  ▼  derive_identity_key(seed, ...)   SLIP-0010 Ed25519
                  DerivedKey  →  Ed25519PrivateKey   [in memory]
                  │              │
                  ▼              ▼
               dk.zero()     sign(canonical_message)   [in memory]
                              │
                              ▼
                           sig bytes  →  base64url  →  Authorization header

NO PEM EVER WRITTEN TO DISK.
NO INTERMEDIATE KEY BYTES OUTSIDE MEMORY.
ONE MNEMONIC. MANY HUBS.

Current State — What Exists Today

muse/core/keychain.py

  • Stores mnemonic in OS Keychain via keyring library. ✅
  • Key: service="muse", username="mnemonic" (single global entry — not per-hub). ✅ (Phase 1)
  • Legacy per-hub "{hostname}/mnemonic" entries migrated transparently on first load(). ✅
  • MUSE_KEYCHAIN_BACKEND=disabled for CI is correct. ✅
  • load() / store() / delete() API is clean. ✅

muse/core/identity.py

  • Mnemonic never written to TOML — stripped before _save_all(). ✅
  • Injected at load time via kc_load() in load_identity(). ✅
  • resolve_signing_identity() derives key from keychain mnemonic + hd_path — no PEM read. ✅ (Phase 2)
  • TOML written atomically with fchmod(0o600) before data. ✅
  • Symlink guard on write. ✅
  • Advisory fcntl.flock on read-modify-write. ✅
  • key_path: str still in IdentityEntry TypedDict and _dump_identity() serialiser. ⚠️ (Post-Phase-8 finding — see audit section below)
  • _load_all() still parses key_path from TOML — intentional forward-compat. ✅

muse/core/keypair.py

  • generate_hd_keypair, load_private_key, load_private_key_from_pem, _write_private_key_pem, _key_path, _hostname_key, _SAFE_AGENT_ID, _ensure_keys_dir — all deleted. ✅ (Post-Phase-8 cleanup)
  • derive_hd_public_info() is the only key derivation path — no file writes. ✅ (Phase 4)
  • _KEYS_DIR sentinel kept — used by cleanup-keys and security-check commands. ✅
  • DerivedKey.zero() wrapped in try/finally at all call sites. ✅ (Phase 6)
  • ~/.muse/keys/ contains no PEM files — all orphans destroyed by Phase 5. ✅

muse/core/hdkeys.py

  • Derivation path structure is correct and well-documented. ✅
  • derive_agent_sub_seed() returns SecretByteArray (auto-zeroes on GC). ✅ (Phase 6)
  • dk.zero() called in derive_agent_sub_seed() inside try/finally. ✅ (Phase 6)
  • public_bytes_from_seed() calls dk.zero() inside try/finally. ✅ (Phase 6)
  • All intermediate DerivedKey objects passed through child_key() loop are zeroed. ✅

muse/core/msign.py

  • build_msign_header() takes a SigningIdentity (handle + private key). ✅
  • Signing happens in memory. ✅
  • No key material logged or included in exceptions. ✅
  • DEFAULT_SIGN_ALGO used correctly (not KeyAlgorithm enum). ✅ (fixed in previous session)

muse/core/transport.pySigningIdentity

  • Holds handle: str and private_key: Ed25519PrivateKey. ✅
  • private_key now derived from keychain mnemonic via resolve_signing_identity() — no disk read. ✅ (Phase 2)

muse/cli/commands/auth.py

  • run_keygen (human): generates mnemonic → keychain → derives in memory → writes pubkey only. No PEM write. ✅ (Phase 4)
  • run_register: derives key from keychain mnemonic + hd_path, writes no key_path to entry. ✅ (Phase 4)
  • run_recover: derives in memory from stdin mnemonic → stores in keychain; no PEM write. ✅ (Phase 8)
  • run_rotate: reads mnemonic from keychain; derives in memory; no PEM write. ✅ (Phase 8)
  • Mnemonic passed via --mnemonic-fd N (fd 0/1/2 reserved). ✅
  • Mnemonic never echoed to stdout/stderr. ✅
  • Keychain storage is now machine-global ("mnemonic" key, not per-hub). ✅ (Phase 1)

musehub/auth/request_signing.py

  • Server-side verification: reads public key bytes from DB, calls verify_signature(). ✅
  • No private key material on the server at all. ✅
  • Streaming push (application/x-muse-wire) signs with b"" body. ✅ (fixed in previous session)
  • DEFAULT_SIGN_ALGO used for canonical message default. ✅ (fixed in previous session)

Identified Issues (Ranked by Severity)

CRITICAL

# File Issue Status
C1 keypair.py::generate_hd_keypair Writes derived Ed25519 private key to ~/.muse/keys/*.pem (unencrypted, on disk) ✅ No longer called — deletion deferred to Phase 2 cleanup
C2 identity.py::resolve_signing_identity Loads private key from PEM file on disk instead of deriving in memory ✅ Fixed — Phase 2
C3 keychain.py::_username Mnemonic stored per-hub instead of once per machine ✅ Fixed — Phase 1
C4 identity.toml schema key_path field encodes PEM dependency into stored format ✅ Fixed — Phase 3

HIGH

# File Issue Status
H1 keypair.py load_private_key() / load_private_key_from_pem() — exist only to serve the PEM architecture ⚠️ Still present, unused by hot paths — blocked on run_recover/run_rotate migration
H2 auth.py::run_keygen Human keygen stored mnemonic per-hub instead of globally ✅ Fixed — Phase 1 + 4
H3 auth.py::run_recover Still writes PEM after re-deriving from mnemonic ✅ Fixed — Phase 8
H4 auth.py::run_rotate Still writes PEM at new rotation index ✅ Fixed — Phase 8
H5 ~/.muse/keys/ Existing PEM files on disk are live key material ✅ Fixed — Phase 5

MEDIUM

# File Issue Status
M1 hdkeys.py::derive_agent_sub_seed Returns bytearray — callers must zero it after use, no enforcement ✅ Fixed — Phase 6 (SecretByteArray auto-zeroes on GC)
M2 keypair.py::generate_hd_keypair Ed25519PrivateKey from cryptography holds key bytes in C heap — Python cannot zero them ⚠️ Known limitation — inherent to the cryptography library
M3 identity.py::_load_private_key_from_path Reads entire PEM as bytes into Python heap — immutable, cannot be zeroed ⚠️ Still present — only path is run_recover/run_rotate (deferred)
M4 slip010.py::DerivedKey zero() method exists, but OS may swap bytearray to disk before zero ⚠️ Known limitation — inherent to Python's memory model; mitigated by __del__ (Phase 6)

LOW

# File Issue
L1 keychain.py No test that MUSE_KEYCHAIN_BACKEND=disabled mode warns when mnemonic would be lost
L2 identity.toml provisioned_by_fingerprint field not validated against actual operator key on agent provisioning
L3 keypair.py::_write_private_key_pem PEM bytes created as bytes — immutable, stays in Python heap until GC

Implementation Plan

Phase 1 — Fix the Mnemonic Keychain Key (C3) ✅ COMPLETE

Goal: One master mnemonic per machine, not one per hub.

  • [x] Change _username() to return "mnemonic" (constant, no hub in key)
  • [x] Write migration: if "{hostname}/mnemonic" exists and "mnemonic" does not, copy it over and delete the old entry
  • [x] Run migration in load() transparently (one-time, idempotent)
  • [x] Update store() and delete()hub_url param removed
  • [x] Update all callers of kc_store / kc_load in auth.py
  • [x] Update tests in tests/test_core_keychain.py

Phase 2 — Derive and Sign in Memory (C1, C2, H1) ✅ COMPLETE

Goal: resolve_signing_identity() derives the Ed25519 key from the mnemonic at call time. No PEM file read. No PEM file write.

Call chain (implemented):

resolve_signing_identity(hub_url)
  → load_identity(hub_url)            # reads identity.toml (handle, hd_path, fingerprint)
  → kc_load()                         # mnemonic from OS keychain
  → mnemonic_to_seed(mnemonic)        # BIP39 PBKDF2
  → derive_path(seed, hd_path)        # SLIP-0010 Ed25519
  → to_ed25519_private_key(dk)        # materialise key
  → dk.zero()                         # zero DerivedKey immediately
  → Ed25519PrivateKey                 # sign, then let GC handle it
  • [x] Rewrite resolve_signing_identity() to derive from keychain — no PEM read
  • [x] Delete generate_hd_keypair() from keypair.py — done in post-Phase-8 cleanup
  • [x] Delete load_private_key() / load_private_key_from_pem() / _write_private_key_pem() from keypair.py — done in post-Phase-8 cleanup
  • [x] Delete _check_key_file_permissions() and _load_private_key_from_path() from identity.py — done in post-Phase-8 cleanup

Phase 3 — Remove key_path from identity.toml Schema (C4) ⚠️ PARTIAL

Goal: The key_path field is meaningless once Phase 2 lands. Remove it from the type, the serialiser, and the parser.

  • [ ] Remove key_path from IdentityEntry TypedDict — ⚠️ still present (post-Phase-8 audit finding)
  • [x] _load_all() parser still reads key_path from TOML for forward-compat — acceptable
  • [ ] Remove key_path from _dump_identity() serialiser — ⚠️ still writes it if present in entry
  • [x] save_identity() silently drops key_path if present in a loaded file — ⚠️ actually NOT dropped; _dump_identity re-serialises it
  • [ ] Updated all tests that set or assert on key_path — ⚠️ many tests still set key_path in fixture entries; 4 tests fail because key_set now uses hd_path not key_path

Phase 4 — Update auth.py CLI Commands (H2, H4) ✅ COMPLETE

Goal: keygen and register no longer write PEM files or read them.

  • [x] run_keygen: generate mnemonic → kc_store() → derive in memory → store pubkey/fingerprint/hd_path. No PEM write.
  • [x] run_register: derive key from keychain mnemonic + hd_path — no load_private_key, no key_path in entry.
  • [x] run_keygen reuses existing mnemonic from keychain unless --force
  • [x] Data-integrity invariants tested: hd_path preserved, no key_path, fingerprint matches mnemonic, round-trip works (test_auth_register_integrity.py)
  • [x] run_recover: derives in memory from stdin mnemonic → stores mnemonic in keychain; no PEM write (Phase 8)
  • [x] run_rotate: reads mnemonic from keychain; derives in memory; no PEM write (Phase 8)

Phase 5 — Orphan PEM Cleanup ✅ COMPLETE

Goal: Remove existing PEM files from disk. They are now vestigial and represent unprotected key material.

  • [x] muse auth cleanup-keys: overwrites each *.pem with os.urandom bytes, fsyncs, unlinks; JSON output
  • [x] muse auth security-check: four invariant checks; exits 1 on any failure; JSON output
  • [x] 6 stale PEM files destroyed from real ~/.muse/keys/ on first run
  • [x] Tests: C1–C5 (cleanup), S1–S5 (security-check) — all green (test_cmd_auth_phase5.py)

Phase 6 — DerivedKey Zeroing Hardening (M1, M2) ✅ COMPLETE

Goal: Best-effort zeroing of sensitive bytes in Python's heap. We cannot zero cryptography's C-heap allocations, but we can minimize exposure window.

Gaps identified (all code paths, as of Phase 5):

Site Gap
slip010.py::DerivedKey No __del__ fallback — forgotten zero() calls leave key material until GC
identity.py::resolve_signing_identity._derive dk.zero() after to_ed25519_private_key(dk) — no try/finally; exception skips zeroing
keypair.py::derive_hd_public_info Same — no try/finally around dk.zero()
keypair.py::generate_hd_keypair Same
hdkeys.py::public_bytes_from_seed Same
auth.py::run_register inline derivation Same
hdkeys.py::derive_agent_sub_seed Returns raw bytearray — no auto-zero if caller forgets

Implementation targets:

  • [x] DerivedKey.__del__ added — calls self.zero() as GC safety net
  • [x] try/finally wrapping all dk.zero() sites:
    • identity.py::resolve_signing_identity._derive — exception now returns None, dk always zeroed
    • keypair.py::derive_hd_public_info
    • keypair.py::generate_hd_keypair
    • hdkeys.py::public_bytes_from_seed
    • auth.py::run_register inline derivation
  • [x] SecretByteArray added to slip010.pybytearray subclass with zero() method and __del__ auto-zero
  • [x] derive_agent_sub_seed() return type changed from bytearray to SecretByteArray
  • [x] Tests: Z1–Z7 all green (test_security_zeroing.py)

Phase 7 — MuseHub Server Audit ✅ COMPLETE

Goal: Confirm server never touches private key material.

  • [x] Verify musehub/auth/request_signing.py only reads public keys from DB — confirmed
  • [x] Verify no private key material in musehub/crypto/keys.py — confirmed
  • [x] Verify MusehubAuthKey DB model stores only public_key_b64 and fingerprint — confirmed
  • [x] Fixed stale ~/.muse/keys/{{hostname}}.pem reference in musehub/mcp/prompts.py
  • [x] Fixed key_path / ~/.muse/keys/ references in docs_muse_identity.html template
  • [x] 10 tests P7-1 through P7-6 all green (tests/test_security_server_audit.py)

Phase 8 — Migrate run_recover and run_rotate off PEM files ✅ COMPLETE

Goal: The last two CLI commands that wrote PEM files now derive keys in memory only. No PEM files written anywhere in the CLI.

  • [x] run_recover: replaced generate_hd_keypair with derive_hd_public_info; removed key_path from entry and JSON; stores mnemonic in OS keychain; guards on identity entry (not PEM file) for --force check; removed key_path from _RecoverJson TypedDict
  • [x] run_rotate: reads mnemonic from keychain (kc_load()) — no _read_mnemonic_securely stdin call; replaced generate_hd_keypair with derive_hd_public_info; removed key_path from entry and JSON; removed key_path from _RotateJson TypedDict
  • [x] test_cmd_auth_phase8.py — 12 tests (REC-1 through REC-7, ROT-1 through ROT-4) all green; written TDD-first
  • [x] test_auth_rotate.py — keychain patching added to isolated fixture; _rotate() no longer passes stdin; test_III2_new_pem_is_valid_ed25519 replaced with test_III2_rotate_writes_no_pem; all 10 green
  • [x] test_hd_keygen_unified.py::TestRunRecovertest_recover_writes_pem replaced with test_recover_writes_no_pem; test_recover_pem_mode_600 deleted; keychain patching added to _do_recover; all green

Test Checklist

  • [x] test_core_keychain.py — stale key_path fields removed from test entries
  • [x] test_hd_keygen_unified.py — all PEM assertion tests replaced; keychain patching added; no-PEM + fingerprint-based tests green
  • [x] test_cmd_auth_keygen_hd.py — PEM write tests replaced with derive_hd_public_info tests; stale imports removed
  • [x] test_agent_signing.pyTestLoadPrivateKeyFromPem deleted; stale key_path removed; Phase 2 in-memory signing tests green
  • [x] test_security_key_permissions.py — deleted; replaced by test_security_no_pem_on_disk.py (NP-1 through NP-5)
  • [x] test_auth_rotate.py — PEM assertions replaced with no-PEM assertions; keychain patching added; all 10 green (Phase 8)
  • [x] test_resolve_signing_identity_keychain_path.py — new; full chain KC-1 through KC-7 all green

Migration Path (Zero-Downtime)

  1. Phase 1 lands first — mnemonic consolidation. Existing staging.musehub.ai/mnemonic is migrated to mnemonic transparently on first load(). Localhost entry (missing) is created via muse auth recover using the staging mnemonic.
  2. Phase 2 + 3 land together — resolve_signing_identity reads from keychain. PEM files still on disk but no longer read. Identity.toml drops key_path.
  3. Phase 4 lands — CLI commands stop writing PEMs.
  4. Phase 5 lands — PEM files actively overwritten and deleted.
  5. Phase 6 — ongoing hardening, no user-visible change.

Post-Phase-8 Completeness Audit

Findings recorded as discovered. Each item is either clean ✅, needs a doc/comment fix 📝, or is a real code issue ⚠️.

Deleted functions — still referenced

Location Reference Finding
docs/key-material-security-audit.md (this file) "Current State" section + Phase 2 + Phase 6 + "Files to Delete" table 📝 Still describes generate_hd_keypair, load_private_key, load_private_key_from_pem, _write_private_key_pem, _check_key_file_permissions, _load_private_key_from_path as present/deferred. All are now deleted.
docs/agent-provenance.md:195 load_private_key_from_pem in keypair.py description 📝 Stale — function deleted.
tests/test_cmd_auth_keygen_hd.py module docstring (lines 9–12, 54) Lists generate_hd_keypair tests as coverage 📝 Stale docstring/comment — tests migrated to derive_hd_public_info.
tests/test_cmd_auth_keygen_hd.py:194,199 # Unit — generate_hd_keypair comment + class docstring 📝 Stale comment/docstring in TestGenerateHdKeypair class.
tests/test_derived_key_zeroing.py:102 # III generate_hd_keypair zeroes the final DerivedKey 📝 Stale comment — now derive_hd_public_info.
tests/test_auth_hd_persistence.py:88 mnemonic_to_seed`` and ``generate_hd_keypair`` run for real 📝 Stale docstring.

key_path still in production code

Location Finding
muse/core/identity.py:120 Fixed. key_path: str removed from IdentityEntry TypedDict.
muse/core/identity.py:222–224 Fixed. _dump_identity() no longer serialises key_path. TDD tests P3-1 and P3-2 added.
muse/core/identity.py:284–286 _load_all() parses key_path from TOML — intentional forward-compat read of old files. Acceptable.
muse/core/identity.py module docstring (lines 33, 40, 102, 111) 📝 File-format docstring examples still show key_path = "..." lines.
muse/cli/commands/hub/_core.py:572–579 Fixed. PEM-load primary path deleted; get_signing_identity(remote_url=hub_url) is now the only signing path. TDD test H1 added.
muse/cli/commands/mist.py:184–190, 435–440, 1183–1187 Fixed. Three PEM-load sites migrated to get_signing_identity(remote_url=...) + build_msign_header / sign_bytes. Also fixed: _require_hub and _get_hub_url called load_identity() without hub_url (TypeError). TDD tests M1–M3 added.
muse/cli/commands/sign.py:376, 453, 519 Fixed. Stale getattr(args, "key_path", None) positional arg removed from run_request, run_curl, run_payment call sites. TDD tests CS-1 through CS-4 added.

Test files calling deleted _key_path function

Location Finding
tests/test_agent_id_traversal.py entire file Deleted. Tests keypair._key_path path-traversal guard — _key_path was deleted; all 8 tests errored with AttributeError. The traversal risk was PEM filename injection, which is gone since no PEM is written. muse rm executed and committed.

key_set logic broken — 4 test failures

_display_entry was updated to use key_set = bool(hd_path) (not bool(key_path)). Test fixtures that set key_path but not hd_path now get key_set = false, breaking tests that expected key_set = true.

Location Finding
tests/test_cli_auth.py::TestAuthWhoami::test_whoami_key_set_is_bool Fixed. _store_entry() fixture updated: key_path removed, hd_path added.
tests/test_cmd_auth_hardening.py::TestDisplayEntry::test_json_key_set_true Fixed. _make_entry() updated: key_path removed, hd_path added.
tests/test_cmd_auth_hardening.py::TestWhoamiHardening::test_whoami_json_schema Fixed. _store() fixture updated: key_path removed, hd_path added.
tests/test_cmd_auth_hardening.py::TestWhoamiHardening::test_whoami_key_set_is_bool Fixed. Same fixture fix.

Test files using key_path in identity entries (stale fixture data)

Location Finding
tests/test_cli_auth.py:76, 112, 133, 150–151, 179, 225, 259, 274 📝 save_identity calls with key_path in the entry dict. Tests still pass because the field is still in IdentityEntry TypedDict, but semantically stale.
tests/test_cli_hub.py:188, 232, 249, 283 📝 IdentityEntry dicts with key_path field.
tests/test_cli_hub.py:393, 431, 477, 610, 646, 682, 722, 896, 1157, 1268 ⚠️ These pass key_path=str(...) to some constructor — needs closer inspection to determine if they are constructing IdentityEntry or another type (possibly SigningIdentity).
tests/test_cmd_auth_hardening.py:254, 317, 346–347, 364–365, 400, 426–427, 477, 500, 510, 522, 591, 621, 660, 708 📝 Many save_identity calls and IdentityEntry dicts with key_path. Tests for whoami/show/list display — key_path used as dummy data to populate entries.
tests/test_cmd_hub_hardening.py:162 📝 Entry dict with key_path.
tests/test_auth_show_migrate.py:67, 92, 118, 140, 165, 184, 205, 293 ✅ Intentionally tests display of old-format entries that contain key_path. Migration/compatibility tests — keep these.
tests/test_cmd_auth_phase5.py:190, 193, 214–234 ✅ Intentionally tests security-check detecting key_path in identity entries (no_key_path_in_identity invariant). Keep these.
tests/test_auth_register_integrity.py:128–143 ✅ Intentionally asserts key_path NOT in entry after register. Keep this.

Docs and comments with stale key_path / PEM content

Location Finding
docs/guide/getting-started.md:37 Fixed. Updated to show hd_path in example; prose updated to describe keychain-derived key.
docs/reference/auth.md:38, 45, 57 Fixed. key_path replaced with hd_path in TOML examples and type table.
docs/reference/type-contracts.md:251, 3681 Fixed. key_path replaced with hd_path in both type tables.
docs/agent-provenance.md:68, 78 Fixed. Example identity entries updated to use hd_path; PEM key path removed.
docs/agent-provenance.md:195 Fixed. load_private_key_from_pemderive_hd_public_info + sign_bytes.
muse/core/provenance.py module docstring Fixed. Signing model updated to describe keychain-derived key, no PEM file.
EXTREME_STRESS_PLAN.md:1039 Fixed. Updated to describe keychain-derived key instead of PEM path.
tests/test_cmd_auth_keygen_hd.py module + class docstrings Fixed. References to generate_hd_keypair and PEM updated to derive_hd_public_info.
tests/test_derived_key_zeroing.py:102 Fixed. Section comment updated: generate_hd_keypairderive_hd_public_info.
tests/test_auth_hd_persistence.py:88 Fixed. Docstring updated: removed PEM reference, updated function name.
muse/core/snapshot.py:104 "*.pem" in secret-pattern exclusion list — correct, keep.
EXTREME_STRESS_PLAN.md:1346 *.pem listed as a secret pattern — correct, keep.

Files to Delete (deferred cleanup)

All deferred deletions are now complete as of the post-Phase-8 cleanup commit.

File / Symbol Reason Status
muse/core/keypair.py::generate_hd_keypair PEM write ✅ Deleted
muse/core/keypair.py::load_private_key PEM read ✅ Deleted
muse/core/keypair.py::load_private_key_from_pem PEM read ✅ Deleted
muse/core/keypair.py::_write_private_key_pem PEM write helper ✅ Deleted
muse/core/keypair.py::_key_path PEM filename builder ✅ Deleted
muse/core/keypair.py::_hostname_key PEM filename sanitiser ✅ Deleted
muse/core/keypair.py::_SAFE_AGENT_ID PEM filename regex ✅ Deleted
muse/core/keypair.py::_ensure_keys_dir PEM directory creator ✅ Deleted
muse/core/identity.py::_check_key_file_permissions Gating PEM reads ✅ Deleted
muse/core/identity.py::_load_private_key_from_path PEM read path ✅ Deleted
~/.muse/keys/*.pem Live key material ✅ Done — Phase 5 destroyed all orphans
File History 1 commit
sha256:84df9126d09aeec0b8f1b908f0b06c10913feec28f3514b382efb1ba6d619385 refactor: rename StructuredMergePlugin to AddressedMergePlu… Sonnet 4.6 minor 24 days ago