# Companion App β€” Phase 2: Loopback Endpoint Security Core **Status:** accepted design + implementation (pure request-guard; **no socket bound**). **Branch:** `feat/companion-app` (Muse-canonical; not a docs-only PR to `main`). **Phase table ref:** Gate Β§12, Phase 2 β€” 🧠 Thinking. "DNS-rebinding and cross-origin abuse are adversarial; the defense must be argued against an attacker model, not pattern-matched." **Depends on:** Phase 0 Decision Record (gate Β§13, D1–D3 accepted) and Phase 1 adapter seam ([`COMPANION-APP-PHASE-1-ADAPTER-SEAM.md`](COMPANION-APP-PHASE-1-ADAPTER-SEAM.md)). **Upstream:** [`COMPANION-APP-DESIGN-AND-AUTHORIZATION-GATE.md`](COMPANION-APP-DESIGN-AND-AUTHORIZATION-GATE.md) Β§4 (the 8 loopback controls), Β§10 (7-tier test obligations); [`COMPANION-APP-MODEL-ROUTING-AND-ENRICHMENT-ARCHITECTURE.md`](COMPANION-APP-MODEL-ROUTING-AND-ENRICHMENT-ARCHITECTURE.md) Β§3 (client-side constraint), Β§8.1 (localhost security), Β§8.3 (prompt injection). --- ## Simple summary The companion app runs a tiny AI server **on your own laptop** so your private notes can be processed locally and never leave the device. The most dangerous moment in that whole feature is when that little server **opens its door** to the network: every web page open in your browser can knock on `http://127.0.0.1:`, and a trick called **DNS-rebinding** can make a stranger's website *look* like it's coming from your own machine. This phase builds the **bouncer** that stands at the door and decides, for every single knock, whether to let it in. The bouncer checks: *do you carry the right one-time pass (token)? are you actually knocking on the loopback door and not a disguised one (Host)? are you a page from this same local app and not some random website (Origin)? have there been too many knocks too fast (rate-limit)?* If anything is missing or even slightly off, the answer is **no** β€” the bouncer fails safe. Crucially, we built and exhaustively tested the **bouncer by itself, before installing the actual door.** The door (the real listening socket) is deliberately **not** opened in this phase β€” that is the single most security-critical action and it stays behind a separate explicit approval (Phase 5). The bouncer is a pure function: same inputs always give the same answer, it touches no files, no network, no settings, so we can prove it is incorruptible. ## Technical summary Phase 2 delivers **`lib/companion-loopback-guard.mjs`** β€” a pure, I/O-free request-decision core (`verifyLoopbackRequest`) enforcing gate Β§4 controls **1, 2, 3, 5, 6, 8** at the request-decision level, plus the rate-state helpers (`createLoopbackRateState`, `recordLoopbackRequest`, `evaluateRateLimit`, `shouldCountTowardRateLimit`) and a constant-time comparator (`constantTimeStringEqual`). It binds **no socket** and reads **no environment**, mirroring how Phase 1 shipped pure decision logic. This scope is deliberate and gate-compliant. The gate's ["DOES NOT approve (no code)"](COMPANION-APP-DESIGN-AND-AUTHORIZATION-GATE.md) list forbids *"opening any new local HTTP listener / loopback model endpoint in any repo,"* and Β§13.2 restates that this prohibition is **unchanged** by Phase 0 acceptance. Building the request-guard as a pure, fully-tested function β€” and deferring the actual `server.listen()` bind to Phase 5 behind an explicit gate β€” keeps the most security-critical surface closed while letting the adversarial decision logic be proven now. Β§4's controls 4 (non-predictable port) and 5 (loopback bind) are **binding-time** properties; this doc specifies exactly what Phase 5 must do to satisfy them (see [Β§6](#6-what-phase-5-must-do-to-bind-the-socket-safely)). The guard is **fail-closed everywhere**: any missing, malformed, ambiguous, or unrecognised input denies. It never throws (a catch-all converts internal errors to a fixed-reason 403), never logs, and never copies a token, JWT, or note body into a reason string, a return value, or an error β€” satisfying gate Β§4.8 "never log token, JWT, or note bodies." --- ## 1. Adversarial threat model The loopback endpoint is the GitHub-analogue of a service bound to `127.0.0.1`: reachable by anything already running on the machine. We model four attacker capabilities and, for each, the **exact control** that stops it. The defense is argued against the attacker, not pattern-matched. ### Attacker A β€” malicious web page in the user's browser (cross-origin) **Capability.** The user visits `https://evil.example`. That page's JavaScript can issue `fetch()` / `XMLHttpRequest` to `http://127.0.0.1:` (the browser will connect to loopback). The attacker controls the request method, the URL path, and most request headers β€” but **cannot** set the Forbidden headers `Origin` and `Sec-Fetch-*` (the browser sets these from the page's real context), and **cannot** read a cross-origin response unless the server emits permissive CORS. **Stops it:** - **`Sec-Fetch-Site: cross-site`** is attached automatically by the browser β†’ guard returns **403** (`cross_site_forbidden`). The attacker cannot forge or strip this header. - **`Origin: https://evil.example`** is attached on the cross-origin request and is **not** the loopback origin β†’ guard returns **403** even if `Sec-Fetch-Site` were somehow absent. - **No wildcard CORS / no Origin reflection** (control Β§4.3) β€” Phase 5 must never emit `Access-Control-Allow-Origin: *` nor reflect an arbitrary `Origin`, so even a response the attacker provokes is unreadable cross-origin. The guard models this by accepting **only** the loopback origin; a foreign origin is denied and never echoed back. - Even if the attacker has somehow learned the per-session token, the cross-site Origin check rejects the request **before** the token is consulted (evaluation order, [Β§3](#3-evaluation-order-and-why)). ### Attacker B β€” DNS-rebinding (make a remote origin appear to target loopback) **Capability.** `evil.example` initially resolves to the attacker's server, then re-resolves to `127.0.0.1`. The victim's browser, still treating the page as same-origin to `evil.example`, sends requests that physically reach the local endpoint. The defining signature: the **`Host` header carries the attacker's domain** (`evil.example:`), because the browser fills `Host` from the URL the page fetched β€” not from the resolved IP. **Stops it:** - **Strict `Host` allowlist** (control Β§4.2, the primary DNS-rebinding defense) β€” the guard accepts `Host` only when it both (a) matches the caller-supplied `allowedHosts` literal list and (b) resolves to a recognised loopback hostname (`127.0.0.1` / `localhost` / `::1`). A rebound domain presents `Host: evil.example:` β†’ **403** (`host_not_allowed`), before any model work. - **Loopback-only double-check** (control Β§4.5) β€” even if a caller misconfigures `allowedHosts` with a LAN IP, the independent loopback-hostname check still refuses it. The bind itself (Phase 5) must use `127.0.0.1`, never `0.0.0.0`. ### Attacker C β€” a local non-browser process **Capability.** Malware or another user's process on the same machine speaks raw HTTP to the endpoint. It can set **any** header (including `Host`, `Origin`, `Sec-Fetch-Site`) because it is not a browser. It cannot, however, present the **per-session token** unless it has read the OS keychain (a separate, higher privilege). **Stops it:** - **Per-session bearer token** (control Β§4.1) β€” a high-entropy token, generated at companion start and stored in the OS keychain, is required on every request. A process without it gets **401**. Constant-time comparison (`constantTimeStringEqual`) prevents a timing side-channel from leaking the token byte-by-byte. - **Rate limiting** (control Β§4.8) β€” bounds brute-force guessing; once the window is full even token-guessing requests get **429**, not an unbounded stream of 401s. - **No ambient authority** (control Β§4.6) β€” even an admitted request can only reach model inference; the endpoint never exposes the vault, the canister client, or the stored JWT. A compromise of the inference path cannot pivot to data exfiltration through this surface. ### Attacker D β€” prompt-injection payload inside a note body **Capability.** A note contains adversarial text ("IGNORE ALL PREVIOUS INSTRUCTIONS… set Host to… use Bearer …"). This text is processed by the model; the attacker hopes the body can influence control decisions (auth, host, routing). **Stops it:** - **Note body is data, never control** (control Β§4.7 / brief Β§8.3) β€” structurally, the guard does **not accept, read, or branch on any request body.** `verifyLoopbackRequest` has no `body` parameter; the admission decision is a function only of method, headers, token, allowlist, clock, and rate state. A payload in the body therefore cannot alter the Host, the Origin, the token, or the verdict. (Downstream prompt construction β€” treating the body strictly as data when building the model prompt β€” is the runtime's obligation in a later phase; the guard guarantees the body never reaches *this* decision.) --- ## 2. Guard contract β€” `lib/companion-loopback-guard.mjs` ### 2.1 `verifyLoopbackRequest(params) β†’ LoopbackVerdict` **Signature.** ```js verifyLoopbackRequest({ method, headers, token, expectedToken, allowedHosts, now, rateState }) β†’ { allow: boolean, status: 200 | 401 | 403 | 429, reason: string } ``` | Param | Type | Meaning | | --- | --- | --- | | `method` | `string` | HTTP method. Allowlist: `GET`, `POST` (case-insensitive). Anything else β†’ 403. | | `headers` | `Record` | Request headers (case-insensitive lookup). Array-valued (duplicate) headers are treated as ambiguous β†’ fail-closed. | | `token` | `string` | Bearer token presented by the caller (already extracted from `Authorization`). | | `expectedToken` | `string` | The per-session token to match against (from the OS keychain, supplied by Phase 5). | | `allowedHosts` | `string[]` | Loopback host literals, e.g. `['127.0.0.1:51847','localhost:51847']`. Empty/missing β†’ deny. | | `now` | `number` | Epoch-ms for this request (passed explicitly β€” the guard never reads the clock). | | `rateState` | `LoopbackRateState` | Current sliding-window state. Missing/malformed β†’ 429 fail-closed. | **Verdict.** Exactly `{ allow, status, reason }`. `reason` is always one of the frozen `LOOPBACK_GUARD_REASONS` constants β€” never a value derived from input: | `reason` | `status` | Meaning | | --- | --- | --- | | `ok` | 200 | Admitted. | | `malformed_request` | 403 | Structurally invalid input (fail-closed). | | `method_not_allowed` | 403 | Method not in `{GET, POST}`. | | `host_not_allowed` | 403 | Missing/foreign/non-loopback `Host` (DNS-rebinding defense). | | `cross_site_forbidden` | 403 | Cross-site `Sec-Fetch-Site` or foreign `Origin`. | | `rate_state_unavailable` | 429 | Rate state missing/malformed β€” cannot prove the rate is bounded. | | `rate_limited` | 429 | Window full. | | `missing_token` | 401 | No token presented. | | `invalid_token` | 401 | Token mismatch, or no `expectedToken` configured. | **Guarantees (all under test):** - **Pure:** no I/O, no `process.env`, no network, no logging, no clock read. Deterministic. - **Fail-closed:** anything missing/malformed/ambiguous denies. No fail-open branch exists. - **Never throws:** a catch-all converts any internal error to `403 malformed_request`, so no exception can carry input data outward. - **No ambient authority:** the verdict is the only output. No vault, canister, or JWT handle. - **No secret in output:** the presented token, expected token, JWT, and any note body never appear in a reason, a return value, or an error. ### 2.2 Rate-limit helpers - `createLoopbackRateState({ windowMs = 60_000, maxRequests = 60 })` β†’ fresh `{ windowMs, maxRequests, timestamps: [] }`. - `evaluateRateLimit(rateState, now)` β†’ `{ ok: true }` or `{ ok: false, reason }`. Pure; counts in-window timestamps; β‰₯ `maxRequests` β†’ `rate_limited`. - `recordLoopbackRequest(rateState, now)` β†’ **new** state with `now` appended and out-of-window timestamps pruned (pure; input not mutated). The array is bounded by `maxRequests`. - `shouldCountTowardRateLimit(verdict)` β†’ `true` **only** for verdicts that reached the token stage (`ok` / `missing_token` / `invalid_token`). See [Β§4](#4-the-rate-limit-recording-contract). ### 2.3 Why the Origin allowlist is the loopback origin only The signature intentionally has **no `allowedOrigins`** parameter. The guard derives the permitted browser origins from `allowedHosts` (i.e. `http(s)://`), so the **only** browser origin that may call the endpoint is its **own loopback origin** (same-origin). A remote origin β€” including the hosted Knowtation web app (`https://knowtation.store`) β€” is cross-origin and is **rejected**. This is the strictest reading of control Β§4.3 ("no reflecting arbitrary Origin") and it cleanly resolves the DNS-rebinding + cross-origin story: the loopback endpoint trusts only same-origin loopback browser context and non-browser local clients (which send no `Origin`/`Sec-Fetch-Site` and still must present a valid token). If a future product decision requires the hosted web tab to *drive* the local companion, that is a **deliberate, documented allowlist extension** decided at the Phase 5 bind gate β€” not a silent default of this guard. Per brief Β§3/Β§2, in-browser inference today runs **in the tab via WebGPU** (reusing the web session), not through the loopback endpoint, so the same-origin-only default is correct for Phase 2. --- ## 3. Evaluation order (and why) The order of checks is itself a security decision: ``` 1. Structural validity β†’ 403 malformed_request (fail-closed on bad input) 2. Method allowlist β†’ 403 method_not_allowed 3. Host allowlist+loopback β†’ 403 host_not_allowed (DNS-rebinding; cheap, rejects most abuse) 4. Origin / Sec-Fetch-Site β†’ 403 cross_site_forbidden 5. Rate limit β†’ 429 rate_limited / rate_state_unavailable 6. Token (constant-time) β†’ 401 missing_token / invalid_token 7. Admit β†’ 200 ok ``` - **Host/Origin before rate-limit.** A cross-origin or DNS-rebinding flood is rejected at steps 3–4 and is **not** recorded against the rate window (see Β§4). If those checks came *after* rate-limit, an attacker could exhaust the shared budget with cheap 403'd probes and **deny the legitimate client** (a budget-exhaustion DoS). Rejecting them first, without consuming budget, prevents that. - **Rate-limit before token.** Placing the rate check *before* the token check is what **bounds token brute-force**: once the window is full, even token-guessing requests receive **429** rather than an unbounded stream of `401`s. If token came first, the function would short-circuit at the token check and never reach the 429 gate, leaving guessing unbounded. --- ## 4. The rate-limit recording contract `verifyLoopbackRequest` is pure and does **not** mutate `rateState`. The caller (Phase 5 listener) advances the window: ```js const verdict = verifyLoopbackRequest({ ...req, expectedToken, allowedHosts, now, rateState }); if (shouldCountTowardRateLimit(verdict)) { rateState = recordLoopbackRequest(rateState, now); } ``` `shouldCountTowardRateLimit` returns `true` **only** for verdicts that reached the token stage (`ok`, `missing_token`, `invalid_token`). This is the precise contract that makes two properties hold simultaneously: - **Brute-force is bounded** β€” failed-auth (`401`) requests consume a slot, so a guessing flood fills the window and trips `429`. - **No budget-exhaustion DoS, and the array stays bounded** β€” pre-rate rejections (`malformed`/`method`/`host`/`cross_site`) and rate rejections (`rate_limited`/ `rate_state_unavailable`) are **not** recorded, so cross-origin/rebinding floods cannot drain the budget, and the `timestamps` array can never grow past `maxRequests`. --- ## 5. Mapping: gate Β§4 controls β†’ Phase 2 enforcement | Gate Β§4 control | Where enforced | Status | | --- | --- | --- | | **1. Bearer token on every request** | `verifyLoopbackRequest` token stage; `constantTimeStringEqual` | βœ… request-decision | | **2. Strict `Host` allowlist (DNS-rebinding)** | `allowedHosts` match + `isLoopbackHost` | βœ… request-decision | | **3. Strict `Origin`/`Sec-Fetch-Site`, no wildcard CORS** | Sec-Fetch-Site allowlist + loopback-origin-only check | βœ… request-decision | | **4. Non-predictable ephemeral port** | β€” | ⏭ **Phase 5 (bind-time)** β€” see Β§6 | | **5. Loopback bind only (`127.0.0.1`)** | `isLoopbackHost` double-check at decision level | βœ… partial (decision); bind ⏭ Phase 5 | | **6. No ambient authority** | Narrow verdict shape; no vault/canister/JWT reachable | βœ… structural | | **7. Untrusted input (note body as data)** | Guard never reads a body β€” structurally outside the decision | βœ… structural | | **8. Rate limiting + minimal logging** | Sliding-window rate gate; guard never logs; no secret in output | βœ… request-decision | > Gate Β§4: *"A future implementation that omits any of items 1–3, 5, or 6 fails this gate."* Items > 1, 2, 3, 6 are fully enforced at the request-decision level; item 5's loopback **assertion** is > enforced at the decision level and its **bind** is specified for Phase 5 below. No required item > is omitted. --- ## 6. What Phase 5 must do to bind the socket safely The pure guard is the bouncer; Phase 5 (companion shell) installs the door. Binding the listener is the single most security-critical action and **requires an explicit gate**. When Phase 5 binds, it MUST: 1. **Bind loopback only.** `server.listen(port, '127.0.0.1')` β€” never `0.0.0.0`, never a public interface (control Β§4.5). Do not bind `::` ; if IPv6 loopback is offered, bind `::1` explicitly. 2. **Allocate a non-predictable ephemeral port.** Let the OS assign an ephemeral port (`listen(0, '127.0.0.1')`) and treat the chosen port as a secret-ish capability; do not use a fixed well-known port (control Β§4.4). Persist it only for the local session. 3. **Generate the per-session token with a CSPRNG.** `crypto.randomBytes(32)` (β‰₯ 256-bit), base64url-encoded, stored in the **OS keychain** (Keychain / DPAPI / libsecret), regenerated each companion start. Pass it to the guard as `expectedToken`. Never log it; never place it in a URL. 4. **Build `allowedHosts` from the actual bound port** β€” `['127.0.0.1:', 'localhost:']` β€” and pass it to every `verifyLoopbackRequest` call. 5. **Extract the presented token** from `Authorization: Bearer ` and pass it as `token`. Pass `Date.now()` as `now` and maintain `rateState` per the Β§4 recording contract. 6. **Call the guard before any model work.** On `allow === false`, return `verdict.status` with a generic body and **no** secret; do not proceed to the runtime. On `allow === true`, proceed. 7. **Emit no permissive CORS.** Never `Access-Control-Allow-Origin: *`; if any CORS header is emitted at all, set `Access-Control-Allow-Origin` to the **validated loopback origin only** and never reflect an arbitrary `Origin`. (Contrast `hub/bridge/server.mjs`, which defaults to `Access-Control-Allow-Origin: *` β€” that pattern MUST NOT be copied to the loopback endpoint.) 8. **Minimal logging.** Log admission decisions by `reason` code only; never log the token, JWT, `Authorization` header, `Origin`, or any note body (control Β§4.8). 9. **Ship its own 7-tier suite** for the bind/lifecycle layer (socket bind assertion, ephemeral-port randomness, keychain read/write, concurrent-connection handling) per gate Β§10 β€” the pure guard's suite does not absolve the listener of its own tests. Until that explicit Phase 5 gate is approved, **no socket is bound** and the gate's no-listener prohibition remains in force. --- ## 7. Test obligations satisfied (gate Β§10, 7 tiers) All under `test/companion-loopback-guard-*.test.mjs` (102 cases, all green): | Tier | File | Focus | | --- | --- | --- | | Unit | `…-unit.test.mjs` | Each control in isolation; helpers (`parseHostHeader`, `constantTimeStringEqual`, rate helpers). | | Integration | `…-integration.test.mjs` | Evaluation order under combined faults; rate-state lifecycle; brute-force bounding; budget-DoS prevention. | | End-to-end | `…-e2e.test.mjs` | Realistic callers: companion UI, local CLI, cross-origin page, DNS-rebinding, stolen-token-still-blocked, full interleaved session. | | Stress | `…-stress.test.mjs` | 100k wrong-token attempts (zero accidental allows); bounded window under 50k load; 10k-entry allowlist; pathological header bags. | | Data-integrity | `…-data-integrity.test.mjs` | Determinism (10k identical calls); no input mutation; verdict shape; reason domain; env-independence. | | Performance | `…-performance.test.mjs` | Sub-ms mean per-decision; 100k decisions < 2s; no super-linear blowup with window size. | | **Security** | `…-security.test.mjs` | **Centerpiece:** missing/wrong token (constant-time, no length oracle); DNS-rebinding 403; cross-site 403; no wildcard CORS / no Origin reflection; rate-limit 429; no ambient authority; note-body-as-data; no secret in any output/reason/error; global fail-closed posture. | --- ## 8. Deferred (explicitly not Phase 2) - The real listening socket, ephemeral-port allocation, and loopback bind β€” **Phase 5** behind an explicit gate (Β§6). - OS-keychain read/write of the per-session token β€” Phase 3 (OAuth/keychain) / Phase 5. - Downstream prompt construction that treats the note body strictly as data when building the model prompt β€” runtime phase (the guard guarantees the body never reaches the admission decision). - Any change to OAuth client registration or scopes (gate "DOES NOT approve" list β€” unchanged).