# TOTP-seeded SPAKE2 node pairing

## Status

accepted (2026-05-29)

## Context

"Zero-config WAN that just works" still requires a first-contact ceremony by which a user's new node is authorized into their existing subnet (key exchange + mutual trust), with no account and no central operator. The research brief vetted Ed25519 identity, Noise, Magic-Wormhole-style PAKE, and TOFU. The product question is the day-one pairing UX for a single user's own machines.

A natural-seeming option is TOTP: the user provisions a seed into an authenticator app and pairs future nodes with the rotating 6-digit code. Used the obvious way (a new node sends the code to a trusted node for verification), this is insecure: the code is a ~20-bit bearer token, replayable within its 30s window, not bound to the key being exchanged — a real-time MITM can replay the code with their own pubkey and join the subnet. It also performs no key exchange, requiring a separate unbound handshake.

## Decision

Adopt **TOTP-seeded SPAKE2 pairing**: keep the familiar TOTP UX but use the 6-digit code as the **password for a SPAKE2 (PAKE) handshake**, not as a verified bearer token.

- The TOTP **seed** is the durable subnet secret: generated on the first node, shown as a QR for the user's authenticator app, and replicated to every already-trusted node.
- To pair a new node, a trusted node (online, no human — it holds the seed) computes the current code as its SPAKE2 password; the user reads the code off their phone and types it into the new node as its password. Matching codes → PAKE succeeds → pubkeys exchanged + bound, MITM-resistant encrypted channel.

PAKE is the correct primitive for a low-entropy human-transcribed secret: it resists offline brute force (one online guess per attempt) and binds the code to the established key. Cryptographic strength comes from the PAKE construction, not from the 6 digits being secret.

Rejected alternatives:
- **TOTP-as-bearer-token** — replayable, unbound, MITM-vulnerable (above).
- **Plain Magic Wormhole / fresh-code-each-time** — secure, but needs a human/relay at *both* ends each pairing. TOTP-seeded PAKE keeps the persistent seed on online trusted nodes so the user interacts only with the new node.
- **Paste-the-key** — too error-prone for "just works."
- **Account-anchored auto-trust** — reintroduces the account dependency the product is shedding.

Trust model: local per-node trust store of authorized peer pubkeys; TOFU after pairing (accept on first post-ceremony connect, warn on key change). Cross-subnet (two different users) is out of scope for v1.

## Consequences

- A seed-holder node must be online and reachable by the untrusted new node at pairing time; the relay/discovery layer must permit pre-trust contact on a dedicated pairing ALPN.
- Operational details to implement: ±1 TOTP window tolerance, handshake-attempt rate limiting, QR re-provisioning for recovery from any still-trusted node.
- Losing all nodes *and* the authenticator seed makes the subnet unrecoverable — must be documented.
- Revocation of a paired node is a separate trust-store delete; TOTP gates joining only, not continued membership.
- Adds a SPAKE2 + TOTP dependency to the core crypto surface (both have mature Rust crates per the research brief).
- Because every paired node holds the seed, the current code is fetchable from any trusted node (`spt pair --code` or via an agent) — no phone needed if a trusted node is handy. The user is prompted to **name the subnet** on first pairing; the name is shown on every subsequent pairing for confidence as the fleet grows.

## Stage A red-team amendments (2026-05-31)

Codex (`docs/reviews/STAGE-A-codex-redteam.md`) found three issues; resolve before M4 builds pairing:

- **FATAL #10 — trust-store delete is NOT revocation.** The "revocation = trust-store delete; TOTP gates joining only" consequence above is false in the threat model: the seed is replicated to every trusted node, so a compromised node/seed can re-pair as a *new* node after deletion. **Required:** a seed **epoch + rotation** protocol — removing a compromised node rotates the subnet seed so the old node/old seed cannot rejoin. Pairing transcript must bind the seed epoch.
- **SERIOUS #11 — SPAKE2's one-online-guess bound needs subnet-global rate limiting.** A public pre-trust relay + multiple seed-holding nodes enables *distributed* guessing, and a ±1 TOTP acceptance window triples the valid-password space. **Required:** one active pairing ceremony per subnet, shared attempt counter, exponential backoff; justify or drop the ±1 window.
- **SERIOUS #12 — "PAKE binds pubkeys" only if the transcript does.** Must explicitly bind roles, both node pubkeys, subnet ID, TOTP time-step, and confirmation MACs, or unknown-key-share / reflection / wrong-subnet pairing remain. **Required:** full pairing-transcript spec + negative tests (MITM, replay, reflection, wrong subnet, stale time-step).

## Multi-subnet amendment (2026-06-01 — see ADR-0006)

ADR-0006 makes a node a member of *multiple* subnets. The pairing ceremony is refined (not reversed) accordingly:

- **Discovery takes two inputs: TOTP code + subnet-name.** The code is also the relay rendezvous selector, but a 6-digit code is a weak global selector, so the subnet-name namespaces the rendezvous. This is the **unified rule for all cases** — there is no node-name mode.
- **"Create-new subnet" names the subnet at creation.** One machine generates the new seed *and* runs the R-PAIR-4 naming prompt up front (becoming the sole seed-holder); the joiner then uses that subnet-name + the new seed's code. Naming simply moves to link-start.
- **Rendezvous-token hashing.** The relay routes the pre-trust ceremony by `H(subnet-name ‖ TOTP-epoch)`, not the plaintext label — the user/agent enters the raw name + code; the hash is internal. (Payload is already E2E-encrypted; this protects the relay-visible routing token from leaking the human subnet label.) Composes with the #12 transcript binding (the subnet ID / seed epoch already in the transcript).
- **Per-subnet code fetch is gated behind OS elevation** (Windows UAC / Linux root-or-equivalent), or an elevated agent endpoint; else the user falls back to their authenticator app. Replaces the old unqualified "fetchable from any trusted node" with an elevation-gated, *per-subnet* fetch (a node now holds several subnets' seeds).
- **Code fetch is per-subnet** (a node holds *N* seeds): `spt pair show-totp` prompts which subnet (or `--subnet <name>` / `--create-new` to bypass), offered optionally as a QR / `otpauth://` URI.
- **Forward (cross-user, ADR-0006 seam):** the seed key becomes per-(subnet, user); the prompt becomes 2-stage. The #10 seed epoch+rotation and #11 subnet-global rate-limit requirements apply **per subnet**.

## Product-surface amendment (2026-06-05 — M7 D3)

<!-- [doc->REQ-SUBNET-2] [doc->REQ-SUBNET-4] -->

M7 gives the ceremony its product surface (`M7-PLAN.md` decisions 8–10); two
postures are ratified as part of this ADR:

- **The responder is always-on in every member daemon.** The broker's
  node-identity endpoint speaks the pre-trust pairing ALPN beside the trusted
  surface (one endpoint, ALPN-separated — the transcript binds/pins the REAL
  node pubkey, and a relay routes one endpoint per key), and a per-subnet
  rendezvous **meet** listener rotates with the TOTP step. There is no arming
  step: this completes the one-sided UX this ADR intended ("a seed-holder
  node must be online and reachable"). The standing-listener exposure is
  guarded by the #11 subnet-global rate limiter, a one-ceremony-at-a-time
  shed at the accept door, and a ceremony timeout aligned with the limiter's
  stale-slot reclaim — an attacker reaching the listener still gets one
  online guess per admitted ceremony, serialized and backed off.
- **Membership mutations are elevation-gated** (the REQ-PAIR-6 machinery,
  exit 3, gate-before-everything): `subnet create` reveals a fresh subnet's
  joining secret, and `subnet join` enrolls the whole machine into a subnet's
  trust fabric — an unprivileged process must do neither (an attacker-run
  joiner would enroll the box into the attacker's subnet). `subnet status`
  stays ungated: read-only, membership shape only, never secrets.

Surface rename, same semantics: `spt pair show-totp [--subnet|--create-new]`
became `spt subnet show-code [NAME]` (named `show-totp` for five days) / `spt subnet create <NAME>` /
`spt subnet join [NAME] [--code]` (M7 decision 3 — verbs, not flags; the
"user is prompted to name the subnet" consequence above is delivered by the
guided join/create prompts).
