# M8 acceptance — status & rig checklist

Tracks M8-PLAN §Acceptance (criteria 1–10). M8 code is shipped through
`1123e38` (D1 `bcc58f5` · D2 `2d888fa` · D3 `cb3ac5d` · D4 `fb329ec` ·
D5 `1123e38`); CI green on that head. This file records what is **bench-
verified** (single-host, automatable) vs what needs the **real-hardware
ceremony** (fresh installs, reboots, a second/third machine, enlyzeam's
skewed clock).

## Real-hardware results (2026-06-07, kitsubito Linux + enlyzeam Windows)

Driven over SSH (`reavus@kitsubito` w/ sudo, `decid@enlyzeam` elevated). D4
built on each from source; throwaway subnets cleaned up after.

- **Criterion 2 — fresh Windows install (enlyzeam).** ✅ `install.ps1`
  (local asset base) **upgraded enlyzeam from the public v0.1.0 → D4** and
  fired every leg: sha256 verified, inbound **UDP firewall rule registered**
  (REQ-INSTALL-7), **at-logon daemon scheduled task registered**
  (REQ-INSTALL-8), user PATH set. `daemon status` then rendered
  `peer pump: live (last tick 3s ago)` — the D4 heartbeat writer→reader on
  real hardware (criterion 10 healthy half).

- **Criterion 9 — skewed-clock pairing (enlyzeam, clock +206s ≈ 7 TOTP
  steps off).** ✅ **both arms:**
  - NTP **on** (default): `spt subnet join` from enlyzeam → `JOINED` — the
    in-process NTP offset carried the rendezvous token *and* the ceremony
    step. The exact enlyzeam scenario decision 18 targeted.
  - NTP **off** (`SPT_NTP_SERVER=off` on the daemon, +206s uncorrected) →
    `NO_SEED_HOLDER` ("the rendezvous derives from the time") — proves the
    offset is the carrier, not luck.

- **Criterion 6 — convergence (live pair).** ✅ accept9 peers saw each other:
  `status --nodes` on kitsubito showed KITSUBITO + enlyzeam both `online`.
  (Precise restart-resync *timing* still a rig follow-up.)

- **Criterion 1 — Linux install legs (kitsubito).** ◐ partial. The installer
  legs all proved: unelevated install → ~/.local/bin + PATH + systemd **user
  service started** + **graceful printed sudo hints** for the symlink/linger
  (REQ-INSTALL-6/8); the `sudo ln` symlink made `sudo spt` resolve;
  `loginctl enable-linger` set. NOT a clean criterion-1: kitsubito is a *live
  SPT_DEV node*, so the fresh-account election + reboot-reachable legs want a
  clean user/box (election ladder + de-elevation are unit-covered;
  `/root` confirmed never contaminated). Bonus: `subnet leave` ran
  elevation-gated on both nodes (criterion 5's leave verb).

### Bug found by acceptance (FIXED) — notif_id UNIQUE collision

The `[twohost]` rig run (`e10def4`) passed 11 cross-node rungs then the
**presence-redirect rung** panicked: `UNIQUE constraint failed:
notifs.notif_id`. Root cause (pre-existing, D4 only shifted the timing that
exposed it): a `notif_id` is `node_hex:epoch`, but the pump's **consent-notif
producer minted from `pump-ops.json`** (the stream-open op counter) while the
CLI notify front door + presence producer mint from the **canonical** per-node
epoch (`identity/epoch.json`). Two independent counters, one id namespace → on
a node where both sat low, the consent notif and a presence/notify notif
collided on `node:1`. Long-lived nodes (gravity, where it last passed) had
diverged counters that masked it. **Fix:** the consent notif now mints from
the canonical epoch shared with the registry leg (`RegistryHost::with_epoch`),
so every daemon-side per-node-id producer uses the one counter. Regression
unit: `registryhost::tests::with_epoch_is_the_one_canonical_counter`.
Re-run of the `[twohost]` ladder pending (quiet rig).

## Bench-verified on HFENDULEAM (2026-06-07)

Isolated `SPT_HOME` throwaway homes — the live owl listener was never touched.

- **Criterion 4 — old verbs gone / hot path intact.** ✅
  `spt fork|suspend|wake|rename|digest|resources|access|shutdown|stop` →
  *unrecognized subcommand*. `spt endpoint fork`, `spt subnet notify`,
  `spt endpoint description`, `spt endpoint list --detail` exist. Hot path
  (`send`/`ring`/`ready`/`whoami`/`how-to`) parses unchanged.
  `spt daemon` → `run|stop|status`. Docs drift gate (`xtask check`) green;
  D5 sweep verified no page references a moved verb's old home.

- **Criterion 8 — status honesty.** ✅
  `spt subnet status`, daemon **down**, zero subnets →
  *"No subnets registered — this node is standalone."* + *"The daemon is NOT
  running…"*; never implies messaging works. Hint footer appears on bare
  `spt subnet` only; `spt subnet status` carries none. (Unit:
  `subnet_status_renders_rows_and_hints_never_secrets`.)

- **Criterion 6 tooling — `xtask debug-converge`.** ✅ (shape)
  Missing `--version` → exit **2** (usage); `--version` with no trust rows →
  exit **2** (no expected nodes). Status-only wire + classifier proven
  end-to-end on loopback
  (`tests/propagate.rs::status_query_drives_the_convergence_table_end_to_end`:
  Pending → StagedAwaitingConsent → Applied, plus NotPinned / Rejected /
  untrusted-all-None). The *seconds-not-60s on the rig* half is criterion 6
  proper (below).

- **Criterion 10 mechanism — pump liveness.** ✅ (unit) / ⏳ (live)
  Heartbeat write + advance, supervised restart after injected panic, and the
  capped-backoff ladder are unit-covered (`peerloop` tests). `spt daemon
  status` renders *live / STALLED / no heartbeat recorded* (the D1 reader over
  the D4 writer). The **live** half — kill/stall the pump task on a running
  daemon and observe the STALLED render + `PEER_PUMP_RESTART` log line — is a
  rig step (can't induce a half-dead daemon on the bench without disturbing
  the live networked endpoint).

- **Cross-node regression — `[twohost]` ladder.** ⏳ running
  Pushed `e10def4` ([twohost]) on the D4 binary; CI ladder (pair → register →
  message → drive → sync → update → notif) over the real HFENDULEAM↔kitsubito
  path proves D4 did not regress cross-node. (Static `ip:port`, so it is a
  regression guard, not the criterion-6 convergence-timing proof.)

## Needs the real-hardware ceremony (user-driven)

These require fresh installs / reboots / elevation prompts / a second or
third machine / enlyzeam's skewed clock — not performable from this bench.
Deploy a **D4 binary** to each rig node first (any new binary also self-heals
hfenduleam's v0.1.0-dead pump — D4 makes that class supervised + visible).

1. **Fresh kitsubito Linux install** (clean user, real one-liner):
   `sudo spt subnet …` resolves (symlink leg) and prompts the default-account
   election **exactly once**; daemon + state land under the elected account
   (never root); node is reachable **after a reboot** before any manual `spt`
   (systemd user unit + linger). → criteria 1, 3 (Linux de-elevation live).

2. **Fresh Windows install** (elevated leg): inbound works with zero
   accumulated dev firewall rules — a join succeeds where M7's
   NO_SEED_HOLDER failed. Then the **unelevated-install** path: `spt subnet
   status` renders the blocked-inbound state honestly. → criteria 2, 3.

5. **attach/detach + prune** on a live pair: `spt subnet detach <n>` makes a
   peer observe this node go offline for that subnet; `attach` flips it back;
   `spt subnet prune <node>` removes BIGNET's two `09ef831e` trust rows and
   the dead dials stop.

6. **Convergence in seconds** (the criterion proper): on the rig, a daemon
   restart on one node → the peer reappears in `status --nodes` in seconds
   (post-restart addr-seed reload, not the 60s id-only cold start); an
   endpoint flipping online/offline is visible in the peer's `status --nodes`
   within seconds (event-driven advertisement). Evidence via
   `cargo run -p xtask -- debug-converge --version <N>` against a staged debug
   rollout, or by timing `status --nodes` directly.

7. **Re-pair proof**: regenerate a rig node's identity, re-join with the same
   label → the subnet ends with **one** trust row + **one** registry identity
   for that machine (auto-evicted at ceremony, no manual prune). Sub-check
   (hazard 4.11): the re-paired node's fresh epochs are **not** dropped as
   stale (the evicted row took the peer-side epoch memory with it).

9. **Skewed-clock pairing** (the NTP trigger): pair from **enlyzeam** (clock
   ≥1 min off) — the ceremony completes via the NTP TOTP offset. With NTP
   blocked (`SPT_NTP_SERVER=off`), behavior matches today's (system clock,
   ±1 window) — i.e. a >1-min skew fails closed, proving the offset is what
   carries enlyzeam.

10. **Pump-death honesty** (live half): kill/stall the pump on a running rig
    daemon → `spt daemon status` + `spt subnet status` render the stall
    (never implied-healthy); a panicking pump restarts with capped backoff
    (the `PEER_PUMP_RESTART: supervised restart in <n>s` log line).

## On close

After the rig criteria pass: add the **M8 delivered row to ROADMAP.md**
(the delivered-only pattern — M6/M7 precedent), then milestone close. Next:
`spt-claude-code` scoping (separate repo — the first real adapter; M8 was the
last cheap breaking-rename window).
