# M3 Plan — Terminal wrapper + daemon + self-update

> **Just-in-time, lightweight** — same pattern as `M0/M1/M2a/M2b-PLAN.md`. The
> ordered task layer between `ROADMAP.md` (milestone sequence) and
> `traceable-reqs.toml` (requirement checklist). Honors ROADMAP §"lightweight but
> structured, GSD too heavy". Branch: `dev-freeform`.

> **M3 is split** (decision 2026-06-01): the milestone carries three large,
> separable deliverables plus a mandatory spike-gate. Building them as one
> undifferentiated task list would bury the broker/brain boundary the whole
> architecture rests on. The split mirrors M2a/M2b and ADR-0004's own layering:
> - **Phase 0 — spike-gate** (BLOCKING): close ADR-0004 §E before building the daemon.
> - **M3a — `spt-term`**: the broker's PTY/session-surface half (REQ-TERM-*).
> - **M3b — `spt-daemon`**: the broker/brain split + consolidation; re-hosts the
>   M2b lifecycle (REQ-DAEMON-*, REQ-START-3).
> - **M3c — self-update**: the gated, signed, ripple update engine (REQ-UPD-*,
>   REQ-SEAM-UPDATE). Peer-propagation *transport* defers to M4 (needs P2P).

## Goal

Replace the interim no-daemon model with the **real architecture** (ADR-0004):
one per-machine `spt-daemon` (broker/brain split) that owns all per-machine state
and re-hosts the M2b lifecycle as in-process loops with daemon-authoritative
liveness; a `spt-term` broker hosting PTYs behind a versioned local IPC; and a
**seamless self-update** that swaps the brain without any endpoint noticing. This
is where the two surviving FATALs land (ADR-0004 daemon, the broker↔brain
boundary) — the milestone that earns every interim seam its keep.

**M3 done =** the daemon hosts the LiveAgent lifecycle (Psyche-as-loop, no
`api listen` process, no seed file); `spt-term` hosts a real PTY with the
ConPTY-DSR hazard closed; a brain-only self-update swaps the brain with **zero**
endpoint interruption (the no-terminate invariant, proven E2E); adapters
ripple-update behind it; `cargo test --workspace` + clippy `-D warnings` green;
all M3-phase reqs activated; CI matrix (ubuntu + windows) green.

## Why now (what M0–M2b deferred here)

Every interim mechanism M1/M2a/M2b shipped is a documented swap point — M3 swaps
the *backing*, not the *shape*:

| Interim (M1/M2a/M2b) | M3 swap | Req / hazard |
|---|---|---|
| `api listen` process holds perch + relays | daemon-hosted loop + thin killable relay (topology 1) | REQ-DAEMON-1, ADR-0004 |
| seed-file bridge (`spt-store::seed`) | in-memory seed; spt-hosted spawn→bind (no file) | REQ-START-3 |
| interim pulse in the listen runner (`spt-live::pulse`) | daemon Psyche-loop with durable, configurable period | REQ-DAEMON-1 |
| per-pid Psyche liveness (`spt-live::psyche`) | daemon-authoritative `status` field | REQ-HAZARD-DAEMON-HOSTED-LIVENESS (2.5) |
| graceful-only signoff (`spt-live::signoff`) | always-on orphan-watch reusing the *same* grace/echo ordering | hazards 1.1 / 3.3 (supervised path) |
| `[inject]` `pty` method dropped (M2a) | real PTY injection (send-keys / send-line) | REQ-TERM-2 |

## Scope

### Phase 0 — spike-gate (✅ **COMPLETE 2026-06-01**; `spt-spikes/`, not in-repo crates)

> All four ADR-0004 §E gaps closed **PASS**: Spike #3 (QUIC survival), #4 (forkpty
> parity), #5 (100× restart + resize), #6 (idempotent boundary). Docs in
> `docs/spikes/SPIKE-0{3,4,5,6}-*.md`; ADR-0004 §E updated. M3a + M3b unblocked.

ADR-0004 §E: four gaps **must close before M3 builds the daemon**. Spikes live in
`spt-spikes/` (the sandbox, per Spike #1/#2), not the spt-core workspace.

- **Spike #3 — QUIC + file-transfer stream survival across brain restart**
  (FATAL #2; *highest remaining risk in the whole roadmap*). Spike #1 proved
  PTY + plain TCP; QUIC carries process-local crypto/stream state. Stand up an
  iroh endpoint + an active QUIC stream + an in-flight file transfer in the
  broker, restart the brain, prove the stream survives. **De-risks the broker's
  ownership table (ADR-0004 §B) before the daemon is built around it.** Uses iroh
  in the sandbox only — M3 ships no networking; the *shape* is what's validated.
- **Spike #4 — Linux `forkpty` parity.** Spike #1 was Windows-only (the high-risk
  ConPTY path). Rerun the broker handoff on Linux `forkpty`.
- **Spike #5 — 100× restart + resize-under-load stress.** Spike #1 did one
  restart; harden before trusting the broker as the stable kernel.
- **Spike #6 — idempotent delivery across the broker↔brain boundary** (#14):
  prototype durable IDs + replay rules at each side-effect boundary (spool, PTY
  write, transfer, registry) — the design input for `REQ-HAZARD-RESTART-IDEMPOTENT`.

**Gate rule:** M3b does not start until #3 and #6 return a verdict. #4/#5 may run
in parallel with M3a. A spike that *falsifies* ADR-0004 §B re-opens the ADR
before any daemon code (cheaper than discovering it in the build).

### M3a — `spt-term` (the broker's PTY / session-surface half)

> **Detailed JIT plan: [`M3a-PLAN.md`](./M3a-PLAN.md).** Folds in **REQ-TERM-4**
> (live activity buffer / PTY digest, ADR-0008 — the parser primitive is M3a; its
> manifest-sourcing + `spt digest` + delta-stream + persistence are M3b).

**In:** the `spt-term` crate (sibling of `spt-runtime`, above `spt-msg`):
- the **process-supervisor terminal wrapper** hosting broker PTYs (`REQ-TERM-1`)
  — ConPTY on Windows / `forkpty` on Unix behind one `SessionSurface` trait;
- the **session-surface abstraction** with `send-keys` + `send-line` injection
  (`REQ-TERM-2`) — the real PTY channel the M2a `[inject]` `pty` method (dropped
  then) now resolves to;
- **byte-stream remote terminal streaming** for v1 (`REQ-TERM-3`) — the local
  stream shape (the off-node hop is M4 transport);
- the **ConPTY-DSR auto-answer** (`REQ-HAZARD-CONPTY-DSR` / 5.5) — every ConPTY
  reader answers `ESC[6n` → cursor report; reader drains on a thread, never gates
  exit on a blocking `read()`. (Pattern already proven in Spike #1.)

**Out:** the broker *process* + IPC (that's M3b — `spt-term` provides the PTY
*mechanism* the broker hosts); off-node terminal transport (M4).

### M3b — `spt-daemon` (broker/brain split + consolidation)

**In:** the `spt-daemon` crate (above `spt-live`, below `spt`):
- **broker** (stable kernel) — holds only un-transferable resources per the
  ADR-0004 §B ownership table: PTY master handles (`spt-term`), spawned harness
  child processes, accepted local client sockets, listening sockets. Minimal
  **versioned local IPC** (`REQ-HAZARD-HANDOFF-ARGV-COMPAT` / 2.3 — newer brain ↔
  older broker).
- **brain** (userspace) — all logic: routing, registry, **the re-hosted M2b
  lifecycle** (Psyche-as-loop + the durable pulse period replacing the interim
  glue), manifest parse. Restarts freely; rehydrates from disk and re-attaches to
  broker-held handles. `gen_start = now()` on cold-start **and** handoff
  (`REQ-HAZARD-GEN-START-NOW` / 2.4).
- **consolidation** (`REQ-DAEMON-1`) — poll-listener + Psyche/pulse loops move
  into the daemon; the only residue is the thin **stateless in-session relay**
  for harness-hosted sessions (topology 1), freely killable.
- **daemon-authoritative liveness** (`REQ-HAZARD-DAEMON-HOSTED-LIVENESS` / 2.5) —
  the daemon's `status` field supersedes `is_process_alive(info.pid)`; the M2b
  per-pid checks gated behind the interim seam become the swap.
- **idempotent delivery** (`REQ-HAZARD-RESTART-IDEMPOTENT` / 7.2) — durable IDs +
  replay rules at every broker↔brain side-effect boundary (from Spike #6).
- **auto-start** (`REQ-DAEMON-3`) — any `api` invocation auto-starts the daemon
  if absent.
- **spt-hosted startup** (`REQ-START-3`) — spawn-session then `api bind`, in-memory
  seed, no seed file (swaps the M2a interim `spt-store::seed`).
- **orphan-watch** — carry the M2b graceful ordering (grace 1.1 / echo 3.3 /
  stale-sweep 3.2) into the **supervised crash** path the always-on daemon now
  enables (M2b delivered graceful-only).

**Out:** networking/P2P (M4); the QUIC-stream broker ownership *implementation*
(M4 — but its *shape* validated by Spike #3 now).

### M3c — self-update (gated, signed, ripple)

**In:**
- the **update-class taxonomy** (ADR-0004 §A): **brain-only** (no endpoint
  terminates — `REQ-UPD-3`, absolute, the routine case, proven E2E),
  **broker-compatible** (hot-swap behind the versioned IPC), **broker-breaking**
  (planned endpoint-cycle, may suspend with consent + scheduling);
- **signature verification before handoff** (`REQ-UPD-2`) — spt-core's release
  key, on every binary regardless of source;
- **update hardening** (`REQ-HAZARD-UPDATE-ROLLBACK`, ADR-0004 §D) — monotonic
  version (rollback rejection), release-metadata expiry, channel pinning, key
  rotation/revocation list, **adapter content signing**;
- **consent gating** (`REQ-UPD-4`) — gated on user confirmation delivered to the
  most-recently-active live session; opt-in full-auto;
- **adapter ripple-update** (`REQ-UPD-5` + `REQ-SEAM-UPDATE`) — spt-core conducts
  the stack: self first, then each registered adapter via its manifest `[update]`
  avenue (file-pull / delegated command), with adapter content signing.

**Out:**
- **Peer-propagation transport over P2P** (`REQ-UPD-1`) — needs the M4 network
  layer. M3c ships the update *engine* with **self-fetch / out-of-band** delivery
  (ADR-0004: "layered on self-fetch, out-of-band still supported"); peer-propagation
  is the M4 transport plugged into the same engine. `REQ-UPD-1` activates at M4.

## Clean-room posture (validated against the sister)

- **`spt-term` is brand-new** — the sister never hosted ConPTY/PTY directly (it
  shelled out to `claude`). Clean-room the whole crate; the only copied artifact
  is the **ConPTY-DSR mechanism** proven in Spike #1 (harness-orthogonal).
- **`spt-daemon` consolidation** — copy the sister's hard-won **ordering /
  liveness invariants** (orphan grace 1.1, echo 3.3, stale-sweep 3.2, stable-PID
  anchor 5.1, gen_start 2.4) and clean-room the process model (the sister's
  separate listener + wrapper processes collapse into the one daemon — ADR-0004).
- **self-update** — clean-room (the sister used a GitHub remote + `gh`; spt-core
  self-fetches + verifies signatures, P2P transport at M4 — ADR-0002/0004).

## New requirements to register first (TRACEABILITY rule 3)

**Assessment — none.** Every M3 deliverable maps onto already-registered reqs:
- daemon → `REQ-DAEMON-1/2/3/4`; terminal → `REQ-TERM-1/2/3`; startup →
  `REQ-START-3`; update → `REQ-UPD-2/3/4/5`, `REQ-SEAM-UPDATE`; `REQ-UPD-1`
  (peer-propagation) stays `[]` until M4.
- hazards → `REQ-HAZARD-CONPTY-DSR` (5.5), `REQ-HAZARD-DAEMON-HOSTED-LIVENESS`
  (2.5), `REQ-HAZARD-RESTART-IDEMPOTENT` (7.2), `REQ-HAZARD-HANDOFF-ARGV-COMPAT`
  (2.3), `REQ-HAZARD-GEN-START-NOW` (2.4), `REQ-HAZARD-UPDATE-ROLLBACK` — all
  registered; activate per task.
- `REQ-DAEMON-4` ("honor every KNOWN-HAZARDS invariant") is the conformance
  umbrella — activate last (M3 close), evidenced by the hazard suite being green.

## Sequencing rationale

Spike-gate first (§E gaps falsify-or-confirm the broker shape *before* code).
Then `spt-term` (the PTY mechanism the broker hosts — buildable in parallel with
spikes #4/#5). Then `spt-daemon`: broker skeleton (host `spt-term` PTYs +
versioned IPC) → brain skeleton (rehydrate + re-attach) → **re-host the M2b
lifecycle** (the load-bearing consolidation) → daemon liveness → idempotent
boundary → orphan-watch → spt-hosted startup → auto-start. Self-update last (it
rests on the broker/brain split being real): update-class engine → signature +
rollback hardening → consent gating → adapter ripple. Each task tags evidence +
activates its reqs in the same commit; `traceable-reqs check` green before the next.

## Tasks — Phase 0 spike-gate (`spt-spikes/`)

| # | Task | Source | Gate | Acceptance |
|---|------|--------|------|------------|
| S3 | **Spike #3 — QUIC + file-transfer survival across brain restart** | ADR-0004 §E.1 (FATAL #2) | blocks M3b | an active QUIC stream + in-flight transfer in the broker survive a brain restart; or ADR-0004 §B re-opened |
| S4 | **Spike #4 — Linux `forkpty` parity** | ADR-0004 §E.2 | blocks M3a close | broker PTY handoff passes on Linux `forkpty` (mirror Spike #1) |
| S5 | **Spike #5 — 100× restart + resize-under-load** | ADR-0004 §E.3 | blocks M3b close | 100 brain restarts + concurrent resizes, no leak/hang/lost-byte |
| S6 | **Spike #6 — idempotent broker↔brain delivery** | ADR-0004 §E.4 | blocks M3b idempotency task | durable-ID + replay rule prototype: crash before/after each side-effect, no dup/drop |

## Tasks — M3a `spt-term`

| # | Task | Reqs / hazards | Acceptance |
|---|------|----------------|------------|
| A0 | Scaffold `crates/spt-term` (sibling of spt-runtime, above spt-msg); deps proto/store/msg | — | compiles; layering acyclic; `check` green |
| A1 | `SessionSurface` trait + ConPTY (win) / `forkpty` (unix) backends; spawn a child under a PTY | REQ-TERM-1 | a child runs under a real PTY on both OSes; output captured |
| A2 | **ConPTY-DSR auto-answer**: reader answers `ESC[6n`→cursor report on a drain thread | REQ-HAZARD-CONPTY-DSR (5.5) | a ConPTY child's stdout is NOT withheld; DSR query auto-answered (regression test) |
| A3 | `send-keys` + `send-line` injection over the surface | REQ-TERM-2 | injected keys/lines reach the child; ordering preserved |
| A4 | Byte-stream local terminal streaming (v1 shape) | REQ-TERM-3 | a consumer streams the child's byte output; backpressure bounded |
| A5 | Activation sweep (REQ-TERM-1/2/3 + CONPTY-DSR) | — | `check` green; CI matrix green |

## Tasks — M3b `spt-daemon`

| # | Task | Reqs / hazards | Acceptance |
|---|------|----------------|------------|
| B0 | Scaffold `crates/spt-daemon` (above spt-live, below spt) | — | compiles; layering acyclic |
| B1 | **Broker** skeleton: host `spt-term` PTYs + spawned children + sockets; minimal **versioned** local IPC | REQ-DAEMON-2; REQ-HAZARD-HANDOFF-ARGV-COMPAT (2.3) | newer brain talks to older broker across an IPC version bump |
| B2 | **Brain** skeleton: rehydrate from disk + re-attach to broker handles; `gen_start=now()` cold-start & handoff | REQ-DAEMON-2; REQ-HAZARD-GEN-START-NOW (2.4) | brain restart re-attaches; fresh gen_start each start |
| B3 | **Re-host the M2b lifecycle** into the brain: Psyche-as-loop + durable pulse period (replace `spt-live::pulse` interim glue + the `api listen` host) | REQ-DAEMON-1 | the daemon hosts the lifecycle; the interim `api listen` host path retired |
| B4 | **Daemon-authoritative liveness**: `status` field supersedes per-pid checks | REQ-HAZARD-DAEMON-HOSTED-LIVENESS (2.5) | a daemon-hosted endpoint reports live with no dedicated pid; M2b per-pid seam swapped |
| B5 | **Idempotent delivery** across the broker↔brain boundary (durable IDs + replay) | REQ-HAZARD-RESTART-IDEMPOTENT (7.2) | crash before/after spool/PTY/registry side-effects → exactly-once |
| B6 | **Orphan-watch**: supervised crash path reuses M2b grace/echo ordering | hazards 1.1 / 3.3 / 3.2 (supervised) | an orphaned Self after harness death is grace-rechecked + echo-before-signoff |
| B7 | **spt-hosted startup**: spawn-session→`api bind`, in-memory seed (no file) | REQ-START-3 | a spt-hosted session binds with no seed file |
| B8 | **Auto-start**: any `api` call auto-starts the daemon if absent | REQ-DAEMON-3 | a cold `api` call launches the daemon then proceeds |
| B9 | Daemon lifecycle E2E + activation (DAEMON-1/2/3, START-3, the 4 hazards) | — | E2E: brain restart preserves a live PTY session; `check` green |

## Tasks — M3c self-update

| # | Task | Reqs / hazards | Acceptance |
|---|------|----------------|------------|
| C0 | Update-class engine: brain-only / broker-compatible / broker-breaking dispatch | ADR-0004 §A | each class routed to its invariant; brain-only never touches endpoints |
| C1 | **Signature verify before handoff** (release key) + monotonic version / expiry / channel-pin / revocation | REQ-UPD-2; REQ-HAZARD-UPDATE-ROLLBACK | unsigned/older/expired/revoked rejected; valid swap proceeds |
| C2 | **Brain-only swap E2E — the no-terminate invariant** | REQ-UPD-3 | a brain-only update swaps the brain with ZERO endpoint interruption (live PTY survives) |
| C3 | **Consent gating**: confirm to most-recently-active live session; opt-in full-auto | REQ-UPD-4 | update blocks on consent by default; full-auto opt-in honored |
| C4 | **Adapter ripple-update** via manifest `[update]` avenue + adapter content signing | REQ-UPD-5; REQ-SEAM-UPDATE | self updates then each registered adapter ripples; adapter payload signature-checked |
| C5 | Activation sweep: REQ-UPD-2/3/4/5 + SEAM-UPDATE + UPDATE-ROLLBACK; **REQ-DAEMON-4** (hazard-suite umbrella); amend ROADMAP (M3 delivered) + CONTEXT/M3 notes; author M4 plan stub | — | `check` green with all M3 reqs activated; CI matrix green |

## M3 requirement-activation map

Activate as each task lands (default `["impl","unit"]`; daemon/term/update E2Es
add `int`):
- **Terminal:** REQ-TERM-1 (A1), REQ-TERM-2 (A3), REQ-TERM-3 (A4); REQ-HAZARD-CONPTY-DSR (A2).
- **Daemon:** REQ-DAEMON-2 (B1/B2), REQ-DAEMON-1 (B3; +int B9), REQ-DAEMON-3 (B8),
  REQ-START-3 (B7; +int B9); hazards HANDOFF-ARGV-COMPAT (B1), GEN-START-NOW (B2),
  DAEMON-HOSTED-LIVENESS (B4), RESTART-IDEMPOTENT (B5).
- **Update:** REQ-UPD-2 (C1), REQ-UPD-3 (C2; +int), REQ-UPD-4 (C3), REQ-UPD-5 (C4),
  REQ-SEAM-UPDATE (C4); REQ-HAZARD-UPDATE-ROLLBACK (C1).
- **Umbrella:** REQ-DAEMON-4 (C5) — last, evidenced by the full hazard suite green.

**Stay `[]` (M4):** `REQ-UPD-1` (peer-propagation transport), all `R-NET`/`R-PAIR`,
`REQ-EP-4` (PresenceChannel impl), the distributed precedence/freshness
generalization of 6.5 (node identity).

## Workspace change

Add `crates/spt-term` (sibling of `spt-runtime`) and `crates/spt-daemon` (above
`spt-live`). Final layering (R-ARCH-1, acyclic) matches the day-one comment:

```
spt-proto → spt-store → spt-msg → {spt-net, spt-term, spt-runtime} → spt-live → spt-daemon → spt
```

`spt` becomes a **thin client** of `spt-daemon` (auto-starts + talks to it).
Neither `spt-term` nor `spt-daemon` is public SDK (R-ARCH-2 stays
proto/runtime/msg) — they are the per-machine host the daemon consolidates.
`spt-net` (M4) slots in as the remaining sibling.

## Risks carried into M3

- **The two FATALs live here.** ADR-0004 (daemon broker/brain) is the keystone;
  the interim model existed precisely so M3 is a controlled consolidation, not a
  leap. **Do not start M3b until the spike-gate (Phase 0) returns** — a falsified
  ownership table is cheap to fix before code, ruinous after.
- **QUIC-across-restart is unproven (Spike #3) — the single highest risk in the
  roadmap.** It validates the broker's most invasive resource ownership. If it
  fails, ADR-0004 §B re-opens before any daemon line is written.
- **No-terminate is an *absolute* only for brain-only** (ADR-0004 §A). Don't let
  the C2 E2E quietly downgrade to "endpoints usually survive" — it must be zero
  interruption, asserted against a live PTY session.
- **Re-host, not rewrite.** The M2b seams (seed-file, pulse module, per-pid
  liveness, graceful signoff) are deliberately localized — M3 swaps their
  *backing*, not their *shape*. A task that finds itself rewriting a seam's
  contract is a signal the interim boundary was wrong; surface it, don't paper
  over it.
- **Update signing is mandatory, not optional** (ADR-0004 §D / peer-propagation
  makes one compromised node a subnet poison vector). Adapter content signing
  (not just binary signing) is the easy-to-miss half — C4 must cover it.
- **Cross-milestone coupling.** `REQ-UPD-1` (peer-propagation) and the QUIC-stream
  broker *implementation* both need M4's P2P. M3 ships the update *engine* + the
  broker *shape* (spike-validated); M4 plugs in the transport. Keep the update
  engine and broker ownership table transport-agnostic so M4 is an integration,
  not a re-architecture.
