# spt-core Roadmap

> Lightweight-but-structured path from design → fruition. Not a GSD/phase-gated process — a clear sequence with de-risk gates. Planning corpus is complete (PRD + CONTEXT + 5 ADRs + 6 design docs); this is how we build it.

## End goal

A harness-independent core for an agent ecosystem — messaging, live-agent lifecycle, terminal hosting, **first-class P2P networking**, and a runtime-manifest harness contract — shipped as Rust crates + a single `spt` binary. The proof: the same agent runs natively across all of a user's machines with a synced mind, reachable and operable from anywhere, with seamless self-update and zero-config cross-machine messaging. First thing built atop it: a rebuilt `spt` plugin at parity with today's modern SPT.

## Scope decisions (locked)

- **v1 = the full PRD, networking included.** The networked element is the *reason* spt-core exists; it is not deferrable to v1.1.
- **The hard freeze on `claude_skill_owl` is not a clock.** It works today and delivers value as-is while spt-core is built — no schedule pressure from it.
- **Process: lightweight but structured.** GSD is too heavy. This roadmap + the design docs are the structure.

## Path to fruition (sequence)

**Stage A — Red-team review (before building).** Pressure-test the design adversarially. Focus on the riskiest architectural decisions, not bikeshedding. Tools: `plan-eng-review` (architecture) + `codex` (adversarial second opinion). Scope-cutting is *not* a goal (v1 is locked as full PRD), but surfacing contradictions / over-engineering / unproven assumptions is.

**Stage B — De-risk spikes.** Throwaway proofs of the load-bearing unknowns — anything that, if it fails, forces a redesign. Do these before committing to the full build. See *De-risk spikes* below.

**Stage C — Walking skeleton.** Build the killer-quickstart slice end-to-end through the real crate boundaries: **two agents exchange a message** (first local; ultimately cross-node — the real proof). Validates the whole stack early; then thicken.

**Stage D — Phased build (M0–M5).** Dependency-ordered, each milestone ships something usable.

## Milestones (all v1)

- **M0 — Workspace + commodity layer.** Cargo workspace; `spt-proto` (wire envelope grammar, endpoint types, framing, Ed25519 identity, typed+binary payloads) + `spt-store` (spool, perch layout, trust store, registry persistence). Copy-verbatim the stable formats from the sister project (ADR-0001).
- **M1 — Local messaging + binary.** `spt-msg` (delivery TCP+spool, routing, send/ring/ready) + the `spt` binary + CLI. **= the killer quickstart (local): two agents exchange a message.** Task breakdown: [`M1-PLAN.md`](./M1-PLAN.md).
- **M2 — Harness contract + lifecycle.** **Strictly harness-agnostic — zero Claude Code conventions in this repo.** spt-core defines the adapter contract; the only adapter-shaped artifact in-tree is a generic mock/test adapter (manifest fixture + fake `api` caller) that exercises the contract (PRD R-DOCS-2 dev-agent quickstart). **The rebuilt `spt` plugin (Claude Code) is a SEPARATE downstream project** built atop these crates/binary — spt-core v1's acceptance proof (PRD §"feature parity"), not an spt-core deliverable. It lives and builds outside this repo. **Split into two JIT milestones** (decision 2026-06-01):
  - **M2a — Harness contract. ✅ delivered (2026-06-01).** `spt-runtime` (AgentRuntime, ManifestRuntime, manifest schema) + the `api` subcommand surface (seed/listen/bind/state/echo-gate/poll/worker-*/boundary/session-end/presence/history-log/emit/capability) + the **harness-hosted** startup topology (seed→listen→bind, interim seed-file) + local-api-auth + the generic mock adapter + contract E2E. All M2a reqs activated + green. Task breakdown: [`M2a-PLAN.md`](./M2a-PLAN.md).
  - **M2b — Live-agent lifecycle. ✅ delivered (2026-06-01).** `spt-live` (fifth layer): history subsystem (fetcher/locate-normalize/native) + spawn-psyche (nested perch, spt-core-owned `$psyche_prompt`) + echo-commune (history brief → single-writer commune drop) + drop-file ingest (single-writer 6.4 + direct-write precedence 6.5) + interim pulse + signoff/boundary ordering invariants (echo-before-signoff 3.3, grace-before-signoff 1.1, stale-signoff sweep 3.2) + resume seam (continue-existing / fresh-with-preload). Hosted by the `api listen` runner + `api shutdown`; mock adapter gained the lifecycle templates; LiveAgent E2E proves spawn→Psyche-perch→commune-ingest→graceful-signoff harness-agnostically. All M2b reqs activated + green. Task breakdown: [`M2b-PLAN.md`](./M2b-PLAN.md).
  - **Interim, no daemon** (like M1): the `api listen` process holds the perch + relays; the seed bridges via a file. **The spt-hosted topology (PTY-launch) + the consolidated `spt-daemon` move to M3** (they need the `spt-term` broker) — this narrows the earlier "both topologies in M2" wording.
- **M3 — Terminal wrapper + daemon + self-update.** `spt-term` (session-surface, PTY, broker) + the consolidated `spt-daemon` (broker/brain split) + peer-propagated signed self-update. Split M3a/M3b/M3c. Scope + carried-forward Spike #1 gaps: [`M3-PLAN.md`](./M3-PLAN.md).
  - **M3a — `spt-term` session-surface mechanism. ✅ delivered (2026-06-02).** The sixth crate (`…→spt-msg→spt-term`, sibling of spt-runtime): the `SessionSurface` trait + native PTY backends (ConPTY/forkpty via `portable-pty`, REQ-TERM-1) · ConPTY-DSR auto-answer drain pump (REQ-HAZARD-CONPTY-DSR) · send-keys/send-line injection (REQ-TERM-2) · bounded byte-stream (REQ-TERM-3) · the PTY-digest parser primitive (REQ-TERM-4 impl/unit, ADR-0008). OS-neutral per Spike #4/#5 (no OS's stream contract baked in; ANSI-stripped, tail-wins repaint tolerance). Validated on Windows (ConPTY) + Linux (forkpty). All M3a reqs activated + green. Task breakdown: [`M3a-PLAN.md`](./M3a-PLAN.md).
  - **M3b — `spt-daemon` broker/brain process. ✅ delivered (2026-06-03).** The seventh crate (`…→spt-live→spt-daemon→spt`) hosts many `spt-term` surfaces behind a **versioned local IPC** (named-pipe/UDS behind one `DaemonTransport` trait, forward-compatible frame schema): a **broker** (stable kernel — PTY masters, harness children, sockets; no logic) + a **brain** (restartable logic) so a brain kill/restart leaves the hosted PTY child + its output stream **intact and gapless** (Spike #1 made real; `gen_start = now()` each handoff). Consolidates the interim no-daemon model — the `api listen` pulse/Psyche/drop-ingest loops fold into the brain as config-paced in-process loops (only a thin stateless relay residue) · daemon-authoritative liveness (`info.json` `status` supersedes `is_process_alive` for hosted perches) · exactly-once broker↔brain boundary (broker-owned `EffectJournal` + dedup-at-effect, Spike #6) · auto-start + in-memory seed (the seed file is dead) · supervised-crash orphan-watch (carries the M2b graceful ordering) · the **digest daemon-half** (manifest `pty_digest` seam → `DigestParser` over the broker PTY, `spt digest <id>` pull + delta-stream + opt-in Path-B persistence → REQ-TERM-4 `int`). All M3b reqs activated + green; daemon E2E (spawn→Psyche-loop→commune→brain-restart-survives→graceful-signoff) on the CI matrix (ubuntu + windows). Task breakdown: [`M3b-PLAN.md`](./M3b-PLAN.md).
  - **M3c — signed self-update. ✅ delivered (2026-06-03).** The gated, signed, ripple-capable self-update engine (no new crate — it lives in `spt-daemon`, signing primitives in `release.rs`): the **update-class taxonomy** (brain-only / broker-compatible / broker-breaking, ADR-0004 §A) with the **brain-only zero-interruption swap** that drives the M3b handoff substrate (snapshot→drop→re-attach) so a live PTY child + its output stream survive the logic swap **untouched** (REQ-UPD-3, proven E2E `tests/brain_swap.rs`) · **verify-before-handoff** (Ed25519 over signed release metadata + SHA-256 artifact binding; `plan_verified` is the front door — no unverified binary reaches a handoff, REQ-UPD-2) · **rollback hardening** (monotonic version, metadata expiry, channel pinning, key revocation — REQ-HAZARD-UPDATE-ROLLBACK) · **consent gating** (default-gated, delivered to the most-recently-active live session via the new `info.json` `last_active_ms` recency stamp; opt-in full-auto — REQ-UPD-4) · **adapter ripple-update** (self-then-adapters via each manifest `[update]` avenue, with adapter content signing against the adapter's own key for `file_pull` / a `self_verifies` attest for opaque `delegated` updaters — REQ-UPD-5, REQ-SEAM-UPDATE). **Peer-propagation transport (REQ-UPD-1) defers to M4** — the engine ships with self-fetch/out-of-band delivery into the same seam. All M3c reqs activated + green. Task breakdown: [`M3c-PLAN.md`](./M3c-PLAN.md).
  - **M3 COMPLETE** (M3a + M3b + M3c all ✅, 2026-06-03): the real ADR-0004 architecture is live — native PTY hosting, the consolidated broker/brain daemon, and seamless signed self-update. Next milestone: M4.
- **M4 — Networking + instances. ✅ delivered (2026-06-04).** The eighth crate `spt-net` (Iroh WAN endpoint bound to the node's own Ed25519 identity, mDNS LAN discovery, TOTP-SPAKE2 pairing over a dedicated pre-trust ALPN with transcript binding / rate limiting / seed rotation+transfer, subnet registry + NDJSON replication under the per-node epoch lease) · the multi-instance model (registry rows + bare-id resolution refuse-and-qualify + visibility/sync-membership gates + rename/collision + **resting-state machine** dormant/suspended with measured warm/cold policy [`docs/DORMANCY-BUDGET.md`, red-team #9 closed] + **resource advertisement** REQ-INST-14 + **immutable home subnet & `spt fork`** REQ-INST-15, ADR-0010) · **cross-node Psyche sync** (git-native bundles over broker QUIC streams, two-tier scope gates, vector-precedence merge with conflicts surfaced + Psyche-reconciled, ADR-0013; **the gh-repo interim is retired**) · WAN messaging + remote-drive attach + chunked file transfer + per-endpoint access whitelists · the subnet **notification primitive** (replicated spool, dismiss-replication, boundary resurfacing; update consent + escalations refactored as producers) · the **update peer-propagation transport** into the M3c engine seam (offer-then-fetch, verify-before-stage per node, consent-notified — REQ-UPD-1) · production trigger loops (peer pump + inbound dispatcher — the subnet self-drives). **The real proof ran on the real rig: `[twohost]` CI run `26958175812` — pair → register → cross-node message → remote-drive (A typed into B's live PTY and read the echo) → Psyche sync both directions → update self-heal → notif dismiss-replication, HFENDULEAM (Windows) ↔ gravity-linux, whole ladder ~8 s** (`docs/TWO-HOST-RUNBOOK.md` + evidence appendix; `tests/twohost.rs` carries the nine activated `int` stages). Task breakdowns: [`M4-PLAN.md`](./M4-PLAN.md) + per-task JIT plans (D1–D9).
- **M5 — Shells, presence, deferred capabilities. ✅ delivered (2026-06-04;** scope locked with user 2026-06-04, see [`M5-PLAN.md`](./M5-PLAN.md) + per-task JIT plans D0–D9). The consent framework seam (grant store + interactive escalation + pre-consent flags; remote-exec / instantiate-anywhere reserved-but-refusing) · adapter registration lifecycle (`spt adapter add/remove`, manifest-first) · **shell hosting machinery + sleep/wake** (link-token channels with per-frame MAC, broker-launched binaries, wake-watcher supervision, state-keyed wake resolution) proven by the mock **plus one real shell**: the OS-notification adapter — **standalone repo `spt-shell-notify`, zero spt-core source integration** (the manifest + binary are the whole glue; spt-core CI cross-repo hook drives its E2E) · deferred-message resting gate (REQ-INST-6) + remote `spt suspend/wake <id@node>` · **presence resolution** (the gossiped `last_active_ms` datum riding the registry rows + one most-recently-active API; notif first-fire redirects cross-node to the user's active endpoint) · the `[session.notif]` render seam generalized to shell adapters (`spt notify` → native toast) · **cross-node owner↔shell link** (relink + command drive over the wire; spawn stays same-node) · carried M4 seams closed (relay rendezvous meet routing, `spt update apply` consent-ack orchestration, self-hosted cross-runner CI). **The M5 proof ran on the real rig: `[twohost]` run `26998058816` — the full 11-rung ladder (M4's seven + remote suspend/wake, presence-shift redirect, cross-node shell relink+drive, notif→toast) green with the REAL standalone adapter rendering A's `spt notify` at B, the host the user last touched** (`docs/TWO-HOST-RUNBOOK.md` M5 evidence appendix).

## Post-v1-core milestones

- **M6 — Stage-setting: `spt-releases`, dev docs, v0.1 release. ✅ delivered (2026-06-05** — planned, executed, and shipped in one day; decisions grilled `1922fa4`, task breakdown [`M6-PLAN.md`](./M6-PLAN.md)). spt-core is now **installable, documented, and released**: [`SaberMage/spt-releases`](https://github.com/SaberMage/spt-releases) is **public** with the signed **v0.1.0** release live and GitHub Pages serving the docs at the permanent canonical URL `https://sabermage.github.io/spt-releases` (ADR-0014; repo-face README/licenses truth-synced from `releases-repo/` here). Shipped: non-interactive installer scripts (sha256-verify, user PATH, env-knob CI/air-gap override; OS-service → DEFERRED) with a staged-release CI rung on both runners · manifest **JSON Schema generated from the parsing derives themselves** + drift gate · the Tier-1 mdBook docs corpus (Starlight-styled; both killer quickstarts with real captured outputs — the messaging one CI-executed step-for-step as an anti-rot rung; complete harness-contract vertical; shell getting-started via the real `spt-shell-notify`; llms.txt/llms-full; CLI reference generated from the binary's own `--help`, drift-gated; rustdoc) · `xtask` generation/publish plumbing + tag-triggered `docs-publish` and `release` pipelines (cross-repo via fine-grained PAT; **signing stays manual + local** — `release-keygen/sign/publish/verify`, seed via password-manager CLI) · the **embedded two-key trust anchor** (ADR-0015: `rel-primary-2026` + `rel-recovery-2026` compiled into every binary, keys-file as revocable overlay; ceremony run 2026-06-05, seeds custodied off-machine) · public-facing CLI help-text polish. **Acceptance ran against the LIVE release:** the real `irm|iex` one-liner installed the published binary (`spt 0.1.0`, sha256-verified) · the published quickstarts ran as written on the released binary · llms.txt + schema + raw-`.md` + rustdoc all serve from the canonical URL · both platforms' `SignedRelease` assets **verify against the embedded anchor** (`release_verify_e2e`, REQ-REL-2 int) · a v0.1.1-shaped next release **and a recovery-key-signed metadata** both verified with the real keys (the ADR-0015 loss story, proven). Side quest: `traceable-reqs` v0.1.2 (shell/PowerShell tag scanning) shipped to cover script evidence. **Next:** M7 (subnet & quickstart UX), then the v1 acceptance proof — `spt-claude-code`, a separate downstream repo — gets its own scoping conversation.

- **M7 — Subnet & quickstart UX. ✅ delivered (2026-06-06** — planned 2026-06-05, task breakdown [`M7-PLAN.md`](./M7-PLAN.md)). Subnet setup is now a first-class, guided, secure product surface and the public quickstart is honest about personas. Shipped: the **join ceremony's product surface** (always-on daemon pairing responder + guided `spt subnet join` CLI initiator over the rig-proven SPAKE2 wire) · the `spt subnet` noun namespace (`status [NAME] [--nodes]` / `create` / `show-code` / `join`; `spt pair` deleted) · **elevation-gated membership mutations** (ADR-0005 amendment) · **node labels** (hostname-default, gossiped, addressable in `@node` qualifiers, refuse-on-ambiguity) · `spt poll` removed (`ready` + `ready --once`) · **`spt how-to <topic>`** in-binary agent guidance · quickstart rewritten as human-pairs / agents-message prompt blocks · README/docs-site sweep + `quickstart_e2e` + the twohost ladder's product-surface join rung. **Acceptance ran on REAL hardware:** fresh two-machine setup via `subnet create` + the user's own fully-guided join (elevation prompts honored), `status --nodes` labeled+online both sides, cross-machine `SENT(WAN)` both directions (HFENDULEAM ↔ gravity-linux). The run surfaced two publish-blocking field bugs, **both fixed at closeout**: elevated daemon spawn → the de-elevation seam (KNOWN-HAZARDS 5.7) and immortal ghost registry rows → silent-peer eviction (KNOWN-HAZARDS 4.10), plus refusals now rendering copy-paste qualified targets. Harvest (Linux elevation model, NTP-first clock, firewall registration at install, post-join address seeding, trust-prune verb, M8 nounification spec) → `docs/DEFERRED.md`. **Next:** `spt-claude-code` scoping (the v1 acceptance proof, own repo).

- **🔜 NEXT milestone (planned 2026-06-09) — Broker/brain process-isolation restoration.** Design: [`docs/BROKER-BRAIN-SPLIT-RESTORATION.md`](./docs/BROKER-BRAIN-SPLIT-RESTORATION.md) (verified-with-amendments by `doyle`); decision: **ADR-0018**; build plan: [`RESTORATION-PLAN.md`](./RESTORATION-PLAN.md) (D1–D7). Corrects an unintended regression — the broker/brain split (ADR-0004) was specced + spiked as two *processes*, but the production daemon runs the broker as an in-process *thread* (`daemon.rs:165-170`), so `spt update apply` swaps the binary on disk yet never restarts the running code. Every release to date needed a manual fleet daemon bounce (paid 3× on v0.3.2; `enlyzeam` reproduced the `\r`-corruption for ~a day with the fix on disk). Restore the two-process model: broker = always-up per-machine anchor (seed-lock + liveness + brain supervisor) spawning the brain as a child; update = brain snapshot+self-exit → broker respawns from the swapped binary → re-attach; durable absolute-deadline loop timing; broker cursor-of-record with all-session resume; readiness-gated auto-rollback to last-known-good (two-phase applied record + the no-pre-ready-migration invariant); broker-owned generation custody; the §B-adjudicated direct-call→versioned-IPC-verb surface (net near-free, shellwake stays brain, new-brain × old-broker N-1 compat test). Re-points the regression-masked REQ-DAEMON-2 / REQ-UPD-3 `int` evidence to a productionized SPIKE-01/03 process-level survival E2E; mints `REQ-HAZARD-BROKER-PROCESS-ISOLATION` + `REQ-HAZARD-ROLLBACK-STATE-COMPAT` (KNOWN-HAZARDS 6.7/6.8). **Sequenced before `spt-claude-code`** (operator 2026-06-09): the *last* release needing a manual bounce, so every adapter-era release rolls seamlessly and the adapter lands on the final topology; changes daemon internals only, leaving the M8-frozen CLI/api surface untouched.

> _Roadmap note: this milestone list is behind reality — M8 (CLI nounification + acceptance) and the v0.3.x release line shipped after M7 but are tracked in commits/CHANGELOG, not itemized here. A full roadmap reconciliation is a separate docs pass._

## De-risk spikes (do early — these can invalidate the architecture)

1. **Broker/brain no-terminate handoff** — prove a PTY child + open socket survive a daemon-*logic* restart (FD-passing or stable-broker shim). The hardest invariant in the design (ADR-0004).
2. **Iroh + mDNS smoke test** — two nodes, separate NATs, one message end-to-end; measure direct-vs-relay + binary size (research brief §5). Validates the whole networking premise.
3. **ConPTY + `portable-pty`** — host and drive a real session in a broker PTY on Windows + Linux.
4. **Manifest-driven lifecycle** — a manifest drives one real Claude Code session through spawn → bind → commune → signoff, proving harness-independence works in practice.

## Immediate next steps

Progress (2026-05-31):
- ✅ **Stage A — codex adversarial pass** — done. 14 findings (5 FATAL, 9 SERIOUS) in `docs/reviews/STAGE-A-codex-redteam.md`.
- ✅ **Spike #1 (broker/brain handoff)** — PASS on Windows ConPTY. `docs/spikes/SPIKE-01-broker-handoff.md`. Found+fixed the ConPTY DSR-stall hazard. 4 open gaps carried to M3.
- ✅ **Spike #2 (Iroh smoke test)** — PASS (single-host). `docs/spikes/SPIKE-02-iroh-smoke.md`. iroh 0.98 API workable; 13.57 MB release. Cross-NAT/relay proof deferred to M4.
- ✅ **Findings folded** into ADR-0002/0003/0004/0005 + KNOWN-HAZARDS (+3) + traceable-reqs.toml (+4 REQ-HAZARD).

Remaining before M0:
1. **Stage A — `plan-eng-review`** (interactive architecture pass; second half of Stage A).
2. **Spike #4 (manifest-driven lifecycle)** — a manifest drives one real session spawn→bind→commune→signoff. (Spike #3 ConPTY hosting is largely covered by Spike #1.)
3. Resolve the open user decisions logged in the ADR amendments (context-merge model, seed-rotation, R-UPD-3 rewording).
4. **M0** (workspace + `spt-proto` + `spt-store`) — **started**; task breakdown in [`M0-PLAN.md`](./M0-PLAN.md). Each milestone gets its own just-in-time plan (this one is M0); the roadmap stays the milestone sequence, the JIT plan is the task layer.

## Tooling: requirement traceability (adopted)

`traceable-reqs` is adopted as the requirement-tracing gate. `traceable-reqs.toml` seeds the `REQ-*` registry from PRD `R-*` + KNOWN-HAZARDS invariants (all inactive until milestone activation). The development contract + enforcement (CI gate, milestone activation, agent rule, quality audit) is `docs/TRACEABILITY.md`; the agent rule lives in `CLAUDE.md`. **M0 task:** install the CLI, validate the seed, wire the CI workflow, activate M0 reqs.

## Open threads carried forward

- New deferred items logged this session (`docs/DEFERRED.md`): block-level OCC context merge; the rebalance / Memory-Health pass; the consent framework's gated capabilities.
- The dual-audience docs strategy (`docs/DOCS-STRATEGY.md`) governs the *shipped* docs, authored as code lands.
