# spt-core

**Platform scope:** Windows + Linux for v1. macOS is out (no test machine available) but kept structurally easy — `portable-pty` and Iroh both support it, so macOS is a later test/CI-budget decision, not a re-architecture.

**Legacy migration:** it should be possible — ideally *automatic* — for a user to migrate an existing `claude_skill_owl` (modern SPT) install to spt-core (identity, agents, tracked Psyche context). Exact mechanism deferred to design; the commitment is that migration is a first-class supported path, not a manual rebuild.

Harness-independent core for the SPT ecosystem. Provides inter-agent messaging, live-agent lifecycle, terminal wrapping, self-update, and networking primitives — as both a Rust library workspace and a canonical reference binary. Designed so any agent runtime (Claude Code, Codex, Cursor, headless, future harnesses) can interface with the SPT ecosystem either by shelling out to the binary or by linking the crates directly.

Successor to `claude_skill_owl` (today's "modern SPT"), which is being rebuilt as `spt-core` to untether the system from Claude Code and lift it to a general-purpose agent-ecosystem core.

## Language

**spt-core**:
The system. Canonical name. The Rust workspace and the umbrella project.

**spt.exe / spt** (canonical binary):
The reference binary built from the workspace. Replaces today's `owl.exe`. Most external integrations (plugins, hooks, scripts in other harnesses) interact with spt-core *only* through this binary — fire-and-forget subcommands, long-running listeners under a parent harness's process supervisor, etc. Unix builds use the same name without `.exe`.

**library workspace**:
The set of Rust crates that compose spt-core. Consumers that want a deeper integration than shelling out to `spt.exe` link these crates directly. The reference binary is itself a consumer of the workspace. The expected non-binary consumers are future first-party services that link Rust directly.

**spt plugin** (separate downstream project — NOT an spt-core deliverable):
A rebuilt version of today's Claude Code `spt` plugin. It is the **first consumer** built *atop* spt-core and the **acceptance proof** of spt-core v1 (it reaches feature parity with modern SPT while delegating all core functionality to spt-core, primarily via `spt.exe`, with deeper hooks where useful) — but it **lives and builds in its own repository, outside spt-core**. It is a Claude-Code-specific *adapter*: it holds the Claude Code conventions (hooks, slash-commands, skill/plugin layout, `claude` session-invocation). **spt-core itself contains zero Claude Code conventions** — only the harness-agnostic contract the plugin binds to. The only adapter-shaped artifact ever in this repo is a generic mock/test adapter exercising the manifest + `api` contract (PRD R-DOCS-2). The M2 milestone delivers that contract + the lifecycle primitives; building the plugin is downstream work, not M2.

**Pi** (disambiguation — two meanings, never conflate):
(1) **Pi, the coding agent/harness** (`badlogic/pi-mono`) — a harness example alongside Claude Code and Codex; this is the meaning in user-facing harness lists. (2) **Pi-class node** — Raspberry-Pi-class low-power hardware hosting a Shell-only or headless SPT node; an incidental hardware descriptor, never an explicit product example. Public-facing docs must disambiguate or avoid the bare word.
_Avoid_: bare "Pi node" when the harness is meant.

**spt-daemon** (per-machine supervisor):
The single always-on, one-per-machine logical supervisor. Owns the PTYs for all hosted sessions, the node's network identity + WAN endpoint, the subnet registry, all spools, **all poll-listener logic, and all Psyche/pulse loops** — everything is consolidated here (no separate poll-listener or Psyche-wrapper processes; listeners already touch sessions directly under capsule/idle, and Psyche wrappers already invoke harness binaries directly, so they belong in the one supervisor). Collapses what the sister project planned as a separate `spt-node` daemon into one process — see Networking. The `spt-node` separate-deliverable concept is retired.

Internally the logical daemon is split into two implementation layers for seamless self-update (see Self-update):
- **broker** (stable "kernel") — holds *only* the un-transferable, must-not-die resources: PTY master fds, the spawned harness child processes, and listening network sockets. Minimal, dumb, versioned local IPC. Almost never updates.
- **daemon brain** ("userspace") — all logic (routing, registry, pulse/psyche loops, manifest parsing, update orchestration). Restarts freely on update; rehydrates from disk state and re-attaches to the broker's held handles.

Logical addressing is unchanged — still one per-machine `spt-daemon`; the broker is an internal layer, not separately addressable. There is exactly **one broker per machine** (per `SPT_HOME`) — *not* one per endpoint: a single broker holds every hosted endpoint's resources, and it is present whenever the daemon runs, even with zero endpoints online (the bare-daemon case). It is therefore the always-present per-machine layer, which is why the single-daemon lock + liveness anchor belong to it.

**in-session relay**:
A thin, stateless `spt.exe` task that exists only in **harness-hosted** sessions (where the agent harness is the parent process and spt cannot reach into its process tree — today's Monitor model). It streams the daemon brain's events into the session's stdout. All *stateful* listener logic lives in the daemon; the relay is a dumb pipe, freely killable and respawnable. **spt-hosted** sessions need no separate harness-owned relay — the daemon owns the PTY and consumes the same poll feed itself. Idle delivery into an spt-hosted PTY goes through an opt-in adapter **translation binary** (`[message-idle-translation-binary]`, ADR-0022): a pure stdin→stdout filter spt-core lifecycle-manages — it reads the `<EVENT>` feed on stdin, emits keystroke-commands (`{key}`/`{delay_ms}`/`{text}`) on stdout, and spt-core applies them to the PTY **atomically** (controller input buffered during the sequence, so injection coexists with a live `spt rc`). The v0.11.0 raw `payload+\r` inject is the degenerate no-choreography case.

### Deliverable shape

spt-core ships **both** a library workspace and a canonical binary:

- **Library crates** — the deeper integration path. Used by future first-party services that link Rust directly.
- **`spt.exe` / `spt`** — the canonical binary, built from the workspace. The primary integration path for harness plugins and external tooling, which mostly fire it as a subprocess at various surfaces (one-shot commands, poll listeners under a Monitor-tool-equivalent, hook tap-ins).

Both surfaces are first-class. Wire-protocol parity between them is a versioning concern from day one (a non-Rust client speaking to `spt.exe` and a Rust client linking the crates must see the same observable behavior).

## Runtime model

spt-core is harness-independent: it does not know about Claude Code, Codex, Cursor, or any other agent runtime. All harness-specific surfaces (how to invoke an agent session, fetch conversation history for an echo commune, detect activity/idleness, etc.) are abstracted behind a runtime layer that consumers supply.

**AgentRuntime** (Rust trait, implementation detail):
The internal Rust abstraction over a harness. Anything spt-core needs to do *to* or *with* an agent goes through this trait. Most consumers never see it directly — they configure spt-core via a manifest, and spt-core's default `ManifestRuntime` implementation executes against the manifest.

**harness contract** (umbrella term):
The full surface a harness binds to in order to participate in the spt-core ecosystem. Has two equally-important halves: the **runtime manifest** (outbound — how spt-core drives the harness) and the **subcommand surface** (inbound — how the harness reports events back to spt-core). A harness implementation is one TOML/YAML manifest + a binding from the harness's own hook system into `spt.exe <subcommand>` calls.

**runtime manifest** (outbound half of the harness contract):
A declarative configuration file (TOML/YAML — schema TBD) that tells spt-core how to drive a specific harness. Declares: how to invoke an agent session, how to look up conversation history for an echo commune, how to spawn/resume a Psyche-equivalent, which binary or command implements each harness-side operation, and which endpoint types this harness supports. spt-core is the actor for each of these; the manifest tells it what to do.

Example shape (illustrative): `spt.exe --manifest spt-plugin.toml live start <id>`. A harness like the planned spt plugin wraps this invocation into the `$LIVE` / `$OWL` environment variables it injects into its sessions, so harness-internal callers continue to invoke `$LIVE` / `$OWL` unchanged.

**subcommand surface** (inbound half of the harness contract):
The stable set of `spt.exe <subcommand>` entry points that harnesses bind their own hook systems to. When the harness's runtime emits an event (subagent started, tool just invoked, user typed `/clear`, session crashed), the harness's hook fires a short-lived `spt.exe <subcommand>` invocation that mutates on-disk SPT state (perch registry, spool, etc.). spt-core publishes this surface; harnesses author the bindings.

**Naming convention:** these inbound, machinery-facing commands are prefixed **`api `/`api-`** (e.g. `spt api bind`, `spt api state`) to distinguish them from the agent-facing verbs an agent invokes directly (`send`, `ring`, `ready`, …). The `api` namespace is the harness/adapter commands-API; the unprefixed namespace is the agent surface.

Together: manifest + subcommand surface = the complete harness API. A sidecar-style long-running adapter process speaking a wire protocol is explicitly **deferred** as a possible v2 alternative for harnesses that outgrow the manifest+hooks shape (e.g. need streaming or in-memory state across events). Not built day-one.

**adapter manifest header** (`adapter_name` + version compat):
Every manifest declares a unified **`adapter_name`** (e.g. `claude-spt`), carried on every `api` invocation too. It is load-bearing: one daemon hosts endpoints from multiple adapters, so it resolves an endpoint's manifest + seams by `adapter_name`; adapter-update ripples target by it; capability/manifest lookup and telemetry key on it. The header also declares the adapter's own version and a **`min_spt_core_version`** — the minimum spt-core the adapter requires. This declaration must be **readable before an adapter update is applied** (it lives in the manifest header / a small metadata file fetched first), so spt-core can verify compatibility / expected supported features *before* committing the update. If installed spt-core < the adapter's `min_spt_core_version`, surface the incompatibility rather than silently breaking; when spt-core self-updates, re-verify adapters still satisfy (coordinate core + adapter updates when needed). This is a distinct compatibility axis from the node↔node/library wire-protocol version (see Workspace & versioning).

<!-- [doc->REQ-MANIFEST-2] -->
**adapter profile** (ratified 2026-06-11, Gateway grill; future spt-core milestone — first beneficiaries `spt-claude-code` and the usbip shell):
A named **sparse overlay** on its parent adapter manifest. Merge semantics are **leaf-replace**: a profile key replaces the whole value at that path (arrays included — never spliced or appended). The merged result is a complete manifest, and the profile behaves as a distinct adapter option everywhere: canonical addressing is the composite **`<adapter>:<profile>`** (`claude-spt:work`, `spt-usbip-driver:hid-only`) in every place a bare `adapter_name` rides today (perch `info.json`, capability resolution, `api` invocations, `spt adapter list`); the bare name = the parent unmodified. **Two sources, one semantics:** a **shipped profile** is declared inside the parent manifest by the adapter dev and updates as one unit with it; a **local profile** is a node-local overlay file registered beside the adapter — user-authored, **surviving adapter updates** (the safe space for nodewise customization; the manifest file itself is adapter-owned and overwritten by updates). A local profile may not shadow a shipped profile's name (refused at registration). Profiles are never independently versioned (fork-drift, rejected). **Consent floors are tighten-only**: a profile may demand approval where the parent does not, never loosen a parent's `require_approval` floor — a loosening overlay is invalid at registration. CLI: `spt adapter create-profile` / `delete-profile` / `set-string` operate on **local** profiles only. Use cases: per-account variants, extra hook wiring, trust-narrowed shell profiles.
_Avoid_: "manifest fork", "child adapter", per-profile versioning.

**adapter strings** (ratified 2026-06-11, Gateway grill):
<!-- [doc->REQ-MANIFEST-3] -->
A `[strings]` manifest section — an adapter-authored JSON/TOML KV tree, dot-path-readable by anything on the node via `spt adapter get-string <adapter-option> <key.path>` (e.g. a harness hook fetching per-profile `additionalContext` — one hook script serves every profile, only the data differs). Resolution rides the **same leaf-replace profile overlay** as the rest of the manifest: a shipped or local profile may override base strings; `get-string` returns the merged view for the named adapter option. **Strings are data only** — nothing in spt-core ever executes a string (command templates live in manifest sections behind registration, never in the KV). Node-local like the registration itself; no cross-node sync. `set-string` is sugar that edits a **local** profile's `[strings]` (never adapter-shipped files).
<!-- [doc->REQ-MANIFEST-5] -->
**File-backed strings** (M12-W3): a `[strings]` value MAY be a **file pointer** instead of an inline literal — a value-position table with **exactly one** key `file`: `skill = { file = "skill.md" }`. `get-string` resolves it to the file's **contents** (so large bodies — skill-instructions, hint text — stay out of the manifest). The exactly-one-key rule is the disambiguation: any other table shape stays an opaque nested strings tree (existing trees untouched), and `{ file = … }` is reserved as the pointer form (it can't double as inline data). Files live in the adapter's per-adapter aux dir **`adapters/<adapter>/strings/`** (sibling of `profiles/`), referenced by a path relative to it that **must stay inside that dir** (HAZARD-class containment: `..` traversal and absolute paths are refused at registration). A **local** profile's own file pointers resolve against the user-owned local-profile dir, not the adapter-shipped `strings/` (which adapter updates overwrite) — so a local override survives updates; equivalently a local profile may just inline a literal. Pointers are **validated at registration** (fail-fast on an escaping/missing pointer — records nothing) and **read lazily** at `get-string` (live file edits reflect without re-register); a missing/unreadable file at read time **skip-diagnoses** (emits a diagnostic, returns "not set" — never a silent drop or hard error, mirroring `[digest]`).
_Avoid_: treating strings as config knobs for spt-core itself (those are global settings); "adapter KV store" as a separate registry; putting user files in the adapter-shipped `strings/` dir (clobbered by updates — use a local profile).

**manifest substitution in `[strings]`** (ratified 2026-06-25, v0.16.0 update-arc grill):
`get-string` resolves a set of **adapter-static** substitution keys inside a returned string value at **read time** (lazily, like file-backed strings): **`{adapter_dir}`** — the registry record's precise `source_dir` (the install dir; survives updates; the same dir bare-program resolution already uses) — and **`{adapter_name}`**. Session-scoped keys (`{id}`/`{session_id}`/…) are **not** available: `get-string` carries no session context today, and a `get-string --session-id` for session-scoped substitution is a deferred, larger change. The load-bearing invariant is preserved: **spt-core still never executes a string** — it substitutes and returns; the *adapter's own wrapper* executes the result. Canonical use: a harness hook dispatcher resolves its own packed binary via `get-string` (e.g. value `"{adapter_dir}/claude-spt hook"`) **once per session into an env var**, then runs it per-hook — so hook *logic* rides `spt adapter update` (it lives in the adapter binary) and the plugin's `hooks.json` + a thin static dispatch wrapper go static-forever. This **supersedes a rejected `spt api run-hook`** (which would have made spt-core itself execute an adapter handler — a new execution surface; rejected in favor of resolve-not-execute).
_Avoid_: session-scoped substitution through bare `get-string`; reading this as spt-core executing a string (it never does — the adapter wrapper executes the resolved value).

**keyword hints** (ratified 2026-06-12 — core milestone A):
<!-- [doc->REQ-MANIFEST-4] -->
Once-per-session usage/syntax hints, a first-class adapter feature: the manifest's `[hints]` section declares entries of `{keywords (literal default, regex opt-in), text}`; the adapter's user-prompt hook pipes the **full user message** to `spt api hint --session <id>` (stdin) and receives matched hint lines (`keyword hint for SPT adapter <name>: "<kw>"-->{text}`) for its context-injection channel. The daemon keeps a per-session seen-set — each hint fires **once per session** (a `/clear` mints a new session, naturally re-arming) — and emits at most **one hint per message**. **Tiebreak when a message matches multiple hints:** scan in declaration order and emit the FIRST match whose hint this session has not yet seen; if the declaration-order-first matching hint is already seen, fall through to the next unseen matching hint (the once-per-session and ≤1-per-message invariants both still hold). Emit nothing only when every matching hint is already seen. Employing the hook is the adapter dev's choice; profiles override/extend `[hints]` by leaf-replace like any section.
_Avoid_: unconditional static context (that's the adapter's own preamble); firing per-message.

**adapter update declaration** (manifest field):
<!-- [doc->REQ-UPD-9] -->
Each adapter manifest declares how spt-core should *ripple-update the adapter itself* (see Self-update). One of: **file-pull** (a plugin-directory lookup regex + a gh repo for the adapter's latest files — spt-core fetches + swaps), **delegated command** (a binary command the adapter owns, e.g. `claude.exe plugin update` — spt-core invokes it), or **gh_release** (the adapter ships its updates from its own GitHub releases). After initial bootstrap, the plugin no longer self-manages updates; spt-core conducts them. The **gh_release** avenue (since v0.8.0) declares `repo = "user/repo"` (plus an optional release `asset`, default `adapter.spt`, and an optional Ed25519 `signing_key`): spt-core compares the repo's latest GitHub release version against the installed adapter version and, when newer, fetches the release `.spt` (the same archive primitive as `spt adapter add --release`), then re-extracts and re-registers. Trust mirrors first-acquisition — HTTPS + GitHub when no key is declared; when a `signing_key` is declared, the fetched `.spt` is verified **fail-closed** against a detached signature published as a sibling release asset `<asset>.sig` (lowercase-hex Ed25519 over the raw archive bytes), and the new `.spt` is verified against the **installed** manifest's key (key continuity). A bad or missing signature refuses the update — the staged bytes are deleted, never extracted. The gh_release update is driven by the `spt adapter update [name]` command (with no name it sweeps every registered gh_release adapter); the network fetch lives in the CLI, never the daemon.

**adapter packaging & live update** (v0.13.2; ADR-0024, ADR-0025):
<!-- [doc->REQ-ADAPTER-GH-TRANSPORT] -->
A `.spt` may be **multi-platform**: shared `manifest.toml` + `strings/` at the root, role binaries under per-target-triple subdirectories (`x86_64-pc-windows-msvc/`, …); install/update extracts the shared root plus only the current node's triple, flattened into `install_dir`, so flat `<install_dir>/<program>` resolution is unchanged. It stays one signed asset (`adapter.spt`, plain-tar or gzip); a multi-platform archive missing the recipient's triple is a typed `NoArtifactForPlatform`. Large adapters may still split per-platform. The `gh_release` fetch transport is **`auto`** by default — the pre-authorized `gh` CLI when available (the path for **private** adapter repos: `gh` honors both OAuth and `GH_TOKEN`, so spt never custodies a token), else direct HTTPS (public). An adapter update is **live and daemon-coordinated**, the adapter analog of brain self-update: for an endpoint with a running **resident adapter binary**, the daemon stops it (releasing the OS file lock that otherwise fails an overwrite on Windows), swaps **only files whose CRC changed**, **refreshes the endpoint's in-memory manifest** (binaries and manifest stay on the same page), then restarts it — the endpoint itself never restarts.
<!-- [doc->REQ-ADAPTER-UPDATE-MESSAGE] -->
An optional **`[update].message`** (avenue-agnostic) is a plain multi-line operator notice surfaced to stdout, markdown-rendered (the helpfmt prose path), **only when an update is actually applied** (the version changed) — never on a no-op. It is read from the newly-installed manifest with no `{key}` substitution; its use is to announce a post-update action (e.g. "run `/reload-plugins` in any ongoing sessions").

**composite update — `[update.post]`** (ratified 2026-06-25, v0.16.0 update-arc grill; ADR-0029):
An optional **avenue-agnostic** post-step `{ command, self_verifies }` spt-core runs **after** the primary avenue resolves, in the same `spt adapter update` — so one lever pulls the adapter `.spt` (`gh_release`) **and** runs a delegated reconcile (e.g. an adapter's `claude plugin update` cross-platform binary). It **runs unconditionally** (even when the adapter version did not change — the post-step's own idempotent check decides), receiving a one-line **stdin JSON** seam (`adapter_applied`, `adapter_name`, `profile_name`, `version`, `previous_version`, `adapter_dir`; additive keys). Its **stdout arbitrates the notice**: custom text **supersedes** `[update].message` (a dynamic notice); a reserved sentinel fires the static `[update].message`; empty prints nothing — precedence dynamic > sentinel/manifest > nothing. Exit code is orthogonal (0 ok / nonzero failed). With no `[update.post]`, the existing `adapter_applied`→`[update].message` rule is unchanged; a **failed** post-step warns loudly and falls back to that rule. **Failure-isolated**: a committed `gh_release` pull is never rolled back if the post-step fails (independent channels). Trust mirrors the `delegated` avenue (`self_verifies`).

**resident adapter binary**: an adapter-owned process spt-core keeps alive for an endpoint's lifetime (today the `[message-idle-translation-binary]`), as opposed to **ephemeral** adapter binaries — the Psyche loop (daemon-hosted, ADR-0004), the `[digest]` extractor, `[session.*]` runners, hooks — which spawn on demand and pick up an update on their next invocation. Only resident binaries are stopped/restarted on a live update; ephemerals self-heal.
_Avoid_: calling the Psyche loop or an on-demand extractor a "resident" binary; "restart the endpoint" for what is a per-binary cycle.

**session-invocation declaration** (manifest field, noted for spt-plugin parity):
How the harness spawns agent sessions, including Psyche and echo-commune sessions. For the rebuilt spt-plugin, Psyche and echo communes must migrate **off `claude -p`** (imminent Claude Code billing changes) to headless `claude` sessions (`--resume` for the Psyche). This is an adapter/manifest concern, not a core concern, but the parity milestone must carry it.

### Manifest seams (outbound contract, detailed)

Governing principle: **SPT is not a harness.** Model choice, billing shape, harness-internal env, and harness-internal context are entirely the adapter's concern, expressed inside the adapter's own command templates. spt-core owns only the template *mechanism* (substitution keys), the substitution *values* it is responsible for, and the surrounding lifecycle. Env for the *endpoint binary itself* is auto-handled by spt-core/broker; env for the *agent running inside* that binary is the adapter's config (e.g. the CC plugin config).

**spawn-session seam** — launch a new agent session on this node. Manifest provides: a command template; `cwd`/project; a `headless` flag (optional, default false — for the GUI's resume-of-compatible-adapters); a `resume` flag (optional); and the `commune` + `signoff` file directories relative to `cwd` (so the daemon knows where to watch). Substitution keys spt-core can supply: `{id}` and, optionally, a spt-core-generated valid session UUID (e.g. injected as `--session-id {uuid}`) so an adapter can skip the post-spawn seam. spt-core does **not** inject: harness-internal env (broker handles binary env; adapter handles in-session env), and **no initial-context handoff** (not needed at start — the agent is prompted for context once its session is up; the first commune populates it).
- **id resolution:** `id` is optional. With no id, spt-core reproduces today's no-id `/spt:live` behavior — run the lone live agent if that's all the project has; show a picker with proposed default IDs if the project has none; let the user choose if there are several.

**post-spawn seam** — the just-launched binary calls an spt-core command on boot (via the adapter's SessionStart-equivalent hook) to bind itself. Needed because the harness's own session id usually isn't known until after the binary runs. Payload: the harness `session_id` (when binary-generated rather than spt-core-injected); the `parent_pid` (the stable session-binding anchor — see KNOWN-HAZARDS 2.1); an endpoint identity/type confirmation; optionally a local HTTP port the binary listens on (for HTTP-mode input delivery, below); and a **boot nonce** (a generation/boot discriminator so a respawn-after-crash bind can't be confused with a stale duplicate — guards KNOWN-HAZARDS 2.4). The call flips the perch from skeleton → live.

**post-spawn is optional only under a strict commitment:** an adapter may forgo post-spawn *only* if it (a) injects the spt-core-generated session UUID at spawn AND (b) guarantees the launched-process pid IS the stable session-binding anchor (no wrapper-script / subprocess pid indirection). If either does not hold, post-spawn must fire to report `session_id` and/or `parent_pid`. UUID-injection alone suppresses only the `session_id` reporting, not the binding.

**spawn-psyche seam** — two command templates: fresh-start and resume (the resume template includes `$session_id`). Both include `$psyche_prompt` — the revival essentials spt-core feeds the Psyche (timestamp, incoming event envelope). Everything else is the adapter's: model selection (in its template), and any harness-specific instructions the Psyche needs (Write-tool usage, commune dir) supplied as a static preamble before `$psyche_prompt` or as adapter SessionStart additionalContext. spt-core owns `$psyche_prompt` content; the adapter owns the rest.

**history subsystem** (covers echo-commune source logs, resume briefs, and Shell logs) — two supported paths:
- **Path A — adapter-owned logs.** Manifest declares a locate-template (keyed by `$session_id`) + a **normalize-command the adapter owns** that emits spt-core's expected normalized format. spt-core docs must teach adapter devs how to build a conformant parser.
- **Path B — spt-core-native history store.** spt-core exposes a `history-log` command/API; the adapter writes its logs to spt-core in the native format and spt-core stores them. Rationale: spt-core needs its own log store for Shells anyway, and this simplifies integration for flexible/DIY harnesses.
- The **echo-commune seam** is then just a command template (adapter picks the model) that consumes whichever history path is configured for the session.
- **Why adapter-owned normalize over spt-core built-in parsers** (grounded in a Codex-CLI vs Claude-Code comparison): transcript formats diverge sharply and move fast. Claude Code = one flat JSONL per session, project-partitioned, locatable directly from the session id (`~/.claude/projects/<hash>/<id>.jsonl`). Codex = date-partitioned **rollout files** (`~/.codex/sessions/YYYY/MM/DD/rollout-<ts>-<id>.jsonl`) where the id is only a filename *substring* (must recursive-glob to locate), a **3-level tagged envelope** (`{timestamp,type,payload}` → tagged `ResponseItem` → tagged `ContentItem` with distinct `input_text`/`output_text`), tool calls as separate top-level items, a **second SQLite index that can desync from the files**, Limited/Extended persistence modes that change which records exist, and session **forking** (`forked_from_id`) requiring lineage-following. A built-in spt-core parser per harness would be an unbounded, version-fragile maintenance burden. Path A keeps that complexity in the adapter; Path B is the escape hatch for harnesses whose native logs are too painful to locate/parse — the adapter just pushes normalized records to spt-core's native store instead.

**activity/idle detection** — **not** PTY-quiescence (insufficient: e.g. CC's AskUserQuestion stalls the PTY while holding stdin and needing nuanced input) and **not** a manifest-declared idle signal. Instead, the adapter calls spt-core activity/idle commands at the right moments (from its hooks); those commands manage activity/idle **sentinels inside the session perch**. The idle state lives in the perch, owned by spt-core via the commands API exposed to adapter devs.

**inject-input seam** — message delivery into a running session. Configurable per activity-state (activity / idle / both); multiple methods, any combination:
- PTY injection (with or without key/submit sequences) — spt-hosted topology;
- adapter hooks calling spt-core poll commands;
- an in-adapter-session child relay (à la CC's Monitor tool);
- adapter manifest requesting HTTP POST delivery to the endpoint binary on a local port (shared via the post-spawn seam).
Note: even spt-hosted sessions default to hook injection (or the adapter's equivalent) as the non-disruptive path **during activity**; some adapters prefer the in-session relay regardless of topology.

**activity-gated delivery** — an inbound message routes by the receiver's activity sentinel (above). While the endpoint is **active**, the message spools for the receiver's own hook-poll to drain (non-disruptive — the *active window*). On **idle** (or an idle transition before a hook drains it), it delivers immediately — translation binary (spt-hosted) → relay-poll (either topology) → spool, in that fallback order (the *idle window*). The send-side axes below modulate which of these two windows a message is eligible for.

**message delivery axes** — a sent message carries independent modifiers on orthogonal axes; it is **not** a single "type". The flag on each axis defaults to the unrestricted value:
- **delivery window** (*when*) — **default** (both windows; delivers in whichever fires first) · **idle-only** (held for the idle window; delivered immediately if already idle) · **active-only** (active window only — the receiver's hook-poll; never wakes an idle agent). *active-only* is the renamed legacy **deferred** (the `deferred=1` spool column + `api poll --include-deferred` are its internal/adapter-facing names). Mutually exclusive.
- **channel restriction** (*through what*) — **unrestricted** (any configured inject method) · **prefer-native** (the translation binary if one is running, else fall back to the standard methods) · **force-native** (the translation binary and nothing else — no fallback, no spool-to-another-method). Mutually exclusive; composes with the window. *"Native"* = the `[message-idle-translation-binary]` PTY channel.
- **persistence** (*how long it waits*) — **durable** (default; spooled until delivered or TTL) · **ephemeral** (dropped if it cannot deliver in its accepted window — at the moment the window opens with no live carrier, or at TTL, whichever is first). Ephemeral is the **only** path permitted to drop silently (the REQ-HAZARD-IDLE-SILENT-NONDELIVERY carve-out); every non-ephemeral path spools and reports non-delivery. <!-- v0.15.0 PARTIAL (W3): ephemeral evaporation covers the spt-hosted-binary no-carrier-at-window leg (the idle-transition drain drops ephemeral rows the binary cannot take) + the TTL leg (purge). The harness-hosted relay "window opens with no live *listener*" leg is NOT yet delivered — it needs relay carrier-presence detection (same separate-concern shape as relay activity-gating, deferred). Until then an ephemeral message to a harness-hosted relay endpoint with no live listener spools durably rather than evaporating. -->
Window restricts *when* delivery is accepted, channel restricts *which method* carries it, persistence restricts *how long* it waits — they compose freely (e.g. `force-native` + `active-only` = the binary injects during the active window, never idle; `force-native` + `ephemeral` = binary-or-nothing).

**message metadata (`json`)** — a sender may attach an opaque JSON metadata block (`--json-payload`), carried as a single attr-escaped `json="…"` envelope attribute **alongside** (never replacing) the body. spt-core never interprets it — pure verbatim passthrough across every rail (spool / TCP / WAN / EVENT-PART), parsed only by the receiving adapter (its hooks and/or translation binary). Collision-proof by construction: the structured data lives **inside** the single `json` value, so it can never forge spt-core's control/identity attributes (`from`, `type`, …). Available to any sender — it confers no spt-core authority; what a custom field *means* is the receiving adapter's trust decision (the same posture as `from`-is-never-payload-trusted).

**resume-session seam** — two distinct forms:
- **fresh-with-preload:** resume with *cleared* context (a fresh session) + psyche-download. Accepts a `$psyche-context` key to launch the fresh session with the psyche-download preloaded — or the adapter instead pulls it via an spt-core command in its SessionStart hook. <!-- [doc->REQ-RESUME-CONTEXT-PULL] --> That command is **`spt api psyche-download <id> [--session-id <sid>]`**: it emits the durable resume brief (role → live-context → project-context, project resolved from the perch's bound cwd) to stdout for the adapter's SessionStart hook to inject as additional context, and APPENDS any **not-yet-synthesized** commune/signoff drop as a `<pending-commune>`/`<pending-signoff>` slice AFTER the durable tiers — closing the window where a just-dropped commune is invisible to a resuming agent until ingest synthesizes it. The append is **presentation-only** (it reads the drop, never writes the store — spt-core's ingest stays the sole writer, `REQ-HAZARD-DROP-FILE-SINGLE-WRITER`) and **self-clearing** (once ingest consumes the drop the slice vanishes). v0.15.0 realizes the pre-synthesis signal as watched-dir drop-file presence (Tier-1); the richer legacy payload (drift-stamp / `<current>` / memformat / pulse-log) is a deferred Tier-2 parity item.
- **continue-existing:** resume an existing harness session under the adapter (its native resume).

**capability declaration** — which endpoint types a harness/node can host (a Pi node might host only Shells, never a LiveAgent). Static manifest list, consumed by the subnet registry so a node advertises its hostable types. Exact shape is design-open.

**adapter-update seam** — file-pull or delegated-command (see Self-update). Locked.

There is no separate "model/billing" seam — those live inside the adapter's spawn/psyche/echo command templates. SPT never selects a model. The full manifest schema is `docs/MANIFEST.md`; key model-level facts from it:

- **Command templates are opaque.** spt-core never parses out a model/tool/flag — the adapter writes the whole command line; spt-core fills substitution keys and runs it.
- **A command template's program token resolves against the adapter install dir before PATH (since v0.8.0).** A `.spt` adapter ships its built binaries to the adapter's install dir (`adapters/_github/<safe>/` via `--release`/`--github`, or the record's `source_dir` under copy-mode), so a bare program name (e.g. `claude-spt-digest …`) binds to the shipped binary first and falls back to PATH when absent — a `.spt` that ships its binaries is **self-contained**, needing no PATH placement. <!-- [doc->REQ-INSTALL-11] --> Applies to the `[digest]` extractor, the `[session.psyche_init]` runner, and the `adapter digest-proof` tool; the install dir is the registry record's `source_dir` (precise) for the daemon-resolved paths — the `[digest]` extractor and the daemon-hosted `[session.psyche_init]` runner, where the brain's live-host reconcile resolves the Psyche program against the matching record's `source_dir` — and the `--manifest` file's parent dir when an explicit `--manifest` overrides on the api seam.
- **Hook output capability is declared per harness-event** (`can_inject`). CC's Stop hook cannot inject context — that single fact drives the echo-gate sentinel + relay fallback. The manifest expresses it so spt-core knows when to fall back.
- **Env injection is asymmetric** (file-bridge-only-when-not-launcher, applied to env): spt-hosted sessions inherit env from the broker that spawns them; harness-hosted sessions need the harness's declared env channel. With `spt` on PATH the env table is small.
- **Cross-adapter fallback** is a **node-wide setting**, not a manifest field: if a Psyche/echo invocation under one adapter is rate-limited, spt-core falls back to another adapter (e.g. `ccs` — its own adapter, not a binary-swap). <!-- [doc->REQ-MANIFEST-6] --> A fallback **target is addressed as `<adapter>:<profile>`** (not just a bare adapter_name) and resolves through the one composite-addressing resolver (`registry::resolve_option`), so a fallback may select a shipped or local profile (`ccs`, `ccs:<profile>`) exactly as any other adapter-option read site does. *Contract only at M12-W3 — the addressing resolves; the node-wide setting + its rate-limit invocation belong to the consuming milestone (no reader exists yet, so no config field is added).* Adapter-agnostic IDs make this safe; an endpoint's Psyche may run under a fallback adapter temporarily.
- **Config knobs** (pulse period, echo-commune window, route-guard window, daily refresh) are spt-core **global settings** with optional **per-endpoint override** — never per-adapter.
- **Event-block vocabulary and file-drop filenames are fixed spt-core constants** (documented for adapter authors), not manifest-configurable. <!-- [doc->REQ-RESUME-CONTEXT-PULL] --> This includes the **checkpoint sentinel `!!checkpoint!!`** — the agent-checkpoint trigger an adapter embeds in a commune/signoff drop body (one bare token = checkpoint with default wake; a `!!checkpoint!! <text> !!checkpoint!!` pair makes the inter-marker text a custom wake directive). It is spt-core control metadata: spt-core STRIPS every occurrence (keeping the inter-marker text) before the drop body reaches agent context, at BOTH points it can — the resume `<pending-*>` presentation (pre-synthesis) and the durable tier write (post-synthesis) — so the marker never surfaces or re-triggers.

### Inbound `api` surface (detailed)

All commands below are `api`-prefixed (machinery-facing). Every `api` invocation **and** every manifest carries a unified **`adapter_name`** string (e.g. `claude-spt`) identifying the owning adapter. This is load-bearing: one daemon hosts endpoints from multiple adapters (`claude-spt`, `spt-codex`, `spt-pi`), so the daemon resolves an endpoint's manifest + seams (history normalize-command, inject method, update avenue) by its `adapter_name`; adapter-update ripples target by it; capability lookup and telemetry key on it.

<!-- [doc->REQ-API-4] -->
**Manifest resolution from `--adapter` (since v0.8.0).** `spt api <cmd> --adapter <name[:profile]>` resolves the registered adapter's manifest, `:profile` overlay, and install dir from the registry when `--manifest` is omitted — a registered adapter's `api` calls need only `--adapter`. `--manifest <path>` becomes an optional **override** (an unregistered or local-dev manifest): when present, the manifest loads from that file and the install dir is its parent directory; when absent, both come from the registry record (the install dir is the record's precise `source_dir`). An unregistered adapter with no `--manifest` degrades to no-manifest rather than failing.

- **`api bind`** — post-spawn boot bind (payload above). Skeleton→live.
- **`api listen`** — *long-running* relay/poll listener that an adapter-owned (harness-hosted) session owns as a child process; streams the daemon's events to the session's stdout. Distinct from the short-lived `api poll`. This is the heir to today's Monitor-bound `$LIVE start` poll loop.
- **`api poll`** — short-lived drain of queued messages for a session (the hook-injection delivery path). `--include-deferred` optionally also drains deferred rows, for adapter flexibility (default excludes them — KNOWN-HAZARDS 1.4/4.4).
- **`api state <busy|idle>`** — adapter reports session activity; writes the activity/idle sentinel in the session perch. By default `api state idle` also writes the **echo-commune gate sentinel** (`.more-done`-equiv — modern spt couples them); `--no-gate` suppresses that coupling, and a standalone **`api echo-gate <set|clear>`** gives granular adapters explicit control over when echo communes may fire, independent of idle.
- **`api worker-start`** / **`api worker-stop`** — Worker (subagent) perch create/teardown under the parent (nested, registry-tracked).
- **`api worker-poll`** — a Worker (subagent) receives its queued messages (inbound from Self or sibling Workers).
- **`api boundary <clear|compact>`** — context-boundary report; **carries the new `session_id`** (it rotates on `/clear` or `/compact`), so the daemon rebinds the perch to the new session id while keeping the stable identity + `parent_pid` anchor. Authors a **Self-resume commune** (resume the Self session → commune file-drop) rather than a background echo — strong live-context signal at the boundary (see `docs/CONTEXT-MEMORY.md`).
- **`api session-end`** — session stop/crash report → soft teardown by default (preserve perch + spool + tracked history for recovery — KNOWN-HAZARDS 6.2). **`--erase`** instead hard-wipes the perch and tracked history (for ephemeral/secondary adapters that act as robust agent-spawned-agent surfaces).

**`spt endpoint purge <id>`** (CLI, not `api`) — the standalone, formal **full teardown**: wipe an endpoint and *every* record keyed on it. It is the dev/CI sibling of `api session-end --erase` (which is adapter-triggered at session end); `purge` is the explicit operator/test command for clean setup-and-reset. **Deliberately NOT consent-gated** — a local dev/test op, never a peer-visible action. **Offline-only**: it refuses a live / daemon-hosted endpoint (deleting records out from under a running host would let the daemon re-create or re-host mid-purge); **`--force`** stops it first (→ the daemon reconcile un-hosts it and reaps its Psyche) and then purges. **`--yes`** skips the interactive confirm (the CI path); purge refuses removing the **caller's own running id**. It is **node-local** — purge reaches only *this* node's records; a remote endpoint's records are unreachable and its subnet-registry rows decay on their own via the epoch-lease eviction. It removes: the perch directory **tree** recursively (incl every nested `{id}-psyche` / `{id}-w*` / shell child — info.json, ready marker, the `sessions.log` ledger, spool, the idle/echo-gate sentinels, the auth token); the registry address; the **context store** (`a-<id>` branch + worktree and the `<id>/` rows in every `p-<project>` branch — the same path `endpoint fork --delete-source` uses); and the node-local trust rows keyed on the id (access + visibility). Implementation-wise it is `fork --delete-source` generalized (recursive perch-remove + unregister + context `remove_endpoint`) plus the trust-record cleanup, sharing `endpoint rename`'s record-set enumeration and offline-only gate.
_Avoid_: consent-gating it (it is intentionally ungated, for CI); treating it as a sync/remote op (local-only); a soft variant (purge is always the hard, full wipe — soft teardown is `endpoint stop`).
<!-- [doc->REQ-ENDPOINT-PURGE] -->

- **`api history-log`** — Path B: ingest normalized records into spt-core's native history store.
- **`api presence`** — adapter reports user interaction → updates the presence datum `(last_active_node, last_active_endpoint, ts)`. In the spt-hosted topology, presence is **also** updated by the broker *detecting* (sensing, not watching/logging) user input on a held PTY — privacy-preserving (it notes that input occurred, records no content).
- **`api emit --type <sensory_type> <payload>`** — a broker-launched **Shell** binary pushes a sensory payload to its owner agent (owner known from `api bind`; REST-only, never spooled). See the Shell model.

**Not `api` commands — file-drop flow:** `commune` and `signoff` are deprecated as commands (modern SPT) in favor of file drops. The agent/adapter writes `<id>-commune.md` / `<id>-signoff.md`; the daemon watches the manifest-declared commune/signoff dirs (the spawn-session seam fields), ingests, and deletes (drop files are daemon-owned single-writer — KNOWN-HAZARDS 6.4). These stay off the `api` surface and the agent surface alike.

### Startup flows (the two topologies)

**Adapters never resolve `$SPT_HOME`.** spt-core install registers its binary directory on the system-wide PATH, so adapters call `spt api …` on any OS without path math. All harness↔daemon bridging goes through `spt api` commands (the daemon is always running, or auto-started — below), so there is **no adapter-written file** in the bind path.

**Harness-hosted (e.g. spt-plugin; the harness binary is user-launched, harness is the parent).** Key constraint: the SPT *live agent* does not exist until the agent invokes start — the `live_id` isn't chosen at session boot, and `$LIVE start` is itself invoked *behind the Monitor tool*, so it becomes the long-running relay. So binding cannot happen at SessionStart directly. A **seed record** (daemon-held, in-memory — not a file) bridges the gap:
1. The harness's SessionStart hook calls **`spt api seed --pid <parent_pid> --session-id <sid> [cwd]`**. The daemon records an ephemeral in-memory **seed entry** keyed by `parent_pid` — the session details the spt-hosted topology would share directly, minus the not-yet-chosen `live_id`. The seed is **adapter-agnostic**: it carries no `adapter_name`. <!-- [doc->REQ-START-5] --> *Which* adapter/profile a session belongs to is resolved later, at bind, as a read against the live registry (below) — so one SessionStart hook seeds correctly no matter which harness adapters are installed, and an `adapter add` after the seed is never missed. In-memory (not a file) avoids drive churn and the `$SPT_HOME` resolution nuisance; seeds are consumed within seconds, so persistence across a daemon restart is unnecessary (re-fired on the next SessionStart if needed).
2. The agent runs `/spt:live <id>` → the adapter's `$LIVE start <id>` alias = **`$SPT listen <id>`** (= `spt api listen <id>`), invoked via Monitor. It self-discovers its `parent_pid`, the daemon matches the seed entry by that pid (validated against `session_id` to defeat PID-recycling — KNOWN-HAZARDS 5.1), **resolves the owning adapter/profile** (the bind-time resolution below), creates/revives the perch binding `live_id` ↔ session details, then enters the long-running relay loop streaming events to stdout.
3. The always-on daemon holds the perch, spool, registry, and daemon-spawns the Psyche (via the spawn-psyche seam) — no separate wrapper. The relay is purely the delivery pipe.
   - Seed entry refreshed on each SessionStart (keeps `session_id` current across `/clear`, since `parent_pid` is stable while the harness process persists). If a harness has no SessionStart-equiv, `start` may carry the details directly as args — the seed is the preferred convenience, not the only path.
   - The same seed + bind-time resolution serves a **ReadyAgent** bringup (`$SPT ready`/poll), not just a LiveAgent — a harness-hosted ready agent is seeded and resolved identically (it just binds a poll listener, no Psyche).

**Bind-time adapter/profile resolution (ADR-0021).** Because the seed is adapter-agnostic, `listen`/`poll` resolve the owning adapter/profile when they bind, as a pure read — never a seed-time snapshot that could drift. `--adapter <name[:profile]>` is an **optional override** on the `api` group (an explicit choice for adapter dev/iteration); omitted, resolution runs:
1. the seed's `parent_pid` → that process's **executable basename** (case-insensitive, `.exe`-stripped);
2. **candidate adapters** = registered `kind="harness"` adapters whose **`host_binaries`** (the manifest match-key) contains that basename; <!-- [doc->REQ-MANIFEST-8] -->
3. **profile**: the durable **active-profile pointer** (`spt adapter use <adapter>[:profile]` writes it; one default per `host_binary`) wins; unset → the freshest candidate adapter by `registered_at_ms`, base profile (a specific profile is only ever chosen by the pointer), name-ascending on ties; <!-- [doc->REQ-INSTALL-12] -->
4. zero candidates → a friendly error naming the binary and the `--adapter` escape. The pointer is a standing user preference (durable on disk, never auto-written by install/update); the seed is ephemeral — see ADR-0021.

**Daemon auto-start:** the daemon is per-machine always-on (OS-service registered), but any `spt api` invocation that needs it will **start it if absent** (fresh boot, crash, never-installed-as-service). `$SPT listen` for the first SPT session on a machine thus transparently spins up the daemon. Ensure-running lives in the `api` layer generally; `listen` is the reliable anchor.

**spt-hosted (terminal wrapper / GUI launcher; the daemon launches the binary into a broker PTY):**
1. The frontend/CLI launches the agent: the daemon runs the **spawn-session** command template into a broker-held PTY.
2. The binary boots and fires **`api bind`** (or skips it under the strict UUID-injection + stable-pid commitment). **No catalyst/seed file** — the daemon is the launcher, already holds a direct channel (it spawned the process and owns the PTY), so a file round-trip would only add drive churn for no benefit.
3. The daemon delivers events; method is **manifest-configurable per activity-state** — direct PTY injection, or a relay even here (some adapters prefer a relay over PTY injection for idle delivery), or HTTP. During *activity*, delivery still defaults to the non-disruptive hook-injection path, not raw PTY writes.
4. Psyche is daemon-spawned, same as above.

So the old `$LIVE start` splits by topology: harness-hosted = SessionStart writes an adapter-agnostic seed → `$SPT listen <id>` consumes seed (by `parent_pid`) + resolves adapter/profile (ADR-0021) + binds + relays — legacy parity (`$LIVE start <id>` → `$SPT listen <id>`, no mandatory `--adapter`); spt-hosted = daemon spawn-session + `api bind` (direct, no file). The asymmetry is the file-bridge-only-when-no-direct-channel principle.

**Env-var aliases:** adapters inject clean env-var aliases for in-session invocation (heirs to today's `$OWL`/`$LIVE`), e.g. **`$SPT` = `spt api`** so a Monitor-bound call reads `$SPT listen <id>`. spt-core supplies the subcommands; the adapter supplies the env aliases (manifest philosophy).

### Endpoint types

Each perch advertises an **endpoint type** — a tag that says what shape of entity lives at that perch and what operations it accepts. The set of day-one types:

**ReadyAgent**:
Minimal SPT participant — a perch + a poll listener, no Psyche, no live-agent wrapper. Direct heir to the sister project's "ready agent".

**LiveAgent**:
A Self with a Psyche companion. Composite logical actor; addressable as one ID, but its component perches (the Self's, the Psyche's) live independently. Direct heir to the sister project's "live agent".

**Psyche**:
The Psyche companion's own perch, distinct from its paired LiveAgent's perch. First-class endpoint type so messages addressed to a LiveAgent's Psyche route directly without ambiguity. **Liveness is daemon-held, not PID-based.** Unlike the sister's separate wrapper *process*, the Psyche (under ADR-0004) is a **loop inside the daemon** — it has no dedicated liveness pid, and its summarizer subprocess is ephemeral. Its perch is alive iff the daemon's endpoint table says so, surfaced via a daemon-managed `status` field on `info.json` (the same pattern Shells use), **never** an `is_process_alive(info.pid)` check. The M1/M2a interim (no daemon) keeps it a real process, so per-pid liveness holds *interim*; M3 replaces it with daemon-authoritative liveness (KNOWN-HAZARDS 2.5).

*I/O & trust boundary (ADR-0012):* the Psyche is a **sandboxed** actor — it may read and write files but **cannot send messages or reach the network itself**. Its inbound context arrives two ways: events/messages the daemon hands it, and **commune/signoff file-drops** (Self → daemon → Psyche; the *Summarizer* authors the commune delta). Its **sole outbound** is **reply/notify intents** the daemon relays as its **outbound proxy** — emitted as `<EVENT type="reply">`/`<EVENT type="notify">` (the shared envelope grammar). A *reply* reaches **only the sender it answers**; a *notify* reaches **only the agent's own user** — the Psyche carries no target and cannot address arbitrary endpoints (the daemon strips/re-stamps `from=` before relaying).

*Psyche-host health — harness-reachable failure signal (v0.8.1, REQ-HAZARD-LIVEHOST-BOOT-RACE):* a LiveAgent's `status=online` is daemon-authoritative liveness and **stays authoritative** — but it does not by itself prove the daemon hosted a Psyche. When the brain's live-host reconcile fails to spawn the Psyche (e.g. the adapter's psyche binary is absent from its install dir, or the net-less boot-race starves the host), that failure was previously **silent** — only an `eprintln!` on the brain's invisible stderr, while a harness (and a human via `spt endpoint list` / `whoami`) reads **perch state**, never brain stderr. The Self perch's `info.json` therefore carries an additive, N-1-safe `psyche_host_error` field (`{reason, ts, attempts}`): a **current-state** stamp the reconcile **overwrites on each retry** (incrementing `attempts`) and **clears on a later successful host** — never an append log. It is **independent of `status`**: an online live agent with no Psyche reads `status=online` **and** a `psyche_host_error`, so the boot-race is diagnosable from perch state. `spt endpoint list`/`whoami` render it inline as a `psyche-host: FAILED (...)` annotation after the authoritative liveness line.

**Summarizer**:
The ephemeral, cheap model that builds a **commune delta** from a Self's recent turns and feeds it *into* the **Psyche** as inbound context. A distinct actor from the Psyche — different (cheaper) model, fire-and-forget, **no perch** (not an endpoint type). It authors *commune* deltas only, **never** *reply*/*notify*.
_Avoid_: conflating with the Psyche; "echo-commune model".

**Worker**:
A subagent's perch under a parent LiveAgent. Created on subagent start, torn down on subagent stop. Replaces today's "working perch" concept; first-class type so cross-communication between a Self and its workers (and worker↔worker) is addressable.

**SptNode**:
A machine's participation in an SPT subnet, identified by an Ed25519 public key generated on first run. First-class so networking primitives can address nodes directly as message targets, not only as transport peers. The node identity and network endpoint are hosted by the machine's `spt-daemon` (see Networking), not a separate process.

<!-- [doc->REQ-EP-6] -->
**Gateway** (concept ratified 2026-06-11; registered via the open type system, first instance downstream):
A **human-backed endpoint** — a user's specialized window into the subnet from a device or surface with no conventional-harness compatibility. Nothing LLM-shaped runs there; the intelligence at the endpoint is the **user**. Addressable like any endpoint (receives digests/messages, sends via the normal verbs) and may **own Shells** (it is an owning endpoint — see §Shell model). Distinct from a Shell: a Shell is *driven from elsewhere*; a Gateway *originates* interaction. No `tracked/` mind, no Psyche (LiveAgent affordances). First instance: the `spt-lecturn` adapter's Playdate endpoint (own repo).

<!-- [doc->REQ-MSG-5] -->
A message sent from a Gateway carries **the user's authority** — it *is* the user speaking through a device — and is delivered typed **`user-msg`** (ratified 2026-06-12) so receiving agents weight it as user instruction, not peer-agent chatter. The type is **identity-gated, never payload-trusted** (the KH 7.3/7.5 posture): the daemon permits `user-msg` only from user-backed origins (a Gateway endpoint, the local user's own CLI) and re-stamps an agent-family sender's `user-msg` down to plain `msg` — authority comes from who you are, not what you wrote.

<!-- [doc->REQ-MSG-6] -->
_Implemented posture_: the **local** user-backed origins are honored end-to-end — a locally-hosted Gateway endpoint (info.json `state="gateway"`) and the local user's CLI (M9-T4/T5). The **cross-node WAN** path is being completed (trust posture **ratified 2026-06-13**): the **subnet membership boundary is the trust boundary**. A subnet is a collection of machines the user already trusts, so a `user-msg` arriving over the subnet from a **Gateway-typed** origin is honored as the user's authority; the daemon does **not** defend against a subnet member *forging* the Gateway type — an in-subnet compromise is out of scope by construction (if the subnet is breached at all, the trust model is already void). The origin's type is read from its advertised registry **`endpoint_type`**, resolved at the receive funnel against the **QUIC-handshake-proven origin node** (never wire bytes — that keying is correctness, not an extra trust layer). Until a node advertises `endpoint_type` (N-1 rollout grace), its WAN `user-msg` re-stamps to plain `msg` — a graceful degrade, not a trust gate. Mechanism: `Instance.endpoint_type` (additive, riding the existing epoch-leased registry replication) + advertise population + `receive_wan` resolve flipping the already-plumbed `origin_user_backed`. (An earlier per-node "user-surface trust set" proposal was vetoed as overkill — the subnet boundary already is that trust.)

A Gateway endpoint binary is revived by **existing machinery only** (settled 2026-06-12, two corrections deep): while running, the bridged device's link liveness drives ordinary **instance state** (sustained device silence → dormant; device contact → active — the driver-attach rule). Across a node restart, revival rides a **co-located shell's wake-watcher** — the Gateway typically owns a shell instance on its own gateway host; that shell's offline wake-watcher (one of the two classes of third-party binary spt-core boot-launches — the other is the [[AlwaysOnEndpoint]] resident binary) holds the device-contact surface and fires the standard **wake resolution** ("owner suspended → revive the owner"). No Gateway-manifest watcher, no autostart flag, no new mechanism.
_Avoid_: calling a Gateway a Shell or an agent; "console", "remote".

**PresenceChannel** (broker endpoint — concept locked, impl deferred past v1):
A *broker* endpoint, not an interaction surface. Job: (1) **presence resolution** — track which node + endpoint the user most recently interacted with; (2) **shell brokering** — locate/instantiate the right Shell on that node and relay between the agent and the user. An agent "just knows how to reach the user" by firing at its PresenceChannel; the channel figures out the rest. Also a durable, **shell-agnostic 2-way thread**: messages persist in the channel, not in any one Shell, so the user can be sent a message via a phone messaging-Shell and surface/continue that same agent conversation later at a GameRobot Shell. Shells are interchangeable I/O windows onto the channel's thread.

Three interaction styles:
- **dispatch** — fire-and-forget: "reach the user with this payload"; channel delivers via the best available Shell.
- **bind** — sustained drive: "give me a Shell of capability X"; channel instantiates and the agent drives it directly until teardown. Supports operating a *specific* Shell regardless of where the user currently is (agent transience).
- **thread** — the persistent conversation that floats across Shells; the user can pick it up from any Shell, and 2-way payloads (text/audio/image/video, subject to the Shell's supported types) flow both directions.

Presence datum: `(last_active_node, last_active_endpoint, timestamp)`. The `last_active_endpoint` field lets an agent choose between messaging that specific endpoint vs. driving a parallel instance of itself.

**AlwaysOnEndpoint** (always-on endpoint; concept ratified 2026-06-21 — core kind, first instance downstream `spt-discord`):
A **resident, addressable endpoint that hosts no mind.** Its binary is daemon-**supervised continuously and runs independent of any agent's liveness** (up even when zero agents are online) — unlike an *agent endpoint* (a hosted mind with a Psyche + `tracked/` context) and unlike a **Shell** (single-owner, *driven*). It is **two-way addressable**: agents message it (to drive whatever external surface it fronts) and it messages out — notably it may call `endpoint wake <id>` to draw an offline agent online (wake authorization is **target-side**, so no special caller right is needed — see the wake-watcher/sleep-wake model). Declared by an adapter's **`[always-on]`** manifest section; the supervised binary is **one per adapter-option** (`<adapter>[:profile]`), and each always-on endpoint created from that option represents **another channel that one binary serves** — it binds via the existing `api bind`, so one connection fronts many `#`-endpoints. It is the **second class of third-party binary spt-core boot-launches** (the shell wake-watcher was the first); supervision **reuses the wake-watcher scaffolding** (backoff, give-up latch, one-per-instance lock, orphan-kill, brain-side reconcile) **minus the offline-only flip** — always online, never resting (no dormant/suspended states). Addressed with a mandatory leading [[`#` always-on sigil]].
_Avoid_: calling it a Shell (owner-less + not driven) or an agent (no mind); "service" alone (it is addressable, not faceless); a sleep/wake resting model (it does not rest).

The endpoint type system is **open**: harnesses and downstream projects may register additional types beyond the day-one set. Closed-vs-open semantics for capability advertisement (what operations each type accepts, how routing decides eligibility) are deferred to the design phase.

### Agent endpoints vs Shells

Endpoint types split into two families:

**agent endpoints** — *host* an agent, backed by a harness. ReadyAgent, LiveAgent, Psyche, Worker. Something intelligent runs there.

**Shell** (first-class concept — model locked, concrete types deferred past v1): a *driven surface*, not an agent. Nothing intelligent runs at a Shell; a remote agent (on another node) drives it. A Shell is a "self-documenting" endpoint that advertises a **typed capability toolset** and a 2-way interaction relationship with the user. It may live behind a node on a platform with zero conventional-harness compatibility (e.g. a gaming handheld). Examples (all deferred): `GameRobot` (a 2D-world avatar — move, gesture, alert-symbol, request-screenshot, preload-message), an OS-notification target, an in-session-inject target. The earlier "presence blueprint" idea collapses into this: a blueprint *is* a Shell type + its capabilities.

For day-one, only the **seams** that let Shells exist later without a rewrite are in scope (see Networking / Instances and `docs/DEFERRED.md`):
- the open endpoint-type system (above),
- a messaging substrate whose payloads carry **typed operation commands + arbitrary file blobs** (text/audio/image/video), not just text envelopes,
- the Shell-vs-agent-endpoint distinction present in the type model so a Shell type can be registered later.

Concrete Shells live **downstream in shell-adapter repos** (first shipped: `spt-shell-notify`, the OS-notification shell — the mechanism itself delivered in v1); the PresenceChannel implementation and presence gossip remain deferred.

**Shells differ structurally from agent endpoints** (full treatment below): a Shell has a node-local perch but **no `tracked/` context** (no mind to sync); its logs are node-local; it is **adapter/platform-bound, not adapter-agnostic**; and its lifecycle is link/teardown, not the dormant/suspended resting model of agent instances.

#### Shell model (detailed)

A Shell is a **surface one agent controls** — embodiment of an agent in a new system/environment.

**Provided by a shell adapter.** There are two kinds of adapter, both manifested, both under `adapters/`: a **harness adapter** (`kind="harness"`, e.g. `claude-spt`) hosts *agent* endpoints; a **shell adapter** (`kind="shell"`, e.g. `GameRobot`) provides a *Shell* endpoint. `Shell` is the endpoint type; the shell adapter name is the provider, and its manifest defines the Shell's capability surface. A shell binary is **only ever launched by SPT** (the broker spawns it; never user-launched).

**Three channels** over the typed/binary substrate:
- **command** (agent→shell, durable/spooled) — the manifest's capability toolset; drives the shell and shell→environment interaction.
- **text+file** (2-way, durable/spooled; per-shell) — general payloads; for a shell it is a *forwarder* (harness adapters have this channel too). File transfers are progress-queryable (below).
- **sensory** (shell→agent, **REST-only, never spooled**) — images, sounds, and arbitrary descriptive sensory payloads ("movement completed", "obstructed", "bumped", temperature, energy level). Ephemeral — delivered to the agent's live session; dropped if the agent isn't live.
- **drive** (owner→shell, **REST-only, never spooled, latest-wins**) — continuous-control payloads steering a surface in real time (scroll/crank state, stick positions, avatar movement). The owner→shell mirror of **sensory**: ephemeral, dropped if the shell is offline, a missed frame is superseded by the next — spooled replay of stale control on relink would be actively wrong, which is why this is not the command channel. Commands = discrete + durable; drive = continuous + ephemeral. (Minted 2026-06-11, Gateway grill. Standing consumer: GameRobot-class continuous control; the lecturn's HID path moved to a **shell tunnel** when its driver generalized to USB/IP.)

<!-- [doc->REQ-SHELL-4] shell tunnel: a long-lived reliable-ordered link-bound QUIC stream pair carrying opaque bytes the taxonomy never reinterprets; manifest opt-in, not enveloped/MAC-framed/spooled; link-break closes it; reliable-ordered ⇒ on-LAN posture -->
Channels carry typed, taxonomy-interpreted payloads. Distinct from them, an owner↔shell link may also hold a **shell tunnel**: a long-lived, **reliable, ordered** byte stream (a dedicated QUIC stream pair bound to the link) for protocol traffic the channel taxonomy must NOT reinterpret — opaque wire protocols spoken end-to-end (first consumer: USB/IP URB traffic to a usbip shell). Not spooled, not enveloped; the link's lifecycle governs it — a link-break closes the tunnel. Reliable-ordered is the point and the cost: tunneled protocols cannot drop frames, so congestion surfaces as lag, never loss — acceptable only where the deployment keeps tunnels on-LAN. (Minted 2026-06-11, Gateway grill.)

<!-- [doc->REQ-SHELL-5] owner-type-agnostic: control-exclusivity keys on the owner endpoint_id, never the owner's endpoint type -->
**Owner-linked, exclusive.** Spawned by an **owning endpoint**; both linked; **only that owner `endpoint_id` may control it**. Agent endpoints are the common owner, but ownership is NOT agent-exclusive — any non-Shell endpoint type may own shells (ratified 2026-06-11, the Gateway grill: e.g. a **Gateway** owns the driven surfaces it steers; a future **Resource** endpoint could too). Control-exclusivity does not mean interaction-exclusivity: a Shell stays 2-way with its *environment* (user messages on text+file, sensory payloads inbound) — exclusive is *who commands it*. An owner and its shell **can link across nodes** (the shell perch lives on the shell's node; commands ride Iroh). One owner : many shells (`GameRobot-0`, `GameRobot-1`, …).

**Command-delivery method** is the shell adapter's choice (same modes as agent inject-input): HTTP REST, stdin-via-broker, or child-relay (`api poll`). The binary is always broker-launched, so binding is direct (`api bind`, no seed-file).

**Lifecycle = online / offline / torn-down.** A **link-break always closes the broker + shell binary** (with an optional manifest-declared **pre-close instruction** to the binary + a **termination timeout** for graceful shutdown). Then: an **ephemeral** shell (a manifest property, not an agent choice) is fully torn down **and removed from the agent's shell history**; a **persistent** shell keeps its perch **offline** (re-linkable — the binary is re-spawned on relink). The binary never survives a link-break.

**Broadcast policy** (manifest): the shell advertises its availability to all agents on the subnet, only same-node agents, or not at all (agents learn of it only by user instruction). Shells run only on nodes where the shell adapter is installed/advertised.

**Instantiation scope (orthogonal to broadcast):** `broadcast` governs *discovery* only; *permission* to instantiate is **"shell adapter registered on this node" + the per-shell approval gate** (below). Any agent **instance running on a node** where the shell adapter is registered may `shell spawn` it. A **cross-node** spawn (owner on one node, the shell binary on another) is **not** special-cased — it rides the deferred **instantiate-anywhere** primitive and its consent gate; same-node spawn does not.

**`shell spawn` mints a new instance, it is not the online switch.** `spt shell spawn <adapter> [--id]` creates a *new* Shell identity (`<adapter>-<n>` + its perch + the owner link). Bringing an *existing* offline (persistent) instance back online is **`shell relink <id>`** (re-spawns the binary), or happens automatically via `persistent` (auto-online with the owner) / `wake_command` (offline wake-watcher). `shell teardown <id>` destroys it. So **`spawn`:create :: `relink`/`persistent`/`wake`:online** — identity-creation is distinct from lifecycle, mirroring endpoint **create vs wake**.

**per-shell instantiation approval (`require_approval`, manifest enum):** gates `shell spawn`, reusing the consent plumbing (grant store + interactive escalation; see Consent & security gates).
- `none` (**default** — matches the system's everything-opt-in posture) — the agent spawns freely within scope.
- `remembered` — prompt on first spawn; **allow-always writes a persistent grant** (`spawn-shell × agent × (node, shell-adapter)`) so later spawns auto-allow; allow-once does not.
- `always` — prompt on **every** spawn; allow-always is suppressed (no persistent grant).

The manifest value is the **floor**: a node/endpoint setting may **tighten** (demand approval where the manifest says `none`) but never loosen below what the shell requires.

<!-- [doc->REQ-CONSENT-3] -->
**per-capability approval gates** (ratified 2026-06-11, Gateway grill): the same `require_approval` enum may ride **individual capability entries** in a shell manifest — gating the dangerous *operation*, not just the spawn. Same grant store, same interactive escalation, same floor semantics. A capability may declare a **class key** so grants are scoped finer than the capability itself: the first consumer is the usbip shell's `attach`, granted per **(owner endpoint × device class × node)** — a remembered HID-attach grant never authorizes a storage-class attach. Spawn gates govern *existence*; capability gates govern *acts*.

**instance cap (`max_instances_per_owner` + `over_cap`, manifest):** an optional ceiling on how many instances of this shell adapter **one owner endpoint** may hold. The count is **all existing instances** — online *and* offline, every non-torn-down perch (`shell teardown` frees a slot) — so offline persistent shells cannot be stockpiled to evade it. Unset ⇒ unlimited. At the cap, **`over_cap`** decides: `reject` (**default**) refuses the spawn outright; `approve` requires per-spawn approval for each instance beyond the cap (allow-once each — it does **not** permanently raise the cap; a bigger ceiling is a config change, not a grant). Serves the consent model's **runaway** concern, and **composes with `require_approval`**: that governs spawns *under* the cap, `over_cap` governs *at/above* it.

**shell instance aliasing:** every instance has an immutable canonical id **`<adapter>-<n>`** (`GameRobot-0`, `GameRobot-1`, …) — which itself encodes the providing adapter — plus an optional **alias**, a friendly owner-unique label (`TempleKeeper`) set at spawn (`shell spawn GameRobot --alias TempleKeeper`) or later (`shell rename <ref> <alias>`). Alias and canonical id are interchangeable for addressing (`shell cmd TempleKeeper …`). The alias is a *display/address* overlay only: `adapter_name` always rides the perch, so `shell list` still shows the underlying type (`TempleKeeper → GameRobot, online`) — aliasing never obscures what a shell is.

**Agent awareness.** Available shells *and* the agent's own shell instances (with online/offline state) are injected into the agent's context (SessionStart additionalContext / `$LIVE start`-equiv + hidden-context versions) — the agent ideally just "knows." A `shell list` command also exists.

**Discovery scope (what "available shells" resolves to):** the set surfaced to an agent (injected + `shell list`) is **(1)** its **own instances** — alias · canonical `<adapter>-<n>` · status — always; plus **(2) instantiable adapters**: shell adapters **registered on the agent's own node** with `broadcast ∈ {subnet, same-node}`, and shell adapters on *other* subnet nodes with `broadcast = subnet`. `broadcast = none` shells are **never** surfaced proactively — spawnable only by explicit user/agent instruction (if registered locally / reachable). Discovering an *other-node* shell does **not** imply free instantiation: spawning it rides the deferred **instantiate-anywhere** gate. (Discovery is gated by the same reach gates as everything else — an agent only sees shells on nodes/subnets it can address.)

**Not in the subnet registry.** Shell perches are private to the agent↔shell link, not a general messaging surface — they are nested under the owner (`perches/<endpoint_id>/shells/<adapter>-<n>/`) and not registry-advertised like agent endpoints. The perch `info.json` carries only `type=Shell`, `owner`, `adapter_name`, `status` (online|offline), and an optional `alias` — the capability set is resolved from the shell adapter manifest by `adapter_name`, not duplicated on the perch.

**Shell-relevant commands:** agent/user surface — `spt shell spawn <shell_adapter> [--id]`, `shell list`, `shell relink <id>`, `shell teardown <id>`, `shell cmd <id> <op…>` (or ordinary `send` of a typed command payload), and **`spt endpoint shutdown`** (an agent gracefully shuts down *its own* endpoint → suspended; fires the transition echo-commune; cascades its persistent shells offline). Machinery `api` surface — `api bind` (type=Shell), `api poll` (relay delivery), `api emit --type <sensory_type> <payload>` (sensory back to the owner; owner known from bind), and **`api owner-shutdown`** (a shell directly suspends its linked owner, bypassing agent comms — gated by `can_shutdown` in the shell manifest; the channel is this machinery command, not the content channels).

#### Shell sleep/wake (offline ↔ online)

A shell automatically goes **offline when its owner endpoint goes offline**. Two complementary manifest options bring it (and its owner) back:

- **`persistent`** (bool): the shell is automatically brought **online whenever its owner endpoint is online**. Covers "owner already up."
- **`wake_command`** (template): a long-running **wake-watcher** process spt-core runs *while the shell is offline*. Its sole job is to fire a wake. It runs on the **shell's node** (where the platform wake-event originates — e.g. interacting with a "shut-down" avatar). Covers "owner is down, wake it from outside."

**Online/offline are mutually-exclusive processes:** online ⇒ the shell binary runs (no watcher); offline ⇒ the wake-watcher runs (no shell binary). spt-core flips between them.

**Exit-opcode supervision:** the wake-watcher exiting with the **wake opcode** → spt-core runs the wake resolution (below) and brings the shell online. Any *other* exit (crash) → respawn the watcher with **exponential backoff + eventual give-up** (until next shell activity) — crash-bug safety. One watcher per offline shell instance.

**Wake resolution (state-keyed):** find any *reachable* instance of the owner endpoint on the subnet, then —
- owner **dormant** (still running) → do nothing to the endpoint, just bring the shell online;
- owner **suspended** → revive the owner, then online the shell;
- owner **active** elsewhere → attach/online the shell;
- **no reachable instance** (home node offline) → no-op, **unless** the owner's **`shell_wake_spawn_anywhere`** settings flag is set, which pre-consents shells to **fresh-spawn the owner on any available node** to wake it (the flag *is* the consent for this otherwise-deferred instantiate-anywhere path).

The same `spt endpoint shutdown` / `api owner-shutdown` / avatar-wake machinery composes into a full sleep/wake cycle driven from either end.

**remote attach (`spt rc`)** (ratified 2026-06-12 — own milestone, rides the built R-TERM substrate):
The user-facing remote-terminal product surface: `spt rc [subnet:]<id>[@node]` attaches across the subnet to an spt-hosted session's broker surface — scrollback replay + live bytes down, keystrokes up. Address-gated like messaging, **plus a consent gate** (driving a session ≠ watching it). **Controller/viewer model:** one interactive **controller** at a time — its viewport sets the PTY size (resize fires on attach + controller window changes; ConPTY repaint cost is why resize stays controller-exclusive); any number of **`--view`** attachers (read-only, never resize, letterbox client-side). `spt rc kick <target>` displaces the incumbent controller (loud notice to them); `--take` = kick + attach in one motion. A web client (xterm.js + a local bridge speaking the same stream substrate) is the GUI's terminal pane and the future Android-gateway shape — the bridge is just another `rc`-class client.
_Avoid_: "screen share"; resize-per-viewer; silent controller displacement.

<!-- [doc->REQ-RCVIEW-1] [doc->REQ-KICK-1] [doc->REQ-VIEWER-SKIP-TO-LIVE-ON-EVICT] [doc->REQ-HAZARD-VIEWER-RING-ROLL-SNAP] -->
**BUILT (M12 W2.5).** The controller/viewer model is implemented end-to-end. Attach intent is **three-valued** (`AttachIntent = Viewer | Control | Take`, wire-default `Control`): `Control` to a FREE endpoint becomes controller; `Control` to a CONTROLLED endpoint is **refused with guidance** (`--view` to watch, `--take` to control) — never auto-viewer, never silent-displace; `Take` (`spt rc --take` / picker "Kick") kicks the incumbent with a **loud `Displaced{by}` notice** and full detach (not demote). The broker's per-session `OutputLog` is the fan-out hub: ONE authoritative **controller** (advances the brain-resume cursor `delivered_through`) plus ANY NUMBER of read-only **viewers**, each an isolated bounded queue + writer thread evicted on overflow (a wedged viewer never stalls the drain, controller, or child — `REQ-HAZARD-VIEWER-ISOLATION`). The controller is **no longer a *blocking* writer**: since b4 it is a NON-BLOCKING `try_send` that DROPS on a full channel (`CONTROLLER_CHANNEL_DEPTH`), so a slow controller can never throttle the drain and starve a concurrent viewer (`REQ-HAZARD-VIEWER-STARVE-UNDER-CONTROLLER-BACKPRESSURE`). Exactly-once for the controller is preserved by RE-FETCH, not by blocking: a controller that falls behind its own echo and drops frames hits a forward `output gap`, and the serve loop RESUMES-FROM-FLOOR — re-subscribes from the frozen `delivered_through` so the broker replays the dropped frames from the ring (`REQ-HAZARD-CONTROLLER-GAP-RESUME`, a re-fetch — NOT the viewer's snap, which would skip frames and violate the controller's exactly-once resume). This holds while the ring still retains `delivered_through`; a controller that falls behind a **ring-exceeding** flood (the dropped frames have rolled out of the ring) surfaces a clearly-marked data-loss rather than a silent skip or a hang (`REQ-HAZARD-CONTROLLER-IRRECOVERABLE-BEHIND`, deferred for full graceful handling). An evicted viewer **skips to live** instead of dying silently: the broker signals the eviction (a marker distinct from session-exit EOF) and the viewer re-subscribes from the current ring floor — rate-limited so a hopelessly-behind `--view` under a sustained output flood sees intermittent live bursts (tail -f reconnect), never a frozen viewport or an evict→resubscribe CPU spin. Viewer-only, so it never touches the authoritative resume cursor (`REQ-VIEWER-SKIP-TO-LIVE-ON-EVICT`). A viewer also tolerates a forward ring-roll gap **before** any eviction: if it falls behind the live ring under a hard flood and reads a seq past its cursor (the ring rolled the intervening frames out between reads, with no eviction marker), it **snaps to that live seq** (accept-and-advance via snap-above, armed at the initial viewer attach) instead of fataling on the gap — composing with skip-to-live, which recovers *after* eviction, and staying viewer-only so the controller keeps its strict exactly-once reject-gap (`REQ-HAZARD-VIEWER-RING-ROLL-SNAP`). Resize is **controller-exclusive** (the broker rejects a viewer's resize). The **broker is the single writer** of the perch's `driven_by` (controller node) + `viewer_count`, resolving the displaced-controller clear-race. **Controller identity is keyed on the operator node (`by`)**: a same-`by` re-subscribe (a successor re-taking the slot after a brain restart) silently re-takes — `Displaced` fires ONLY on a genuine cross-operator `Take` (the gate-#7 self-kick guard; a brain update never kicks attached operators). **Dormancy keys on the controller only** (viewer attach/detach is wake-neutral; a viewer may watch a dormant endpoint as-is). **v1: viewing is gated identically to driving** (a viewer runs the same `access_check(Unsolicited)`; the lighter distinct watch-gate is the future seam). The picker is status-conditional: a CONTROLLED endpoint offers **View + Kick** only (no plain Attach), pinned `controlled by <node> (+N viewing)`; all of View/Attach/Kick ride the SAME rc dispatch (intent is a parameter — single-bringup-path). (rc viewer letterboxing is a one-line size indicator in v1; true clip/pad needs a client grid model — deferred.)

<!-- [doc->REQ-RC-KEY-VT-TRANSLATE] -->
**rc keyboard input (Windows VT translation, v0.13.0 bug 2).** On **Windows** an interactive `spt rc` console reads crossterm **key events** and translates each to **standard xterm VT** (`translate_key_event`) — arrows / Home / End / PgUp/Dn / Insert / Delete / F-keys + modifiers all reach the harness as the universal terminal contract (**agnostic**, NOT win32-input-mode; the legacy console delivers those keys as events, not bytes, so the old byte-pump left them DEAD). **Unix passes through** (its raw-mode stream is already VT; cfg-split, zero Unix change). Detach stays the **`ctrl-b d`** prefix, event-sourced on Windows. A **non-tty** stdin (piped / tests) falls back to the raw byte path (the e2e byte-injection contract). This **supersedes** the W7 `normalize_key_byte` byte-swap (KNOWN-HAZARDS 7.13 → 7.16): Backspace and Ctrl+Backspace are emitted natively by the translator.

**spt-hosted bringup picker (`spt endpoint run`)** (M12-W2):
<!-- [doc->REQ-RUN-PICKER] -->
The user-facing bringup flow for spt-hosted endpoints. **Bare `spt endpoint run`** (no
`--adapter`/`--id`) opens an in-process **ratatui picker**; the **flagged** form is the
non-interactive bringup path (`--adapter <a[:profile]> --id <id> --create|--resume <session>
--start|--attach|--view`), untouched — a picker selection bakes exactly that path. **Layer 1**
picks the kind (*Create new* | *Pick existing*). **Create-new** chooses a registered
`kind="harness"` adapter with its shipped+local **profiles tree-nested**, then a
charset-validated id, then — **on a node that holds ≥2 member subnets** — a **home-subnet
layer** (`CreateAdapter → CreateId → CreateHome`): the member subnets **MRU-ordered**
(most-recent home first, the same ordering the CLI confirm uses), pick one and it bakes an
explicit `--subnet` into the bringup so home resolves directly with **no `Ok to proceed? Y/n`
confirm** on the picker path. A **single-subnet / unpaired** node **skips the layer** (the sole
subnet auto-homes; `--subnet` stays unset). The CLI-only path (`endpoint run --adapter X --id Y`
with no `--subnet`) keeps its post-resolution Y/n confirm and its non-interactive multi-subnet
refuse — unchanged. <!-- [doc->REQ-RUN-PICKER-HOME] --> **Pick-existing** selects a **category** (←→ over
`[<cwd-project> | Local node | Subnet]`), endpoints **grouped + alphabetically sorted** with a
**status square** (online green ■ / offline gray ▢ — the blue *attached* tri-state + *Kick* are
**W2.5**, a dedicated broker attach-presence slice), **type-to-filter** (`/`), a pinned keybind
legend, and a **two-pane** right-half description (harness `adapter:profile` · best-effort
project history newest→oldest · `endpoint description`). The **confirm** layer offers
status-dependent options — Attach/Start/View (the rc pump / bringup) · Instantiate-locally
(remote) · Change-harness-adapter (offline) · Fork · **Resume-from-history** (offline+LOCAL
only — enumerates the per-endpoint session ledger, titles `<project> @ <ts> (…id5)`, resumes
that session id). **Invariant:** the picker is a pure front-end — every terminal action routes
through the one bringup core (`spt endpoint run` / the rc pump), never a second path. A single
action enum is the source of truth so a future tap-mode (phone PTY) layers on without
re-coupling to keybinds.
_Avoid_: a second bringup path; hard-coupling interaction to physical keybinds.

**`spt endpoint run` is the spt-hosted bringup for BOTH endpoint types** (v0.12.0):
<!-- [doc->REQ-READY-AGENT-RESUME] -->
The bringup core is **type-agnostic** — the endpoint TYPE is the adapter manifest's
concern, not a separate bringup mode. A manifest declaring `[session.psyche_init]`
brings up a **LiveAgent** (the daemon reconcile hosts its Psyche); a manifest *without*
it brings up a **ReadyAgent** (a poll listener, no Psyche — see *ReadyAgent* and the
harness-hosted ready bind at the *seed + bind-time resolution* note above). No
`--adapter`/picker branch distinguishes them: the daemon live-host reconcile hosts only a
perch whose **state is `live_agent`** (a `ready_agent`-state perch is skipped at the
reconcile start-side state gate), and within the live path keys on `psyche_init` presence
— so `endpoint run` of a ready manifest naturally yields a ready endpoint with no Psyche;
`--resume <session>` carries its session into the bind for either type. Consequently a
ReadyAgent is now first-class in the **Resume-from-history** offer above: the
harness-hosted ready bind ledgers a **Boot session row** on bind — exactly as the live
`establish_perch` path does — so an *offline* ready perch carries the session rows the
offline+LOCAL Resume-from-history enumerates (previously only LiveAgents did; a ready bind
wrote `info.json` but never the ledger, so the picker never offered it).

**`spt-<id>` shortcut** (picker `s` keybind, M12-W2):
<!-- [doc->REQ-RUN-SHORTCUT] -->
From any pre-start options set, `s` writes (or updates) a **`<basename>-<id>` launcher** at the
project root that bakes the current selection's **non-interactive** flags (terminal actions
only: adapter[:profile] + id + create|resume + start|attach|view; the interactive-only branches
— Kick/Instantiate/Change-adapter/Fork — are not bakeable). The **basename is a parameter**:
harness-agnostic spt-core defaults to **`spt`** (→ `spt-<id>`, e.g. `spt-doyle`); an
adapter/flow **overrides** it (spt-claude-code → `cc`, giving `cc-<id>`) — the Claude-Code-ness
lives in the adapter, **spt-core never emits `cc`**. The basename must be a *distinct* token,
never bare `spt`: a `spt.cmd` wrapper would shadow the real `spt.exe` only under cmd.exe
(cwd-first search), silently no-op in PowerShell/Unix, and self-recurse — so `spt-<id>` is the
safe, consistent form. The launcher is the **current OS's native form**: a **`.cmd`** on
Windows (the default `PATHEXT` excludes `.ps1`, so a bare name never resolves one; `.cmd` is
PATHEXT-resolvable), a POSIX **`sh`** (`chmod +x`) on Unix. **Invocation reality** (documented
in the generated header; `<name>` = `<basename>-<id>`): cmd.exe bare `<name>` in the project dir
· PowerShell `.\<name>` · Unix `./<name>`; a *truly-bare* basename on `PATH` is a PATH-installed
launcher (`/spt:setup`'s job), not this project-root shortcut. **Overwrite is sentinel-guarded:**
the generator writes + checks a generated-by header marker — it overwrites its own prior output
freely, but refuses (with a warning) a same-named file that lacks the sentinel, never clobbering
a user file.
_Avoid_: baking `cc` (a harness name) into spt-core; a bare `spt`/`cc` shadow wrapper; `.ps1`
shortcuts (PATHEXT won't find them); clobbering a non-generated file.

## Terminal wrapper

The multiplatform terminal wrapper is the backbone for hosting agent sessions. Supersedes the sister project's planned "capsule" (which leaned on psmux). Architecturally a **process supervisor** with an abstract session surface. **The session-surface *mechanism* is delivered in M3a (`crates/spt-term`); the supervisor *process* (term-daemon) is delivered in M3b (`crates/spt-daemon`, 2026-06-03).**

**term-daemon** (process supervisor):
A long-lived, one-per-machine daemon that owns the PTYs for all hosted sessions, multiplexes them (sessions addressable by name, tmux-style), and accepts attach/detach from frontends. A hosted agent session (a LiveAgent's Self, a ReadyAgent, etc.) runs *inside* a supervised PTY. **Headed** = a frontend is attached and rendering/driving the PTY; **headless** = the session runs with no frontend attached. The daemon's persistence is what lets a session keep running while unwatched and be re-attached "on a whim." *(M3b ✅ — `spt-daemon` hosts many M3a `spt-term` surfaces behind a versioned local IPC, split into a stable **broker** (holds the PTYs/children/sockets) + a restartable **brain** (all logic) so a brain swap leaves hosted sessions gapless; `spt-term` itself is deliberately single-surface-per-handle.)*

**A view is independent from the endpoint** (invariant):
<!-- [doc->REQ-HAZARD-VIEWER-CLOSE-DETACH] -->
An spt-hosted endpoint runs in a **daemon-owned PTY, decoupled from whatever terminal launched it**. Closing the
tab/window where `spt endpoint run` was invoked detaches only the `spt rc` attach pump — the endpoint keeps running under
the daemon and stays re-attachable via `spt rc <id>`. A view is a transient frontend over a daemon-owned session, never
the session's lifeline. *Implementation:* the daemon must never live inside the launching terminal's process grouping —
on Windows the cold-started daemon is launched **job-neutral**: a job-neutral creator (**WMI `Win32_Process.Create`**,
owned by WmiPrvSE — primary; a `schtasks` one-shot — fallback) starts it OUTSIDE any terminal Job Object from birth, so a
terminal's `KILL_ON_JOB_CLOSE` job can never reap it (this is why Task-Scheduler-autostarted daemons never had the bug).
A WMI/scheduler child does not inherit the launching shell's transient environment, so `SPT_*` (notably `SPT_HOME`) is
forwarded explicitly. `CREATE_BREAKAWAY_FROM_JOB` is retained only as a fallback rung (a job *can* deny it). On Unix the
daemon's own session detachment (new session, no controlling terminal) already keeps the SIGHUP of a closing terminal off
the daemon's children. (The ConPTY/pseudoconsole isolation is correct on its own — the lifetime binding that leaks is the
Job Object, not the console.)

**session surface** (library abstraction):
The trait spt-core exposes over "anything you can write input to and read output from" — delivered in M3a as `spt_term::SessionSurface` (`write_input` + `resize`, with `send_keys`/`send_line` as first-class injection). Native PTY (`spt_term::PtySession`: ConPTY on Windows 10+, `forkpty(3)` on Unix, via the `portable-pty` crate) is the day-one implementation. The abstraction generalizes so future implementations — network-attached surfaces (for `spt-node` remote control), GUI text panes, etc. — are uniform to consumers. Pre-Windows-10 (winpty) is explicitly dropped. **OS-neutral by contract (Spike #4/#5):** the trait bakes in neither `forkpty`'s raw-ordered-pipe nor ConPTY's repaint-on-resize screen-buffer semantics; output is a plain byte stream (`spt_term::OutputStream`, bounded backpressure) with OS-independent replay.

**input injection**:
Two granularities, both first-class. `send-keys` — raw byte stream including escape sequences (e.g. `Ctrl-C`). `send-line` — cooked line-level input flushed on newline (the common case; matches how user input reaches an agent prompt). **PTY input is single-writer (v0.13.0 P0, KNOWN-HAZARDS 7.17):** broker-side, every write to a hosted session's PTY goes through ONE per-session input-writer thread — the sole caller of the blocking `write_input`, fed by a bounded FIFO. All callers (operator keystrokes, message injection, the inject-floor flush) ENQUEUE and return immediately, so a paste burst that fills the harness input buffer parks only that thread, never the broker dispatch thread. A genuinely wedged harness (queue full) DROPS excess input + stamps the perch `input_backpressure` (a visible signal, healed on resume); the daemon never wedges. This is the input-side mirror of the output-side single-writer (controller/viewer writer threads, 7.12). **rc paste is client-originated on Windows (v0.13.0 P1/P1b, KNOWN-HAZARDS 7.18):** the harness runs daemon-side with no access to the operator's local clipboard, so `spt rc` itself reads the local clipboard on a **right-click** and injects a *bracketed* paste (`ESC[200~`…`ESC[201~`) — the harness's bracketed-paste mode lands a multi-line paste intact with no per-newline submit-storm, content verbatim. (ctrl+V is NOT intercepted: Windows Terminal consumes it as its own paste accelerator and injects the clipboard as keystrokes — which the broker no longer wedges on, 7.19.) Because that capture also steals WT's native scroll, rc forwards the **scroll wheel** to the harness as SGR mouse reports when the harness has mouse reporting on (7.20). cfg(windows) only; Unix terminals paste + scroll natively through the byte pump. **An operator input flood must not deadlock the broker (v0.13.0 P1b, KNOWN-HAZARDS 7.19):** the applied-ack is opt-in (`InputReq.ack`) — the fire-and-forward rc path sends no-ack so the broker's per-conn handler never blocks writing an ack back onto the same conn it is draining a flood from; `shellchan` (which waits on the ack) keeps it. Exactly-once is unaffected (dedup at the applied-set).

**scrollback**:
In-memory ring buffer per session for v1. **On-disk spillover for long sessions is an intended, tracked feature** (deferred past v1, not forgotten) — see `docs/DEFERRED.md`.

**live activity buffer (session digest)**:
A rolling, human-glanceable view of an endpoint's **recent context** — *what the agent did* (recent user turns + the agent output between them, with **tool-usage sprints collapsed**, e.g. `<endpoint_id> used: **Write** `dir/file.txt` · **Bash** `first ~25 cmd chars` · […]`) **and** *what spt fed the agent* (**context-injection** entries — session-start psyche download, echo-commune mirror, incoming owl messages — collapsed by default, expandable). One time-ordered, glanceable timeline. **How much** is shown (window depth, arg-truncation, sprint-collapse) is the adapter's declared preference, consumer-overridable — *not* a fixed spt-core count. Distinct from both the **raw scrollback ring** (unparsed bytes) and full **transcript history** (the conversation behind echo communes/resume): its job is *"what is this agent's recent context, at a glance,"* not conversation replay.

- **Source = normalized session logs, never PTY parsing** (revised 2026-06-12, Gateway grill — supersedes the ADR-0008 source mechanism). A TUI's PTY stream is a *rendering protocol*, not a content log: scroll/resize trigger full repaints (duplicate content), and semantic turn boundaries drown in repaint soup. The split: **render surfaces read the PTY** (the R-TERM raw byte ring, unchanged); **content surfaces read the logs**. The digest reads the **same session logs as the echo-commune** but through its **own adapter-declared extractor** (the `[digest]` seam — see below), not echo's opaque normalizer: one source, two extractors (echo wants rich/opaque records; the digest wants the contract). Presentation knobs (window depth, sprint-collapse, arg truncation) are adapter-defaulted and consumer-overridable, not a fixed spt-core formula.
- **Topology-independent.** Because the source is logs, the digest works for harness-hosted endpoints too — Claude Code's per-session JSONL is Path A's worked example. Freshness = file-watch + adapter hook nudges; harnesses with no usable logs push entries directly via **`api digest-entry`** (the Path-B-style door). Near-realtime tailing needs an *incremental* normalize story (design-phase concern: Path A's normalize-command was shaped for occasional pulls).
- **Thread-spanning across session boundaries.** A live agent's identity persists across many harness *sessions* (a `/clear` or `/compact` rotates the harness `session_id` — see `api boundary` / **post-spawn seam** — while the endpoint identity and `parent_pid` anchor hold). The digest follows the **agent thread**, not a single session: its rolling window may bridge a boundary (the tail of session N and the head of session N+1 coexist until the old turns age out), so a glance right after a `/clear` still shows the agent mid-thread rather than empty. A session boundary is itself **represented distinctively** in the digest (a visible boundary marker, e.g. a `/clear` divider), so a reader sees that context reset *there* without losing the thread. (Heir to the sister project's per-agent `sessions.log` boundary ledger.)
- **Two access modes.** **Snapshot pull** — `spt endpoint digest <[subnet:]id[@node]>` returns the current structured buffer (a Shell/CLI/GUI on-demand render; feeds the frontend's "latest session output" pane; a Gateway's agent-window). **Structured-delta stream** — a subscriber (live Shell pane, GUI, Gateway) receives only *changes* (new turn, new tool entry, collapse update), keyed to log records, for resource-efficient near-realtime reflection. **Access is address-gated** — fetch/subscribe is allowed for anyone who can address the endpoint (visible + resolvable per the resolution policy), the same gate as messaging.
- **"Option (C)" logging surface:** the digest may also be *persisted* (opt-in per-endpoint) as a coarse activity log appended to the native history store (Path B), on-boundary or periodic — off by default.
- **Adapter-declared extraction (post-M9 milestone — reverses M9's "no manifest seam"):** M9 retired the PTY-parse engine and made the digest an on-demand projection over a published `{role,text,tool}` contract carried in `[history]`. That under-served real adapters: one normalizer cannot serve echo (opaque) and the digest (contract) at once, and a contract mismatch failed **silently**. So the digest gets its **own `[digest]` manifest seam** — an **imperative extractor** the adapter declares (its native log → the `{role, text, tool, ts}` contract; reads the same source files as `[history]` by default, own-source escape hatch), with **`api digest-entry`** as the always-available push fallback. **No declarative DSL** (anyone who could conform to a fixed format could as easily push or write the extractor). **`spt adapter digest-proof`** lets an author validate their extractor's output; the daemon surfaces the same skip-diagnostics at runtime. The PTY-parse engine stays retired. (Successor to ADR-0008.)
- **Two-origin merge.** The digest interleaves two sources by `ts`: the adapter's **extracted log records** (agent activity) and spt's own **context-injection entries** (psyche download / echo mirror / owl message), which spt appends to the endpoint's `digest.log` (the existing Path-B sink — spt becomes an internal producer). The projection merges both newest-relevant within the window. **Deferred (no consumer yet):** the GUI collapse/expand of context-injection entries, the echo-commune-reads-the-digest delta loop, and the *autonomous file-watch freshness* nudge land with the milestones that own those surfaces; the data model (the `[digest]` seam, `ts`, the context-injection category, thread-spanning) lands now so the contract need not break again.

_Avoid_: "PTY digest" (the superseded source mechanism); parsing agent content out of PTY bytes anywhere.

## Networking

WAN cross-machine messaging is a **first-class, day-one capability** of spt-core — not a follow-on project. Goal: minimal-to-zero config, "just works" cross-machine SPT. The transport stack (Iroh + mDNS LAN discovery + pairing) is baked into the core library crates, behind a compile-time feature flag (`net`) that is default-on in the reference binary and optional for embedded/stripped library consumers.

The persistent network presence required for "reachable even when no session is running" is hosted by the `spt-daemon` (the same one-per-machine process that supervises PTYs). There is no separate networking daemon. The node's Ed25519 identity lives at `{SPT_HOME}/owlery/node.key` (heir to the sister project's planned location).

**relay dependency (v1 stance)**:
Out of the box, WAN connection setup and the ~30% relay-fallback path route through relay servers operated by **n0.computer** (Iroh's authors) — free, accountless, the silent default that makes zero-config WAN work. This is a soft, metadata-only, escapable runtime dependency:
- It is operational, not a code dependency. n0 relays being unavailable degrades initial connection setup and the relay-fallback case; already-established direct connections are unaffected.
- Traffic is end-to-end encrypted by node keypairs; relays never see plaintext. The trust question is availability + metadata (who-talks-to-whom-when), not confidentiality.
- The relay binary is open-source and self-hostable; spt-core ships a config knob to point at a user-operated relay instead of n0's.
- v1 ships with n0 as default + the self-host escape hatch + plain-language disclosure in docs. Hiding the dependency is rejected.

First network socket bind triggers an OS firewall prompt (Windows Defender, some Linux); unavoidable without a signed binary + pre-declared exception. Framed in CLI UX, not suppressible.

### The peer pump

**peer pump**: the **one supervised daemon loop** driving all outbound cadenced peer traffic — registry advertisement, notif push, sync pull, update check — against every roster member of every attached subnet. The pump owns *when* and *toward whom*: the scheduling kernel (in-memory stagger-from-due-now cadences + wake markers — deliberately **not** deadline-grid converted, ADR-0018 `[V4]`; no per-loop timing writes), the per-round shared store loads, the per-subnet × per-peer fan-out (connection cache, the one brain IPC handle, the sole open-op `EpochSource`), the loop heartbeat, and capped-backoff supervision (REQ-DAEMON-5). The workers own *what*.

**pump worker**: a cadenced consumer module behind the pump's **worker seam** — one each for registry, notif, sync, update. A worker declares its cadence, optionally consumes its wake marker (`poll_wake` — the exactly-once filesystem take stays at the worker edge, the scheduling kernel stays pure), runs optional node-local **pre-round** work once per due round (e.g. the registry worker's eviction sweep → rotation fire → re-advertise ordering), and a per-peer **peer step** over the pump-provided connection. Workers receive per-round shared state through the pump-owned **round context** — the single-read invariant: push targets and the sync gate see the *same* roster load. A failed peer step aborts the remaining due workers for that peer and drops its connection (redial next tick). **LLM-bearing work is never a pump worker** (KNOWN-HAZARDS 7.4 — per-agent Psyche/pulse/echo runs off the shared scheduler, on its own thread).
_Avoid_: calling an individual leg a "pump" (the loop is *the* pump; the legs are workers); giving a worker its own `EpochSource` over the shared op counter (sole-writer: the pump's, KH epoch-lease class).

### Cross-node Psyche sync

**Sync scope is subnet-exclusive by default.** An endpoint's mind replicates only within **one subnet** unless told otherwise. Endpoint config carries a **subnet-membership list** for sync: the context syncs across the nodes of *every* subnet in that list, defaulting to just the endpoint's home subnet. This is distinct from registry *visibility* (*endpoint visibility*) — being addressable in a subnet does not imply replicating your mind there; mind-replication into a subnet requires that subnet to be in the sync list. The default keeps a `home`-fleet agent's mind off the `work` fleet even when a node bridges both. (The subnet scope composes with the live/project tier split below: a tier syncs to the eligible instances *that are also in an in-scope subnet*.)

The context is a local git repo with `a-<endpoint_id>` (per-agent) and `p-<project_id>` (per-project) branch views. Sync has **two transport modes** (full mechanics: `docs/STORAGE.md`):
- **P2P (default):** the data syncs between a user's nodes over the built-in P2P networking — **no `gh`/account/setup**, the no-central-operator promise. Replaces modern SPT's required GitHub-remote sync.
- **Hub (opt-in, `spt context-github-setup`):** every node pull→merge→pushes against a private GitHub remote — an always-online sync hub *and* a convenient GUI view. Uses a **shared deploy key** distributed as subnet secret material, so only the setup node needs a GitHub account; other nodes use the key over SSH, with **P2P as backstop**. Hub mode is the **context-repo sync transport only** — messaging, registry, remote-drive, pairing, and presence always ride P2P; gh-on never removes the P2P requirement (ADR-0002, amended).

**The gh-repo interim is retired** (M4-D9-6, after the two-host rig proof): during the pre-P2P era the sister project's `gh-repo-sync` skill (and the `spt-agent-storage` private repo wired by `psyche-sync-setup`) was *the* cross-machine context transport. With P2P sync proven end-to-end on real hardware (`docs/TWO-HOST-RUNBOOK.md`), that interim stands down — the sister skill is no longer part of the sync path. `psyche-sync-setup` remains only as the **hub-mode seam** (the opt-in transport above), not the default and never a requirement.

Merge never line-merges a mind file: a custom merge driver resolves context files **per file by the precedence marker**, which carries `source`, **`node`**, and a per-node **version vector** (entries from each node's monotonic epoch source — wall-clock is never the ordering authority, at most a human tiebreaker hint). A dominating write supersedes; a dominated write drops; **concurrent writes (neither dominates) surface as an explicit conflict — never silent newest-wins**: both versions persist as replicated conflict artifacts and the endpoint's **own Psyche reconciles its own mind** in one bounded turn, run by the **active instance's node only** (single reconciler; the merged write dominates both parents and clears the conflict subnet-wide; reconcile unavailable ⟹ the artifacts simply persist, nothing is lost). KNOWN-HAZARDS 6.5 generalized; mechanics: ADR-0013 + `docs/STORAGE.md`. The single-writer invariant (active-instance-authoritative) makes true conflicts rare. The per-project branch view lets any agent synthesize broader context from *other* agents' contexts in the same project and supports query-routing ("which endpoint owns this task"). The two-tier sync maps onto the views: live context (`a-`) → all instances; project context (`p-`) → same-project instances.

<!-- [doc->REQ-EP-7] -->
**live role** (`live-role.md`, ratified 2026-06-12 — core milestone A):
A durable statement of an agent's **broad purpose** — rarely modified, and only at deliberate user instruction. Lives in `tracked/` (the mind) beside `live-context.md`, so it replicates with the mind and follows the agent across nodes. At start-transition context injection it renders **first** (role, then live context, then project context). The guarantee is **mechanical**: no automated writer exists — Psyche reconcile, echo-communes, and signoff structurally never touch it; the sole writer is `spt endpoint role [--overwrite <file>]`. (A hard gate restricting writes to user-backed origins is a recorded later tightening, riding the `user-msg` identity plumbing.)
_Avoid_: "system prompt", "persona file"; any automated mutation.

### Subnet notifications

**notification (notif)**:
A first-class kind **distinct from an inter-agent message**: a **user-directed, dismissable, resurfacing** subnet event (e.g. *new node paired*, *external-subnet pairing request*, *update available*, *consent needed*). Messages route agent→agent and are consumed once; notifs route **to the user** — via whichever endpoint is active — and **persist until dismissed**. The notif is the general primitive that the previously-ad-hoc "deliver to the user's most-recently-active session" flows are special cases of: **the self-update prompt and the consent escalation are refactored to be notif producers** (one primitive, many producers — kills the duplicated delivery paths; the registry-resolution precursor to PresenceChannel is this primitive's delivery step).

- **Spool: per-subnet, replicated across that subnet's nodes** (reuses subnet-registry distribution). Each notif is tagged with its subnet; **dismiss-state replicates subnet-wide**, eventually consistent (dismiss on node A clears it on node B). "Node notif spool" = each node's local replica. *(Forward seam: dismissal — and the spool key — generalize to **per-(subnet, user)** when the cross-user model lands.)*
- **First-fire** targets the user's most-recently-active endpoint *in that subnet* via presence resolution.
- **Two states, not one:** **seen** (surfaced at least once — tracked **per-endpoint**) vs **dismissed** (explicitly acknowledged → removed from the resurface set). Surfacing alone ≠ dismissal. Dismissal is explicit: `spt notif dismiss <id>`, or **the agent marks it dismissed** once it judges the user acknowledged (the agent is the default surfacer, so it is well-placed to dismiss).
- **Resurface** re-delivers *undismissed* notifs at boundaries, gated to avoid nagging: skip if already **seen on this endpoint**, and skip if surfaced anywhere within a **global suppression timeout** (default ~1h, configurable) — the timeout is cross-endpoint so a notif can't bounce between endpoints. **Resurface boundaries** (reuse existing reported events, no new plumbing): state→active (`wake` from offline/suspended/dormant), `api boundary clear`, `api boundary compact`, and **new-session-start**. The daemon injects undismissed notifs into the activating/cleared/compacted/fresh session's context.
- **Scope:** an endpoint only ever sees/resurfaces notifs for subnets it is **visible in** (excluded endpoints never receive that subnet's notifs).
- **Delivery.** Default: a notif to an agent is **just normal SPT messaging to the agent's perch with a notif-flavored envelope** — the existing delivery/inject path handles it and the agent surfaces it conversationally (no new delivery machinery). Optional: a **`notif_command` manifest template** for endpoint-native rendering (OS toast, etc.), substitution keys from the envelope, **combinable** with the agent-surface path (blanket template, not per-notif-type). The `notif_command` seam lives on **both harness-adapter and shell-adapter manifests** — if presence resolves the user not just to an endpoint but to a **Shell attached to it**, the notif renders via *that shell's* `notif_command` (e.g. GameRobot `alert-symbol`). This is the forward generalization and v1 seam in one.
- **Producers (open set):** system events (node-paired, pairing-request, update-available, consent-needed) + agent-issued `spt subnet notify`. New producers register later.
- **Envelope `from`** — for agent-issued notifs carries the issuer's **endpoint id, node, and subnet** (the subnet is **surfaced to the receiver only if its node is on multiple subnets** — otherwise redundant). Makes notifs attributable now and targetable once the per-(subnet, user) model lands.

<!-- [doc->REQ-NOTIF-2] -->
**`spt subnet notify` (agent-issued subnet notif; bare `spt notify` moved under the subnet noun at M8-D1)**:
A command letting any agent **issue a subnet-wide notif** to the user. v1: it **reaches the user on their active endpoint from any agent** (the default broadcast-to-user). *Forward:* gains targeting — **all subnet users (default) or specific subnet users** — once the per-(subnet, user) model lands; the `from` field is what makes targeted/attributed delivery possible.

## Self-update

<!-- [doc->REQ-ADAPTER-UPDATE-MESSAGE] -->
Seamless, realtime self-updating is a day-one pillar. After a plugin performs the initial spt-core bootstrap, `spt.exe` self-updates from then on and **ripple-updates registered adapters** via each adapter's manifest update declaration. An adapter may include a `[update].message` — a plain human notice (markdown-rendered) that `spt adapter update` prints to stdout only when a new version is **actually applied** (never on a no-op). Useful for post-update operator actions, e.g. `"Run \`/reload-plugins\` in any ongoing sessions."` for spt-claude-code.

**delivery** — peer-propagated over P2P, layered on self-fetch, with out-of-band still supported. One node learns of / obtains an update (marketplace drop *or* self-fetch from a release channel), gossips availability across the subnet, and peers pull the new binary over the networking layer — the subnet self-heals to latest. **All binaries are signature-verified before handoff regardless of source** (peer-propagation otherwise lets a compromised node poison the subnet). spt-core has its own release signing key, distinct from any OS-publisher code-signing (the Windows-publisher-trust question is separate).

**handoff invariant** — **no endpoint process terminates or suspends during a self-update.** We cannot assume every endpoint can safely suspend. Satisfied by the broker/brain split: routine updates replace only the daemon brain, which rehydrates from disk state and re-attaches to the broker's held PTY/socket/child handles. Harness-hosted endpoints (topology 1) are safe by construction; spt-hosted endpoints (topology 2) are kept alive by the broker holding their PTY masters + child processes across the brain restart.

**broker update frequency** — rare by design. Triggers: broker↔brain IPC contract change (versioned so a newer brain talks to an older broker — broker stays put as the brain updates), held-resource-type change, OS PTY/socket API change, or a broker bugfix. When the broker *must* update (the one case that can disturb held endpoints), it is a small well-scoped place that can use FD-passing or a rare planned endpoint-cycle — not the whole daemon.

**debug rollout**:
A deliberately non-production update distribution across a trusted lab subnet, used to test a local build on multiple nodes before a public release. It is still an update, not a raw peer file-copy: every recipient treats it as a signed, channel-scoped candidate that must pass the normal verification and consent gates; lab nodes pinned to the debug channel may explicitly opt into full-auto apply. Trust is node-local: debug rollout keys are installed through each lab node's release-key overlay, never embedded in the production trust anchor. A debug rollout may carry a platform artifact set assembled by one fast coordinator; each recipient applies only the artifact for its own platform. The rollout driver is maintainer/dev tooling that reuses SPT's update substrate, not an end-user product surface in the production package.
_Avoid_: production release, peer proliferation, subnet binary copy

**cadence/consent** — **not fully automatic by default; gated on user confirmation.** The update prompt is delivered via spt to the user's **most-recently-active live session** (v1 registry resolution — a precursor to PresenceChannel dispatch, implemented on the v1 registry, not the deferred channel) and offers an "enable full-auto" choice. Full-auto is the opt-in seamless path.

## Instances

The multi-instance identity model. An agent is no longer "a perch on one machine"; identity and materialization split into two layers:

**endpoint ID**:
The logical identity (`ling`). Shared subnet-wide. The thing you address. **Adapter-agnostic**: the same ID can run under different adapters over its life (start on `claude-spt`, later revive on `spt-codex` or `spt-pi`) — adapters expose the same SPT-relevant capability subset, so the ID is not bound to an adapter. The currently-running `adapter_name` is a property of the live instance, not part of the identity. **At most one instance of an ID per node** is allowed; a second attempt on the same node (any adapter) is rejected. **Identity is node-global, advertised per-subnet** (see *subnet membership*): one `ling` exists on a node (one mind, one `tracked/` context) and is advertised into each subnet the node belongs to, subject to per-subnet visibility (see *endpoint visibility*). The same bare name may exist as *distinct* endpoints in different subnets; the resolution policy forces qualification when both are visible. Within a single subnet a bare id is unique — enforced by a **join-time collision check**: when a node carrying endpoint `X` joins/advertises into a subnet that already has a different `X`, the clash is surfaced and resolved (rename one — below) before advertisement. An endpoint keeps an **ordered adapter history** (in `tracked/agents/<id>/meta.json`, synced) driving the resume UX: latest adapter is the default resume offer, then next-latest, … oldest, then "choose a different adapter". (Adapter-agnosticism is an *agent-endpoint* property; Shells are adapter/platform-bound — see Shells.)

**adapter selection (creation & change)**:
At **creation**, the adapter is chosen from the node's **registered `kind="harness"` adapters whose `hostable_types` includes the endpoint's type** — **auto if exactly one qualifies, else chosen** (`--adapter` / picker): the same *auto-if-one-ask-if-many* rule as *home subnet*. The choice seeds the **head of the ordered adapter history** and the live `adapter_name`. **Changing** an endpoint's adapter is **not** a standalone operation — adapter is a *live-instance* property, so you change it by **launching/resuming the endpoint under a different registered adapter** (the resume UX surfaces the adapter history: head = default, then prior adapters, then "choose a different adapter"). For a currently-active instance this is a **resume-under-the-new-adapter** carrying context via the *fresh-with-preload* psyche-download (a session cycle, not a hot swap — the mind is adapter-agnostic). Same gate both times: the target adapter must be **registered on the node** and its `hostable_types` must include the endpoint's type.

**instance**:
The *same* endpoint (same harness + adapter) running **natively on a machine** — local files, local compute. `ling@desktop` and `ling@laptop` are both real local `ling`s, not remote views of one. What is per-node vs synced:
- **Per-node (anchored, cannot teleport):** the project working directory / files, and the harness session history. A git repo can't teleport; files are obtained locally (e.g. `git pull`).
- **Synced across instances:** the **Psyche context** (the agent's mind). See the two-tier sync below.

This is the core meaning of "the same agent on multiple machines": same identity, native local materialization on each, with the mind kept in sync. The ID is the shared identity for addressing; files are local; the mind follows.

**dormant / suspended instance:**
At most one instance of an endpoint is typically the *actively-driven* one; the others rest. Two resting states:
- **dormant** (default, *warm*): a live, state-preserved seat whose harness session stays running → instant re-activation. The default resting state.
- **suspended** (opt-in, *cold*): the harness session is closed and resumed-on-wake → frees RAM/compute (and, for billable harnesses, footprint), at the cost of a resume on wake. Reached via per-endpoint opt-in *auto-suspend*, or on demand: a subcommand can **suspend an endpoint from anywhere** (any node).

**State transitions (no idle timer):**
- **active** = the instance is the **most-recently-interacted instance for that ID** *and* **has a driver attached**. Both conditions. There is **no idle-grace timer** — a session stays active as long as it holds those two properties. A **linked Shell counts as a driver** (an agent with a live shell stays active even when the user isn't typing).
- **active → dormant** — the driver detaches, *or* another instance of the same ID becomes the most-recently-interacted one (attention shifts). No explicit stop needed to switch; driving `ling@laptop` makes `ling@desktop` dormant.
- **dormant → suspended** — manual (`spt endpoint suspend` / `spt endpoint shutdown` / shell `api owner-shutdown`), *or* opt-in auto-suspend after an `auto-suspend-after` threshold **counted from the moment the instance went dormant**.
- **wake (→ active)** — `spt endpoint wake`, a driver attaches, or a shell wake-watcher fires.

A resting instance retains its files + last context and is re-activatable in place — a lightweight **wake** (state's already there), distinct from instantiate-anywhere's fresh spawn. Resting instances remain addressable by node (`ling@desktop`). Registry status: **active / dormant / suspended / offline** (offline = node unreachable).

**Default policy:** warm (dormant) is the default when undriven. **Auto-suspend is opt-in, default OFF globally**, but **node-overridable** (a handheld / Pi / resource-constrained node may default it ON) and **per-endpoint overridable** (e.g. a billable harness whose warm session holds a cost footprint). Thresholds (`auto-suspend-after`) are config knobs: global default → node override → endpoint override. *Confirmed by measurement (M4-D9-3, `docs/DORMANCY-BUDGET.md`): an idle warm seat burns zero CPU — the cost is RSS only (~8 MiB shell-class, ~300 MiB LLM-harness-class); suspended residual is just the on-disk record.*

**Commands** (user/agent-initiated lifecycle, under the `endpoint` noun since M8-D1): `spt endpoint suspend <id[@node]>` (suspend any of your instances from any node), `spt endpoint wake <id[@node]>` (explicitly wake a dormant/suspended instance), `spt endpoint shutdown` (an agent suspends its *own* endpoint — graceful, fires the suspend boundary signoff, cascades shells offline), `spt refresh` (spt-hosted only — an agent clears + resumes itself without stalling: `/clear` → commune capture → guaranteed post-clear resume signal so its turn restarts from immediate next-steps; see `docs/CONTEXT-MEMORY.md`).

**`spt endpoint rename <id> <new_id>`** (endpoint rename, **rippled**): change an endpoint's logical ID and propagate the change to **all its instances** subnet-wide — the registry entries, every node's perch, and the synced context branches (`a-<id>` → `a-<new_id>`, and project-view references). The rename is one logical operation against the eventually-consistent registry; it must collision-check `new_id` against every subnet the endpoint is advertised into (the join-time check above), and reconcile concurrent renames by the same precedence marker the context merge uses (node + newest-wins, KNOWN-HAZARDS 6.5). The adapter-agnostic, node-anchored identity (ADR-0003) is what makes a clean rename possible; without it the ID would be entangled with per-adapter or per-session state.

**`spt endpoint fork <id> [--into <subnet>] [--as <new_id>] [--delete-source]`** (cross-subnet clone):
Clone an endpoint into a **new, identity-distinct endpoint** on another subnet — the sanctioned way to place an agent's mind in a different subnet (home subnet being immutable). The fork's **home = the target subnet**; it is **seeded with a one-time copy** of the source's mind (the live + project *Cross-node Psyche sync* tiers), after which the two **diverge** — there is **no ongoing sync**. A fork is therefore **not an instance**: an *instance* is the **same** identity on another node with a **synced** mind; a *fork* is a **new** identity with a **copied-then-independent** mind. The new id defaults to the source's bare name if free in the target subnet (the join-time collision check; else prompt/rename); the adapter follows the normal creation rule (*adapter selection*). The **source is untouched**, optionally deleted afterward (`--delete-source`, or a later delete). Distinct from **rename** (same identity, new label, rippled) and **instantiate-anywhere** (same identity, new *node*, consent-gated).

**transition echo commune:** any active → (dormant | suspended) transition fires an **echo commune** capturing the outgoing instance's final context delta; that commune syncs to whichever instance becomes active next. This is the finer mechanism behind catch-up-on-activation (and mirrors the sister project's "echo commune before signoff" pattern — KNOWN-HAZARDS 3.3).

**deferred-message gate:** deferred (spool-only, hook-consumer) messages are **not** delivered to a dormant or suspended instance — they hold until it is active again. (Extends the deferred-row semantics, KNOWN-HAZARDS 1.4/4.4, with an instance-state gate.)

**Remote-control vs local operation (two distinct modes — not the same as instances):**
- **Operate locally:** drive the native instance on *your* machine (its local files, its synced mind). The normal case.
- **Remote-control (Shell-like):** attach a control/view surface to an instance *running on another node* — compute + files stay remote; you are a viewport (the byte-stream terminal attach, daemon-to-daemon over Iroh). Used when you specifically want *that machine's* environment. This is effectively a Shell (a driven surface, user→agent direction), separate from the instance concept itself.
- **Smooth handoff (remote-control → local):** no teardown required. `git pull` brings the files; the **fresh-with-preload resume seam** preloads the other instance's latest synced context (psyche-download); the remote instance simply goes dormant. The mind follows; the files are local.

### Pieces the Instances model requires

**subnet registry**:
Distributed, eventually-consistent map `endpoint_id → [instances: (node, perch, type, status)]`, **per subnet**. Every `spt-daemon` participates in the registry of each subnet it belongs to. This is the "agent name resolution" the sister project deferred — now first-class and mandatory.

**endpoint visibility (per-(endpoint, subnet))**:
Whether an endpoint is exposed to a given subnet. **Hidden means *excluded*** — not advertised in that subnet's registry **and not routable from it** (addressing it from that subnet fails until explicitly revealed), not merely unlisted. This makes hiding a real boundary, ready for the cross-user (b) seam, not a silent non-boundary. Computed as: endpoint `E` is **hidden in subnet `S` iff** `S.hide_new_endpoints` (a per-subnet policy captured at the node's **join time**) **OR** `E.default_hide_from_new_subnets` (a per-endpoint setting) — **unless** an explicit per-`(E,S)` override says otherwise (the override always wins). Both defaults ship **OFF** (visible); hiding is opt-in. **Visibility gates sync:** `hidden ⟹ not synced`, and replicating a mind into `S` (the sync-membership list under *Cross-node Psyche sync*) requires `E` be visible in `S` — visibility is the outer gate, sync-membership the inner opt-in within visible subnets.

**home subnet (per endpoint)**:
The single subnet that anchors an endpoint's *defaults*: the default **sync** scope (subnet-exclusive mind replication — *Cross-node Psyche sync*) and the subnet that qualifies its **bare name**. **Assigned at creation:** the node's **sole** subnet automatically; on a **multi-subnet** node it **must be specified** at creation (no silent guess — mirrors the *resolution policy*'s refuse-and-qualify); on an **unpaired** node the endpoint is local-only until first join, when home is set. Distinct from advertisement: identity is **node-global and advertised into *every* subnet the node belongs to** per the visibility defaults — home subnet is only the default-scope *anchor*, not a limit on where the endpoint is visible or addressable. **Home is immutable** — there is **no re-home** (ADR-0010). To place an agent's mind in a different subnet, **fork** it (see *`spt endpoint fork`*) and optionally delete the original; this avoids any scope-migration / stale-mind ambiguity (the source is untouched until explicitly deleted).

<!-- [doc->REQ-ENDPOINT-UNBOUND-ATTACH] -->
**Unbound endpoint**:
The lifecycle point between *spawn* and *bind*: an spt-hosted endpoint whose broker **session + PTY are live** but whose harness has **not yet bound** its perch (the *post-spawn seam* hasn't fired — e.g. the harness is waiting on a startup prompt). On-disk status `unbound` (spawn → `unbound`; bind → `online`; session death → `offline`). An Unbound endpoint is **attachable** (a live PTY — `spt rc` and the `endpoint run` attach reach it, so an operator can see and drive the harness, including clearing a bind-gating prompt) but **not message-addressable** (no bound `session_id` yet — messaging stays gated on `online`). Distinct from *offline* (no session) and from *online* (bound). In the picker it renders **hollow** (and hollow-controlled when driven) — not amber, which is the *harness-only* (online-but-not-broker-controllable) state, the opposite of attachable.

<!-- [doc->REQ-INST-14] -->
**resource advertisement (subnet resource registry)**:
A per-endpoint **free-text blurb** describing the services/functions the endpoint can serve — an agent **yellow-pages** for service discovery, distinct from *capability declaration* (machine-readable, which endpoint *types* a node hosts) and from *endpoint visibility* (whether it's addressable at all). **Both-authored + mutable:** config seeds a default; the agent refines its own at runtime (`spt endpoint description set …`). It is **not a separate registry** — it is a field on the endpoint record and a **projection** of the subnet registry (`(id, node, resources-blurb)`), surfaced as the "subnet resource registry" view. **Gated by the access layers:** an endpoint excluded from subnet S (visibility) — or, later, one whose access whitelist (*endpoint access whitelist*) excludes the viewer's node — never appears in that view; discovery leaks nothing a viewer couldn't reach. Synced as **registry data, not context** (directory metadata, not the agent's mind). M4 (needs the distributed registry).

**resolution policy**:
How bare `ling` resolves when multiple instances exist. Rule: local-node instance if present → else most-recently-active → explicit override always available. **Two qualifier syntaxes**, combinable: **subnet-qualified** `home:ling` and **node-qualified** `ling@hfenduleam` (→ `home:ling@hfenduleam`). **Multi-subnet ambiguity rule:** under node-global identity (endpoint ID below), a node on several subnets may see two *distinct* endpoints that share a bare name (a `home:ling` and an unrelated `work:ling`). When a bare `id` is ambiguous across the visible subnets, resolution **refuses and forces qualification** (by subnet or node) rather than guessing. Within a single subnet, bare ids stay unique (join-time collision check, below).

**instantiate-anywhere** (seam day-one, behavior deferred):
From node Y, spin up an instance of an endpoint on node X by routing a launch command to X's daemon, which runs the endpoint's manifest locally. "Agents can spin up instances (including of themselves) on any node, possibly requiring user consent" — so a **consent gate** governs remote instantiation. The consent-UX design is deferred; the model is built to accommodate it.

**cross-instance context freshness** (two-tier):
When a dormant instance is activated again, it is unobtrusively fed the latest Psyche context — newest-among-peers *and* newer than its own. Scoped by the existing Psyche-context split:
- **live context** (per-agent, project-independent — the agent's general mind): synced to **all** instances of the endpoint, regardless of project.
- **project context** (per-agent-per-project — project-specific detail): synced **only** to instances sharing the **same project**.

So all `ling` instances catch up on live context; only same-project `ling` instances catch up on project context. This resolves the "divergent simultaneous work" concern (two instances on different projects don't clobber each other's project detail) without losing the "mind follows everywhere" property.

The precedence/freshness guard (KNOWN-HAZARDS 6.5) — with node identity in the marker — keeps a dormant instance's stale context from clobbering the active authoritative one.

**Authoring directives + memformat** (full design: `docs/CONTEXT-MEMORY.md`): the live/project split is made stable by (1) a per-*topic* cross-project discriminator with default-to-project bias, (2) **commune-source asymmetry** — echo communes are project-primary and live-conservative (the leak fix), and (3) **deep memformat integration** — memformat becomes the load-bearing, self-evolving, **tier-tagged** schema (each topic tagged `live`/`project`, decided once at creation, re-tiering migrates content), whose topics are containers of **structured blocks** (id + Concepts/Aliases/Applies-to + Source + timestamps + Hash + Pinned). A generated **Retrieval Map (INDEX)** per project + per agent realizes cross-agent synthesis + query-routing. **Start-injection** is dense on the working set (live + current-project hot/`Pinned` blocks → immediate productivity) and sparse on the tail (other projects, cross-agent synthesis, cold blocks → pointer + INDEX, on-demand), tunable via a density knob. Several patterns adapted from `BigscreenVR/pi-agent-memory`.

### Off-node reach-back

When the user is driving an instance from off-node, the agent should be able to:
- **detect** that it is being driven remotely (and from which node) — observable from the `spt-daemon`, which knows whether the driving session-surface stream is local or network-attached; (v1)
- **transfer files** to/from the user's (driving) node — file payloads over the messaging substrate; (v1)
- **run commands** on the user's node — remote command execution, **deferred + security-gated** (shares the consent model with instantiate-anywhere; arbitrary cross-node exec is the highest-risk capability in the system). Deferred partly because workarounds exist: the remote-node agent can message a local-node agent to run the command, or the user can interact with a local-node agent directly. (deferred)

### v1 scope line (V1-mid)

- **In v1:** the endpoint-ID / instance data-model split; the subnet registry; resolution policy; remote-drive of *already-running* instances (network session-surface); cross-node Psyche sync (full — replaces gh-repo-sync); cross-instance context-freshness auto-feed; off-node file transfer; off-node remote-drive detection.
- **Seam-compatible but deferred:** instantiate-anywhere + its consent gate; remote command execution + its security gate (workarounds: remote agent messages a local agent, or user interacts with a local agent directly).
- **Rationale:** the data-model split and registry are foundations — deferring them forces a later rewrite, so they are in regardless. Remote-drive of a running instance is the highest-value slice and exercises the whole networking+daemon stack end-to-end. Instantiate-anywhere and remote-exec carry consent/security design that should not gate v1.

## Workspace & versioning

The crate-boundary design is the rebuild's main value (ADR-0001) — honest decoupling that prevents the sister project's tangle from re-forming. Principle: **many small, acyclically-layered crates** (not a few fat ones).

Layering (bottom → top): `spt-proto` (wire envelope grammar, endpoint types, message framing, Ed25519 identity types, typed+binary payload schema — pure types, no I/O) → `spt-store` (spool, perch layout, trust store, registry persistence) → `spt-msg` (delivery TCP+spool-fallback, routing, send/ring) → `spt-net` (Iroh, mDNS, SPAKE2/TOTP pairing, subnet-registry distribution; `net` feature-flagged) and `spt-term` (session-surface trait, PTY impl, broker) and `spt-runtime` (`AgentRuntime` trait, `ManifestRuntime`, manifest schema) → `spt-live` (Psyche/pulse/commune/signoff) → `spt-daemon` (the brain: ties all, hosts broker, update orchestration, instance registry) → `spt` (the `spt.exe` binary; thin CLI).

**public SDK surface** (semver-committed): `spt-proto` (the wire contract — anyone speaking SPT depends on it), `spt-runtime` (the trait third-party harnesses implement), `spt-msg` (the deeper messaging-integration path). Everything else (`spt-store`, `spt-net`, `spt-term`, `spt-live`, `spt-daemon`) is internal/unstable pre-1.0 — reachable, but no stability promise. The *integration* crates are stable; the *machinery* churns. **Private-source consequence (ADR-0014 era):** with the source repo private long-term, the SDK crates are **first-party-only** for now — the third-party integration surface is the **binary** (manifest + `spt api`), exactly what the harness contract was designed for. rustdoc for the SDK crates is still generated and deployed to the Pages docs (no docs.rs — that needs crates.io). Publishing the 3 SDK crates to crates.io (their source goes public; machinery stays private) is the explicit later path if third-party deep integration materializes.

**wire-protocol version vs crate semver** (orthogonal, never conflated): `spt-proto` carries an explicit **wire-protocol version** independent of crate semver. Two nodes — or a node and a linked library consumer — interoperate iff their proto versions are compatible, with a documented **N-1 compat window**. Crate semver governs *API* stability; proto version governs *wire* stability.

## Pairing & trust

How a single user's own nodes come to trust each other and form a subnet. Cross-*user* subnet (two *different* users' subnets interoperating) is explicitly deferred (the sister project's "cross-subnet contract" — very-early future concept), but the model is **built with the seam for it** (see *subnet membership*).

**subnet member / membership proof (seed-proof)**:
A **member** of subnet S is any node that can prove knowledge of S's **current-epoch seed** (the seed every trusted node already holds — *TOTP-seeded SPAKE2 pairing*). This **membership proof** ("seed-proof") is the authorization the inbound gate consumes: a connecting peer is admitted iff it proves current-epoch seed-knowledge, bound to its QUIC-handshake-proven node pubkey (so identity stays per-node and unforgeable — *trust store* / KNOWN-HAZARDS 7.5). Membership = seed-knowledge, **not** pairwise pinning: a member is accepted even by a peer it never paired with (the mesh property), which is what lets non-directly-paired nodes reach each other. Symmetric by design — any member can prove membership and any member can vouch a new one — which is sound under the v1 same-user model (every member is the one user's own node; a compromised member already holds the seed, so seed-proof adds no surface beyond what *seed rotation* already mitigates). _Avoid_: conflating "member" (seed-proof, dynamic) with "trusted peer" (a pairwise pin in the *trust store*).
_Replaces_: the pairwise-`is_trusted` authorization at every inbound gate (registry apply, WAN receive, sync, notif, connection accept).

**member roster**:
Per subnet, the set of member node pubkeys with their addresses and labels — a **discovery** directory ("whom to dial / what to call them"), **not** an authorization list (authorization is *membership proof*). Propagated transitively: seeded into the joiner at pairing and gossiped whenever any path opens, so a node learns members it never paired with. **Discovery-only and therefore forgery-inert** — a fabricated roster entry just names a pubkey that still cannot produce the seed-proof on connect, so it grants nothing; a stale entry is a dead address you fail to dial (harmless, like a stale *trust store* row). Distinct from the *subnet registry*: the roster is node-level membership/address data; the registry is endpoint-level instance data fetched **directly from each member** over a handshake (rows are never relayed — *membership proof* / KNOWN-HAZARDS 4.10). **Shape:** a union-merge grow-set — each node authors its own entry (address/label/machine_id, stamped with its monotonic epoch, merged strictly-greater-wins like `node_label`); **seeded in full at pairing** (the seed-holder hands the joiner the whole current roster, so a fresh node knows even offline members — this folds in the deferred pairing-time hostname capture + post-join address seeding); merged on every member connection. Removal needs a **tombstone** (a grow-set can't delete by omission — an un-tombstoned revokee re-inserts itself on its next connect): a per-pubkey revoked marker that dominates the entry and gates admission (*seed rotation / revoke*). Persists through silence (an offline member keeps its entry); dropped only by tombstone.

**subnet membership (multi-subnet)**:
A node may be a member of **multiple subnets at once** — but, for v1, only multiple subnets *of the same user* (e.g. a `home` fleet and a `work` fleet, each cryptographically its own TOTP seed + trust context). A node simply holds *N* seeds / trust-contexts rather than one. The per-subnet trust boundary is unchanged: within each subnet, all nodes are the one user's own mutually-trusted nodes (no stranger-auth). **Cross-*user* membership (joining another person's subnet) stays deferred**, but the membership model is multiplicity-ready so it can drop in later without re-architecture. An **external subnet** (relative to a given subnet) means *another subnet the node also belongs to* — in v1, always another of the user's own.

**Cross-user seam (forward design, not v1):** the seed key generalizes from *per-subnet* to **per-(subnet, user)** — a seed then doubles as an identifier+authenticator for *whose* it is. The pairing/code prompt becomes **2-stage** (select user / create-new → then select subnet / create-new); in v1 the single implicit user collapses it back to the 1-stage prompt. This makes node↔user attribution fall out for free ("node `enlyzeam` is using Brandon's `work` seed → that machine's active endpoint is Brandon's") and makes **force-unlink of an entire user's node-group** a single operation (drop that user's seeds). v1 stores seeds in a shape that already accommodates the (subnet, user) key.

**seed rotation / revoke**:
Removing a member is **`spt subnet revoke <node>...`** (elevation-gated, revoke-only — *adding* a member never rotates; the joiner just receives the current seed at pairing). Two effects:
- **Immediate:** writes a **roster tombstone** per revoked pubkey (see *member roster*) — propagates over member connections, suppresses that pubkey under the roster's union-merge, and augments the inbound gate to **membership-proof ∧ ¬tombstoned** so the node can't reconnect-and-reinsert (it still holds the seed until rotation). Force-drops its connections.
- **Coalesced rotation:** the tombstone schedules **one** seed rotation (re-mint seed, bump the **seed-rotation epoch** — `SubnetRecord.epoch`, ADR-0005 #10; push the new seed confidentially over member-authenticated TLS connections, **never** in roster/registry gossip — the seed is not roster data) at the close of a **coalescing window (default 1 h)**. Further revokes within the window join the same rotation → **one epoch bump** however many nodes, keeping benign offliners inside the single-epoch *re-seed* grace. `--force-rotate-seed` skips the window and rotates **now** (compromised-node path: seed dies immediately, not in an hour).
A completed re-pair ceremony for a tombstoned pubkey clears its tombstone (deliberate re-admit).

**re-seed (auto-heal grace)**:
A benign member that was **offline during a revoke** returns on the *prior* seed epoch (N-1) and would fail *membership proof* against rotated peers. The grace: a node proving the **immediately-prior** epoch **and still on the *member roster*** is granted a **re-seed-only** restricted connection that hands it the new seed — nothing else. Heals a sleeping node automatically; the **revoked node is off-roster → denied** (not a revocation hole); a node stale by **≥2** rotations (N-2) gets no grace → re-pair. Grace depth is **one epoch** (a verifier only retains the single prior seed); *seed rotation*'s batch-revoke keeps multi-removal to one epoch bump so it stays inside this window.

**pairing vs join (verbs)**:
**Pairing** names the *ceremony* (the TOTP-seeded SPAKE2 exchange between two nodes). **Join** names the *outcome from the subnet's perspective* — a node joins a subnet (`spt subnet join`). User-facing surfaces use **join** (the user's mental object is the subnet, not the node-pair); "pairing" remains correct for the ceremony mechanics and the pre-trust ALPN.
_Avoid_: "pair a subnet" (nodes pair; a node *joins* a subnet).

**joiner / seed-holder (ceremony roles)**:
The **joiner** is the new, not-yet-trusted node running the ceremony's initiator side (`spt subnet join`, typed code). A **seed-holder** is any already-trusted member node holding the subnet seed; every member is one, and any online member's daemon answers the join rendezvous as responder — always-on, no arming step (the user interacts only with the new node; the subnet-global rate limiter is the standing-listener guard).

**TOTP-seeded SPAKE2 pairing**:
The day-one pairing model. A durable **TOTP seed** is the subnet secret: generated on the first node, shown as a QR → stored in the user's authenticator app *and* held by every already-trusted node. To pair a new node, the rotating 6-digit TOTP code is used as the **password for a SPAKE2 (PAKE) handshake** — *not* as a bearer token verified by a seed-holder. A trusted node (online, no human needed — it holds the seed) computes the current code as its PAKE password; the user reads the code off their phone and types it into the new node. Matching codes → PAKE succeeds → pubkeys exchanged and bound, MITM-resistant encrypted channel established.

Why this construction (not plain TOTP-verification, not plain Magic Wormhole):
- **Not TOTP-as-bearer-token** — a 6-digit code sent over the wire for verification is ~20 bits, replayable within its 30s window, and doesn't bind to the key being exchanged (real-time MITM/replay seam). PAKE makes the low-entropy code MITM-resistant and limits attackers to one online guess per attempt (no offline brute force).
- **Better UX than plain Magic Wormhole** — Wormhole needs a fresh code on one node typed into the other, requiring a human/relay at *both* ends each pairing. With the persistent TOTP seed replicated to online trusted nodes, the trusted side is automatic; **the user only ever interacts with the new node** (read phone, type into new machine).

**trust store** (RETIRED — superseded by *member roster* + *membership proof*):
Historically a local TOFU store (`peers.json`) of pinned peer pubkeys that **authorized** inbound connections. The mesh model replaces pairwise-pin authorization with live **seed-proof** (*membership proof*), so the pinned-peer list is gone — **hard cutover, no `peers.json`** (single-user fleet, no migration). Its one surviving function is **warn-on-change**, reframed: an **awareness notice** (not a gate — seed-proof already admitted the peer) fired when a node presenting a **known machine_id** appears under a **new node pubkey** (*"machine M, last seen as K1, now presents K2 — reinstall/maintenance, or investigate"*). Anchored on **machine_id**, not label (hostnames collide; machine_id is stable across an spt reinstall). Honest limit: machine_id is self-asserted in gossip, so this defends against benign confusion, **not** a seed thief (already full-compromise, *seed rotation*-mitigated). Same event drives the **REQ-SUBNET-7 re-pair overwrite** (known machine_id + new key → warn *and* supersede the dead roster entry).

**link discovery (TOTP-epoch + name) — the meet selector is public; the code is not**:
The pairing rendezvous (the **meet**) routes two not-yet-trusted nodes to the same relay rendezvous over the pre-trust pairing ALPN. Its selector is **`(subnet-name, TOTP-epoch)` — both public**, *not* the secret code. Distinguish the two TOTP-derived values (they are routinely conflated):
- **TOTP-epoch** — the current 30 s time-bucket (`floor(unix/30)`, the code's `totp_step`), derivable from the (NTP-corrected) clock alone; one of the two **meet** selector inputs.
- **TOTP-code** — the secret 6-digit `HOTP(seed, epoch)`; the **SPAKE2 password only**, consumed in the ceremony *after* the meet, **never a discovery input**.

The subnet-name (the R-PAIR-4 human label) is the second meet input in all cases, namespacing the rendezvous. Because the meet is **code-independent**, a joiner can find a seed-holder *before* it has the code — which licenses the **two-phase join** (meet on name + epoch first; prompt the code only at the ceremony). _Avoid_: saying "the code routes discovery" — the public epoch does; the secret code only authenticates.
- **join-existing** → enter the target subnet's name; the daemon meets any online seed-holder on `(name, current-epoch)`; the **code is entered for the SPAKE2 ceremony**, not to discover. The joiner never enumerates the subnet's nodes.
- **create-new** → the subnet is **named at creation**: one machine generates the new seed *and names the subnet* up front (becoming the sole seed-holder), then the joiner uses that subnet-name + the new seed's code. No node-name mode — naming simply happens at link start rather than after.

The name only namespaces the rendezvous; a collision (two unrelated pairings sharing a code + name) merely fails the PAKE (wrong password) — no security break, just a retry. **Rendezvous-token hashing (under the hood):** the relay routes the *pre-trust* pairing by a rendezvous token; the payload is already E2E-encrypted (R-NET-2), but the token itself is relay-visible, so spt-core derives it as `H(subnet-name ‖ TOTP-epoch)` rather than the plaintext label — the user/agent still enters the **raw name + code**, the hash is internal.

**fetch-code-from-any-node (per-subnet, QR-optional)**:
Every node in a subnet holds that subnet's seed, so the user can fetch the *current* code for **any subnet the node belongs to** from any node in it — no phone required if a trusted node is handy. Because a node may be in several subnets (*subnet membership*), the fetch is **per-subnet**: with several subnets and no name given, the CLI prompts *"Show the code for which subnet?"*; `spt subnet show-code [name]` bypasses the prompt (the scripted path). Minting a new subnet is its own verb (`spt subnet create <name>`), not a fetch flag. The code is offered optionally as a **QR / `otpauth://` URI** so the seed can be stored directly in an authenticator app (Google Authenticator etc.).

**Node-bound code fetch — and every subnet-membership mutation — is gated behind OS privilege elevation** (Windows UAC / Linux root-or-equivalent): retrieving a subnet's code *from the node*, minting a new subnet (`subnet create` — a seed reveal), and joining one (`subnet join` — enrolling the machine into a trust fabric) all require either hardware/elevated access OR an **elevated endpoint** (an agent whose process is elevated can surface it). The join gate exists because membership is a trust-boundary change: an unprivileged process must not be able to enroll the machine into an attacker's subnet without the user's consent. Read-only subnet views (`subnet status`) are ungated — they reveal no secrets. This means the node-bound path proves real possession of the machine; everyone else falls back to **their own authenticator-app TOTP store** (where the seed was stored at pairing). The gate narrows the multi-subnet exposure — without elevation, mere CLI/agent presence on a node no longer yields *any* subnet's join-code; with it, node compromise still implies full trust loss for that node's subnets (unchanged baseline).

**self-elevating re-launch (cross-platform)**:
<!-- [doc->REQ-ELEVATE-1] -->
When a gated command is run unelevated, spt does not just print "run as administrator" — it **re-launches itself with privilege** so the user reaches the result in one step. The path is chosen by a pure decision seam (`elevation::decide_elevation_path`) from the OS, the current elevation, and the environment: an **interactive Unix TTY** re-execs inline under `sudo`; a **Linux desktop without a TTY** prefers **`pkexec`** (native polkit GUI auth, clean stdio) and falls back to a **terminal-emulator** (`x-terminal-emulator -e sudo …`, then gnome-terminal/konsole/xterm); **Windows** pops a **UAC console** via `ShellExecuteW("runas")`; anything else (headless / no path) prints the absolute-path command for the human. The elevated child runs in its own console — on Windows it self-pauses ("you can close this window") so a fresh UAC console stays legible, since the unprivileged parent **never captures the elevated child's output across the privilege boundary**. The security discipline is `REQ-HAZARD-SELF-ELEVATE` (KNOWN-HAZARDS 5.11): the re-launch re-runs the **exact** invocation with the binary's **absolute** path, **never** widening args, resolving a bare name, or interpolating a crafted arg into a shell string (every launcher passes an argv array; the Windows params string MSVC-quotes each verbatim arg). The user's UAC/polkit/sudo prompt is the only consent gate, and an already-elevated process **never re-elevates** (loop-safe). The mechanism is generic — reused by every gated command, not subnet-specific. On **create** and **join** (and `show-code`), the elevated output includes the subnet's code, `otpauth://` URI, and a **terminal QR** so the seed can be stored straight into an authenticator app from either side of the desk.

**node label**:
A human display name for a node, defaulting to the machine's OS hostname (re-checked at daemon startup; a hostname change updates the label). Advertised through the existing registry gossip so subnet views render `HFENDULEAM (bcead52b…)` instead of bare key hex. The pubkey remains the identity; the label is **addressable**: an `@node` qualifier accepts a label or a key-prefix (`ling@hfenduleam`). Labels are not unique — an ambiguous label follows the resolution policy's refuse-and-qualify rule (refuse, list candidates with key prefixes), never a guess.

**`#` always-on sigil** (ratified 2026-06-21):
A reserved **leading address sigil** marking an [[AlwaysOnEndpoint]], extending the `:`/`@` reserved-delimiter discipline (id charset stays narrow per `REQ-HAZARD-ID-CHARSET`; the sigil lives at the address-grammar layer, **never in the bare id** — `#general` addresses the clean stored id `general`). It is **mandatory and bijective**: `#name` ⟺ an always-on endpoint, bare `name` ⟺ an agent endpoint — so the router resolves the endpoint **class from the address alone**, before any registry lookup. Placement **hugs the id** within the qualified form: `[subnet:]#id[@node]` (e.g. `#general`, `#general@node`, `home:#general@node`). A mid-id `#` stays charset-rejected; only a single leading `#` on the id token is the sigil.

**subnet naming**:
The first time two nodes pair, the user is prompted to **name the subnet**. The name is a human label on the subnet identity (which is cryptographically the shared seed). On every subsequent pairing the same name is shown ("adding node to subnet `<name>`"), giving the user confidence they're extending the right subnet as the fleet grows.

**subnet icon** (GUI metadata): an optional small image (square PNG/WebP, ≤256 KB) representing the subnet in future GUIs. Stored **inline** in subnet metadata (so it syncs for free over the same subnet-material channel as the name; the cap keeps inline-sync trivial — it's an icon, not a media library), **editable by any node in the subnet, any time** (seeded at create/name). **GUI-only consumer** (frontend milestone); data distribution rides M4.

Design caveats (carried forward, none disqualifying): a seed-holder must be online + reachable by the untrusted new node at pairing time (relay allows pre-trust contact on a pairing ALPN); ±1 TOTP window tolerance; rate-limit handshake attempts; **recovery** = re-provision the QR from any still-trusted node (all hold the seed) — losing all nodes *and* the auth app makes the subnet unrecoverable (documented); **revocation** of a paired node is a separate trust-store delete (TOTP gates joining only).

**subnet attachment (attached / detached)**:
Whether this node's daemon is **actively serving** a held subnet membership right now — pairing responder reachable, rendezvous meet listener rotating, registry gossip pumping. **Attached** = serving; **detached** = the membership record (seed) is held on disk but deliberately not served (the daemon neither advertises into nor connects to that subnet). Detachment is a *chosen* per-subnet state — `spt subnet detach/attach <NAME> [--save]` (shipped M8-D2; `--save` persists the startup default in daemon config, renamed from the once-planned `--auto`; an unsaved flip deliberately does not survive a daemon restart). A *degraded* daemon that cannot serve at all (net-less broker — endpoint bind failed) is rendered as **no connection**, never conflated with deliberate detachment. Corollary, ratified 2026-06-06: **membership implies reachability** — the membership-creating verbs (`subnet create`, `subnet join`) ensure the daemon is running, because a sole seed-holder with no responder is a contradiction of the subnet's purpose.

**endpoint description** (canonical CLI term for the resource-advertisement blurb):
The per-endpoint free-text "yellow-pages" line (see §resource advertisement). Surface term is **description**; "blurb" survives only as the internal field name.

**whoami** (alias for endpoint list):
<!-- [doc->REQ-WHOAMI-1] -->
`spt whoami` is a thin **alias for `spt endpoint list`** — it prints the full view with the session's own endpoint **SELF-pinned first**, that pin carrying the endpoint's id, liveness state, and its authored **endpoint description** (the "who am I" answer). There is no separate bare-id command: nothing captured `id=$(spt whoami)` (environment variables don't persist between an agent's tool calls), so there is no scripting contract to preserve. `whoami` stays a top-level hot-path verb (its parse is unchanged, REQ-MSG-9); only the SELF pin's new description line is added behavior.

**endpoint list always merges local perches**:
<!-- [doc->REQ-ENDPOINT-LIST-MERGE-LOCAL] -->
`spt endpoint list` (and therefore `whoami`) **always** appends this node's **LOCAL perch roster** as a trailing section, in addition to the SELF pin and the subnet groups. The subnet groups are the WAN registry snapshot, which lags a just-bound perch by a pump cadence — so without the merge a freshly-online endpoint (or the caller's own, under `whoami`) could be **absent** from its own listing, which reads as lost. The earlier `--local` flag (a separate this-node-only view) is **removed**: the local view is no longer a mode, it is unconditionally part of the merged listing. `--subnet`/`--detail` still shape the subnet portion.

**local-link authentication**:
Cross-node traffic already rides Iroh (E2E-encrypted by node keypairs). The exposed surface is *same-node* channels where potentially-untrusted code touches SPT — a shell binary's HTTP/stdin/relay link, HTTP-to-harness-binary delivery. These require a **per-link handshake at `api bind` that establishes a link token + local-channel encryption**, required on every subsequent message — a local-link auth capability so other local processes can't inject into or read a link. Shell↔broker links specifically are encrypted + handshaked.

**binary-trust disclosure (accepted risk)**:
A shell binary is the least-trusted code in the system (3rd-party, runs on the user's node, agent-controllable) — but the same is true of harness binaries. The capability toolset bounds what an *agent* may ask; it does not sandbox what the *binary* may do with its OS permissions. Optional **shell-binary sandboxing** is deferred (gated like instantiate-anywhere/remote-exec). The baseline stance: **running any adapter or shell binary means trusting it — a disclosed, accepted risk for all spt-core users.**

## Consent & security gates

Gates the high-risk cross-node capabilities. **Trust boundary:** everything is within *one user's own subnet* (their own TOTP-SPAKE2-paired, mutually-trusted nodes; cross-subnet is deferred). So consent is **not** stranger-authentication — it is guardrails against (1) an agent doing something surprising/destructive the user didn't want, (2) **runaway** (spawning instances / burning compute/billing across nodes), (3) **compromise containment** (limiting a compromised node/agent's cross-node blast radius).

**Consent model — hybrid (grants + interactive escalation):**
- **Grant store** — records `capability × subject (agent) × target (node)` (this granularity is sufficient; finer per-command-pattern scoping is a later refinement). **Enforced at the target node** (the node receiving a remote action checks its local grants), **settable subnet-wide** (a grant authored from any node propagates to the target), and **revocable**. Lives with subnet security material (near the trust store), not in the context git repo.
- **Interactive escalation** — an ungranted high-risk action → the target node routes a **consent prompt to the user's most-recently-active session** (the registry-resolution precursor the update-prompt uses; PresenceChannel later generalizes it). Options: **allow-once / allow-always (writes a grant) / deny** — plus the harness's native free-text field (e.g. CC's AskUserQuestion "Other") for refinement.
- **Pre-consent flags** — the shell `can_shutdown` and endpoint `shell_wake_spawn_anywhere` flags are simply grants authored ahead of time via manifest/endpoint settings (same model, different authoring path).

**What is gated vs not:**
- **Gated (default-deny, grant or prompt):** remote command execution (highest risk), instantiate-anywhere (spawn an endpoint on a remote node).
- **Pre-consented by flag:** shell owner-shutdown, shell wake-spawn-anywhere.
- **Ungated:** remote-drive of your *own running* instance (low risk — your own instance; a light "node X is driving `ling@desktop`" notification suffices), cross-node context/Psyche sync (your own data).

**endpoint access whitelist** (distinct from the grant store — the outer reach gate):
A per-endpoint allow-list controlling **who may remotely reach** an endpoint, **by origin node** (and, later, by user). The load-bearing semantic: it checks the **origin node of the inbound interaction** (where the operator is acting), **not** the sender endpoint's identity or home — *"the operator must be on a whitelisted machine; the sender endpoint's location is irrelevant."* It is a **stateful-firewall** model, not a blanket block:
- **Outbound** from the endpoint → any **visible** node (the whitelist never restricts who it talks *to*).
- **Inbound** → a **reply** correlated to the endpoint's own prior outbound (reply traffic, keyed on the inbound `from`) is allowed from any visible node ("established/related"); an **unsolicited/direct** control or message ("new inbound") is allowed only from a **whitelisted** node.
- **Same-node operation is always allowed** (you're at the hardware; the home node never whitelists itself). The whitelist gates **remote (cross-node)** reach only.

**Default empty = open** (any subnet-visible node, current behavior); setting a whitelist *restricts*. **Node-tier ships now (M4)** — whitelist Ed25519 node pubkeys. **User-tier is deferred** (the per-(subnet,user) model, ADR-0006 cross-user seam) — the schema reserves an inert `users` field until then. Enforced **at the target endpoint's node**, synced as **security material near the trust store** (not the context repo), subnet-settable, revocable — same plumbing as the grant store, **different table + polarity** (origin-node/default-open vs agent-subject/default-deny). **Composition — three orthogonal gates, nested:** subnet *visibility* (is it routable here at all?) → *access whitelist* (may your node reach it?) → *capability grants* (may this agent do this high-risk thing?). Discovery (*resource advertisement*) is gated by the first two.

**Scope:** the gated *capabilities* (remote-exec, instantiate-anywhere) are deferred, so the *full* consent UX lands with them. **v1 ships the framework seam** — the grant-store shape + the deliver-consent-to-user mechanism — so the deferred capabilities drop into a coherent gate without a rewrite.

### Messaging substrate

**arriving-message envelope**:
Every message that arrives at a consumer — an agent over `api listen`/`spt ready`, an adapter's hook composer over `api poll`/`api worker-poll`, the cross-node WAN feed — is delivered as the canonical **`<EVENT type="…" from="…">body</EVENT>` envelope** (`spt-proto::event`, the ADR-0001 grammar; self-delimiting, `from` attribution, `<br>` body escaping, `<EVENT-PART>` chunking). It is the one format adapters parse on, identical across every **arriving-message** surface on an agent perch. The early-legacy `__REPLY_TO__:` delivery frame was a mis-elevated relic and is removed (ADR-0020); reply-routing rides the envelope's `from` attribute. **Scope carve-out:** the **shell-command relay** (`api poll <shell-id> --link`) is a distinct *internal* transport, not an arriving-message surface — it carries the **raw MAC'd stamped command frames** the shell child consumes verbatim (verifying the MAC, parsing its own vocabulary), so it is **exempt** from `<EVENT>` composition (`notify_shell_e2e` guards this).

**file-transfer progress**:
Every file transfer over the substrate is addressable and **progress-queryable mid-flight** by both the agent and the SPT binary (for the GUI) — at any point during the transfer. A substrate-wide requirement, not shell-specific (the Shell text+file channel inherits it).

## Installation

<!-- [doc->REQ-INSTALL-1] the two-paths model + the one-line script half (v0.1 phasing below; OS-service leg = docs/DEFERRED.md) -->
<!-- [doc->REQ-INSTALL-2] the marketplace-repackaging stance: relocatable binary + minimal, non-OS-entangled install logic -->
spt-core is per-machine and harness-independent, so it installs *before* and *independent of* any adapter.

**Two install paths, (b) is the primitive:**
- **(a) harness-bootstrapped** — a harness plugin installs spt-core as its bootstrap step (today's cplugs model: `/plugin` pulls the plugin, which fetches the spt-core binary). The plugin's role shrinks to this bootstrap; spt-core self-updates thereafter.
- **(b) standalone** — the user installs spt-core directly with no harness, for a Pi node, a Shell-only node, or a headless server. Path (a) calls into (b).

**Installer form:** a one-line script (`curl | sh` / `irm | iex`) that fetches the **signed** binary for the platform, verifies it, places it, registers the binary dir on system-wide PATH (so adapters call `spt api …` cross-OS), and registers the daemon as an OS service. **v0.1 phasing:** the scripts (hosted at the canonical Pages URL, ADR-0014) download the latest release binary, **sha256-check** it (full ed25519 verification is `spt update`'s job thereafter; first-fetch trust = HTTPS + GitHub), place it, and register the *user* PATH — **OS-service registration is deferred** (daemon auto-start on `spt` invocation covers dev-stage use; gap: node unreachable after reboot until something invokes `spt`). **The install path must be non-interactive** (it doubles as every adapter's pack-in on-demand install — no second mechanism); if the one-liner ever grows interactive elements, a flagged non-interactive mode is mandatory. First-run identity gen + daemon start are already unattended; pairing stays a separate explicit step. Chosen for the dev-tool audience, cross-platform reach, and because the binary self-updates after. **Marketplace-repackaging-friendly:** nodes are foreseen on novel platforms (Android, medium-power Linux handhelds), so the install must be easy to repackage for platform marketplaces (e.g. PortMaster for handhelds, F-Droid-style for Android) — a relocatable binary + minimal, non-OS-entangled install logic.

**Daemon lifecycle:** registered as a systemd user service (Linux) / Windows service or scheduled task for the always-on guarantee, with `spt api`-triggered auto-start as the fallback (above). The first network bind triggers the OS firewall prompt here.

**What "installed" comprises:** the `spt` binary + the broker + the `$SPT_HOME` root (holds `node.key`, trust store, registry, spools, in-memory-seed fallback state). **First run is idempotent + interactive-optional:** generates the node identity and starts the daemon unattended; pairing (and subnet naming) is a separate explicit step.

**Legacy migration at install:** standalone install **auto-detects an existing `claude_skill_owl` (modern SPT) install and offers migration** (the migration commitment above) — identity, agents, tracked Psyche context.

**adapter registration (`spt adapter add`)**:
How a node comes to *know* an adapter — harness or shell. An explicit **`spt adapter add <path>`** (or **`--github <user/repo>`**) validates the manifest against the published JSON Schema and writes a registration record under `{SPT_HOME}/…/adapters/` — a **copy** of the files for `file_pull`-update adapters (spt-core owns what it later swaps) or a **pointer** for `delegated`-update adapters (the plugin owns + updates its own files). One command + one dir for both `kind="harness"` and `kind="shell"`; the `kind` field differentiates. The `--github` form **fetches the manifest first** (readable-before-install — same rule as the `min_spt_core_version` readable-before-update gate), checks compatibility, then completes the install via the manifest's own `[update]` avenue: **install is the first update** (one fetch/swap mechanism, not two). <!-- [doc->REQ-INSTALL-9] release-archive adapter acquisition; a third add source, distinct from the [update] ripple avenue --> A third acquisition source, **`--release <user/repo>` (+ optional `--tag` / `--asset`)**, fetches a **`.spt` archive** asset (a tar whose root holds `manifest.toml` + `strings/` + the pointed-at binaries) from the repo's GitHub release, extracts it to the durable `adapters/_github/` home, and registers the root — shipping **built binaries, source-free and versioned by tag**. It is the path for a dev **monorepo** whose adapter lives in a subdir, where the root-only `--github` clone does not fit (the release CI packs the archive from the existing repo). Like the installer's first binary fetch, first-acquisition trusts **HTTPS + GitHub**; signed verification stays with the `file_pull` *update* avenue. All three sources (local path, `--github`, `--release`) conduct the manifest's own `[update]` avenue once — install is the first update — so **acquisition is distinct from, and does not alter, the automatic ripple-update route**; an eager-extract acquisition (`--release` / `gh_release`) reports **`ADAPTER_INSTALLED`** (the files are already extracted + registered; the `[update]` avenue merely conducts on the update engine, not at add time), distinct from a `file_pull`-no-payload-yet add which is genuinely **`ADAPTER_INSTALL_PENDING`** (the payload arrives later over the update engine) — the two are no longer conflated under one "deferred" label. <!-- [doc->REQ-INSTALL-9] --> the release-archive fetch is the natural transport the deferred `file_pull` update would later reuse (with signature verification added). Harness-bootstrapped install (path a) calls it from the plugin's bootstrap; standalone install (path b) calls it for shell-only / Pi nodes. Registration is **node-local** — it means *"this node can drive/launch this adapter,"* distinct from advertising an endpoint into a subnet. The **registered-adapter set** the self-updater ripple-updates (see Self-update) is exactly this record set. **`adapter add` is non-destructive:** re-adding an already-registered adapter (any source landing at the same `_github/<safe>` home) is **refused** (`ADAPTER_ADD_ALREADY_REGISTERED` → use `adapter update` to refresh in place, or `adapter remove` then re-add to replace) rather than clobbering the live install; and when it does (re)populate a home it **stages-then-swaps** (fetch/clone to a sibling staging dir, swap into place only on success), so a failed fetch never strands the prior manifest+binaries as a dangling pointer — the same never-strand discipline `adapter update` already uses. <!-- [doc->REQ-INSTALL-13] -->.

**Removal (`spt adapter remove`) is soft-deregister:** the adapter is **hidden from new-creation / picker lists immediately**, but existing and live instances **keep running** under it (lazy — nothing is force-torn-down). An optional manifest **`uninstall` command template** (the inverse of the install/`[update]` avenue — e.g. `claude plugin uninstall spt`) cleans the adapter's own artifacts; it runs **once no instance is live** under the adapter (or immediately with `--force`, which cascades teardown). Once actually uninstalled, an endpoint whose synced **adapter history** still references it renders as **"needs install"** in the resume picker and offers `adapter add` (the `--github` path closes the loop).

## Project infrastructure

**spt-releases** (publish-target repo, `SaberMage/spt-releases`):
The public face of spt-core. The source repo (`SaberMage/spt-core`) stays **private long-term**; everything dev-facing publishes here: the README, GitHub Releases (binaries + dev assets), and the rendered dev docs (GitHub Pages at `https://sabermage.github.io/spt-releases`). Pure publish target — doc *truth* and CI doc-generation stay in the source repo; a publish pipeline pushes rendered artifacts. Everything in it is authored public-safe from day one (created private, flips public later). **The Pages URL is the permanent canonical URL** baked into docs, `llms.txt`, the install one-liner, and adapter bootstrap code — no custom domain on Pages (lapse-proof; ADR-0014). `spt-core.decidel.com` may exist only as a human-facing redirect. **Licensing splits by artifact:** install scripts / `mock-adapter` / docs-site content = MIT (devs copy these by design); the `spt` binary = short proprietary freeware license (free to run + redistribute unmodified, no warranty) with an explicit clause that **building adapters/shells against the public contract is unrestricted and royalty-free**.
_Avoid_: treating it as a second source repo; authoring doc truth there.

**issue / feature tracking**:
GitHub for v1 (mature API + `gh` CLI already wired; de-risks any agent-files-issues automation). This is *project infrastructure*, not a runtime user dependency — it does not compromise the product's "no central operator" promise the way the removed runtime gh-sync did. **tangled.org** (git on AT Protocol, self-hostable knots) is documented as the principled migration target, aligned with spt-core's decentralized ethos — especially if agents ever file issues over spt's own P2P layer into a self-hosted tracker. v2+ story.

### Frontend (day-one)

The day-one headed frontend is a launcher/manager UX, not just a raw `attach`. The end user cares about *reaching a target agent*, not about PTY mechanics. The frontend must:

- **List** all running and historic endpoints.
- **Launch a historic endpoint** that isn't currently running, reusing the same manifest that originally launched it.
- **Tap into** (attach to) a running endpoint.
- **Init a new endpoint** using a known adapter (e.g. claude-code / spt-plugin).

This is the realization of the "guided resume" (resume by project or agent, XMB-style filtered selection) and "management GUI" (per-endpoint panes: latest session output, live/project context, psyche log) sketches. `attach` is one operation under this frontend, not the whole frontend.

**guided-resume picker (`spt resume`, no-arg)**:
The CLI sibling of the frontend's guided resume — one command that lists endpoints **grouped by locality, most-recently-used within each group**: `on-node / current-project → on-node / other-project → off-node`, mirroring the *resolution policy*'s local-first preference. Selection **chains conditionally**: a **running** instance → attach/tap-in (no adapter step — already live under one); a **non-running** endpoint → into the **adapter selector** (*adapter selection*: history head = default → prior adapters → "choose a different adapter") → *home subnet* / other creation prompts as needed → launch; a **"+ new endpoint"** entry → the full creation flow. Off-node picks respect the reach + consent gates (remote-drive of your own running instance is ungated; a cold off-node launch is *instantiate-anywhere*, gated/deferred). It unifies the no-id `spawn-session` picker, the resume adapter-history UX, and locality resolution into one pipeline; the day-one frontend renders the same selection logic graphically.
