# v0.13.2 W3 — Live daemon-coordinated adapter update: DESIGN-GATE brief

**This is a DESIGN-GATE, not a build order.** W3 is the keystone + the riskiest wave (touches daemon lifecycle just hardened in v0.12.x/v0.13.0) + it absorbs a live field bug (perri F-010×F-015). Per the *gate-against-documented-design* discipline: **return a concrete, code-grounded DESIGN PROPOSAL first — do NOT write impl yet.** doyle confirms the design, then you build.

Authoritative design: **ADR-0025** (`docs/adr/0025-live-daemon-coordinated-adapter-update.md`) + the **W3** section of `V0.13.2-ADAPTER-PACKAGING-JIT.md`. Read both. This brief refines them with a field-bug-driven addendum ADR-0025 does NOT yet cover.

## The field bug that reshapes W3 (perri F-010×F-015, 2026-06-22)

A claude-spt update bricked on Windows: `claude-spt-psyche.exe` was running **detached** (manifest `[session.psyche_init] detach=true`, `Stdio::null`, unsupervised) **straight from the shared install dir**, its parent brain **DEAD** (orphan, lingered ~4h), holding the binary lock → `spt adapter update` / `add --release` failed `extract: tar exit Some(1)`.

**Why this breaks ADR-0025's premise:** ADR-0025 splits adapter binaries into **resident** (cycled on update) vs **ephemeral** (excluded, "self-heal on next spawn"), and classifies the Psyche as ephemeral *because it assumed the Psyche is daemon-hosted in-process (ADR-0004)*. The field proves claude-spt's Psyche is a **separate install-dir binary** — an unsupervised locker that W3's resident-only stop/restart would NOT release. The keystone ships with a hole unless W3 absorbs this.

**Existing lifecycle facts (grounded — confirm in your proposal):**
- `livehost.rs` `stop_host` ALREADY reaps the detached Psyche **by owned handle** (REQ-HAZARD-UNHOST-PSYCHE-REAP, test `stop_host_reaps_the_detached_psyche_process`). It works only on a **graceful** stop with the **owning brain alive**.
- The v0.12.0 **ID-specific scoped-reap** (pid alive AND exe basename == psyche program AND cmdline contains `<id>-psyche`, fail-safe-decline on any unreadable signal) fires only at **brain START**, before the first reconcile.
- perri's orphan slipped BOTH: the owning brain died abruptly (handle lost → `stop_host` never ran), and nothing reaps a brain-less perch's orphan at **endpoint-stop** time.

## The W3 design, three-part psyche-lock fix (doyle's direction — propose the concrete shape)

**(c) PRIMARY — psyche-from-own-copy.** A daemon-spawned **long-lived install-dir binary that runs detached/unsupervised** (today: the Psyche) should execute from a **per-endpoint private copy**, never the shared install-dir binary, so it can never lock the update target — regardless of reap timing. Propose: where the private copy lives (per-endpoint dir), when it's made (at host/spawn), how an update refreshes it (the next spawn copies the new bytes — consistent with "ephemeral self-heals on next spawn"), and **prove `<install_dir>/<program>` resolution (REQ-INSTALL-11) still holds** (the spawn resolves the private-copy path, not the install-dir path). State explicitly whether the **resident translation binary** also gets own-copy treatment or stays in-install-dir (it's resident → W3 stops it → lock releases → it can stay; your call, justify it).

**(a) SUPPORTING — stop-path / brain-death orphan reap.** Extend the v0.12.0 ID-specific scoped-reap to the **endpoint-stop path and brain-death reconcile**, so a brain-less perch's orphan Psyche is reaped (the handle-reap can't, the brain is gone). Keep the fail-safe-decline invariant (a missed dup is bounded; a wrong-kill is catastrophic — never kill on an unreadable signal). This fixes the orphan *leak* independent of updates.

**(b) BACKSTOP — extract-to-temp + atomic per-file swap.** One busy/locked file must not fail the WHOLE update. Compose with W3b's CRC swap. NOTE (b) alone can't replace a *running* locked `.exe` on Windows — that's why (c)/(a) are primary, (b) is the backstop.

## W3 core (ADR-0025) — propose the concrete shape for each

- **W3a resident-children registry.** Where does it live? (the **broker** is the daemon-state anchor across brain restarts — see the broker-is-daemon-state-anchor lesson; justify if you put it elsewhere.) Shape: a per-endpoint record of the live adapter-owned children to cycle (the `TranslationChild` at `translation.rs:240`, + whatever (c) decides about the Psyche). How does `TranslationChild::spawn` feed it; how is it torn down.
- **W3b CRC-gated swap.** Use the **existing `sha2` `Sha256`** (already a workspace dep in spt-daemon + spt — `Cargo.toml:42`/`:101`) for content compare; **no new crc crate**. Replace on disk only files whose hash differs from the staged archive; identical files + their still-running binaries untouched.
- **W3c manifest refresh.** A `BrainLifecycle` method (`lifecycle.rs:61` struct; `manifest: Manifest` field at `:70`, `runtime: ManifestRuntime` at `:62`, built in `with_config` `:93`/`:120-129`) that re-reads the new on-disk manifest and rebuilds BOTH `manifest` and `runtime` (`ManifestRuntime::with_install_dir`). Propose signature + call site.
- **W3d daemon-apply IPC.** CLI keeps fetch+verify (no new daemon HTTP). For an adapter with running resident children, CLI hands **apply** to the daemon over a new IPC command (where on the api/broker IPC surface?). Daemon apply **per affected endpoint**: stop resident children (translation; Psyche per (c)) → W3b swap → W3c refresh → restart resident (Psyche re-spawns from its new own-copy). Endpoint **not** running → CLI swaps directly (no lock, no cache, no round-trip). Route in `cmd_adapter_update` (today the gh_release apply path) + `adapter_update.rs`.
- **Bounded + brain-parity + partial-failure (BINDING).** Stop→restart MUST be bounded (`join_bounded` / kill-bounded; **never ride a 240s backstop**). The endpoint (`BrainLifecycle`/broker) **never restarts** — only the resident child cycles. A mid-swap failure must **not strand** the endpoint (stopped binary + half-swapped files): restart-old / atomic-ish on failure — specify the recovery.

## W3e — int keystone, NO mocks (the ship-blocker)

perri's **exact 4-step repro** is the acceptance (real spt-hosted endpoint + dummy-harness + real daemon, NO mocks):
1. bring up a claude-spt-style live agent wall-X → daemon spawns the detached Psyche; 2. kill the Psyche's PARENT (brain/livehost) → parent-dead orphan; 3. `spt endpoint stop wall-X` → perch alive=false; 4. `spt adapter update` → (pre-fix) tar overwrites the locked binary → `Access denied (os error 5)` / exit 1 → whole update fails.
**PASS post-W3:** step 3 reaps the orphan AND/OR the Psyche never ran from the shared target AND/OR the locked-file extract no longer fails the whole update. The **Windows locked-binary** case is the regression test. Add a graceful-stop variant + a happy-path live-update variant (endpoint stays live, serves the new binary). Reuse the v0.12.1 dummy-harness fixture. If it flakes on the contended runner, use the `SPT_ATTACH_IPC_DEADLINE_MS`/watchdog env-knob precedent.

## Docs + traceability (in your proposal, name what you'll write)

- **ADR-0025 amendment** for the separate-process-Psyche lock vector + the three-part fix (the resident/ephemeral split needs a third axis: *ephemeral-but-install-dir-locking → run from own copy*).
- **REQ activation:** `REQ-ADAPTER-LIVE-UPDATE` → `["doc","impl","unit","int"]`. Decide whether the stop-path/brain-death orphan reap needs its **own** `REQ-HAZARD-*` (likely yes — a distinct lifecycle hazard from the existing brain-start reap) or folds into an existing one; **propose, doyle confirms the REQ shape before you mint.**

## What to return (the DESIGN-GATE packet)

A concrete proposal: the registry location+shape; the (c) own-copy mechanism + REQ-INSTALL-11 proof; the (a) stop-path reap design vs the existing handle-reap; the W3b/W3c/W3d shapes with file:line call sites; the bounded-stop + partial-failure recovery; the W3e fixture plan; the ADR amendment + REQ/REQ-HAZARD plan. **No impl until doyle confirms.** Flag any place ADR-0025 / the JIT is wrong or underspecified.
