# v0.12.1 — Lifecycle + Picker JIT Plan

**Why:** v0.12.0 shipped (counter 25) but real-harness testing (operator, `claude-spt`, Windows) showed the
lifecycle fixes do **not** deliver in the real daemon + broker + PTY path. v0.12.0 was gated on **mock adapters +
in-process `reconcile_once`** — green tests, broken reality. v0.12.1 reopens with **real-harness gating** (no mocks).

**Branch:** `v0.12.1-lifecycle` off `main@5f9ea23`. **Release:** v0.12.1 PATCH (deployah, after doyle gate-pass all waves).
**Roles:** doyle designs/mints/gates · todlando executes · deployah releases.

**Meta-fix (binding for this sprint):** every lifecycle gate runs against the **real dummy-harness fixture + a real
detached daemon** — NOT `reconcile_once` in-proc, NOT a mock with a live-draining peer. doyle independently re-runs each.

---

## Wave 1 — Test infrastructure (unblocks all real-harness gates)

- **T1 — dummy-harness adapter fixture** (operator's idea). A `kind="harness"` adapter whose command is a trivial program
  that (a) binds its perch on startup (the harness contract), (b) prints a stdout line on an interval, (c) stays alive
  until killed. Drives the REAL `spt endpoint run` → `launch_harness_brokered_in` → broker PTY → `rc` attach path.
  Isolates "broker wedged / endpoint died" from "a real harness failed to launch." Becomes the permanent regression
  fixture the v0.12.0 mock tests never had. Lives in the test tree; runs against a scratch target dir (livehost-E2E pattern).

## Wave 2 — Lifecycle core

- **L0 — REQ-HAZARD-ENDPOINT-RUN-ATTACH-OUTPUT (KEYSTONE — do FIRST).** Confirmed v0.12.1 Wave 1 via the real
  dummy-harness fixture: a clean `spt rc` attach to a LIVE, heartbeating, psyche-hosted `endpoint run` harness receives
  **0 bytes** over 10s of its flushed `[session.self]` stdout — no death, no wedge. This IS the operator's central
  "attach shows no output," and it blocks the whole "view is independent" goal (re-attach shows nothing). Known-good
  (attach.rs loopback-attach E2Es) proves the broker drains+fans a `spawn_session` PTY child over the same transport →
  the gap is endpoint-run-specific (both paths share `dispatch_spawn`, broker.rs:706/835). ISOLATE path-vs-program first
  (run the attach.rs known-delivering child as the endpoint-run `[session.self]`); then root the mechanism — candidates:
  (a) `spawn_session_pid` SpawnReq stdio/env/cwd diff; (b) harness stdout write-blocks on a full ConPTY buffer (drain not
  reading THIS pty) → alive-but-0-bytes; (c) ConPTY reader-park (KH 7.6); (d) `rc` subscribe/`resolve_session` for an
  endpoint-run session reads the wrong/empty log. GATE (dummy harness): rc attach to a LIVE endpoint-run harness RECEIVES
  its DUMMY_HARNESS_TICK within a bounded window.

- **L1 — REQ-HAZARD-VIEWER-CLOSE-DETACH (PRIMARY).** Closing the tab/window where `spt endpoint run` was invoked must
  detach only the `spt rc` pump; the daemon-hosted harness keeps running and is re-attachable.
  ROOT: the daemon never breaks away from the launching terminal's Windows Job Object (`KILL_ON_JOB_CLOSE`); no
  `CREATE_BREAKAWAY_FROM_JOB` anywhere → tab close reaps the daemon's freshly-spawned ConPTY harness subtree. ConPTY
  isolation itself is already correct (portable-pty makes the pseudoconsole in the daemon; no console signal/handle leak).
  FIX: add `CREATE_BREAKAWAY_FROM_JOB` to both daemon spawn paths (`daemon.rs:707` `detached_no_inherit`,
  `deelevate.rs:519` elevated) **+** pin each broker-spawned harness into a **daemon-owned Job Object** (mirror Breap,
  `reap.rs`) as backstop (survives even if a terminal sets `SILENT_BREAKAWAY_OK=false`). `pty.rs` unchanged. Unix: verify
  the daemon's session-detach already covers terminal-close (SIGHUP scope) — likely no change, add a guard test.
  GATE: spawn daemon under a parent-held `KILL_ON_JOB_CLOSE` job → `endpoint run` a dummy harness → close the parent job →
  assert the harness pid **stays alive** AND `spt rc <id>` re-attaches AND a brand-new endpoint launches.

- **L2 — REQ-HAZARD-ATTACH-WEDGE (ROBUSTNESS).** Even a *legitimately* dead PTY child (real crash/kill) + an undrained
  operator pump must NOT wedge the broker.
  ROOT: loopback attach output is a blocking `write_all` into a bounded 64 KB tokio duplex (`nethost.rs:1040,1090`); a
  dead operator (closed tab) stops draining → `write_all` blocks forever (the "loopback never hangs" assumption at
  `nethost.rs:1103` is false) → parks workers in the **2-worker** net runtime (`nethost.rs:640`) → both saturate → every
  new attach/`endpoint run` stalls after `PUMP_IPC_READER: spawned` → 30 s `FIRST_EVENT_GRACE` → "dead or wedged";
  `daemon stop` can't join the stuck workers. Distinct from the removed B1 path-(c) mutex deadlock.
  FIX: make loopback sends fail-fast — a full-buffer / `BrokenPipe` loopback write is an ordinary per-stream error that
  ENDS `serve_attach`; one dead stream can never hold a runtime worker. (Defense-in-depth: raise worker count — but the
  real fix is non-blocking-on-dead-peer.)
  GATE (dummy harness): kill the child abruptly + drop the operator pump without a clean detach → assert a **new**
  endpoint is still served, `brain.sessions()` returns promptly, `daemon stop` completes bounded.

- **L3 — status=online persistence (sub-item, folded into L2's gate).** Three dead endpoints stayed `status=online`.
  B2 (`reconcile_hosted_liveness`) should offline a controllable spt-hosted perch whose broker session is gone — confirm
  whether abrupt Windows child death actually reaps the broker session (so B2 sees it absent) and whether the reconcile
  tick fires. Resolve as part of L2; GATE asserts the dead endpoint is marked **offline within one reconcile tick**.

- **L4 — `daemon stop` ends everything (operator item #1, folded).** `daemon stop` did not end all spt processes / take
  everything offline. Largely downstream of L1+L2 (un-wedged broker can stop cleanly; harness in a daemon-owned job is
  reaped by Breap). GATE: after L1+L2, `daemon stop` terminates the daemon + brain + all hosted harness/psyche processes,
  bounded, and the roster goes offline.

## Wave 3 — Picker (operator-raised; ROOTS TBD — investigate before fixing)

- **P1 — REQ-PICKER-HISTORY-FRESH.** `spt endpoint run` picker does **not** show project history for fresh endpoints.
  Investigate the project-history loader (v0.10.0 PICKER-2, `picker/data.rs`) — real bug vs "fresh = no history yet"
  semantics. Then fix.
- **P2 — REQ-PICKER-ONLINE-ACTION.** Picker shows **"Start now"** for endpoints that are already online. Investigate the
  4-state status mapping (v0.10.0 PICKER-1, `picker/model.rs`) — is it reading live/online state correctly, or rendering
  stale/wedged broker state? Then fix (online → "Attach", not "Start now").

## Wave 4 — endpoint list cleanup (operator decision 2026-06-17, dropped — now minted)

- **E1 — REQ-ENDPOINT-LIST-MERGE-LOCAL.** Remove the `--local` flag; the bare `spt endpoint list` ALWAYS merges this
  node's local (unadvertised) perches into the view. Rationale: `spt whoami` is a thin alias — a just-online agent
  running `whoami` must see its own perch. Drop the flag + its `--detail` conflict test + the v0.10.0 hint line +
  `cmd_list_local`; fix the `whoami` alias path. Run `cargo run -p xtask -- gen` (docs-drift, DEFAULT target).

## Release

- **v0.12.1 PATCH.** deployah drives after doyle gate-pass on all waves. CHANGELOG: cross-check **each bullet's scope vs
  the commit range** (the v0.12.0 lesson — [[changelog-scope-vs-commit-range]]); name the specific delta. Post-publish:
  inform operator + perri (real-harness lifecycle now actually works).

## Open / carried

- Non-blocking test-hardening: drain OUTPUT after `KIND_EXIT` in `broker::spawn_env_reaches_child` (latent flake) — fold
  into Wave 1/2 test work if cheap, else carry.
- Machine cleanup: clear the operator's 3 phantom endpoints (wall-a/b/c) + orphan psyche — scoped, preserve daemon +
  the `doyle` live perch + CI.
