---
name: lifecycle-liveness-milestone
description: "spt-hosted LIFECYCLE & LIVENESS reconciliation milestone (~v0.12.0) — 7-bug cluster operator-surfaced 2026-06-17, diagnosed by doyle (2 agents), dispatched todlando. doyle drives to release (operator mandate)."
metadata: 
  node_type: memory
  type: project
  originSessionId: 16531b26-c5ac-4120-bfb8-113daef39d8f
---

**OPERATOR MANDATE 2026-06-17:** diagnose thoroughly → rope in todlando → see it through to release publish. doyle owns the whole arc (diagnose+gate+drive deployah). Branch off main@v0.11.0, mint registry-first, keystone-first waves, ~v0.12.0 MINOR. Operator is BLOCKED: `spt endpoint run` unusable (attach hangs, daemon stop doesn't stop).

**UNIFYING ROOT:** `info.json status=online` is a ONE-WAY LATCH — set at establish (startup.rs:361/468 set_status STATUS_ONLINE), NEVER reconciled vs real liveness (liveness.rs:80-93 is_perch_alive = status==ONLINE for daemon-hosted). Plus daemon-stop/brain-restart don't tear-down/rehydrate. Every symptom falls out.

**7-bug cluster (proposed REQ ids; todlando mints registry-first):**
- ⭐KEYSTONE REQ-HAZARD-HOSTED-LIVENESS-RECONCILE (B2): broker exit-waiter (broker.rs:889-910) reaps its session table (909)+emits ExitEvent but NEVER clears info.json status; mark_offline (lifecycle.rs:202) only on psyche-teardown+tests. FIX (doyle RULING 2026-06-17, todlando flagged the unattended-case gap): **PULL-reconcile PRIMARY** — the livehost reconcile loop (reconcile_once livehost.rs:226-313) marks status=offline any online-latched perch whose broker session is gone (query KIND_SESSIONS) / child pid dead, past the boot grace window. MIRRORS v0.11.0 REQ-HAZARD-ROSTER-GHOST (advertise_local offline-heal) — same crash-robust self-heal-next-tick shape. PUSH (brain on ExitEvent → mark_offline, broker stays stateless ADR-0004§B) = ADDITIVE FAST-PATH only, IF the brain is reliably subscribed to all hosted sessions; correctness must NOT depend on push (it misses the common UNATTENDED case — no rc, spawn-time controller gone — + the brain-restart gap + dropped frames). B2 + B5 COLLAPSE into ONE reconcile (the livehost loop = single liveness authority: marks-offline-orphans [B2] + gates-psyche-spawn-on-real-live-session-not-latch [B5]). Do FIRST — unblocks B1/H3.
- REQ-HAZARD-RC-ATTACH-FAILFAST (B1): `spt rc` to dead/silent session → PUMP_IPC_READER:spawned + infinite blank (pump polls, stream never streams). rc.rs run_attach 209-231. FIX: gate on is_online post-B2 + fail-fast on no-ack/no-first-output + broker EOFs the attach stream on dead child → rc PumpEnd::BrokerGone (REQ-HAZARD-RC-EOF). PIN exact sub-mechanism w/ repro test FIRST (operator: fresh wall-f attached OK, closed tab, `spt rc wall-f` hung → broker still resolved a session → alive-silent vs dead-not-reaped-on-Windows-tabclose?).
- REQ-ENDPOINT-STOP-OFFLINE (H3): cmd_stop (cli.rs:2994-3010) removes ready+unregisters addr, no status offline → stopped still alive=true. FIX: set_status(OFFLINE). Folds w/ B2.
- REQ-HAZARD-DAEMON-STOP-BARRIER (B3): `daemon stop`→STOPPED then `start`→ALREADY_RUNNING. request_stop (seedmap.rs:240) returns on the KIND_STOPPING ack (174-176) BEFORE seed socket unbinds; is_running ping (daemon.rs:375) wins the exit window. FIX: unbind/stop-gate socket BEFORE ack, OR request_stop waits ping-to-fail.
- REQ-HAZARD-DAEMON-STOP-REAP (Breap): stop leaves ~8 psyches+spt.exe orphaned. Psyches spawned detached (runtime.rs:342-356 Child dropped); livehost stop flag NEVER raised (brainproc.rs:227-230). FIX: raise stop flag + kill children on stop (Windows job object / Unix process-group). Folds w/ B3.
- REQ-HAZARD-LIVEHOST-BOOT-LIVENESS-GATE (B5): `daemon start` spawns a psyche per status=online-latched perch (reconcile_once livehost.rs:285) w/o checking child alive → revives phantoms (3 psyches). FIX: gate boot psyche-spawn on real child-liveness + boot residency/pid check (Cold start mustn't revive dead-harness perches).
- REQ-HAZARD-BRAIN-RESTART-LIFECYCLE-REHYDRATE (B4, deepest): bare brain-restart (broker survives) → new spawns die/can't-attach until full reset. resume_sessions (brainproc.rs:186) re-subscribes but ALL BrainLifecycle instances LOST (ephemeral brain.rs:254-275) → post-restart endpoints get no livehost. FIX: rebuild BrainLifecycle per resumed live-capable session on brain startup. (Operator: perri's brain kill+restart wedged everything.)

SEQUENCING: B2→(B1+H3)→(B3+Breap)→B5→B4. SEAM BAR HARD: full daemon_lifecycle_real_brain+livehost(bootrace/nonresident)+broker+handoff+attach+brain_swap/resume sweep EVERY wave. doyle gates per-wave → deployah v0.12.0.

**SEPARATE, PAUSED:** the issue-2 idle-message-delivery REDESIGN grill (Q1=A+ accepted: adapter `[message-idle-translation-binary]`, spt-core-managed lifecycle, poll-listener-preferred-when-present, per-harness keystroke choreography). That's a LATER milestone (touches the same rc/broker seam but is the delivery-mechanism redesign, not the lifecycle bugs). Resume the grill after this cluster ships. Also pending from the diagnose: issue-1 (bare `spt endpoint list` must include local-unadvertised perches, kill --local, whoami aliases it) + issue-3/4 (project-history must not be live-commune-commit-gated — ready agents need it; store tracked/.seed.git was empty=zero commits, PICKER-2 reader correct but nothing writes). v0.11.0 (issue-5) = NOT a bug (public; cache).
