---
status: investigating
trigger: "Live agent dunsen never had an echo-commune fire within the past hour. Last echo-commune log entry says 'planning kicked off'; planning then moved to plan-verification phase without a new echo-commune firing afterward."
created: 2026-05-13T00:00:00Z
updated: 2026-05-13T00:00:00Z
---

## Current Focus

hypothesis: H1 (top-ranked, see below) — `.more-done` sentinel never created during the 80-min gap because Dunsen's Live perch had no Stop hook fire in that window (parent Claude turn never ended — autonomous post-planning work runs INSIDE one long-running parent turn while sub-agents resolve)
test: confirmed via log: poll iterations 66-69 emit NO `[ECHO]` line at all → `should_fire_decision` returned `SkipNoSentinel` silently → sentinel absent
expecting: cause is at the producer side (hook_idle.rs Stop hook never invoked on Dunsen perch), not the consumer side (wrapper gate is correctly wired)
next_action: surface ranked hypotheses + reproducible verification

## Symptoms

expected: echo-commune should fire periodically (~every 15-20 min) while Live agent is doing work, so Psyche keeps an updated context
actual: 80-minute gap between echo-commune fires across wrapper iterations 65→70 in dunsen.log (last meaningful fire 21:55:13, next forced fire 23:15:37 on usage-limit / handoff event)
errors: none — the wrapper is silently skipping the fire because the sentinel is absent (SkipNoSentinel returns without logging)
reproduction: Live agent invokes a long-running command (e.g. `/gsd-plan-phase X --research`) that triggers a Task-tool subagent. While the subagent runs, the parent Claude turn is NOT in "stopped" state, so Claude Code's Stop hook does NOT fire, so `.more-done` is never created, so the wrapper's 15-min gate never opens.
started: 2026-05-13 ~21:55 PDT (last good fire); resolved only when usage-limit / wrapper handoff at 23:15 forced a fire path

## Eliminated

- hypothesis: `should_fire` being dead-code means the gate is bypassed / broken
  evidence: `should_fire` (echo_fire.rs:95) is a thin boolean adapter over `should_fire_decision`. The compiler dead-code warning is benign — the function is called only by tests (lines 227-363). Production callsite in `fire_echo_commune_if_due` (echo_fire.rs:113) calls `should_fire_decision` directly. The gate is wired correctly.
  timestamp: 2026-05-13

- hypothesis: A recent v1.9.13/v1.9.14 commit broke the echo-commune fire path
  evidence: `git log -- src/owl/hook_idle.rs src/live/wrapper/echo_fire.rs` shows the last touch was 2026-04-18 (commits 0794a6d, ab9fd14, b034404, 801a8e8). v1.9.13/v1.9.14 commits touched INIT_SIGNOFF + EVENT envelope wrapping only — unrelated. The 80-min gap predates v1.9.14 deploy (which happened at 23:15:37, also when the next forced fire occurred).
  timestamp: 2026-05-13

- hypothesis: SubagentStop hook creates `.more-done`
  evidence: grep `\.more-done` across src/ — only writer is `src/owl/hook_idle.rs:97` inside `spawn_echo_commune_if_live()`, which is called ONLY from `hook_idle.rs::run()` (the Stop hook handler). Confirmed in hooks.json: `Stop` → `owl.exe hook-idle`; `SubagentStop` → `owl.exe hook-subagent-stop` which only cleans the working perch — does NOT write a sentinel.
  timestamp: 2026-05-13

## Evidence

- timestamp: 2026-05-13
  checked: dunsen.log lines 880-927 (poll iterations 66-70 spanning 21:55:18 → 23:15:37)
  found: All four "broken" iterations have shape: `poll iteration N starting` → `ready file exists: true` → 20-min wait → `poll exited code=0` → `[MSG] from=PULSE_TRIGGER` → `[PSYCHE] resume (exit=0): (empty)`. NO `[ECHO]` line of any kind appears between iter-start and ready-check. The wrapper IS running `fire_echo_commune_if_due()` (it's an unconditional call at mod.rs:104, before the ready-check log at mod.rs:120), but the function returns silently via `FireDecision::SkipNoSentinel` (echo_fire.rs:115 — no log emitted when sentinel absent).
  implication: The `.more-done` sentinel was ABSENT for the entire 80-minute span. The wrapper consumer logic is fine; the producer (Stop hook on dunsen perch) never fired.

- timestamp: 2026-05-13
  checked: hooks.json + hook_idle.rs + hook_subagent_stop.rs + hook_prompt.rs (full source read)
  found: `.more-done` is written at exactly ONE source-code location: `src/owl/hook_idle.rs:97` inside `spawn_echo_commune_if_live`, gated on (a) `OWL_ECHO_COMMUNE` env unset, (b) perch state == `PerchState::Live`, (c) non-empty session_id, (d) `{psyche_id}/ready` exists. Caller is `hook_idle.rs::run()` (the Stop hook). No other hook writes the sentinel: UserPromptSubmit clears `.idle-ready` only; SubagentStop removes working perches; SubagentStart creates working perches; plugin-session-start does session env injection.
  implication: The ONLY way to create `.more-done` is for Claude Code to fire a Stop event on the LIVE perch. If the parent Claude turn never ends, no sentinel is created, no echo-commune fires.

- timestamp: 2026-05-13
  checked: Last good echo-commune content at line 876
  found: `ECHO_COMMUNE (2026-05-13T21:55:13-07:00) from dunsen: Action: /gsd-plan-phase 06.3 --research invoked. Researcher spawn initiated. State: Phase 06.3 context validated. RESEARCH.md not yet created. Planner pending researcher output. Next: Researcher to produce findings ... before planner generates PLAN.md.`
  implication: At 21:55:13, dunsen had just dispatched a researcher SUBAGENT (via Task tool). Sub-agents run inside the parent's Claude turn. The parent turn does NOT end ("Stop") until ALL subagent work + post-subagent parent work completes. That can easily exceed 80 minutes for a `/gsd-plan-phase --research` workflow that involves researcher → planner → planning → verification stages all chained.

- timestamp: 2026-05-13
  checked: dunsen.log iteration 69 (22:55:34) — PSYCHE side `[PSYCHE] resume (exit=0): (empty)`
  found: Psyche IS finishing each haiku-cycle (exit=0 with empty body = "nothing to say this pulse"). Psyche-side Stop hooks fire on the psyche-haiku-child perch. But that perch is NOT state=Live (it's transient), so `spawn_echo_commune_if_live` returns early at hook_idle.rs:81 (`if info.state != PerchState::Live { return; }`). Psyche-side Stops cannot create the sentinel — only Self (Live) Stops can.
  implication: The 20-min PULSE_TRIGGER → Psyche-haiku → Stop cycle does not feed the sentinel. The sentinel depends entirely on the human-facing Self/Live agent's main Claude turn ending.

- timestamp: 2026-05-13
  checked: Forced fire at line 921-926 (23:15:37)
  found: `[PSYCHE] resume (exit=1): You've hit your org's monthly usage limit`. Immediately after, `poll iteration 70 starting` → `[ECHO] gate open: sentinel stale (or metadata unavailable) — firing echo commune`. Then handoff to v1.9.14 at 23:15:37.
  implication: When the Psyche-side `claude --resume` exit=1'd (usage limit), the wrapper's gate fired with "metadata unavailable" — likely because the sentinel STILL didn't exist (SkipNoSentinel) but a different code path opened. WAIT — re-reading: gate-open log fires only on `FireDecision::Fire`, which happens when sentinel EXISTS but age >= window OR sentinel exists but metadata fails. SkipNoSentinel returns early without logging. So between iter 69 (22:55) and iter 70 (23:15), Dunsen's Live Claude DID fire a Stop hook in that 20-min gap — likely the user manually pressed Esc / context-cleared / sent a `/clear` / or the autonomous turn finally finished, creating the sentinel. The age=20min was past the 15-min window → Fire.

## Resolution

root_cause: Echo-commune fires depend on the Self/Live Claude agent's Stop hook firing to create the `.more-done` sentinel. During long autonomous turns where a parent Claude session is waiting on chained subagent work (Task tool researcher → planner → planning workflow), the parent's Stop event does not fire — therefore no sentinel is written, and the wrapper's gate stays closed indefinitely (`should_fire_decision` returns `SkipNoSentinel` silently). The user's hypothesis is essentially correct: while they framed it as "only on UserPromptSubmit", the actual rule is "only on Stop hook events on the LIVE perch" — which include the end of any user-initiated turn AND any autonomous turn, BUT NOT mid-turn subagent boundaries. Long uninterrupted parent turns = no Stop = no sentinel = no echo-commune.

fix: Not applied (investigation-only per request). See "Recommended fix direction" in summary.

verification: See "Reproducible test plan" in summary.

files_changed: []
