---
status: diagnosed
trigger: "Phase 25.3-04 [FILE-DROP] commune arm exited with (file gone) instead of routing through route_inbound_commune_body"
created: 2026-05-22T22:00:00-07:00
updated: 2026-05-22T22:15:00-07:00
---

## Current Focus

hypothesis: CONFIRMED — `psyche-download todlando` (run by operator before psyche boot) called `download_payload → append_pending_sections` which destructively consumed (`fs::remove_file`) `.claude/todlando-commune.md`. The Self listener's prior `scan_drop_files` had already enqueued a `<EVENT type="file_drop">` to the wrapper perch via spool. When the wrapper booted and poll iteration 1 drained that spool envelope, `process_file_drop` ran `fs::read_to_string(path)` on the (now-deleted) file → ENOENT → `(file gone)` branch.
test: code-trace confirmation of single deletion site + timing reconstruction from log
expecting: exactly one `fs::remove_file` of the commune drop path in the codebase, in a code path the operator demonstrably invoked
next_action: return diagnosis (root cause only)

## Symptoms

expected: `[FILE-DROP] route_inbound_commune_body for todlando` line in wrapper log; tracked psyche file materialized with `<!-- spt:source=direct spt:routed_at=... -->` header
actual: `[FILE-DROP] dropped kind=commune ... (file gone)` — tracked-psyche folder created but EMPTY, no route line
errors: none (silent failure path — ENOENT arm returns FileDropOutcome::Continue)
reproduction: invoke `$LIVE psyche-download <id>` (or any caller of `download_payload`) BEFORE wrapper boots while `.claude/<id>-commune.md` is present; then boot the wrapper while the listener's spool still holds the pre-deletion file_drop envelope
started: Phase 25.3-04 (post-implementation observation — actually pre-existing since quick-260515-uf1 introduced destructive consume)

## Resolution

### Root Cause

`src/live/context.rs::append_pending_sections` at **line 615** is the SOLE deletion site for `.claude/{self_id}-commune.md` in the codebase:

```rust
// src/live/context.rs:603-615
let disk_write = fs::OpenOptions::new()
    .create(true)
    .append(true)
    .open(&ctx_path)
    .and_then(|mut f| f.write_all(section.as_bytes()));

match disk_write {
    Ok(()) => {
        // Disk write succeeded → safe to delete drop file. ...
        let _ = fs::remove_file(&path);    // <-- LINE 615: THE DELETION
```

This helper is called unconditionally inside `download_payload` (`src/live/context.rs:496-501`):

```rust
// src/live/context.rs:496-501
if let Ok(cwd) = std::env::current_dir() {
    let appended = append_pending_sections(&mut out, &cwd, self_id);
    ...
}
```

`download_payload` has two production callers (`src/live/context.rs:693`, `:678`):
1. `run_download(self_id)` — invoked by `$LIVE psyche-download <id>` CLI
2. `download_payload_for_injection(self_id)` — invoked by the SessionStart hook (`src/owl/resume.rs:203` and `:331`), gated on `is_live` perches owned by the current `parent_pid`

The operator-visible evidence pinpoints the trigger: the `psyche-download todlando` output earlier in the session showed
`## Pending Commune (written 2026-05-22T21:56:24-07:00)` — that header is produced by `append_pending_sections` at `src/live/context.rs:590`. The same call destructively consumed the file in the same invocation.

The Self listener's separate scan (`src/owl/poll.rs::scan_drop_files`, line 1208) is read-only — it never deletes the drop file. It just discovers presence and emits a `<EVENT type="file_drop">` envelope via `compose_file_drop_event` (`src/owl/poll.rs:1236`). Producer-side dedupe (`dedupe_drops`, `src/owl/poll.rs:1265`) was added by quick-260520-pra precisely to suppress repeated emits per file lifecycle — but `dedupe_drops` retains across spool persistence, so a stale envelope can sit in the wrapper's spool even after the file vanishes mid-lifecycle.

### Read site (`(file gone)` branch)

`src/live/wrapper/mod.rs:1446-1454`:

```rust
let body = match std::fs::read_to_string(path) {
    Ok(s) => s,
    Err(e) if e.kind() == std::io::ErrorKind::NotFound => {
        self.log(&format!(
            "[FILE-DROP] dropped kind={} path={} (file gone)",
            kind, path
        ));
        return FileDropOutcome::Continue;
    }
    ...
};
```

The function returns BEFORE `route_inbound_commune_body` (line 1481) ever executes. That is why the tracked-psyche directory was created (by a different boot-time path) but stayed empty — no synchronous route persisted any slice.

### Timeline reconstruction

1. T-N: operator stages `.claude/todlando-commune.md` (file exists, mtime 2026-05-22T21:56:24-07:00).
2. T-N+ε: operator runs `$LIVE psyche-download todlando` from project root.
   - `run_download → download_payload → append_pending_sections`
   - Reads `.claude/todlando-commune.md` (line 569)
   - Appends to `agents/todlando/live_context.md` (line 603–607)
   - **`fs::remove_file(&path)`** — file deleted (line 615)
   - Operator sees `## Pending Commune (written 2026-05-22T21:56:24-07:00)` in the download output (line 590 format).
3. Independently, Self listener's `scan_drop_files` had run at least once while the file was still present and emitted `<EVENT type="file_drop" kind="commune" path="C:/Users/decid/Documents/projects/claude_skill_owl/.claude/todlando-commune.md">` to the `todlando-psyche` perch via `send::deliver_body`. That envelope landed in the wrapper's spool DB (TCP listener not yet up).
4. T+0 (21:58:25): wrapper starts gen=29.
5. T+0–T+39 (21:58:25–21:59:04): `init_session` spawns `claude -p` with cwd = `psyches/tracked/` (`src/live/wrapper/claude.rs:56`). The SessionStart hook fires inside that subprocess but the project-root `.claude/` is not reachable from that cwd, AND `inject_active_perch_context` filters perches by `parent_pid` to the wrapper's subprocess (not the operator's session) — so even if it ran, it would not find a matching live perch. Either way, the commune file is already gone from step 2; this step is irrelevant to the deletion (defensive note for ruling out an alternative hypothesis).
6. T+39 (21:59:04): poll iteration 1 drains the spool envelope from step 3.
7. T+39: `handle_file_drop_arm` matches `<event type="file_drop"`, parser succeeds, `process_file_drop("commune", "C:/.../todlando-commune.md")` runs.
8. T+39: `fs::read_to_string(path)` → ENOENT → `(file gone)` log line, return `FileDropOutcome::Continue`.

The `claude init` hypothesis (that init's own SessionStart consumed the file) is RULED OUT because init runs at `psyche_dir()` cwd and `append_pending_sections` only looks at `cwd.join(".claude")` — but it was on the right track. The actual consumer is the operator-invoked `psyche-download`, which runs at project-root cwd.

### Confidence: HIGH

Reasons:
- Only ONE `fs::remove_file` of `{id}-commune.md` exists in the codebase (verified by `Grep` over `src/`). The deletion path is unambiguous.
- The `## Pending Commune (written 2026-05-22T21:56:24-07:00)` header observed by the operator is produced EXCLUSIVELY by `append_pending_sections:590`. That string is the operator-visible signature of the deletion event.
- The file_drop path validation (`/.claude/` containment + `-commune.md` suffix, `src/live/wrapper/mod.rs:319-326`) rules out a malformed-envelope hypothesis — the parse already succeeded (log shows `dispatching: kind=commune path=...`).
- The ENOENT arm at `src/live/wrapper/mod.rs:1448` is the EXACT formatter that produces `(file gone)`. Line-anchored verification.
- The `spool-retain-replays-file-drop-after-delete.md` resolved debug (referenced in the comment at line 1443) documents the same race shape — listener spool re-delivers after a consumer-side delete. That prior debug fixed listener-side dedupe; the producer (`scan_drop_files`) and the operator-invoked deleter (`psyche-download`) are uncoupled, so the race re-emerges through this new pathway.

### Why Phase 25.3-04 didn't catch this

Phase 25.3-04 Plan 04 Defect A added `route_inbound_commune_body` and inlined it inside `process_file_drop`. The test surface for the synchronous route is the integration test at `src/live/wrapper/mod.rs:2222-2225` which directly invokes `route_inbound_commune_body` with a literal body string — it skips `process_file_drop` entirely. The only `process_file_drop` ENOENT test (`process_file_drop_enoent_logs_dropped_not_retaining`, line 3275) is a regression guard for a DIFFERENT defect (`spool-retain-replays-file-drop-after-delete`); it asserts the ENOENT arm logs `(file gone)` correctly — which is exactly the path that fires here. So the gap is not in unit-level coverage but in **integration coverage**: there is no end-to-end test that exercises the sequence (commune file present → psyche-download consumes → wrapper boots and drains spool envelope) and asserts the route lines materialize on disk. The two relevant subsystems (psyche-download's destructive consume; wrapper's spool drain) are tested in isolation but never together.

### Suggested fix direction (NOT applied)

The choice depends on the intended ownership contract:

**Option A: Make `psyche-download` non-destructive.**
Remove the `fs::remove_file(&path)` at `src/live/context.rs:615` and let the wrapper's `process_file_drop` be the sole consumer + deleter (it already does so on `exit_code == 0`, see `src/live/wrapper/mod.rs:1548-onwards`). Risk: duplicate sections in `live_context.md` if `psyche-download` is run multiple times before wrapper consumes — would need a separate idempotency key (mtime hash, or a `.consumed` marker) on the appended sections.

**Option B: Make `psyche-download` notify the wrapper synchronously.**
After consume, send a control message to the wrapper perch saying "this file_drop is stale; drop any pending spool entries for path=X". Requires new wire-form symbol + wrapper-side handler. Heavier.

**Option C (smallest delta): Receiver-side staleness check.**
Inside `process_file_drop`, BEFORE `read_to_string`, check `fs::metadata(path).modified()` against a per-wrapper-session epoch (e.g. `last_consume_epoch` written to info.json by `append_pending_sections`). If the file mtime predates the epoch, treat as stale-consumed and skip without the `(file gone)` log noise. Doesn't fix the lost commune (still gone) but removes the misleading log. **Not sufficient on its own.**

**Option D (preferred shape): Single-writer contract.**
Pick ONE consumer (the wrapper). `psyche-download` reads `.claude/{id}-commune.md` non-destructively (mirror the data into download output but do NOT remove). The wrapper's file_drop pathway becomes the sole deleter. `live_context.md` materialization happens ONCE, synchronously, inside `route_inbound_commune_body` from the wrapper side. `psyche-download` output for `Pending Commune` becomes a read-only snapshot. This converges with Phase 25.3-04's locked Option A (wrapper-direct routing) — `append_pending_sections` becomes vestigial for the deletion side.

Adding integration test (any option): seed `.claude/{id}-commune.md`, invoke `download_payload` (or `psyche-download` subprocess), then simulate the wrapper drain by invoking `process_file_drop` against the now-absent path AND a pre-staged spool envelope; assert `agents/{id}/live_context.md` contains exactly ONE routed slice (not zero, not two).

### Secondary observation: ANSI escape leak in poll stderr → wrapper log

`src/live/wrapper/mod.rs:1216-1220` and `:1245-1247`:

```rust
self.log(&format!(
    "poll exited code={} stderr={}",
    exit_code,
    stderr.trim()
));
```

`stderr.trim()` only strips whitespace. The poll subprocess emits status lines via `output::owl_status(S_OK, ...)` (cyan-ANSI per CLAUDE.md "owl cyan" convention), producing literal `\x1b[36m✓ READY:todlando-psyche (spt v1.11.6)\x1b[0m` in the wrapper log. The escape codes are inert in the log file but break grep-friendliness and clutter logs read in non-ANSI viewers (Notepad, IDE log panes).

Fix shape (separate defect — not the file-gone root cause):
- Add a small `strip_ansi(&str) -> String` helper (regex over `\x1b\[[0-9;]*[a-zA-Z]`), and apply it to `stderr` before formatting into `self.log(...)`. Two call sites (`mod.rs:1216`, `mod.rs:1245`). Pure-string transform; trivially unit-testable.
- Alternative: pass `--no-color` or set `NO_COLOR=1` in the poll subprocess env at `mod.rs:1197-1200` so the child elects not to emit color in the first place. This is the cleaner shape if `output::owl_status` already honors `NO_COLOR` (worth a one-line check at the emitter side).

### Files Involved

- `src/live/context.rs:558-643` — `append_pending_sections` (destructive consume site)
- `src/live/context.rs:615` — `fs::remove_file(&path)` (THE deletion)
- `src/live/context.rs:496-501` — `download_payload` call into the consumer
- `src/live/context.rs:677-679` — `download_payload_for_injection` (SessionStart variant)
- `src/owl/resume.rs:203`, `:331` — SessionStart caller
- `src/live/wrapper/mod.rs:1430-1454` — `process_file_drop` ENOENT arm (the `(file gone)` log)
- `src/live/wrapper/mod.rs:1481-1485` — `route_inbound_commune_body` call (never reached when ENOENT fires)
- `src/owl/poll.rs:1208-1220` — `scan_drop_files` (listener's read-only producer)
- `src/owl/poll.rs:1236-1243` — `compose_file_drop_event` (envelope built before delete)
- `src/owl/poll.rs:1245-onwards` — `dedupe_drops` (in-memory dedupe; does NOT survive wrapper-side spool drain after producer-side delete)
- `src/live/wrapper/mod.rs:1216-1220`, `:1245-1247` — ANSI-leak log site (secondary)

### Specialist Hint

rust
