---
phase: 25.3
reviewers: [codex]
reviewed_at: 2026-05-22T17:55:00-07:00
plans_reviewed: [25.3-01-PLAN.md, 25.3-02-PLAN.md, 25.3-03-PLAN.md, 25.3-04-PLAN.md]
---

# Cross-AI Plan Review — Phase 25.3

## Codex Review

## Summary

The plans are thoughtful and mostly sequenced around the right defect boundaries: B2 first, D paired with B2, C before A. The strongest part is that they treat this as a contract problem across resolver, prompt, encoding, and persistence layers rather than a single missing write. The main weakness is that several plans still contain unresolved architectural choices, contradictory envelope shapes, and test/deploy steps that may fail on Windows or mutate scope mid-plan. I would not execute these as-is without tightening the contracts first.

## 25.3-01 Plan Review

### Strengths

- Correctly isolates B2 as the smallest first fix.
- Uses `last_project_name` as the canonical project source, which matches Phase 24.1.
- Includes caller audits for `build_current_context_blocks` and `derive_current_repo_names`.
- Adds regression coverage for `tracked` false-positive handling.

### Concerns

- **HIGH:** `resolve_self_project_stamp` still falls back to plain cwd `stamp()`, which can still produce `project="tracked"` for outbound EVENT attrs. That preserves the same asymmetry the plan is trying to remove.
- **MEDIUM:** Rejecting literal `"tracked"` globally can break a real project named `tracked`. Better reject it only when the caller cwd is the SPT tracked psyche dir or when the value came from D-06 fallback.
- **MEDIUM:** Tests that mutate `SPT_HOME` may be flaky under parallel `cargo test`. They need serial execution or isolated env handling.
- **MEDIUM:** The plan says the prompt is keyed on `claude_skill_owl`, but the described prompt format only has `CURRENT_PROJECT_CONTEXT:` with no visible project name. That makes smoke verification ambiguous.
- **LOW:** `files_modified` lists `tests/native_owl.rs`, but the task says to put most tests in the private module inside `echo_commune.rs`.

### Suggestions

- Filter `"tracked"` in all three resolver layers: tracked info, legacy perch field, and cwd fallback.
- Add a helper returning `Option<String>` for “safe resolved project name” and make both EVENT stamping and context-block resolution use it.
- Make env-mutating tests serial.
- Add a debug/test-only way to verify the resolved project name used for the prompt, not just the body path.

### Risk Assessment

**MEDIUM.** The core fix is sound, but the remaining cwd fallback can still emit the bad project name and preserve asymmetry.

## 25.3-02 Plan Review

### Strengths

- Correctly identifies that `psyche.md` must teach absorption, not just emission.
- Aligns `psyche.md`, `commune.md`, and `signoff/SKILL.md`.
- Calls out that `psyche.md` is embedded and requires rebuild.

### Concerns

- **HIGH:** Option B expands far beyond listed files. It requires `build_current_context_blocks`, `parse_two_slice`, `route_two_slice`, validation, and tests, but the plan only lists prompt/skill files.
- **HIGH:** The plan alternates between “Psyche writes `projects/<name>/<self_id>.md`” and “Psyche emits envelopes that Rust routes.” That distinction matters. The LLM should not be taught that it directly writes the file unless it actually has that tool path.
- **HIGH:** The must-haves require `<project-context name="...">`, but Option A explicitly keeps no name attr. The plan is internally inconsistent until that decision is resolved.
- **MEDIUM:** Deploying this before the encoding contract is fixed may still leave the Psyche seeing `&lt;project-context&gt;`, making the new absorption rules ineffective.
- **MEDIUM:** LLM behavior is only verified by smoke testing. There should also be static prompt-contract tests or golden prompt checks.

### Suggestions

- Pick Option A or B before finalizing the plan. My recommendation: defer Option B unless you are ready to change parser/writer semantics in this same phase.
- Reword psyche instructions as: “emit this envelope; SPT routes it to disk.”
- If Option B is chosen, make it a separate Rust implementation plan, not a prompt-doc task.
- Move this after Plan 03, or explicitly state that Plan 03’s decoded-at-LLM-boundary contract is assumed.

### Risk Assessment

**HIGH.** The plan’s unresolved envelope-shape decision can cause incompatible implementations across Plans 03 and 04.

## 25.3-03 Plan Review

### Strengths

- Correctly treats encoding as a full-chain contract problem.
- Adds a missing `event_body_unescape` inverse and specifies amp-last order.
- Requires an end-to-end regression for the observed mixed raw/encoded state.
- Explicitly hands off the body-state answer to Plan 04.

### Concerns

- **HIGH:** Decoding must happen after EVENT framing is parsed, not by globally unescaping the full envelope string. Otherwise user body content can become structural-looking `<EVENT>` text in the wrong layer.
- **HIGH:** `depends_on: []` is risky because this plan depends on the envelope shape chosen in Plan 02. Plan 02 and 03 cannot both make independent assumptions.
- **MEDIUM:** “Decode at exactly ONE site” is too rigid. Attr decoding already exists, and multiple final consumers may need body decoding at their own boundary. The better invariant is “one body decode per transport traversal.”
- **MEDIUM:** The plan mentions EVENT-PART chunking as blast radius but does not require a chunk-boundary test.
- **MEDIUM:** Several commands are Bash-style (`tail`, heredoc) despite the project’s Windows-native constraint and PowerShell shell.
- **LOW:** A large contract doc in `poll.rs` may age poorly if encoding behavior spans modules. A small `encoding.rs` module would be cleaner if churn is expected.

### Suggestions

- Define the contract around parsed fields: attrs decoded as attrs, body decoded only after body extraction.
- Add tests for literal `</EVENT>`, literal `<br>`, amp entities, and EVENT-PART split/reassembly.
- Replace smoke commands with PowerShell-compatible here-strings and `Get-Content -Tail`.
- Make Plan 03 precede Plan 02, or require Plan 02’s envelope decision as an input.

### Risk Assessment

**MEDIUM-HIGH.** The investigation-first structure is good, but an incorrect decode boundary would create framing or prompt-injection regressions.

## 25.3-04 Plan Review

### Strengths

- Correctly waits for Plan 03 before routing inbound bodies.
- Makes the key architecture choice explicit: deterministic wrapper route vs LLM round-trip.
- Preserves LLM handoff if direct routing is chosen.
- Treats route failure as nonfatal, which protects live-agent continuity.

### Concerns

- **HIGH:** Option 1 ignores Plan 02 Option B’s possible `name=` attr. If named project-context envelopes are adopted, routing by receiver `last_project_name` may write cross-project context to the wrong file.
- **HIGH:** Direct route followed by stale LLM re-emit can overwrite the newly persisted inbound body. The plan acknowledges this but does not mitigate it.
- **HIGH:** Option 2 relies on prompt compliance for persistence, which is the failure mode that triggered this phase. It is weaker than the defect requires unless paired with enforcement or retry.
- **MEDIUM:** “Route failures are nonfatal” is operationally safe but can hide persistence failure. It needs a clear warning/status surface.
- **MEDIUM:** Tests require mocking `resume_session_with_exit` and `route_two_slice`, but the plan does not call out the necessary dependency injection or helper extraction.
- **LOW:** `$OWL deliver todlando <<'EOF'` is not PowerShell-compatible.

### Suggestions

- Prefer Option 1, but add stale-overwrite mitigation: after direct route, ensure the next prompt baseline includes the routed content and warn/error if haiku output omits project context.
- If Option B from Plan 02 is chosen, route by validated `name=` attr; otherwise route by resolved `last_project_name`.
- Add a monotonic append/merge or source precedence rule so LLM output cannot erase direct-routed inbound content accidentally.
- Extract a small testable helper for “route inbound commune body” rather than testing through full wrapper process flow.

### Risk Assessment

**HIGH** unless Option 1 is chosen with overwrite mitigation. Option 2 is too dependent on LLM compliance for a persistence defect.

## Cross-Plan Recommendations

1. Resolve envelope shape first: no-name `<project-context>` vs named `<project-context name="X">`. This decision affects Plans 02, 03, and 04.
2. Recommended order: Plan 01, Plan 03, Plan 02, Plan 04. Encoding state should be settled before prompt absorption rules and wrapper routing.
3. Avoid deploying after every partial plan unless you intentionally want intermediate behavior live. A single integrated deploy after 01-03, then final deploy after 04, would reduce churn.
4. Replace Bash commands with PowerShell equivalents throughout.
5. Fix cargo filter commands. `cargo test` accepts one test filter; commands like `cargo test --release --lib resolve_self_project_stamp_prefers... read_tracked_last_project_name` are likely invalid.
6. Add one phase-level invariant test: inbound body with live + project envelopes reaches final persistence with no `tracked` project, no encoded tags, and marker text in the expected project file.

Overall phase risk: **HIGH as written**, mainly due to unresolved cross-plan contract choices. With the envelope shape chosen upfront, resolver fallback tightened, and decode boundary precisely defined, this drops to **MEDIUM**.

---

# Cycle 2 Re-Review (post-replan)

**Reviewed at**: 2026-05-22T18:10:00-07:00

## Prior HIGH Resolution

| Prior HIGH | Status | Evidence |
|---|---|---|
| Plan 01: `resolve_self_project_stamp` could still fall back to cwd and emit `project="tracked"` | **RESOLVED** | Plan 01 now routes both stamp and context-block resolution through `info.json.cwd` plus a final cwd fallback guard. |
| Plan 02: Option B exceeded file scope | **RESOLVED** | Plan 02 locks Option A no-name `<project-context>` and removes parser/writer changes from scope. |
| Plan 02: “Psyche writes file” vs “Rust routes envelope” ambiguity | **RESOLVED** | Plan 02 explicitly standardizes on “SPT routes; you emit.” |
| Plan 02: `<project-context name="...">` vs no-name inconsistency | **RESOLVED** | Option A no-name envelope is locked throughout. |
| Plan 03: body decoding before EVENT framing parse would be unsafe | **RESOLVED** | Plan 03 now requires body decode only after framing/body extraction. |
| Plan 03: missing dependency on envelope-shape decision | **RESOLVED** | Option A is locked, and Plan 03 now runs before Plan 02 with that assumption. |
| Plan 04: named envelope routing ambiguity | **RESOLVED** | No-name Option A removes `name=` routing entirely. |
| Plan 04: stale LLM re-emit could overwrite direct-routed content | **PARTIAL** | Monotonic guard is added, but the proposed strict-subset heuristic may reject legitimate summarization or still miss stale rewrites. |
| Plan 04: Option 2 relies on prompt compliance | **RESOLVED** | Option 1 direct wrapper routing is locked/default; Option 2 is explicitly disfavored. |

## Fresh Review

### 25.3-01

- **MEDIUM:** Plan 01 says `"tracked"` rejection is path-context-sensitive at all three layers, but the legacy perch field branch rejects `p == "tracked"` globally. That contradicts the stated invariant, even if today that field is usually unset.
- **MEDIUM:** `resolve_self_project_name_via_info_cwd` appears introduced as a private helper in `src/owl/echo_commune.rs`, but Plan 04 later wants wrapper/tracked code to reuse it. Either make it `pub(crate)` in a shared module or avoid depending on it outside `echo_commune.rs`.
- **LOW:** Adding `serial_test` is pragmatic, but it expands the dependency surface for env tests. A local mutex/test lock would avoid the new dev dependency, though this is not blocking.

### 25.3-02

- **MEDIUM:** The static prompt test checks for absence of `<project-context name=` only in `psyche.md`, while the plan’s invariant says no name examples anywhere in psyche plus skill files. Add the same guard over `commune.md` and `signoff/SKILL.md`.
- **MEDIUM:** The plan still depends on a human/LLM smoke cycle to prove artifact materialization. That is acceptable for prompt behavior, but it should be treated as a real blocking verification, not just a deployment smoke.
- **LOW:** Task numbering skips from Task 0 to Task 2. Not functionally harmful, but easy to confuse during execution.

### 25.3-03

- **HIGH:** Plan 03 says the LLM stdin should have no `&amp;` entities in the end-to-end test. That is too strong if user content legitimately contains a literal encoded string like `&amp;` or `&lt;` as prose. The invariant should assert no transport-encoded envelope tags, not “no entities anywhere.”
- **MEDIUM:** `event_body_unescape` uses `<br>` -> newline first, which means a user’s literal `<br>` text will not round-trip through `event_body_escape` because `<br>` is escaped as `&lt;br&gt;`, not `<br>`. The test named “literal_br” needs to distinguish transport newline markers from user literal `<br>`.
- **MEDIUM:** The plan still includes a blocking decision between Option A and B even though the surrounding text strongly implies the LLM must see decoded literal tags. If convergence is desired, lock Option A unless there is a real reason to keep the checkpoint.

### 25.3-04

- **HIGH:** `route_inbound_commune_body` claims `project_routed: true` whenever `route_two_slice` returns `Ok`, but existing semantics may silently drop the project slice when no project resolves. That makes logs and tests falsely report project persistence. The outcome must reflect actual writes.
- **HIGH:** The monotonic guard is underspecified for real context evolution. “Strict subset by nonblank lines” can reject valid summarization/compaction and can fail to catch stale rewrites that paraphrase or reorder. For a safety guard, prefer marker/version metadata, append-only merge, or explicit preserve-on-conflict behavior with tests for expected LLM summarization.
- **MEDIUM:** Plan 04’s files_modified omits `src/owl/echo_commune.rs`, but Task 2 wires outbound echo_commune to `route_two_slice_monotonic` there.
- **MEDIUM:** The helper placement/resolver dependency is muddy: Plan 04 says route via Plan 01’s resolver in `echo_commune.rs`, but wrapper should not depend on an owl module-private helper. Move the resolver to `common/owlery.rs` or `common/tracked.rs`.
- **LOW:** The structural source-order test for “routes before LLM handoff” is brittle. A small dependency-injected function boundary would give a stronger behavioral test.

**Overall risk after revision**: **HIGH**. Most Cycle 1 contract issues are resolved, but convergence is not reached because Plan 03’s entity invariant and Plan 04’s false route outcome plus weak stale-overwrite guard introduce remaining HIGH risks.

---

# Cycle 3 Re-Review (post-replan-2)

**Reviewed at**: 2026-05-22T18:28:00-07:00
**Status**: NOT CONVERGED — 4 new HIGHs

**Cycle-2 HIGH Resolution**

| Cycle-2 HIGH | Status | Evidence |
|---|---|---|
| Plan 03 entity invariant was too broad: “no `&amp;` entities anywhere” | **RESOLVED** | Cycle 3 reframes the invariant to “no transport-encoded envelope tags” and explicitly preserves user-prose entities. |
| Plan 04 false `project_routed: true` when `route_two_slice` silently skipped project writes | **RESOLVED** | `RouteOutcome` / `SliceWriteState` now distinguishes `Written`, `SkippedNoSlice`, `SkippedNoProjectName`, and `IoError`. |
| Plan 04 stale-overwrite guard was underspecified | **PARTIAL** | The strict-subset heuristic is replaced, but the precedence-marker design is not yet integrated safely with all context read/write paths. |

**New HIGH Issues**

- **HIGH:** Precedence metadata will leak into Psyche context unless every read path strips it. Plan 04 prepends `<!-- spt:source=... -->` to `live_context.md` and `projects/<name>/<self_id>.md`, but existing prompt builders read file bodies raw, including [build_current_context_blocks](C:/Users/decid/Documents/projects/claude_skill_owl/src/owl/echo_commune.rs:399), [compose_init_signoff_payload](C:/Users/decid/Documents/projects/claude_skill_owl/src/live/signoff.rs:43), and other `live/context.rs` project readers. That turns routing metadata into model memory and risks re-emission, summary drift, and marker forgery confusion. Add a central “read context body stripping SPT head marker” helper and use it everywhere context files are consumed.

- **HIGH:** Plan 04’s resolver contract is internally contradictory and can reintroduce `projects/tracked`. It says `route_two_slice_outcome` should return `SkippedNoProjectName` when the Plan 01 resolver returns `None`, but also says it should fall back to existing resolution. The existing project router uses `derive_current_repo_names()` in [route_project_slot](C:/Users/decid/Documents/projects/claude_skill_owl/src/owl/echo_commune.rs:315), which is the original `tracked` false-positive source. For wrapper-direct routing, no unsafe cwd fallback should run; use the shared helper only, or use the safe path-based helper with the tracked filter.

- **HIGH:** Signoff two-slice routing remains divergent. [route_two_slice_signoff](C:/Users/decid/Documents/projects/claude_skill_owl/src/live/signoff.rs:94) still resolves projects with `derive_current_repo_names()` and raw `std::fs::write`, with no shared resolver, no `RouteOutcome`, and no precedence guard. The roadmap explicitly lists signoff / stale signoff as blast radius. Leaving this path unchanged means project-context persistence defects can still reproduce outside commune.

- **HIGH:** Several planned tests are not executable where specified. `tests/native_owl.rs` is an integration test crate, so it cannot call `pub(crate)` internals like `event_body_unescape`, `route_inbound_commune_body`, `route_two_slice_with_precedence`, or private marker helpers. Move those tests into inline `#[cfg(test)]` modules beside the code, or expose a deliberate public test harness. As written, the verification plan will not compile.

**Medium / Low**

- **MEDIUM:** Plan 01 says `project_name_from_cwd_path("/nonexistent/path")` returns `None`, but the specified fallback-to-`file_name()` behavior would return `Some("path")` unless existence/canonicalization is required. Clarify that invalid `info.json.cwd` paths do not become synthetic projects.

- **MEDIUM:** Plan 04’s 60-second stale window is operationally plausible, but it is still time-based rather than causally tied to a specific echo fire. At minimum, make the window configurable and log enough data to debug suppress/proceed decisions.

**NOT CONVERGED**

Blocking HIGHs:
1. Precedence markers are not treated as out-of-band metadata on all context read paths.
2. `route_two_slice_outcome` fallback can reintroduce the `tracked` false positive.
3. Signoff routing remains on the old resolver/write path.
4. Required private-helper tests are placed in integration tests and will not compile as written.

---

# Cycle 4 Re-Review (post-replan-3)

**Reviewed at**: 2026-05-22T18:48:00-07:00
**Status**: NOT CONVERGED — 3 new HIGHs

Cycle-4 is closer, but I would not call it converged.

**Cycle-3 HIGH Resolution**
| Cycle-3 HIGH | Status | Rationale |
|---|---|---|
| Precedence metadata leaks into Psyche context | **PARTIAL** | Plan 04 adds `read_context_body_stripped`, but the plan omits `src/live/context.rs` from `files_modified` even though `download_payload` reads `live_context.md` and `projects/<name>/<self_id>.md` into prompt/session injection paths. The audit commands can also false-pass because several use `-SimpleMatch` with regex-looking patterns. |
| `route_two_slice_outcome` fallback can reintroduce `projects/tracked` | **RESOLVED** | Plan 04 now explicitly requires `route_two_slice_outcome` to use only `resolve_self_project_name_via_info_cwd(self_id)`, with `SkippedNoProjectName` as final when it returns `None`, plus a regression test where cwd would otherwise resolve. |
| Signoff routing remains divergent | **RESOLVED** | Plan 04 expands scope to `src/live/signoff.rs`, refactors `route_two_slice_signoff` through `RouteOutcome` + `route_two_slice_with_precedence(..., "direct")`, and adds audit gates for no `derive_current_repo_names` / raw write path. |
| Private-helper tests placed in integration tests | **PARTIAL** | Most pub(crate)-targeting tests are moved inline. Plan 03 still leaves a “public-boundary” end-to-end encoding test in `tests/native_owl.rs` without naming a concrete public API, and gives conflicting fallback guidance about adding a public test harness. This may still be non-executable as written. |

**New Cycle-4 Issues**
- **HIGH:** Precedence marker placement can break existing live-context YAML front-matter parsing. Plan 04 prepends `<!-- spt:source=... -->` as line 1 of `live_context.md`, but `src/live/context.rs` parses stored stamps from file-head YAML before body injection. If the marker becomes the first line, the existing front-matter parser no longer sees YAML at the head. The plan strips markers for prompt body reads, but does not update stamp/front-matter readers to strip or skip the marker first.

- **HIGH:** The marker-leak audit is not reliable enough to prove HIGH 1 closed. `src/live/context.rs` has prompt/session payload reads around `download_payload` for both live and project context, but Plan 04’s `files_modified` and Task 2A file list omit it. The audit command also uses `Select-String -SimpleMatch` with patterns like `read_to_string.*live_context\.md`, so those are literal strings, not regexes. That can report PASS while raw reads remain.

- **HIGH:** Plan 03’s “decoded body at LLM stdin” contract is still underspecified at the actual wrapper boundary. The wrapper currently passes a full `<EVENT ...>{body}</EVENT>` string to `resume_session_with_exit`. The plan says decode only after body extraction, but does not specify a concrete parser/recomposer or whether the LLM receives only the decoded body or a full EVENT with decoded inner tags. If it is the latter, literal user `</EVENT>` in body becomes structural-looking in the prompt layer despite the code-level framing parse being complete.

- **MEDIUM:** `read_route_guard_window_secs()` reads `.planning/config.json` relative to process cwd. Live wrappers run from the psyche/runtime cwd, not necessarily the repo root, so the configured window may silently default to 60 in deployed use. Use an explicit config path under SPT_HOME or document that this knob is planning/test-only.

- **MEDIUM:** Plan 01’s nonexistent-path clarification is only partly tested. `/nonexistent/path` returns `None`, but a stale path under an existing git repo, e.g. `<repo>/deleted/subdir`, may still resolve via parent `.git` before `path.exists()` is checked. If “nonexistent paths never resolve” is the intended contract, existence must be checked before the git walk or tested for stale children.

**NOT CONVERGED** + blockers: marker stripping/read-site coverage is incomplete and audit gates can false-pass; marker prepending may break live-context YAML front-matter parsing; Plan 03 still lacks a concrete safe decoded-at-LLM-boundary implementation contract.
