# Project Retrospective

Living document — appended at each milestone boundary. Newest milestone first.

## Milestone: v1.7 — Seamlessification

**Shipped:** 2026-05-16
**Phases:** 5 (26, 27, 28, 29, 30) | **Plans:** 26 | **Commits:** 211 | **Diff:** 422 files / +39,141 / -1,851 lines
**Deployed:** v1.10.10 (cplugs HEAD c7d502f)
**Timeline:** 7 days (2026-05-10 → 2026-05-16)

### What Was Built

Friction removal across the live-agent lifecycle. Five phases, each targeting a specific seam where context or control was leaking:

1. **Agent picker** (Phase 26) — `/spt:live` no longer requires an explicit id; auto-launches single agent, fires AskUserQuestion otherwise.
2. **Streaming poll** (Phase 27) — Listener moved from per-message Bash spawn to Monitor-tool stream mode with `<EVENT>` envelopes.
3. **SessionStart psyche-context** (Phase 28) — Live perches auto-emit memformat + context records on `/clear` & `/compact`.
4. **Auto echo-commune** (Phase 29) — Self-side delta is captured at every boundary (`/clear`, `/compact`, orphan-detect); legacy prose envelope replaced by typed `<EVENT type="echo_commune">`.
5. **File-drop protocol** (Phase 30) — `/spt:commune` and `/spt:signoff` skills write `.claude/{id}-{commune,signoff}.md`; wrapper consumes destructively; listener exits cleanly on signoff via `FileDropOutcome::BreakLoop`.

### What Worked

- **Wrapper-as-dispatcher pattern.** Phase 29's `FIRE_ECHO_COMMUNE_NOW` and Phase 30's `file_drop` envelope both reused the existing inner-poll arm dispatch. New control messages slot in without spawning new processes — single source of truth for what runs in Psyche's session, and the source-order test on the wrapper kept the arms ordered correctly.
- **Shared payload helper landed before its consumers.** Phase 28's `download_payload(self_id)` was extracted as a single helper feeding both CLI and SessionStart injection paths; Phase 30 plan 30-03 then appended Pending sections to the same helper without touching either call-site. Refactor-then-extend cleaned up faster than copy-paste-then-reconcile would have.
- **Defense-in-depth via off-roadmap sub-iterations.** When Phase 30 surfaced two latent defects post-status:passed (wrapper-signoff-doesnt-break-loop, stale-signoff-fires-on-next-session-start), the fixes landed in-tree before milestone audit with regression tests. Treating these as `.planning/debug/` sessions rather than new phases kept the v1.7 surface clean.
- **Live dogfooding caught the long-tail bugs.** The doyle agent's gen43→gen44→gen45 transitions across v1.9.11..v1.10.10 deploys exercised Monitor-tool poll, file-drop consume, and INIT_SIGNOFF predicate fix under real conditions. Each gen handoff was a live UAT.

### What Was Inefficient

- **Phase 27 VERIFICATION.md was not produced at deploy time.** Phase 27 shipped 2026-05-11 with `27-07-DEPLOY-NOTES.md` standing in for verification artifacts; the canonical VERIFICATION.md had to be backfilled at the v1.7 milestone audit (2026-05-16). The phase was substantively verified — but the doc gate didn't enforce itself. Future deploys should treat VERIFICATION.md as a hard exit criterion for any plan tagged "complete".
- **REQUIREMENTS.md milestone table lagged behind phase work.** AUTO-EC-01..03 (Phase 29) and SC-01..06 (Phase 30) were defined inside their phase artifacts but never propagated to the top-level traceability table until v1.7 close. The milestone audit caught it; better would be a roadmap-time enforcement that every milestone phase has its REQ-IDs surfaced in REQUIREMENTS.md before execute-phase runs.
- **Stale quick-task tail.** 38 pre-v1.7 quick tasks accumulated `status: missing` because they used `{slug}-SUMMARY.md` filename convention instead of the bare `SUMMARY.md` the audit tool expects. Inconsistent filename convention meant a one-line `status: complete` frontmatter was effectively invisible to the audit. Resolved at milestone close with a bulk write, but the underlying convention drift is worth standardizing.

### Patterns Established

- **Typed event envelopes over prose.** `<EVENT type="msg|alarm|echo_commune|file_drop">` is the new contract for poll-stream and inter-agent dispatch. Provenance + descriptor attrs + case-insensitive predicates. Legacy prose envelopes are cut over per release rather than fallback-parsed.
- **Destructive consume with write-then-delete ordering.** Drop files (`commune.md`, `signoff.md`) and the on-disk psyche-download append both follow: append succeeds on disk → only then unlink source. Retain-on-error keeps the drop file alive across transient FS failures. Race-loser between `scan_drop_files` and `append_pending_sections` silently skips via `ErrorKind::NotFound`.
- **Wrapper-driven, hook-fire-and-return.** SessionStart hook does not block on subprocess work. It snapshots state (e.g. prior session UUID), dispatches a control message to the wrapper, and returns. The wrapper's inner-poll handles the heavy lifting on its own cadence. Keeps the hook fast and decouples Self's session lifetime from Psyche's processing latency.

### Key Lessons

- **A doc gate that doesn't fail loudly is no gate.** Phase 27 shipped without VERIFICATION.md because nothing blocked the deploy. Future: `gsd-sdk` should refuse to mark a phase complete if VERIFICATION.md is missing.
- **REQ-ID surface area is part of phase scope, not bookkeeping.** When a phase's REQ-IDs only live in phase docs and not REQUIREMENTS.md, downstream tooling (audit, traceability) can't find them. Treat REQUIREMENTS.md propagation as a Wave 1 task, not a milestone-close chore.
- **Live dogfooding > integration tests, when the system is the agent.** Several Phase 29/30 bugs were impossible to reproduce in subprocess integration tests because they depended on real wrapper-handoff timing across binary versions. Investing in doyle/dunsen/todlando perches as continuous live UAT pays back compound interest.

### Cost Observations

- Model mix during v1.7: mostly Opus 4.6/4.7 for planning + executor work, Sonnet 4.6 for routine plans, Haiku 4.5 for `gsd-sdk` deterministic queries.
- Sessions: doyle perch alone went through 3 generations (gen43→gen44→gen45) during this milestone.
- Notable: Phase 30 plan 30-01 (Self listener scan + file_drop emit) was the highest-value-per-token plan — single primitive that unlocked the whole file-drop flow downstream.

## Cross-Milestone Trends

| Milestone | Phases | Plans | Days | Theme |
|-----------|--------|-------|------|-------|
| v1.0 MVP | 4 (1, 2, 2.1, 2.2) | 11 | ~3 | Bash live-agent foundation |
| v1.1 Native | 7 (5-9.2) | 24 | 1 | Rust port + golden-test parity |
| v1.5 Spacetime | 8 (10-14.3) | 26 | 9 | Plugin migration + persistent perches + memformat |
| v1.6 MCP Fabric | 4 (15-18) + 10 sub-iter | 13 | 3 (+ ~25 in 18.x) | MCP transport spike → deprioritized → hooks/CLI-first refactor |
| v1.7 Seamlessification | 5 (26-30) | 26 | 7 | Live-agent lifecycle friction removal |

**Notable trends:**
- Sub-iteration count is rising (Phase 18.1..18.8.1 spanned ~3 weeks post-v1.6 ship). Treat as a signal that "milestone ship" is not the same as "feature stable".
- Plan-count per phase has stabilized at ~4-6 plans. Phases with >6 plans (Phase 27 at 7, Phase 26 at 6) correlate with mid-execution scope discovery — worth pre-empting with sharper discuss-phase.
- LOC velocity (~5,500 net lines added per milestone) suggests we're still in additive growth territory, not yet maintenance mode.

---
*Created: 2026-05-16 at v1.7 close*
