# Domain Pitfalls — v1.7.1 Seamlessification II

**Domain:** SPT polish round — `list` overhaul, argument-hint coverage, fresh-start commune flow, pick-spec correctness, `/spt:live --auto` + casual-language activation, plugin version-change changelog notification
**Researched:** 2026-05-16
**Confidence:** HIGH (codebase-grounded — PROJECT.md, STATE.md decisions, DEPLOY.md, pick_spec.rs, list.rs, hooks.json, every SKILL.md frontmatter)

This file is opinionated for v1.7.1 only. Generic Rust pitfalls (panic safety, unwrap discipline, etc.) are out of scope — the codebase already has those reflexes baked in (see Phase 18.7.1 trace gating, Phase 18.8.1 ExcerptSource enum).

---

## Critical Pitfalls

### CP-1 [item 6]: Version-sentinel race against binary handoff (Phase 18.4/18.5 class)

**What goes wrong:** Version-change changelog notification reads "current-version" from a sentinel file and compares against `plugin.json` (or the running binary's compiled-in version). During a marketplace deploy + handoff cycle, three actors touch state nearly simultaneously:

1. DEPLOY.ps1 step 8 atomically rewrites `installed_plugins.json` to flip `version` + `installPath`.
2. The old `owl.exe` listener detects the flip and self-migrates via stdio-relay (Phase 18.4 trampoline).
3. The new `owl.exe`'s `plugin-session-start` hook fires on the *next* SessionStart and would write the version sentinel.

If the sentinel write happens in step 2 (old binary, reading new `plugin.json` because it sources from `%LOCALAPPDATA%\spt` via `SPT_HOME`, which is global), it gets clobbered by step 3. If it happens in step 3 only, every cold session re-detects "new version" and re-prompts forever. If it lives in `%LOCALAPPDATA%\spt\` it survives `/clear` (good) but is shared across all repos (so a multi-repo user prompting in repo A pre-marks repo B's session as "seen", suppressing the prompt the user *wanted* in repo B).

**Why it happens:** SPT has TWO independent version-of-record concepts already: `plugin/spt/.claude-plugin/plugin.json` (manifest) and the compiled-in `CARGO_PKG_VERSION` baked into `owl.exe`. During handoff, they can briefly diverge (old `owl.exe` v1.10.8 reading new v1.10.10 `plugin.json`). DEPLOY.md §"What it does NOT do" calls out that killing `owl.exe` mid-handoff is forbidden — the same constraint applies to *any* version-comparison hook that writes state mid-handoff.

**Consequences:** Either silent (sentinel never advances → prompt every session = prompt fatigue) or wrong-version (sentinel records "I told user about v1.10.10" while user only ever saw v1.10.8 changelog → silently skips next bump's changelog).

**Prevention:**
- Single source of truth: compare against `env!("CARGO_PKG_VERSION")` (compiled-in), NOT `plugin.json` parsed at runtime. The compiled-in version is the binary the user is actually running; the manifest is a deploy-time artifact.
- Sentinel path: `%LOCALAPPDATA%\spt\.last-seen-version` (NOT `~/.claude/` — that's volatile per Phase 18.2 D-13).
- Atomic write via `common::owlery::atomic_write_string` (already exists, used at 5+ sites since Phase 18.8 P01). Tmp-then-rename, never bare `fs::write`.
- Only the **top-level SessionStart** hook writes the sentinel (NOT `SubagentStart`, NOT `hook-prompt`). One write site = no race.
- Defer write until **after** user has seen the prompt. Guard with a "shown" flag stored in the injected SessionStart context, OR write on next PreToolUse after a SessionStart-injected prompt fires — i.e., "user actually reached the prompt" gate.

**Detection:** Integration test: simulate handoff sequence (set `OWL_HANDOFF_CHILD=1`, spawn old then new binary, assert sentinel file is written exactly once with the new binary's `CARGO_PKG_VERSION`). Reuse `tests/handoff_integration.rs` harness.

**Phase to address:** Phase that builds item 6 (likely last in v1.7.1). Block on CP-2 being resolved first.

---

### CP-2 [item 6]: First-install / manual-install / fresh-clone — no prior sentinel exists

**What goes wrong:** A user who installs SPT for the first time has no `.last-seen-version` file. Naïve implementation reads file, compares to current version, finds mismatch ("file missing" treated as "old version") → fires changelog prompt for v1.10.11 on first session ever. User has zero context for the diff, has not used any prior version. Prompt is noise.

Likewise: a user who manually installed via `/plugin install spt@cplugs` from scratch (no prior versions cached) gets a "you upgraded from <empty> to v1.10.11" prompt that is technically true but useless.

**Why it happens:** "Missing sentinel" is ambiguous: it could mean (a) fresh install, never-seen-before, or (b) user nuked `%LOCALAPPDATA%\spt\` to reset state. Both are common.

**Consequences:** First-run friction. Users learn to dismiss the prompt, then dismiss the *real* prompt when v1.10.12 ships. Boy-who-cried-wolf class regression.

**Prevention:**
- On missing sentinel: write the CURRENT version silently as if already-seen. Do NOT prompt. The first changelog prompt fires on the *first upgrade after install*, not on install itself.
- Add a one-line "Welcome to SPT v{}" stderr emit at top-level SessionStart when sentinel is created from missing — informational, not blocking, not an `AskUserQuestion`.

**Detection:** Test: clean `SPT_HOME`, run `plugin-session-start` with stdin source=`startup`, assert (a) sentinel file exists post-run, (b) no `additionalContext` payload contains "changelog" or "What's new".

**Phase to address:** Same phase as CP-1.

---

### CP-3 [item 5]: SessionStart auto-surface firing on subagent sessions

**What goes wrong:** Item 5 wires SessionStart to auto-surface `pick-spec` JSON. If wired naively into the `plugin-session-start` hook arm, it fires on **every** Claude Code session including spawned subagents (Task tool, GSD researcher agents — like the one writing this file).

Subagents are short-lived, do not need a live agent, and currently spawn under a top-level agent that may already be live. Surfacing pick-spec to a subagent triggers: (a) clutter in the subagent's context, (b) potential auto-launch under `--auto` of a redundant Self perch in the same repo (COLLISION risk against parent's perch).

**Why it happens:** Claude Code's SessionStart hook fires for top-level sessions only — `SubagentStart` is a separate hook (already wired in `hooks.json:53` to `hook-subagent-start`). So in principle the surface IS top-level. BUT: `plugin-session-start` ALSO fires on `--resume` cold-starts of subagent sessions in some Claude Code versions, AND the GSD-researcher and live-agent Psyche-wrapper flows both invoke `claude -p` subprocesses that re-enter SessionStart with `source=startup`.

The classifier needs to refuse auto-surface when:
- The session has `OWL_HANDOFF_CHILD=1` set (it's a handoff child of an existing live agent — see Phase 18.5 Plan 01).
- The session is the Psyche wrapper itself (`SPT_PSYCHE_WRAPPER=1` or detect by cwd matching `psyches/tracked/<id>/` per Phase 18.2/24).
- `CLAUDE_PLUGIN_ROOT`'s parent session is a tracked-perch-bound parent_pid (subagent path).
- Stdin `source` is not `startup` — `clear`/`compact`/`resume` all already get re-orientation per Phase 28; the auto-surface should layer ON TOP of `startup` only.

**Consequences:** Surface fires inside subagents → either user is confused why their researcher agent asks them to "pick a live agent" mid-research, or `--auto` engages and tries to `$LIVE start doyle` while parent already owns the doyle perch → COLLISION error in subagent context → researcher returns "blocked" to parent unnecessarily.

**Prevention:**
- Allow-list approach: auto-surface fires ONLY when source=`startup` AND no `OWL_HANDOFF_CHILD` AND no `SPT_PSYCHE_WRAPPER` AND `parent_pid`-of-this-process is not registered as a tracked perch under `owlery::enumerate_perches()`.
- Add a `--top-level-only` predicate helper in `common::resolve` or new `common::session_classify`. Single source of truth.
- Reject silently (no stderr, no `additionalContext`) when allow-list fails. Stderr at this surface leaks into subagent transcripts.

**Detection:** Integration test seeded with `OWL_HANDOFF_CHILD=1` and another with a parent-perch present in owlery; assert `plugin-session-start` stdout/stderr contains NO pick-spec JSON and NO "no live agent" prompt text.

**Phase to address:** Phase that builds item 5. Same phase as CP-4.

---

### CP-4 [item 5]: Casual-language false-positives — "resume" detection misfires

**What goes wrong:** Item 5 activates `--auto` flow on casual prompts like "pick up where we left off", "keep going", "resume work". This is pattern-based intent detection on `UserPromptSubmit` (or first-prompt context). The patterns are easy to spoof unintentionally.

**Concrete false-positive scenarios (downstream consumer asked for these explicitly):**

1. **Paste of conversation log:** User pastes a transcript that contains the literal string `let's keep going` (an Assistant turn from yesterday's chat). Detector fires `--auto`. Wrong action triggered before user even types their actual prompt.
2. **Documentation reference:** User asks "in the spt docs, what does 'resume work' mean?". The substring `resume work` matches the trigger pattern.
3. **Code paste:** User pastes a `git log --oneline` output containing the word "resume" (e.g., commit `feat: resume after sleep`). Triggers `--auto`.
4. **Quoted question:** "Should I say 'pick up where we left off' to the agent?" — the quoted form matches.
5. **Multi-repo carryover:** User says "keep going" in repo A meaning "continue the current task you were JUST doing in this session" (no live agent intent). Detector fires `$LIVE start` for the most-recent agent in repo A's history — perhaps a stale agent unrelated to current session work.
6. **Cross-language false-friends:** "Let's continue" matches; "continue;" in pasted C code matches; "continue" inside a Python `for` loop discussion matches.
7. **SessionStart with `source=resume`:** Claude Code's built-in `--resume` flag passes `source: "resume"` into the SessionStart hook. The casual-language detector might pattern-match on the source token itself ("resume") rather than user prompt text — wrong layer entirely.

**Concrete false-negative scenarios:**

1. User says "go" (single word). Intent is unambiguous in context (just finished a meal-break) but doesn't match any pattern.
2. User says "ready when you are" — same intent, no keyword overlap.
3. Non-English users: "continúa" / "weitermachen" / variations.
4. User says "resume" (one word) at a fresh session start. Pattern-matches, but ambiguous — resume *what*?

**Why it happens:** Natural language intent detection without LLM reasoning is brittle. SPT's design philosophy (Touch as pure Rust, no LLM in supervisor paths) bumps against the fundamentally LLM-shaped problem of "what did the user mean".

**Consequences:** False-positive = wrong agent activated, wrong work resumed, possibly destructive (e.g., resumed a stale `/gsd-debug` session that opens debug files the user has already moved on from — Phase 18.6 had a real-world incident where MISSED_PULSES triggered wrong-agent `/gsd-plan-phase 19` execution; same class of bug). False-negative = feature feels broken, user falls back to typing `/spt:live` manually anyway.

**Prevention:**
- **Layered detection, not pattern-only.** Three gates that all must pass:
  1. User prompt is SHORT (< 80 chars). Pastes/quoted-docs are usually longer.
  2. User prompt does NOT contain `'` `"` `` ` `` (quote chars suggesting a quote/paste).
  3. User prompt matches the keyword in standalone-word position (regex with word boundaries, NOT substring `contains`).
- **Confirmation hop, not auto-fire:** Even on match, *surface* the candidate ("Resume `doyle` (last active 2h ago)?") via `AskUserQuestion` — never silently fire `$LIVE start`. The "auto" in `--auto` should mean "auto-select candidate" not "auto-execute action".
- **Decouple from GSD entirely.** The downstream consumer flagged "Next obvious body of work" without GSD coupling — DO NOT have item 5 read GSD state or attempt to identify pending phases. That couples two systems and risks firing `/gsd-execute-phase` or build commands on prompts that look like resume. Item 5's domain is "which live agent should I resume" — full stop. The agent itself decides what work to do next.
- **Top-level only** (see CP-3).
- **`source=resume` carve-out:** When Claude Code's own `--resume` flag fires SessionStart with `source: "resume"`, the casual-language detector is OFF entirely — user already resumed an actual session, no need to also resume a live agent.

**Detection:** Unit-test corpus of 30+ prompts (15 should-fire, 15 should-NOT-fire) covering each false-positive scenario above. Use `tests/casual_language_corpus.rs` or similar. Pin the corpus — every new false-positive bug gets a regression case added.

**Phase to address:** Phase that builds item 5.

---

## Moderate Pitfalls

### MP-1 [item 1]: `list` default-change discovery loss + downstream grep callers

**What goes wrong:** Today `$OWL list` and `$LIVE list` show online + offline in one combined view (online first, offline second dimmed). Changing default to online-only means offline perches DISAPPEAR from the default view. Users who relied on `$OWL list` to find "that offline doyle perch I want to revive" suddenly see nothing and assume the perch is gone — but the data is intact, they just need `--all` or `--offline`.

Additionally: **hooks, tests, and skills MAY grep `$OWL list` output**. Three concrete grep-callers:

- `tests/golden_owl.rs` — golden test diffs against `tests/golden/` fixtures. Any default-output change requires regenerating fixtures via `scripts/capture_golden.sh`.
- `src/owl/doctor.rs` — invoked by skills as `$OWL doctor`; if it shells out to or invokes `list_result()`, the default may flow through.
- `plugin/spt/skills/list-ready/SKILL.md` line 17: "Shows active listeners with pending message count. Automatically cleans stale perches." — that doc claim of "active listeners" already implies online-only; the SKILL.md is *already* aligned with the new default. `list-live/SKILL.md` says "Shows active live agents... Cleans stale entries" — similarly aligned.

**Why it happens:** Default-changes are silent breakage for the subset of callers that depend on the old default. Phase 18.7 had a parallel hard-cutover (CLI: `LiveCommands::TimedPulse` → `Commands::NewAlarm`) and they explicitly preserved old test fn names with explanatory comments — same discipline applies here.

**Consequences:**
- Golden test failures across the board on first run after the change.
- Users surprised that "their" offline perches are missing → file false "data loss" bugs.
- Doctor / reboot / revive workflows that previously relied on a comprehensive list lose their pre-flight check.

**Prevention:**
- Grep audit BEFORE changing default: `rg "OWL.*list|LIVE.*list" plugin/spt/ src/ tests/ scripts/ docs/` — every hit must be classified as "expects online-only" / "expects all" / "doesn't parse output". Document the audit table in the phase plan.
- Regenerate golden fixtures and commit them in the same atomic commit as the default change. Don't split across commits — bisect would land on a half-broken state.
- The `run()` (stderr-printing) and `list_result()` (structured) variants currently both accept `online_only: bool`. Audit ALL `list_result` callers — `src/common/outcomes.rs` consumers, MCP server (`src/owl/mcp.rs` or wherever it lives), `$OWL doctor`. Any caller passing `false` today expecting full list still gets full list. Only the CLI default flips.
- **Migration messaging** in CHANGELOG / Phase SUMMARY.md: "BREAKING (UX): `$OWL list` no longer shows offline by default. Use `--all` for previous behavior. Use `--offline` for offline-only. `--here` filters to current repo's cwd."
- Stderr emit in the empty-online case: "No online listeners. Pass `--all` to include offline perches (N offline)." — discoverability gift.

**Detection:** Run `cargo test` BEFORE and AFTER. Diff the golden output. Run `$OWL list` in a repo with a known mix of online + offline perches and confirm `--all`, `--offline`, `--here` each show the expected subset.

**Phase to address:** Phase that builds item 1. First phase in v1.7.1 — lowest dependency.

---

### MP-2 [item 2]: argument-hint frontmatter regressions break skill loading entirely

**What goes wrong:** Claude Code parses SKILL.md frontmatter as YAML. A malformed `argument-hint:` value (unquoted special chars, wrong type, conflict with `description`) can cause Claude Code to silently skip loading the skill. Symptom: `/spt:foo` no longer appears in the slash-command picker.

**Concrete failure modes observed in the existing skill set:**

The current frontmatter uses three inconsistent shapes:
- `argument-hint: <id> [--period <seconds>]` (bare angle brackets, unquoted — works because YAML treats `<...>` as a plain scalar)
- `argument-hint: "[<id>] | --all"` (quoted because `|` is a YAML special char — folded-scalar marker)
- `argument-hint: ""` (empty string, quoted — used by `clear-psyche`, `signoff`, `psyche-download`)
- `argument-hint: (no arguments)` (`whoami` — bare parens, will parse as plain string)
- `argument-hint: <time_spec> -- <message>` (`new-alarm` — bare `--` is fine but could confuse a YAML linter)
- `argument-hint: <msg>` (commune — unquoted; if a future hint contains `:` or `#` it would break)

**Pitfalls when ADDING hints to remaining skills (currently MISSING argument-hint: `list-ready`, `list-live`, `list-psyche`):**

1. **`commune` already has `<msg>`** — but if rewritten to mention multi-line bodies and uses `:`, e.g. `argument-hint: <msg> (multiline ok: heredoc)`, the colon inside parens parses as a YAML key. Quote it.
2. **YAML `#`** — any hint containing a `#` (e.g., `<count> # of items`) becomes a comment-truncated string.
3. **`description: |` block-scalar conflict** — `live/SKILL.md` uses `description: |` as a multi-line block. If `argument-hint` is placed BEFORE `description: |` and YAML indentation is off, the parser folds the hint into the description.
4. **`argument-hint:` vs `description`** — some Claude Code versions display argument-hint inline; description in tooltip. If the same content is duplicated across both, the tooltip becomes redundant. Conversely, if `argument-hint` is set but `description` omits any argument shape, the LLM may not learn what args mean.
5. **Empty string vs missing key** — `argument-hint: ""` is documented in Phase 30 Plan 04 as the explicit no-args convention (T-30-24 mitigation: "key NOT deleted outright per schema-safety policy"). Omitting the key entirely *might* work, but the project has already committed to the empty-string convention. Follow precedent.

**Why it happens:** YAML is full of footguns. Claude Code's frontmatter parser is undocumented in terms of strictness; safest assumption is "strict YAML 1.2".

**Consequences:** A typo in ONE skill's frontmatter might break ONLY that skill. A typo in `commune/SKILL.md` would break the live-agent workflow. Worst case: a typo silently breaks the skill while CI passes (no automated frontmatter validator exists).

**Prevention:**
- **Frontmatter linter test:** add `tests/skill_frontmatter.rs` (already pattern-matches existing test layout). For each `plugin/spt/skills/*/SKILL.md`, parse the frontmatter with `serde_yaml` (already implicit via serde_json — check Cargo.toml, may need adding) OR a hand-rolled tolerant parser. Assert:
  - `name:` exists, matches dir basename
  - `description:` exists, non-empty
  - `argument-hint:` exists (this is the new v1.7.1 invariant)
  - `allowed-tools:` is a YAML array
- **Convention table** in the phase plan: every hint MUST be either a quoted string `"..."` OR a bare-word using only `<>[]a-zA-Z0-9 _-`. No `:`, no `#`, no `|`, no `{}`, no `&`/`*`/`!` anchors.
- **Deploy-gate check:** add to DEPLOY.ps1 step 3 (after build, before sync) a frontmatter validation pass. Fail-fast before pushing broken skills to marketplace.

**Detection:** New test. Manual: `/reload-plugins` after every frontmatter edit, then run `/spt:` and confirm every skill appears in the picker.

**Phase to address:** Phase that builds item 2. Low-risk if test gate exists; high-risk if "just edit the frontmatter" approach.

---

### MP-3 [item 4]: pick-spec v1 schema break risk — Phase 26 froze it

**What goes wrong:** Item 4 fixes a pick-spec bug where live-only repos (all known agents are online) emit `kind:"prompt-new"` instead of returning the online agents as offerable (so user can `fork`). Naïve fix: add a new field like `live_in_repo: [...]` to the existing `pick` or `prompt-new` JSON. This BREAKS the v1 schema contract.

Per `src/live/pick_spec.rs` lines 16-21 (verified read):
> Schema is frozen at v1; future versions add new `kind` values, never re-shape existing ones.

And per STATE.md Decisions:
> 26-03 ships pick-spec JSON contract frozen at v1; new kinds add variants, never re-shape existing

**Why it happens:** The `/spt:live` SKILL.md dispatches on `kind` via a fixed table (lines 159-167 of live/SKILL.md). Adding fields to existing `pick`/`prompt-new` shapes works IF the SKILL stays unaware of them (forward-compat), but the SKILL's dispatch table SPECIFICALLY handles `live: true` only in the `resolve` shape. Adding `live: true` to `pick` breaks the implicit "pick = offline-only candidates" contract.

**Consequences:**
- Existing `/spt:live` SKILL — if not updated in lockstep — will dispatch on the new shape using old rules, and silently `$LIVE start` an agent that's actually live → COLLISION error surfaced to user.
- Other callers of pick-spec (none today, but the schema is documented as a public contract → any external script depending on v1 shape breaks).
- Hidden in tests: `src/live/pick_spec.rs` has 6+ `build_spec` tests asserting EXACT field shapes via `assert_eq!`. Schema change cascades through all.

**Prevention:**
- **Add a new `kind`**, not new fields. Possibilities: `kind:"all-live"` (every known agent is currently live, user must fork or pick something new), or extend `resolve` (already has `live: bool`) to be the SOLE place `live` state is reported.
- **Bump schema** if a true breaking change is needed: add top-level `"schema_version": 2` field to all kinds, and have the SKILL dispatch on `(schema_version, kind)`. Old `owl.exe` callers seeing `schema_version: 2` know to fail-fast with a clear error rather than silently mis-dispatch.
- **The "live-only repos" fix is probably orthogonal to pick-spec:** the real bug is that `build_pick_spec()` filters out online agents (line 149: `.filter(|e| !owlery::is_perch_online(&e.id))`), and when ALL agents are online, the result is `prompt-new` (lying — agents DO exist, they're just live). A cleaner fix: return a NEW kind `kind:"all-live"` with a `live: [{id, last_active}]` array, instructing the SKILL to surface "All known agents are live — fork one or start fresh?". That's additive, not breaking.

**Detection:** Test invariant: every existing test that asserts `kind in {"auto","pick","prompt-new","resolve"}` must still pass unmodified. Net-new tests cover the new kind only.

**Phase to address:** Phase that builds item 4. Couple with item 3 (fresh-start commune) since both flow through the same picker dispatch.

---

### MP-4 [item 3]: Fresh-start commune blocks live boot / fires twice

**What goes wrong:** Item 3 surfaces a first-commune prompt when agent identity is "known-new" (via pick-spec `prompt-new` path OR via fresh psyche-download returning empty). Two failure shapes:

1. **Blocking the boot:** If the prompt is fired via `AskUserQuestion` synchronously *inside* `$LIVE start <id>`, and the user is AFK or ignores it, the live boot stalls — Psyche wrapper waits, listener loop never enters poll, perch never registers `ready`. Other agents trying to `deliver` to this perch get `NO_PERCH` errors.

2. **Firing twice:**
   - Path A: pick-spec returns `prompt-new` → SKILL fires AskUserQuestion → user types new id → SKILL fires first-commune prompt → user types commune body.
   - Path B: pick-spec returns `auto` for a known id → SKILL runs `$LIVE start <id>` → psyche-download returns empty (fresh psyche, never communed) → fresh-start commune prompt fires.
   - **Bug:** the downstream consumer flagged "confusing pick-spec 'known-new' with 'fresh psyche-download empty' — broadened trigger must handle both paths uniformly". If both paths independently fire the prompt, a user creating a brand-new agent via Path A sees the commune prompt, types it, then on next `psyche-download` (could be auto-fired by SessionStart per Phase 28), Path B *also* fires — fresh-start prompt redux.

**Why it happens:**
- The "known-new" signal in Path A comes from `pick-spec.kind == "prompt-new"` (zero history).
- The "fresh psyche" signal in Path B comes from `psyche-download` returning empty/NO-CONTEXT.
- These are RELATED but not identical: a user could revive a dormant agent (Path A says `auto` because history exists) whose psyche-download returns empty (psyche was wiped). The user wants a fresh-start commune *here* but Path A wouldn't have prompted.

**Consequences:** Boot stalls (FP-1) or prompt fatigue (FP-2). Both erode trust.

**Prevention:**
- **Non-blocking prompt:** prompt fires AFTER `$LIVE start` returns (perch registered, listener live). Either inject the prompt as a SessionStart `additionalContext` "remember to commune" nudge, OR fire it on first idle-poll-tick. Boot must not depend on user input.
- **Single sentinel for "first commune sent":** create `{SPT_HOME}/owlery/<id>/.first-commune-sent` (atomic touch, follows Phase 18.8 P01 atomic_write convention). Both Path A and Path B check this sentinel — present → skip prompt. Path A's "user typed commune body" handler writes the sentinel synchronously.
- **psyche-download empty-detection** must distinguish "fresh, never communed" from "psyche-context cleared via /spt:clear-psyche" — clear-psyche is a user action, not a fresh-start. Add a `cleared_at` timestamp inside psyche-context-meta when clear-psyche runs; fresh-start detector checks BOTH context-emptiness AND absence of `cleared_at`.
- **Confirmation that user saw the prompt:** AskUserQuestion has built-in confirmation (user must click). The risk is when the prompt is injected via SessionStart `additionalContext` — the LLM might silently absorb it and never surface to user. Use the explicit `AskUserQuestion` tool, not free-form context injection, for any user-facing first-commune prompt.

**Detection:** Integration test: spawn fresh agent via `prompt-new` path → assert sentinel exists after first commune → second SessionStart cycle does NOT re-fire prompt. Second test: same but psyche-download mid-flow returns empty → assert sentinel still suppresses second fire.

**Phase to address:** Phase that builds item 3. Couple with item 4 (same picker code path).

---

## Minor Pitfalls

### MP-5 [item 1]: `--here` requires reliable cwd-to-perch mapping

`--here` filters to perches whose `info.json.cwd` matches the current cwd. Per `pick_spec.rs:251-298`, cwd matching is done via `fs::canonicalize` of both sides — Windows UNC handling (`\\?\` prefix) is already covered (`to_forward_slash` helper, Phase 30). But `info.json.cwd` is OPTIONAL (`cwd: Option<String>`) and legacy perches don't have it. `--here` must gracefully degrade: legacy `cwd=None` perches are excluded from `--here` (NOT included as "maybe here"). Document this in the SKILL.

### MP-6 [item 2]: `argument-hint` for skills that take stdin not args

Skills like `signoff`, `commune`, `psyche-download` use `""` because they take stdin or use the Write tool, not CLI args. Future maintainers adding `argument-hint: <body>` to these would be WRONG — these skills do not take a positional `<body>` arg; the body is piped/Written. Document: "argument-hint reflects what the user types after the slash command, not what the skill reads from stdin or files."

### MP-7 [item 6]: Version-skip diff — v1.10.5 → v1.12.0 unread interim changelogs

If user skips multiple versions (e.g., they were offline for 2 months), the prompt must decide: show all interim changelogs OR just delta from last-seen to current?

**Recommendation:** show a concise version-list (`v1.10.6 → v1.10.7 → v1.10.8 → ... → v1.12.0`) with a link/path to the consolidated CHANGELOG.md, NOT the full prose of every version. Otherwise the prompt becomes a wall of text and users dismiss without reading.

The CHANGELOG must live in the repo (`CHANGELOG.md` at root, or `docs/CHANGELOG.md`). Currently it does NOT exist as a standalone file — version history is scattered across phase SUMMARY files and the milestone roadmap. **Adding item 6 requires first creating a CHANGELOG.md** that the prompt can reference. Prerequisite work, not a deliverable.

### MP-8 [item 5]: Prompt fatigue on every-session auto-surface

Even with all of CP-3 / CP-4 / MP-7 mitigated, surfacing pick-spec at *every* top-level SessionStart in repos with N≥2 known agents is intrusive. Mitigation: surface only when no live agent currently exists in this repo (`$LIVE list --online` empty), AND only on cold-starts (`source=startup`, NOT `clear`/`compact`/`resume`).

### MP-9 [item 6]: `installed_plugins.json` is the version-of-record at deploy time, not at session time

DEPLOY.md §"Step 8" describes `installed_plugins.json` as the "authoritative pointer flip". But at SessionStart-time, the *running* binary's `CARGO_PKG_VERSION` is authoritative for what the user is *currently using*. Re-read CP-1 prevention: use compiled-in version, NOT manifest. Re-stated here because it bears repeating — the temptation to `serde_json::from_str(installed_plugins.json)` at hook time is strong and wrong.

---

## Phase-Specific Warnings

| Item | Phase Topic | Likely Pitfall | Mitigation |
|------|-------------|----------------|------------|
| 1 | `list` overhaul | Default-change breaks golden tests + downstream grep callers | Audit table + atomic regenerate-fixtures commit (MP-1) |
| 2 | argument-hint coverage | YAML frontmatter regression breaks skill loading silently | Add frontmatter linter test before edits (MP-2) |
| 3 | Fresh-start commune | Blocks boot OR fires twice (Path A + Path B) | Non-blocking prompt + single sentinel + clear-psyche disambiguation (MP-4) |
| 4 | pick-spec false-empty | v1 schema break vs additive new `kind` | Add `kind:"all-live"`, don't re-shape existing (MP-3) |
| 5 | `/spt:live --auto` + casual-language | False-positive on pastes/quotes; subagent misfire; GSD coupling | Layered detection + AskUserQuestion confirm hop + top-level-only + no GSD read (CP-3, CP-4) |
| 6 | Version-change changelog | Handoff race + missing sentinel + multi-repo shared state + CHANGELOG.md prerequisite | Compiled-in version + atomic write + first-install silent + CHANGELOG.md created first (CP-1, CP-2, MP-7, MP-9) |

## Cross-Item Coupling

- **Items 3 + 4** share the pick-spec/`/spt:live` SKILL dispatch path. Plan them as one phase or guarantee phase-3 lands before phase-4 (or vice versa) — interleaving leaves SKILL.md inconsistent with binary.
- **Items 5 + 6** both touch `plugin-session-start.rs`. Avoid interleaving — sequence them.
- **Item 1 + Item 2** are mechanically independent (Rust code vs YAML frontmatter) — safe to parallelize.
- **All items** require `DEPLOY.ps1 -Bump patch` (or `-Bump minor`) and `/reload-plugins`. Plan the bump at end-of-milestone, not per phase, to avoid version-churn during in-flight handoffs (Phase 18.4 lesson).

## Sources

- `src/live/pick_spec.rs` (lines 1-95 verified, schema-frozen comment lines 16-21) — HIGH
- `src/owl/list.rs` (lines 1-271 verified — already has `online_only` param, structure is ready for `--all`/`--offline`/`--here`) — HIGH
- `src/live/list.rs` (lines 1-138 verified — runs unconditional, no online_only flag yet; this is the bigger refactor) — HIGH
- `plugin/spt/skills/*/SKILL.md` frontmatter grep (16 skills total; 14 have argument-hint, 3 missing: `list-ready`, `list-live`, `list-psyche`) — HIGH
- `plugin/spt/hooks/hooks.json` (verified — SessionStart calls `owl.exe plugin-session-start`; SubagentStart is separate `hook-subagent-start` hook) — HIGH
- `docs/DEPLOY.md` (Phase 18.4/18.5 handoff lessons; targeted-prune keep-set; step 8 atomic patch) — HIGH
- `.planning/STATE.md` (26-03 schema-freeze decision; Phase 18.8 atomic_write_string; Phase 18.7.1 spool-direct lessons; quick-260513-vzw EVENT envelope blast-radius) — HIGH
- `.planning/PROJECT.md` (v1.7.1 target features list; Key Decisions table; Constraints) — HIGH
- `plugin/spt/skills/live/SKILL.md` (pick-spec dispatch table lines 159-187; forced-picker rule; Cancel handling) — HIGH

**Skills currently MISSING `argument-hint`** (verified by grep, 14 of 17 SKILL.md files have it — `list-ready`, `list-live`, `list-psyche` are missing): adding these is the substantive item-2 work. The format precedent is established; the risk is silent YAML regression on the OTHER 14 if anyone reflexively edits during sweep.
