# Digest rides an adapter-declared `[digest]` extractor (session-spanning, two-origin)

## Status

accepted (2026-06-13) · **supersedes the source-mechanism lineage of [ADR-0008](0008-live-activity-buffer-pty-digest.md)** (its 2026-06-01 PTY-pattern decision, the 2026-06-12 "normalized logs" amendment, and the M9 "no manifest seam / rides `[history]`" reconciliation). The **product surface** defined in ADR-0008 (a rolling, glanceable activity view distinct from scrollback and transcript) still stands and is broadened here.

## Context

M9 reconciled the digest onto the log source and, in its plan §2, chose "approach A": a single published `{role,text,tool}` contract carried in the adapter's `[history]` records, **no digest-specific manifest seam** — the adapter's existing `[history]` normalizer was expected to emit the contract, and the v0.5.0 release notes told adapter authors exactly that. This decision was made inside the milestone and **was not approved by the operator**, who reopened it on review. Four problems are real:

1. **Dual-consumer conflict.** One `[history]` source feeds *two* consumers with opposite needs: the **echo-commune** pipes records *opaque and verbatim* to a summarizer (it wants full-fidelity records), while the **digest** needs them to *parse* as the contract. One normalizer cannot serve both — emit rich records and the digest silently drops every line; emit contract records and echo loses fidelity.
2. **Silent failure.** `spt_term::projection::record_to_tagged` is `serde_json::from_str(raw).ok()` — any line that is not valid contract JSON is dropped with no count, sample, or log. A normalizer that does not exactly match the contract yields an **empty digest with zero signal**; the author's only symptom is "`spt endpoint digest` returns nothing."
3. **No thread continuity.** A perch's `session_id` is pinned at first bind (`establish_perch` refuses a change while the owner is alive); `[history]` `locate_template` resolves a *single* `{session_id}` file. A live agent that `/clear`s (same process, new session, new log file) rotates `session_id` via `api boundary`, but the digest then reads only the new session — it **follows** the thread but cannot **span** it, going near-empty at every boundary.
4. **Blind to injected context.** The digest showed only what the agent *did*, not what spt *fed* it (session-start psyche download, echo-commune mirror, incoming owl messages). An echo-commune building its next context delta benefits from seeing what context the agent already has, so it does not re-summarize it.

A harness's real log shape forces the design: a sampled Claude Code session JSONL carries nine top-level record `type`s (most are UI/metadata noise), `assistant` records hold a `content` **list** mixing `thinking` + `text` + `tool_use` (one line → multiple digest entries, `thinking` excluded), and `user.content` is sometimes a string and sometimes a tool-result list. Extraction means filter-types + walk-array + branch-per-block + skip-thinking — not a flat field map. Every record also carries a `timestamp`, making cross-origin interleave feasible.

## Decision

The digest gets its **own `[digest]` manifest seam**. Specifically:

- **Imperative extractor, not a declarative DSL.** `[digest]` declares an adapter-supplied command that maps the harness's native log → the `{role, text, tool, ts}` contract. By default it reads the **same source files as `[history]`** (reference its locate pattern; DRY) with an **own-source escape hatch**; **`api digest-entry`** remains the always-available push fallback. We reject a declarative match/extract DSL: a flat map cannot express real nested logs, and a DSL powerful enough to do so reinvents a programming language — and anyone who *could* conform their output to a fixed format could as cheaply push via `api digest-entry` or write the extractor. (This corrects M9 §2, which rejected a digest-specific command only because it wrongly assumed echo's one normalizer could also serve the digest.)
- **Thread-spanning.** A per-endpoint **session ledger** (heir to the sister project's `psyches/tracked/agents/<id>/sessions.log`) records the agent's session rotations; `api boundary` appends the new `session_id` at each `/clear` / `/compact`. The digest enumerates the last *K* sessions (newest-first, bounded), so the rolling window **bridges** a boundary. A session boundary is **rendered distinctively** (a visible `/clear` divider) so a reader sees the context reset without losing the thread.
- **Presentation is adapter-defaulted, consumer-overridable.** Window depth, arg-truncation, and sprint-collapse become `[digest]` defaults the adapter declares and any consumer may override at pull/subscribe; spt-core ships fallback defaults. The fixed "last ~3 turns" is dropped as a spt-core requirement. Extraction stays the adapter's; *how much* is shown stops being spt-core dogma.
- **Two-origin merge.** The digest interleaves, by `ts`, the adapter's **extracted log records** (agent activity) and spt's own **context-injection entries** — a new collapsible category with subtypes `psyche_download | echo_mirror | owl_message`. spt appends its injection entries to the endpoint's `digest.log` (the existing Path-B sink — spt becomes an internal producer); the projection merges both within the window. The record contract gains an optional `ts` ordering key.
- **Diagnostics are first-class.** `spt adapter digest-proof <adapter> [--sample <log>]` runs the declared extractor over a real sample and prints the parsed records, the rendered digest, and every dropped/malformed line with its reason. The daemon surfaces the same skip-diagnostics at runtime. The silent drop is removed.

**Deferred (no consumer yet, by the same logic M9 used):** the GUI collapse/expand UX for context-injection entries, the echo-commune-reads-the-digest delta loop, and the autonomous file-watch freshness nudge land with the milestones that own those surfaces. The **data model** (the `[digest]` seam, `ts`, the context-injection category, thread-spanning) lands now so the adapter-facing contract need not break a second time.

## Consequences

- **Breaking for adapter authors** (again): the M9 guidance ("emit the contract through your `[history]` normalizer") is replaced by "declare a `[digest]` extractor (or push via `api digest-entry`)." The v0.5.0 release note that announced the opposite is wrong and must be corrected; the **harness integration checklist** in the docs is updated as part of this milestone.
- `digest.log` changes role from a *fallback source* to a *merge input* (it now also carries spt's injection entries).
- New `REQ-*` ids must be minted before implementation (traceability rule 3); the relevant `REQ-TERM-*` titles/evidence re-point in the same restoration discipline M9 used.
- ADR-0008's source mechanism is now fully superseded; only its product-surface framing survives (and is broadened from "recent activity" to "recent context").

## Alternatives considered

- **Declarative detection DSL** in the manifest — rejected: too weak for real nested logs; a sufficiently powerful version is a reinvented language; the push door already covers the simple case.
- **One extractor shared with echo (M9 approach A)** — rejected: the dual-consumer conflict above; it is the decision being reversed.
- **Extend `[history]` instead of a new `[digest]` section** — rejected: `[history]` resolves a single session file; bolting multi-session spanning onto it re-tangles the two consumers we are separating.
- **Follow-and-reset window** (digest shows only the post-`/clear` session) — rejected: loses thread continuity, the whole point of session-spanning; a glance right after `/clear` would show an idle agent mid-task.
- **Build all of the context-mirroring feature now** (GUI + echo-feedback) — rejected: builds consumers whose surfaces do not exist yet; only the data model is expensive to retrofit, so only it lands now.
