# ADR 0001: Memory Health via flagger → scout subagents; preventive dedup is the first concern-kind

- Status: Proposed
- Date: 2026-05-28
- Relates to: `NLSPEC.md` (v1 spec), `CONTEXT.md` (Memory Health, Memory Staleness, Deduplication, Memory Capability, Memory Authority Source)

## Decision

Build **Memory Health** as a two-tier subagent architecture: a **flagger** does cheap,
store-only breadth and emits **concerns**; each concern fans out to a read-only **scout**
subagent that does repo-grounded depth and returns a verdict with evidence; findings are
presented to the **human, who still decides every write**. Deliver it staged: wire the
**`duplicate`** concern-kind first, running **preventively inside `/memory extract`**.

## Why (the chain we walked)

1. **Dedup vs. Health split**, then **"stale" is one composable abstraction** (Map / Hash
   / Memory staleness), then **Health = memory-vs-repo staleness only**.
2. We tried to ship dedup as a cheap store-only flag with the human as repo-context
   oracle — **rejected**: authoritative "same fact?" judgement can require repo context,
   so a store-only heuristic is not trustworthy.
3. **Resolution: scouts supply the repo context.** A read-only scout investigates one
   concern against the code and returns evidence; the human decides. This *un-parks
   Health* — dedup is simply concern-kind #1 in a general flagger→scout pipeline. The
   store-vs-repo axis becomes the **flagger (store-only) vs scout (repo-grounded)** split.
4. **Preventive bounds the fan-out.** Checking happens before writing new candidates, so
   the flagger only looks at the ≤5 (`MAX_CANDIDATES`) incoming candidates, not the whole
   store — a handful of scouts at most, affordable inline.

## Flow (preventive `duplicate` concern, now)

Inside `/memory extract`, after the model proposes candidates and before the write:

1. **Shortlist (flagger, store-only):** for each candidate, find existing memories that
   plausibly overlap (shared concepts / `Applies to` / lexical), compared against full
   Topic File bodies — not the 120-char `INDEX.md` excerpts. High-recall, cheap, no
   verdict.
2. **Scout (per candidate with a non-empty shortlist):** a read-only subagent receives
   the candidate plus the shortlisted Memory Blocks and repo access, and returns a
   verdict (`duplicate` / `distinct` / `uncertain`), the suspected memory id(s), and
   evidence. Exact `canonicalContentHash` match short-circuits to `duplicate` with no
   scout.
3. **Enriched review:** the existing approval loop shows each candidate annotated with
   its verdict + evidence.
4. **Write:** approved candidates go through the existing locked path. No auto-discard.

## Authority (what keeps this v1-legal)

- **Scouts are read-only.** They read store + repo and report; they never mutate.
  Investigation needs no authority; only mutation does.
- **The human decides every write.** Autonomous apply (acting on scout verdicts without a
  human) is deferred until **Memory Capability** / **Memory Authority Source** exist.

## Staging

- **Now:** preventive `duplicate` pipeline above.
- **Later (curative `/memory audit`):** same flagger→scout architecture run over the
  whole store, adding the **`stale`** concern-kind (existing memory vs. current code).
  Staleness is inherently whole-store, so it cannot be preventive.
- **Later still:** authorized auto-apply of low-risk verdicts once Authority Source is
  designed.

## Open question (next, to make it buildable)

**The concern handoff contract** — what the flagger emits per concern and what a scout
returns — plus whether the flagger shortlist is mechanical core (high-recall, bounds
cost) or itself a subagent. The verdict is always the scout's judgement; the question is
only whether the *recall* step is core or model.

## Not part of this ADR

Structural junk (malformed blocks, empty Topic Files, Hash staleness) is mechanically
detectable and already detected by `/memory reindex`; letting reindex *repair* it is a
separate small change, not a Health concern.