# Replay Corpus (Phase 9 / TEST-04)

Initial corpus seed for `mic_test --replay-dir`. Used by CI's
`mic_test_replay_corpus` ctest registration to assert detection-pipeline
regressions against a small set of hand-curated positive and negative
samples.

## Files

| File                       | Expected | Tolerance | Description                                                                  |
| -------------------------- | -------- | --------- | ---------------------------------------------------------------------------- |
| `positive_001.wav`         |        1 |         0 | ~2 s band-limited white-noise burst (mic-cover-like signature).              |
| `negative_silence_001.wav` |        0 |         0 | 5 s silence (baseline negative).                                             |
| `negative_speech_001.wav`  |        0 |         0 | 5 s synthetic speech-like signal (multi-formant + 3 Hz syllable envelope).   |

All seed WAVs are **48 kHz / mono / 16-bit PCM**. They are generated by
`tools/gen-replay-corpus.py` from fixed RNG seeds — re-running the script
on any host produces byte-identical output.

## Manifest schema

`manifest.json` follows the Phase 9 / 09-CONTEXT.md `D-30` + `D-35` shape:

```jsonc
{
  "version": 1,                         // integer
  "corpus": "...",                      // free-form string identifying the corpus
  "files": [
    {
      "wav":               "...",       // relative path or basename
      "expected_triggers": 1,           // integer
      "tolerance":         0,           // integer; |observed - expected| <= tolerance => pass
      "notes":             "..."        // optional human-readable note
    }
  ]
}
```

The `mic_test --replay-dir <dir> --expect-triggers-from <manifest.json>`
invocation looks up entries by relative path first, then by basename.
Files in the directory that have no matching manifest entry simply have
no expectation set and always count as `pass`.

## Determinism (CONTEXT D-34)

Replay output is byte-identical across runs. The state machine receives
`dt = block_count * 1000 / sample_rate` rather than a `steady_clock`
delta, and `replayWavDir` sorts its directory listing lexicographically
before walking it. Re-running:

```
mic_test --replay-dir tests/corpus/replay \
         --expect-triggers-from tests/corpus/replay/manifest.json \
         --json-output build/replay_results.json
```

…produces a byte-identical `replay_results.json` on every invocation
given the same `--profile` (or no profile, as in the seed CI run).

## Regenerating

```sh
python tools/gen-replay-corpus.py
```

Outputs land in `tests/corpus/replay/` next to this README. Commit any
intentional regeneration with a clear `chore(09-XX):` message describing
why the audio content changed (e.g. detector parameter retuning).

## TEST-01 invariant

This corpus and the harness that consumes it are headless — no SteamVR
runtime, no driver_micmap.dll, no OpenVR symbols anywhere in
`apps/mic_test/src/wav_replay.{hpp,cpp}` (enforced by
`cmake/AssertReplayNoVrApi.cmake`).
