# Mesh-D0 — staggered repro harness (red)

> JIT plan for `SUBNET-MESH-PLAN.md` §Build phases · phase 0. First artifact, **no production change**. Lands `#[ignore]`, asserting the *desired* mesh convergence — the acceptance that flips green at D6. Tags REQ-MESH-3 `int` evidence only when it passes (D6), so **no REQ activation here**.

## The bug, in two production seams

The grill repro — *A creates → B joins A → A offline → C joins B → B offline → A online → A and C invisible* — is two own-rows-only/pairwise facts:

1. **Push target is pairwise.** `peerloop.rs:356` — `for peer in trust.peers_in(&sub.name)`. A node advertises its own rows to *directly-paired* peers only. No relay, no roster. (Doc line 11: "no transitive gossip in v1: trust is pairwise per subnet.") → widens to **all roster members** at **D6**.
2. **Apply gate is pairwise.** `registryhost.rs:228,291` — `if !policy.trust.is_trusted(&u.subnet, origin_node) { reject }`. Even if a row from a non-paired member arrived, the receiver drops it. → swaps to the **seed-proof membership** flag at **D5**.

C joins B, so C's trust = {B}; A's trust = {B}. **A and C never pin each other.** Under (1) C's pump never targets A; under (2) A would reject C's row anyway. Invisible + unreachable. The mesh: C learns A from B's roster at the join ceremony (D3), so when A returns C dials A *directly* and A's seed-proof gate admits it (D5) — no B in the loop. D6 widens C's push target to the roster so the row actually flows.

## What the harness asserts (the desired end-state)

Two tests, both `#[ignore = "mesh lands at D6 — Mesh-D0 red repro"]`:

1. **`staggered_offliner_still_meshes`** (the acceptance). Topology + stagger exactly as the repro: B is **offline at the critical step**; at the end C and A are both online, B is not. Assert **A's registry holds C's perch row AND C's registry holds A's perch row** (visibility) and **each can resolve a dialable address for the other** (reach). Red today: C's pump targets only B; A's gate rejects C.
2. **`all_online_star_a_reaches_c_b_never_relays`**. A, B, C all online, pairwise A↔B and B↔C only. Assert A↔C converge **and** B never carried a third-party row (every row B forwarded was B-authored — i.e. A's registry never holds a C-row whose *delivering origin* was B). Locks KH 4.10/7.5: the fix is roster relay, **not** row relay.

## Fidelity decision (why not `run_dispatch_loop`)

`run_dispatch_loop` sources its gate from canonical `SPT_HOME` (`dispatch.rs:443 RegistryGatePolicy::load()`) — three in-process nodes can't each hold a distinct pairwise gate through it. So the harness:

- **Push side = the real loop.** Each node runs production `run_peer_pump` with explicit per-node `PumpPaths` (its own `trust`/`subnet`/`registry`). This captures the **D6** target-widening verbatim — when `peerloop.rs:356` reads the roster instead of `peers_in`, these pumps start targeting the full mesh with zero test change.
- **Receive side = a thin per-node arm.** The test reads each node's broker `NetStreamData` feed events and calls the **production** `node.registry.apply_feed(origin, updates, &per_node_policy)`. This captures the **D5** gate swap verbatim — the swap is *inside* `apply_feed`/`RegistryHost`, so the same call exercises whatever gate is current. (Same direct-`apply_feed` seam `peerloop.rs:265` already uses.)
- **Addresses = omniscient resolver.** All three loopback addresses are known to every node's resolver. Address *discovery* is the roster's job (D3/D4, separately tested); D0 isolates the **relay + gate** bug, which is the center of "the subnet was the pairing graph, not a mesh." A node still only *pushes* where its push-target says — so an omniscient resolver does not mask the bug (C still won't target A until the target widens at D6).

This is faithful to **both** flip seams (D5 gate, D6 target) while runnable in-process today. If a later phase needs the receive arm to ride the real dispatch loop, refine then — the `#[ignore]` lifts at D6 regardless.

## Build steps

1. New `crates/spt-daemon/tests/mesh.rs`. Lift the shared scaffold from `peerloop.rs` / `propagate.rs`: `unique_name`, `hermetic`, `net_broker`, `connect_retry`, `converge`. No `mod common` exists — copy-verbatim per the sibling tests' convention.
2. A `MeshNode` helper: `{ name, hex, addr, registry: Arc<RegistryHost>, paths: PumpPaths, root: PathBuf }`, plus `advertise(perch_label)` (write an `info.json` perch under its owlery so `advertise_local` has a row) and `policy(trusted: &[&str])` (a per-node `RegistryGatePolicy`).
3. A per-node **receive pump**: a thread that drains the node's broker feed streams and applies under the node's policy (the ~40-line dispatch-arm reimpl). Gate it on a stop flag.
4. A **push pump**: production `run_peer_pump` per online node, explicit `PumpPaths`, omniscient resolver, fast cadences (100ms), `registry_evict_after = 300s` (inert — same as peerloop).
5. Online/offline = start/stop a node's push+receive threads via its own stop flag. Stagger per the repro; `converge()` at each settle point that *current code can* reach (B-holds-A, B-holds-C), then the final mesh assertion that **fails** today.
6. Mark both tests `#[ignore = "..."]`. Reach assertion = each node resolves a non-empty addr for the other through the same resolver the pump uses.
7. `cargo test -p spt-daemon --test mesh -- --ignored` → both run **red** (compile-green, assert-red). `cargo test -p spt-daemon --test mesh` (no `--ignored`) → green-skip (CI stays green).

## Done when

- `mesh.rs` compiles; `--ignored` run shows both tests failing on the *mesh* assertion (visibility/reach), not on setup.
- Default `cargo test` skips them (CI green).
- `traceable-reqs check` clean (no REQ-MESH stage activated — D0 mints no evidence).
- Commit: `test(mesh): red staggered + star repro harness — Mesh-D0`.
