# Goal: open-region resume rule (SPEC §6.11.2)

After this task, `attractor run resume <run-id>` correctly
recovers a run that crashed inside a parallel region. The engine
detects an "open region" (a `component` with a `NodeCompleted`
but no matching `tripleoctagon` entry), reconstructs each branch's
state from the journal, re-dispatches in-flight branches, then
proceeds. Per-branch resume reuses §6.4's existing three-case
state machine unchanged.

Sequential resume (the §6.4 three-case machine) must not regress.

## Files to touch

- `src/attractor/engine/engine.py` — the resume code path:
  - Before applying §6.4's case (a) / (b) / (c) on the latest
    journal entry, run an **open-region check** (§6.11.2).
  - If a region is open: discover the matching join, reconstruct
    per-branch progress from `NodeCompleted(branch_name=X)` entries,
    re-dispatch in-flight branches via the existing
    `_run_parallel_region` / `_traverse_branch` infrastructure, then
    continue from the join's outgoing edge.
  - If no region is open: proceed with §6.4 unchanged.
- `tests/test_engine/test_parallel_resume.py` (new file) — resume
  scenarios listed below.
- `traceable-reqs.toml` — promote
  `REQ-EXEC-PARALLEL-RESUME.required_stages` from `["doc"]` to
  `["doc", "impl", "unit"]`.

## Detection rule (SPEC §6.11.2)

> A parallel region is open if some `component` node has a
> `NodeCompleted` entry whose matching `tripleoctagon` has no
> `NodeCompleted` entry.

Implementation sketch:

```python
def _find_open_region(workflow, entries) -> tuple[str, str] | None:
    """Return (fanout_id, join_id) for the latest open region, or None."""
    # Scan entries for NodeCompleted on FANOUT-kind nodes.
    # For each, check if there's a later NodeCompleted on the matching
    # JOIN. If not — region is open. Return the LATEST open one.
```

Use `_find_join_for_fanout` from `engine.py` (already lives there
from P3) to resolve the matching join id.

## Per-branch reconstruction

For each outgoing edge of the open component:

1. Derive `branch_name` (label or dst, same as `_run_parallel_region`).
2. Scan `entries` for `NodeCompleted(branch_name=branch_name)`.
   The latest such entry tells us where this branch stopped.
3. If that latest entry is the branch's terminal (`next_node ==
   join_id`), the branch is **complete** — don't re-dispatch.
4. Otherwise the branch is **in-flight** — figure out where it
   resumes by applying §6.4's case (a) or (b) to that entry,
   *scoped to the branch's worktree ref* rather than the main one.

A branch with NO `NodeCompleted(branch_name=X)` entries hasn't
started yet — re-dispatch from the branch's first node.

## When all branches done but no join entry

If the open-region scan finds that every branch has a terminal
`NodeCompleted(branch_name=X, next_node=join_id)` but the join
itself has no `NodeCompleted`, the engine writes the join's
`NodeCompleted` (AND-fold across branch outcomes, concatenated
captured output) and continues from the join's outgoing edge.
This is the §6.11.2 (e) case from earlier conversations — the
crash happened *between* the last branch finishing and the join
entry being written.

## Tests

`tests/test_engine/test_parallel_resume.py` (new file). Tags:
`# [unit->REQ-EXEC-PARALLEL-RESUME]` on each test;
`# [int->REQ-DOD-RESUME-BETWEEN-NODE]` on the end-to-end cases.

Minimum coverage:

1. **No open region → sequential resume unchanged.** A run that
   crashed at a sequential node resumes via existing §6.4 case
   (a) or (b) without entering the new code path.
2. **Crashed mid-branch.** Build a run with a parallel region;
   manually write journal entries that simulate "branch B
   crashed after its first node committed but before its second
   node". Call `resume`; assert branch B re-enters its second
   node and completes; the join AND-fold proceeds.
3. **All branches done, join not written.** Manually write
   entries for all branches' terminal nodes; do NOT write the
   join's `NodeCompleted`. Call `resume`; assert the engine
   writes the join entry with correct AND-fold and the run
   completes.
4. **Branch hasn't started yet.** Build entries where the
   `component`'s `NodeCompleted` is written but no branch has
   any entries (crash *immediately* after fan-out). Call
   `resume`; assert all branches re-dispatch from their first
   nodes.
5. **Mixed: one branch complete, one in-flight, one not
   started.** Resume should leave the complete branch alone,
   continue the in-flight one, and dispatch the not-started one.

For tests 2-5, use a fixture helper to manually construct
journal entries on a fresh state branch — avoids needing to
crash an actual run in the middle of execution.

## Definition of done

- The new open-region check + per-branch reconstruction in
  `engine.py` carries `# [impl->REQ-EXEC-PARALLEL-RESUME]`.
- Per-branch resume reuses the existing `_traverse_branch` and
  `_run_parallel_region` infrastructure — no duplication of
  branch dispatch logic.
- Tests carry `# [unit->REQ-EXEC-PARALLEL-RESUME]`.
- `traceable-reqs.toml` — promote
  `REQ-EXEC-PARALLEL-RESUME.required_stages` to
  `["doc", "impl", "unit"]`.
- `uv run pytest && uv run ruff check src tests && uv run pyright`
  all pass.
- **Existing 443 tests must still pass** — sequential resume
  (§6.4) cases (a) / (b) / (c) and the existing parallel happy-
  path tests are unchanged.
- `uv run traceable-reqs check --json` reports
  REQ-EXEC-PARALLEL-RESUME complete on doc + impl + unit.

## References

- SPEC.md §6.11 (the entire parallel-checkpoint section — read
  fully before touching engine code)
- SPEC.md §6.11.2 (the one new rule + the (e) "all branches done,
  join not fired" sub-case)
- SPEC.md §6.4 (sequential three-case state machine the new rule
  composes with)
- `src/attractor/engine/engine.py`:
  - `Engine.resume` — the existing resume method
  - `_run_parallel_region`, `_traverse_branch`, `_find_join_for_fanout`
    (P3 — already there)
  - `_replay_counters` — existing helper for replaying visit_counts
    and latest_outcomes
- `src/attractor/engine/journal.py` — `NodeCompleted.branch_name`
  (P3 — already there)
- P3 commit 4ceadba for the existing parallel-traversal code

## Out of scope

- **`predecessor_branches` payload to merge agents** (§6.3
  extension). Still deferred to a separate chunk.
- **Per-branch `attractor run show` rendering** with in-flight
  markers — UX, not SPEC.
- **`goal_gate` keying by `(node_id, branch_name)`** — only
  relevant if the same node id appears in multiple branches,
  which is rare. §6.11.5 calls this out as deferred.

After this task lands, the v0.2 parallel-nodes implementation
is feature-complete. All five `REQ-*-PARALLEL-*` requirements
are at `[doc, impl, unit]`. The only remaining SPEC item from
§14.2 priority #2 is the §6.3 `predecessor_branches` extension
for merge agents, which is small and can ship alongside the
canonical merger workflow.
