# Attractor — Specification (v0)

> Status: draft. This is a thinking document, not a contract. Sections marked
> `TODO` are unresolved.

A local-first, DOT-graph workflow engine for unattended AI software work.

---

## 1. Vision

In dynamical systems, an *attractor* is a state a system evolves toward.
Workflows here define the attractors — completion nodes, verification gates,
human-decision points — and a run is the trajectory by which an AI coding
agent reaches them.

The framing borrows from lights-out ("dark factory") manufacturing, where a
production line keeps running without humans on the floor. An Attractor run
operates the same way: AI coding agents execute multi-stage workflows
unattended — overnight, over a weekend, across a long break — and a human
only intervenes at explicit gates. No babysitting a REPL. No reviewing a
50-file diff you don't trust. The graph defines where humans are needed;
everywhere else, the machine runs.

Attractor is **local-first**:

- Workflows execute on your machine.
- Checkpoints, history, and metadata stay in local git refs and are
  never pushed to a remote.
- No cloud sandboxes. No hosted service.
- The engine's own network footprint is just LLM provider APIs. Workflow
  scripts and agent tools can make whatever network calls the author writes
  (`cargo` fetching crates, `curl` in a tool script, `git push` or `gh`
  from a `bash` invocation) — that's their security domain, not the
  engine's.

This is a deliberate constraint on the **engine**, not on what workflows can
do. Attractor itself never phones home, never pushes its own bookkeeping
refs (everything under `refs/heads/attractor/`, see §7.2) to a remote, and
never calls a hosted service. What your workflows reach out to — including
`git push` from inside a tool or agent node — is up to you.

---

## 2. Goals and Non-Goals

### Goals (v0)

- Define multi-stage AI workflows as **versioned, diffable graph artifacts**.
- Run workflows **unattended** with deterministic checkpoint/resume.
- Route stages to the **right model** (cheap for routine work, frontier for
  hard parts) via a CSS-like stylesheet.
- Pause at **human gates**; surface decisions through an event stream that any
  UI can consume.
- Express verification (tests, linters, LLM-as-judge) as a **first-class
  graph idiom** — composed from existing primitives (`tool` + `goal_gate`
  + a `FAILURE` edge to a fix-up `agent` node), not buried inside an
  agent prompt. See §6.6 for the canonical pattern.
- Keep all run state in **local git** so existing git tooling works.

### Non-Goals (v0)

- Cloud sandboxes (Daytona, k8s, remote VMs).
- Multi-tenant or hosted operation.
- GitHub PR creation, remote pushes, team collaboration features.
- Web UI as a first deliverable. The engine is headless; UIs come later.
- A package registry of pre-built workflows.

### Explicit non-features (may revisit later)

- Daemon mode beyond a single foreground server process.
- Mid-run live editing of the graph.
- Speculative parallel execution / branching futures.

---

## 3. Inspirations and Prior Art

Two reference repos are vendored under `references/`:

- **`references/attractor`** — open NLSpecs for an Attractor-style software
  factory. Three documents: orchestrator (DOT graph), coding agent loop,
  unified LLM client.
- **`references/fabro`** — StrongDM's Rust implementation of the same idea.
  Useful as a study target, especially `fabro-checkpoint` (cleanly separated
  local git plumbing).

Attractor is a **clean-room implementation** inspired by both. We are not
forking fabro. We will copy ideas — and occasionally specific module shapes
where they're well-factored — but the codebase is ours.

Where we diverge from attractor by design:

- **Local-only by default.** Attractor is agnostic on this; fabro pushes to
  remotes. We don't.
- **Smaller surface area.** Fewer node types, no MCP/Slack on day one.
- **Single binary, single user.** No multi-tenant assumptions.

---

<!-- [doc->REQ-ARCH-THREE-LAYERS] -->
## 4. Architecture

Three layered modules with one-way dependencies. v0.1 packages them as a
single Python package `attractor`, distributed via UV (`uv tool install
attractor`) with a `console_scripts` entry point. Internal module
boundaries enforce the layer order — they can promote to standalone
packages later if an out-of-tree caller appears (§12).

```
+-----------------------------------------------------+
|  Host: CLI, future TUI, future web UI               |
+-----------------------------------------------------+
                       |
                       v
+-----------------------------------------------------+
|  Layer 3: Orchestrator (DOT-graph engine)           |
|    parse graph, traverse, dispatch, checkpoint      |
+-----------------------------------------------------+
                       |
                       v
+-----------------------------------------------------+
|  Layer 2: Coding agent loop                         |
|    Claude Agent SDK session, skills, hooks, MCP     |
+-----------------------------------------------------+
                       |
                       v
+-----------------------------------------------------+
|  Layer 1: LLM transport (Claude Agent SDK +         |
|  anthropic SDK) — auth, streaming, retry            |
+-----------------------------------------------------+
                       |
                       v
+-----------------------------------------------------+
|  External: Anthropic API                            |
+-----------------------------------------------------+
```

Cross-cutting:

- **Local checkpoint store** (git-backed via `pygit2`) — used by Layer 3
  for run state and by Layer 2 for session history.
- **Event stream** — Layer 2 (via the SDK's event callbacks) and Layer 3
  emit typed events to a single host-supplied async iterator. A pub/sub
  bus is deferred until a second consumer exists.
- **Sandbox / execution environment** — v0.1 calls a `LocalExec` class
  directly (`asyncio.subprocess`). An `ExecutionEnvironment` protocol
  lands with container execution (§14).

Layer 1 and Layer 2 are thin wrappers around the [Claude Agent SDK](
https://docs.anthropic.com/en/docs/agents/agent-sdk-overview): the SDK
owns the conversation loop, tool dispatch, streaming, and ecosystem
features (skills, hooks, MCP, sub-agents). Attractor's value-add is
Layer 3 — the DOT-graph orchestrator that drives the SDK sessions per
workflow node.

---

## 5. Workflow Definition (DOT)

Workflows are Graphviz DOT files. The graph **is** the workflow: nodes are
stages, edges are transitions, attributes configure behavior.

### 5.1 Why DOT

- Workflows are directed graphs; the format should be too.
- Existing tooling (Graphviz CLI, viewers, language servers) renders DOT to
  SVG/PNG for free.
- Plain text → diffs cleanly, reviews in a PR, version controls naturally.
- A constrained subset stays predictable while remaining extensible through
  custom attributes.

<!-- [doc->REQ-DOT-PARSE-SUBSET] -->
### 5.2 Accepted subset (constraints)

- One `digraph` per file. No `strict`, no undirected, no multiple graphs.
- Directed edges only (`->`). `--` is rejected.
- Bare-identifier node IDs (`[A-Za-z_][A-Za-z0-9_]*`). Display names go in
  `label`.
- Attributes inside `[ ... ]` may be separated by commas, whitespace, or both
  (matches Graphviz). The §5.4 example uses whitespace.
- No HTML labels.
- `// line` and `/* block */` comments stripped before parsing.
- Semicolons optional.

<!-- [doc->REQ-DOT-SHAPE-HANDLERS] -->
### 5.3 Shape → handler mapping

| Shape           | Handler  | Purpose                                |
|-----------------|----------|----------------------------------------|
| `Mdiamond`      | `start`  | Entry. Exactly one per graph. No-op.    |
| `Msquare`       | `exit`   | Exit. Exactly one per graph. No-op.     |
| `box`           | `agent`  | LLM/coding agent task (default shape).  |
| `hexagon`       | `human`  | Human-in-the-loop gate. Pauses.         |
| `parallelogram` | `tool`   | External command / API call.            |
| `component`     | `fanout` | Parallel fan-out. No-op. See §6.9.      |
| `tripleoctagon` | `join`   | Parallel fan-in (AND). No-op. See §6.9. |

Explicit `type="..."` overrides the shape mapping.

Deferred shapes (see §14): `house` (sub-workflow). The `diamond` /
`conditional` shape from earlier drafts is not in v0 — under §6.1's
edge-label routing, a tool node
(`script="test -f X && exit 0 || exit 1"`) or an agent node already
covers deterministic branching. Workflows mined from prior
Attractor-style projects often route via `shape=diamond`; that idiom is
not supported here. `attractor validate` rejects any shape not in the
table above that lacks a `type=` override — there is no handler to bind to.

### 5.4 Minimum viable example

```dot
digraph PlanImplementVerify {
    graph [
        goal = "Implement the change described in goal.md and verify it."
        model_stylesheet = "
            *       { model: claude-haiku-4-5;   reasoning_effort: low;  }
            .heavy  { model: claude-sonnet-4-6;  reasoning_effort: high; }
        "
    ]

    start [shape=Mdiamond, label="Start"]
    exit  [shape=Msquare,  label="Exit"]

    plan      [label="Plan",      prompt="Read goal.md. Write plan.md.", reasoning_effort=high]
    approve   [shape=hexagon, label="Approve plan"]
    implement [label="Implement", class=heavy, prompt="Execute plan.md."]
    test      [shape=parallelogram, label="Run tests", script="cargo nextest run"]
    judge     [label="Verify",    prompt="Did the change satisfy goal.md?", goal_gate=true]

    start -> plan -> approve
    approve -> implement [label="approve"]
    approve -> plan      [label="revise"]
    implement -> test -> judge -> exit
}
```

<!-- [doc->REQ-DOT-ATTRIBUTES-V01] -->
### 5.5 Attribute set (v0)

The engine recognizes a small fixed set of attributes. Anything else is
parsed but ignored by the engine — useful for renderer hints or notes that
don't affect execution.

**Graph attributes** (set inside `graph [...]`):

| Attribute            | Type    | Purpose                                                 |
|----------------------|---------|---------------------------------------------------------|
| `goal`               | string  | Workflow's overall goal. Provided to agent nodes as a system-message header preceding their per-node `prompt`. Other handlers ignore it. |
| `model_stylesheet`   | string  | Model routing CSS (§10).                                |
| `default_max_visits` | integer | Fallback for nodes without explicit `max_visits` (§6.3).|
| `arguments`          | string  | Positional run-argument declarations (§6.8).             |
| `default_read_only`  | bool    | Per-workflow default for node `read_only` (§6.7). Defaults `false`.|

**Node attributes (universal — apply to any handler):**

| Attribute       | Type    | Purpose                                                    |
|-----------------|---------|------------------------------------------------------------|
| `label`         | string  | Display name (defaults to node ID).                        |
| `shape`         | string  | Graphviz shape; selects handler (§5.3).                    |
| `type`          | string  | Handler override; takes precedence over `shape`.           |
| `class`         | string  | Stylesheet matching token (§10).                           |
| `goal_gate`     | bool    | Must reach `SUCCESS` before `exit` is allowed (§6.2).      |
| `max_visits`    | integer | Loop bound (§6.3).                                         |
| `read_only`     | bool    | Node does not mutate the run tree (§6.7). Defaults to the graph's `default_read_only` (else `false`). Only meaningful on agent/tool nodes. |

**Node attributes (handler-specific):**

<!-- [doc->REQ-AGENT-NODE-SCOPE] -->
| Attribute          | Handler              | Purpose                            |
|--------------------|----------------------|------------------------------------|
| `prompt`           | agent, human         | Agent: LLM instruction. Human: question or context shown at the gate. |
| `reasoning_effort` | agent                | Model effort hint (low/med/high).  |
| `model`            | agent                | Overrides stylesheet (§10).        |
| `script`           | tool (parallelogram) | Shell command to run.              |
| `allowed_tools`    | agent                | Comma-separated tool patterns (e.g. `"Bash(git merge:*), Read, Edit"`). When set, **replaces** the inherited host-scope tool list for this node; `report_outcome` is always prepended. `None` (attribute absent) = inherit host scope unchanged (v0.1 default). |
| `mcp_servers`      | agent                | Comma-separated MCP server names to include beyond the always-present `attractor` server. Reserved for future use; no other servers are registered in v0.1. |
| `hooks`            | agent                | *(deferred — see §15)* Hook bundle names. Requires a hook-bundle registry (not yet designed); attribute is parsed and rejected on non-agent nodes but has no runtime effect in v0.1. |

Placing `allowed_tools`, `mcp_servers`, or `hooks` on a non-agent node (tool, human-gate, fanout, join, start, exit) is a **validation error** — the engine cannot route these attributes to a non-agent handler, so the mistake is caught at `validate` time rather than silently ignored. `read_only` is valid on agent and tool nodes only; placing it elsewhere is a validation error.

**Edge attributes:**

| Attribute | Type   | Purpose                                                                   |
|-----------|--------|---------------------------------------------------------------------------|
| `label`   | string | Routing token (§6.1). For tool/agent edges, matches the node's outcome (`SUCCESS`, `FAILURE`, `PARTIAL_SUCCESS`, or `NO_OP`). For `hexagon` (human) edges, matches the user's submitted choice. |

This is the entire v0 attribute surface. Adding to it is a spec change.

---

## 6. Execution Model

<!-- [doc->REQ-EXEC-NODE-LIFECYCLE] -->
<!-- [doc->REQ-EXEC-CONFIG-CASCADE] -->
<!-- [doc->REQ-EXEC-EDGE-ROUTING] -->
### 6.1 Stages and transitions

The engine traverses the graph from `start`. At each node:

1. Resolve handler (by `type` or shape).
2. Resolve effective config: graph defaults → stylesheet (§10) → per-node
   attributes (which override the stylesheet).
3. Execute handler. Emit events. Produce an outcome (`SUCCESS` or
   `FAILURE`).
4. Checkpoint.
5. Pick the next edge by matching the outcome (or, for human gates, the
   user's submitted choice) against edge `label`.

Edge `label` is the single routing token. For tool/agent nodes, it matches
the outcome string. For human gates, it matches the choice the user
selected. An unlabeled edge is the default — taken when no labeled edge
matches.

When a node reports `FAILURE`, the engine requires an explicit
`label="FAILURE"` edge to continue. If no such edge exists, the run
ends with status `incomplete`. The unlabeled-default rule applies only
to success-like outcomes (`SUCCESS`, `PARTIAL_SUCCESS`, `NO_OP`) and to
human-gate choices. To allow a node to continue past failure, route the
FAILURE edge explicitly.

This is a runtime check, not a structural one — `attractor validate`
does not flag a missing FAILURE edge. Terminating on failure (run ends
`incomplete`, `captured_output` preserved on the worktree branch for
inspection) is a legitimate workflow design: not every author wants
fix-up loops everywhere.

<!-- [doc->REQ-EXEC-OUTCOME-PREFERRED-LABEL] -->
<!-- [doc->REQ-EXEC-OUTCOME-SUGGESTED-IDS] -->
**Routing hints (`preferred_label`, `suggested_next_ids`).** A
handler may emit a `preferred_label` alongside its status — a
free-form string the engine treats as a routing hint. When set and
an outgoing edge has a matching `label=`, that edge wins regardless
of status.

A handler may also emit `suggested_next_ids: tuple[str, ...]` — a
list of candidate downstream node IDs. The edge selector picks the
first unconditional edge whose target appears in the list (list
order, not source order). Priority below `preferred_label`, above
status-based routing. Useful for handlers that classify their
outcome into one of N targets without needing a labeled edge per
target. Lets a
handler steer the engine without changing its status (e.g., a tool
node classifying its FAILURE as an env-error can set
`preferred_label="env-error"` and route around the generic fixup
loop). If the hint is set but doesn't match any edge, the engine
falls through to status-based routing — hints are advisory, status
is the floor.

<!-- [doc->REQ-EXEC-EDGE-CONDITION-DSL] -->
**Edge `condition=` boolean expressions.** An outgoing edge may
carry a `condition` attribute holding an AND-combined boolean
expression. When non-empty and evaluating true, the edge wins *over*
status-based routing — it's checked before `preferred_label`,
before status-name match, and before unlabeled-default. Grammar
(upstream attractor-spec §10):

```
ConditionExpr ::= Clause ( '&&' Clause )*
Clause        ::= Key Op Literal
Key           ::= 'outcome' | 'preferred_label' | 'context.' Path
Op            ::= '=' | '!='
Literal       ::= '"' [^"]* '"' | BareLiteral
```

Example:

```
verify -> deploy [condition="outcome=SUCCESS && preferred_label!=draft"]
verify -> recover [condition="outcome=FAILURE"]
```

`outcome` compares against the StrEnum value verbatim
(``"SUCCESS"`` / ``"FAILURE"`` / ``"PARTIAL_SUCCESS"`` /
``"NO_OP"`` — uppercase, matching the `label=` strings authors already
use). `context.*` is
reserved (parses but always evaluates to empty string in v0.2 — a
future slice may expose latest_outcomes / visit_counts / graph
attributes through this namespace). Bad syntax surfaces as a
validation error at `attractor validate` time. Condition-bearing
edges are considered ONLY in the condition step; if their condition
doesn't match they don't fall back into the label-based steps —
this avoids accidentally re-matching via a stale `label=`.

<!-- [doc->REQ-EXEC-OUTCOMES-DYNAMIC] -->
<!-- [doc->REQ-EXEC-OUTCOMES-STRUCTURAL] -->
<!-- [doc->REQ-EXEC-GOAL-GATES] -->
<!-- [doc->REQ-EXEC-OUTCOME-STATUSES] -->
<!-- [doc->REQ-EXEC-OUTCOME-NO-OP] -->
### 6.2 Outcomes and goal gates

- Nodes report a typed `OutcomeStatus` outcome — `SUCCESS`,
  `FAILURE`, `PARTIAL_SUCCESS`, or `NO_OP` — and the engine routes via
  the matching edge label (§6.1). `PARTIAL_SUCCESS` means "the handler
  produced useful work but did not fully complete its goal"; it
  satisfies `goal_gate=true` like `SUCCESS` does (upstream
  attractor-spec §3.4) and routes like `SUCCESS` by default (prefers
  `label="PARTIAL_SUCCESS"`, falls back to `label="SUCCESS"` →
  unlabeled). `NO_OP` means "doing nothing is the correct result"
  (for example: the issue is already fixed, cannot be reproduced, is
  missing required context, contradicts product behavior, or needs
  human clarification). It satisfies `goal_gate=true` and routes via
  `label="NO_OP"` before the success-like unlabeled fallback. Per-handler
  rules:
  - **Tool** (`parallelogram`): `SUCCESS` if the script exits 0,
    `FAILURE` otherwise. Tools do not emit `PARTIAL_SUCCESS` or
    `NO_OP`.
  - **Agent** (`box`): the Layer-2 agent loop drives the model until
    one of:
    - (a) the model returns a final assistant message with no pending
      tool calls — outcome `SUCCESS`, unless the session made **zero**
      tool calls *and* the final message has non-empty text, in which
      case the runner classifies it as `PARTIAL_SUCCESS` with
      `preferred_label="no-progress"` (the exported
      `NO_PROGRESS_LABEL` constant). The agent ended cleanly with
      explanatory text but did no observable work — a workflow
      author who wants the engine to act on that signal wires
      `agent_node -> handler [label="no-progress"]`; the default
      behavior is unchanged because `PARTIAL_SUCCESS` routes like
      `SUCCESS` (§6.1). <!-- [doc->REQ-EXEC-OUTCOME-NO-PROGRESS] -->
      `report_outcome` (path b) and terminal-error (paths c/d) take
      priority over the heuristic — explicit agent declaration always
      wins;
    - (b) the agent invokes the `report_outcome(outcome, reason)` tool
      (§8.2) — outcome is the argument (`SUCCESS`, `FAILURE`, or
      `NO_OP`), `reason` becomes the node's `captured_output` for
      downstream labeled-edge payloads (§6.3);
    - (c) the per-node turn limit (default 200) or timeout (default 30
      minutes) is hit — outcome `FAILURE`, `captured_output` describes
      which limit fired;
    - (d) a provider error survives Layer-1 retry (auth, invalid
      model, persistent rate-limit, etc.) — outcome `FAILURE`,
      `captured_output` is the provider's error message.
  - **Start** (`Mdiamond`): always `SUCCESS`. Routes via its single
    outgoing edge.
  - **Exit** (`Msquare`): terminal. Reaching `exit` ends the run; the
    engine does not pick an outgoing edge. Run status is `completed`
    if all `goal_gate` nodes are satisfied (see below), `incomplete`
    otherwise.
  - **Human** (`hexagon`): the outcome is the user's submitted choice,
    matched against edge labels per §11.
  - **Conditional** (`diamond`): always `SUCCESS`. Pure routing node;
    no handler-side work. The edge selector picks the next node via
    the standard priority (condition= edges → preferred_label →
    status → unlabeled). `prompt`, `script`, and `goal_gate` are
    rejected at validation since the node does no work. Upstream
    attractor-spec §2.8. <!-- [doc->REQ-DOT-CONDITIONAL-SHAPE] -->
- Nodes marked `goal_gate=true` must reach `SUCCESS`,
  `PARTIAL_SUCCESS`, or `NO_OP` before the engine accepts an `exit`
  transition.
  Satisfaction is checked against the node's *latest* visit — a node
  that reached `SUCCESS` once but `FAILURE` later does not satisfy
  the gate.
- <!-- [doc->REQ-EXEC-RETRY-TARGET] -->
  When `exit` is reached with unmet gates, the engine checks for a
  `retry_target` (then `fallback_retry_target`) attribute on the
  first unmet gate, then the graph attributes of the same name. If
  any references an existing node, traversal jumps to that node and
  resumes instead of finalizing INCOMPLETE. Cycle protection is the
  existing `max_visits` cap on each node — a retry chain that never
  satisfies the gate eventually exhausts visits and the run ends
  incomplete via that path. Coexists with the FAILURE-edge fixup
  pattern (§6.6); authors choose which model suits their case.
- If `exit` is reached with unmet goal gates, the run ends with status
  `incomplete`. To get retry behavior, model it in the graph: route
  failures to a fix-up node and loop back to the verifier (see §6.6).

<!-- [doc->REQ-EXEC-MAX-VISITS] -->
<!-- [doc->REQ-EXEC-WORKTREE-PERSIST] -->
<!-- [doc->REQ-EXEC-REVISIT-CONTEXT] -->
### 6.3 Loop control

A node that fails routes to its `label="FAILURE"` edge (per §6.1). To
retry the same node, the workflow author models a loop in the graph —
typically a fix-up node that loops back (see §6.6 for the canonical
verify/fixup idiom). To bound loops:

- `max_visits=N` caps the total number of times the engine can enter a
  node across the run. Counts every entry, including loops back via
  fix-up nodes. Inherits from graph `default_max_visits` if unset.
- When a node would be entered for the (N+1)th time, the engine treats
  that as an unrecoverable failure and ends the run with status
  `incomplete`.

For *transient* failures (network blip, file lock), repetition belongs
in the script itself (`for i in 1 2 3; do cmd && break; done`), not in
the engine. The engine's retry surface is only the graph-level loop.

**Worktree state across visits.** The engine does **not** revert the
worktree (§6.7) between visits to a node, nor when looping back via the
graph. Every entry sees the filesystem as the previous one left it.
Reasoning:

- For agent nodes, the partial work from a failed visit is information,
  not garbage. The agent can read what it produced last time, see why it
  failed, and adjust. Forcing a clean slate would throw away signal.
- For the §6.6 verify/fixup idiom, no-revert is required for correctness:
  reverting before re-running `verify` after `fixup` edited code would
  undo fixup's work.
- For tool nodes that genuinely need a clean state per visit, the script
  itself is the right place to express that (`rm -rf target/ && cargo
  test`). The worktree is the workflow author's domain.

Agent visits after the first receive a fixed context payload describing
the prior visit: `{visit_n, prior_outcome, captured_output}`. v0 does not
include diffs, tool-call summaries, or other richer payload — those are
deferred until a workflow demonstrates need.

**Entry-edge context.** When an agent node enters via a labeled
edge — either a tool/agent `FAILURE`/`NO_OP` edge or a human-gate
choice edge like `revise` — the payload additionally includes
`predecessor_node` and `entry_edge_label`. For FAILURE or NO_OP edges
from tool or agent predecessors it also includes `predecessor_outcome`
and `captured_output` (this is what the §6.6 fix-up pattern relies on
for `fixup` to read `verify`'s captured output, and what no-op report
nodes use to preserve the reason/evidence). For human-gate
edges, if the host attached a `reason` (§11), it appears as
`predecessor_reason`. For entry from a `tripleoctagon` (v0.2),
the payload includes a structured `predecessor_branches` list —
see §6.10.3.

**Worked example: `max_visits` and the verify/fixup loop.** With
`max_visits=5` on `verify` in the §6.6 pattern, verify executes up to
5 times. After verify's 5th `FAILURE`, the engine routes to `fixup`,
fixup runs and routes back to verify — but verify's would-be 6th
entry is treated as an unrecoverable failure and the run ends
`incomplete`. So `max_visits=N` on a verify-style node bounds the
loop at **N verify executions and N-1 fixup executions**. Size the
cap accordingly.

<!-- [doc->REQ-CHECKPOINT-DUAL-COMMIT] -->
### 6.4 Checkpoint and resume

After each node completes, the engine writes two commits:

- A **state checkpoint** to `refs/.../run/<id>/state` (§7.2) recording
  the engine's position, the node's outcome, and event metadata.
- A **worktree commit** on `refs/.../run/<id>/worktree` (§7.2) capturing
  any file changes the node produced. Empty diffs produce no commit;
  trailers identify the run-id and node (§7.5). This is the durability
  story: file changes only survive across runs and into the worktree
  branch via these commits — without them, removing the worktree
  directory on success (§6.7) would lose the work.

Together the two commits make the node boundary atomic from the user's
perspective.

A run is identified by a UUID. Resume: `attractor run resume <id>`.
Resume reads the latest state checkpoint, restores engine state, and
continues from the next pending node — running it against the worktree
(§6.7) as it stands. If the engine died mid-node, the in-progress node
is re-run; per §6.3 the worktree is not reverted, so the handler sees
whatever partial state the interrupted attempt left behind. Any file
changes the failed attempt made are uncommitted on the worktree branch
— they'll be committed when a node successfully completes.

**Resume state machine.** State commits are written ONLY after a
handler returns; the engine never records a "node started, not yet
completed" entry as a separate journal write. Resume therefore
distinguishes three cases deterministically from `(journal tail,
worktree-branch HEAD)`:

- **(a) Idle between nodes.** Latest journal entry is `NodeCompleted`
  with a chosen `next_node`. Resume re-enters `next_node` — same code
  path as a fresh start, just from a non-start position.
- **(b) Mid-node interruption.** The worktree branch's HEAD differs
  from the journal's recorded `worktree_commit_after`. The previous
  attempt completed its handler but crashed before the dual-commit
  pair was written. Resume re-enters the same node; the worktree is
  NOT reverted (§6.3), and the handler sees the partial state from
  the failed attempt. Per §16, the resumed result may diverge from
  an uninterrupted run.
- **(c) Paused at gate.** Latest entry is `PausedAtGate`. Resume
  returns `Paused` again unless a `GateResponded` entry follows, in
  which case the response is consumed and traversal continues along
  the matching edge.

Operators can predict resume behaviour by inspecting
`git log <state-branch>` + `git rev-parse <worktree-branch>`.

<!-- [doc->REQ-EXEC-CONCURRENCY] -->
### 6.5 Concurrency

Sequential traversal (§6.1–§6.6) executes one tool/agent/human node
at a time within a workflow. v0.2 adds the structural `component` /
`tripleoctagon` shapes (§6.9), which fan the active frontier out
and back in without changing the per-node lifecycle.

Multiple runs may execute simultaneously in the same repo, each in its
own worktree (§6.7). They share the underlying `.git` directory; git's
own locking handles contention.

**Runtime model.** The engine is built on `asyncio`. Layer 1 streaming
LLM responses (§9.1) and Layer 2 event streams (§8.1) are async-native;
sync would force a thread-per-stream just to read tokens. `pygit2` is
sync and stays that way — checkpoint calls are wrapped in
`asyncio.to_thread` at the boundary. This keeps the `checkpoint`
package runtime-agnostic (§12): the package exposes sync APIs and the
engine decides how to dispatch them. Future parallel handlers (§14)
get fan-out for free via `asyncio.gather` / `asyncio.TaskGroup`,
without reshaping the runtime.

<!-- [doc->REQ-EXEC-VERIFY-FIXUP-IDIOM] -->
### 6.6 Verification idiom

Verification (tests, linters, LLM-as-judge) is a graph pattern composed
from existing primitives, not a dedicated handler. The handler set stays
small; the graph is the verification language.

The pattern:

1. A `tool` node (`parallelogram`) runs the check — `cargo nextest`,
   `clippy`, a doc-builder, or a judge LLM invoked via `bash`.
2. `goal_gate=true` on that node prevents the engine from accepting an
   `exit` transition until the node has reached `SUCCESS`.
3. Two edges leave the verify node: one labeled `SUCCESS` to the next
   stage (or `exit`), one labeled `FAILURE` to a fix-up `agent` node
   prompted to read the failure output and patch the code.
4. The fix-up node has an outgoing edge back to the verify node.
5. `max_visits` on the verify node bounds the loop so failures don't
   churn forever (§6.3).

```dot
verify [
    shape=parallelogram,
    script="cargo nextest run && cargo clippy -- -D warnings",
    goal_gate=true,
    max_visits=3
]
fixup [
    prompt="The verify step failed. Read the build output and fix all issues."
]

implement -> verify
verify    -> exit  [label="SUCCESS"]
verify    -> fixup [label="FAILURE"]
fixup     -> verify
```

A workflow may use this pattern multiple times — a "preflight" verify
before implementation, a "final" verify before exit, and so on. Each is
just another tool node with its own gate and fix-up loop.

<!-- [doc->REQ-EXEC-RUN-ISOLATION] -->
### 6.7 Run isolation

Each run executes in its own git worktree, never the user's working tree.
At run start the engine creates a worktree (location TBD — likely under
`.attractor/worktrees/<run-id>/`) checked out to whatever ref the run
was launched from. All agent file operations and tool scripts execute
inside that worktree.

This gives:

- **Safety.** A run launched at 2am cannot collide with edits the user
  makes in the morning.
- **Concurrency.** Multiple runs can execute simultaneously without
  stepping on each other (subject to the v0 limit in §6.5).
- **Clean teardown.** When a run completes, its worktree can be removed
  or retained for inspection; the user's working tree is untouched
  either way.

<!-- [doc->REQ-EXEC-READONLY-NODE] -->
**Conditional isolation (`read_only`).** Isolation is only load-bearing
when a node *mutates* the tree. A node may declare `read_only=true`
(per-node, defaulting to the graph's `default_read_only`, itself
`false`) to say "I only read." Effects:

- **Agent nodes are enforced read-only:** when a `read_only` agent node
  has no explicit `allowed_tools`, the engine restricts pi's tool
  allowlist to the non-mutating set (`read`, `grep`, `find`, `ls`, plus
  the always-present `report_outcome`), dropping `write`/`edit`/`bash`.
  The agent *cannot* mutate the run tree — real enforcement, not a
  hint. An explicit `allowed_tools` overrides this.
- **Parallel branches share the tree (share only what we can
  enforce):** a branch whose nodes are *all* enforced read-only agents
  reads the parent run worktree (the trigger-time snapshot)
  concurrently instead of getting its own isolated sub-worktree, so an
  agent search/read fan-out creates no extra checkouts. A branch
  containing any mutating node — or any tool/shell node, which can't be
  enforced read-only — keeps its isolated sub-worktree (§6.10). The
  classification is static (the engine knows each branch's node set),
  so fresh-start and resume agree. Merging mutating branches back
  remains an explicit graph step (§6.10.3); the join never merges.
  Race-free: because every mutating branch is isolated, nothing writes
  the shared snapshot during the region.
- **Sequential nodes are unaffected for correctness:** they share the
  single run worktree as always, and an empty diff produces no commit
  (§6.4), so a read-only sequential node already costs nothing.

Why a graph-level default rather than a global one: a global
`read_only=true` default would tool-restrict *every* writing agent and
break existing mutating workflows. The default lives on
`default_read_only` (false) so a search/triage workflow opts in
wholesale while implement-style workflows are unaffected; node-level
`read_only` overrides the graph default either way.

**Worktree lifecycle.** On successful `exit` (run status `completed`),
the engine removes the worktree directory (`git worktree remove`). The
underlying branch ref (§7.2) is retained — it's the durable record of
all file changes the run made, and `git`'s content-addressing makes it
essentially free to keep. You can `git diff main..<ref>` to review or
merge it back at your leisure.

On failure, abort, or `incomplete` exit, both the worktree directory
and the branch ref are retained for inspection. To clean up manually,
run `git worktree remove .attractor/worktrees/<run-id>` and
`git branch -D attractor/run/<run-id>/worktree`. A bulk `prune`
command is deferred (§14).

<!-- [doc->REQ-EXEC-LOCAL-ENV] -->
<!-- [doc->REQ-EXEC-TOOL-CAPTURE] -->
<!-- [doc->REQ-EXEC-INPUT-COPY] -->
<!-- [doc->REQ-EXEC-RUN-ARGUMENTS] -->
### 6.8 Tool and agent execution

Both `tool` (parallelogram) and `agent` (box) nodes execute inside the
run's worktree (§6.7).

- **Working directory.** Worktree root.
- **Environment.** Inherited from the parent process. The engine
  removes provider API keys (§9.4) before invoking, but otherwise
  passes variables through (`PATH`, `HOME`, `USER`, etc.) so
  `~`-expansion and user tooling (`gh`, `cargo`, `git`) work without
  extra wiring. This is also what lets agents reach standard skill
  locations (`./.claude/skills/`, `~/.claude/skills/`, etc.) if a
  workflow points them there.
- **Shell (tool nodes).** `sh -c "<script>"` on Unix-like systems,
  `pwsh -NoProfile -Command "<script>"` on Windows. The script string
  is passed verbatim; multi-statement scripts use shell-native
  operators (`&&`, `;`, pipelines). Dispatch via
  `asyncio.subprocess.create_subprocess_exec`.
- **stdout/stderr capture (tool nodes).** Combined and captured. The
  first 256 KiB and last 256 KiB are retained; if total output
  exceeds 512 KiB, the middle is replaced with
  `[...elided N bytes...]`.
- **Default timeout (tool nodes).** 30 minutes. v0 has no per-node
  override; long-running scripts wrap themselves. When the timeout
  fires, the engine signals the script (per the signal rule below)
  and the node reports `FAILURE` with `captured_output` indicating
  the timeout (e.g. `[timeout: 30m exceeded]`) plus any output
  captured before the kill.
- **Signals.** Ctrl-C on the engine sends SIGINT (Unix) / Ctrl-C
  event (Windows) to the running tool, waits up to 5 seconds, then
  SIGKILL. On Windows the engine additionally invokes
  `taskkill /F /T` to terminate the entire process tree —
  `asyncio.subprocess.Process.terminate` only signals the immediate
  child, and tool scripts commonly spawn grandchildren that would
  otherwise orphan and hold inherited pipes open.

**Per-run input files.** When a run starts, any `--input <name>=<path>`
flags on `attractor run` (§13) copy host files into the worktree
root before traversal begins. This is the standard channel for passing
per-run goals (`goal.md`, `issue.json`, etc.) that early nodes read.

**Workflow-defined run arguments.** A workflow may declare positional
run arguments in the graph-level `arguments` string. Each non-empty line
declares one argument in positional order:

```dot
graph [
    arguments = "
        issue_url { type: url; required: true; help: GitHub issue URL; }
    "
]
```

Names must match `[a-z][a-z0-9_]*`. Supported types are `string` and
`url`; `url` accepts absolute `http://` and `https://` URLs. `required`
defaults to `true`; optional arguments may be omitted only from the end
of the positional list. `help` is descriptive text for CLI errors and
future UIs.

`attractor run <workflow.dot> [ARG...]` binds extra positional arguments
to the workflow's declarations in order. If the workflow declares no
arguments, extra positional values are an error. If binding succeeds, the
engine writes `attractor-args.json` in the worktree root before traversal
begins. The file is a UTF-8 JSON object mapping argument names to string
values, for example:

```json
{
  "issue_url": "https://github.com/owner/repo/issues/123"
}
```

Tool and agent nodes read this file explicitly. This keeps arguments
inside the same local worktree artifact as copied input files, so
checkpoint/resume never depends on ambient process state such as
environment variables.

Captured tool output is checkpointed (§6.4) and made available to a
successor agent that enters via a `label="FAILURE"` edge — see §6.3's
cross-node failure context.

<!-- [doc->REQ-DOT-PARALLEL-SHAPES] -->
<!-- [doc->REQ-EXEC-PARALLEL-FANOUT] -->
<!-- [doc->REQ-EXEC-PARALLEL-JOIN] -->
### 6.9 Parallel nodes (v0.2)

Two structural shapes extend the sequential traversal model with
concurrent branches: `component` fans the active frontier out, and
`tripleoctagon` joins it back. Both are no-op nodes — no handler,
no LLM call, no script — they shape the *traversal*, not the work.

A region delimited by a `component` and its matching `tripleoctagon`
is a **parallel region**. Each outgoing edge from the component is
the head of a **branch**; every branch must eventually reach the
matching join. The engine dispatches branches concurrently (§6.5's
asyncio runtime, via `asyncio.TaskGroup`), and the join blocks
until every branch has completed.

#### 6.9.1 Fan-out: `component`

A node with `shape=component` (or `type="component"`) marks a
fan-out point.

- No handler. `prompt`, `script`, `goal_gate`, and `class` are
  rejected by the validator — there is no model to select and no
  work to gate.
- ≥ 1 outgoing edge. ALL outgoing edges fire concurrently. They
  are NOT choices; the §6.1 SUCCESS/FAILURE routing rules do not
  apply at a `component`.
- Outgoing edge labels are *branch names* for events and journal
  output (e.g. `BranchStarted(name="tests")`). Optional; if
  present they must be unique within the component.
- A component's own outcome is unconditionally SUCCESS — the
  failure surface is the branches, not the fork itself.
- `max_visits` is allowed; a component re-entered via an outer
  fix-up loop participates in cycle bounds like any other node.

#### 6.9.2 Fan-in: `tripleoctagon`

A node with `shape=tripleoctagon` (or `type="tripleoctagon"`) is
the join of a parallel region.

- No handler.
- ≥ 2 incoming edges from the same parallel region.
- 0–2 outgoing edges following §6.1 SUCCESS/FAILURE routing — the
  join slots into the normal flow like any other node.
- Incoming edges must be unlabeled. The validator rejects labels
  on edges that terminate at a tripleoctagon — they would be
  ignored, so the rejection catches author confusion.
- **Outcome = logical AND of each branch's terminal outcome.**
  SUCCESS iff every branch completed and every branch's last node
  in `latest_outcomes` was SUCCESS; FAILURE otherwise.
- `goal_gate=true` is allowed. The AND already encodes "all
  branches must succeed", so a goal-gated join behaves identically
  to a goal-gated tool/agent for §6.2's exit-acceptance rule.

#### 6.9.3 Branch failure and partial completion

A branch can fail to reach the join in two ways: (a) a node inside
the branch hits `max_visits`; (b) a node has no routable outgoing
edge for its outcome (e.g. FAILURE with no FAILURE edge).

In both cases, **sibling branches are NOT cancelled.** Each branch
runs to its own natural termination — either reaching the join or
failing individually. The join then folds the dead branch in as
FAILURE.

Rationale: parallel work is expensive, and a failed sibling does
not invalidate the work of branches that already succeeded. The
fixup-after-join pattern (§6.9.5) inspects which branch failed via
the journal and acts on the surviving evidence.

#### 6.9.4 Validation rules

A graph with one or more parallel regions must satisfy:

1. Every `component` pairs with exactly one `tripleoctagon`, the
   region's **matching join**, such that every directed path
   leaving the component eventually reaches that join.
2. No path leaves a parallel region except via the matching join.
   A branch may not reach `exit`, a different region's join, or
   any node outside the region.
3. **No nesting in v0.2.** A region's branches may not contain
   another `component`. Deferred to v0.3 — relax this rule once
   the engine's frontier scheduler is exercised enough to handle
   nested groups.
4. Multiple **non-overlapping** regions are allowed in series
   (e.g. `start → c1 → […] → t1 → c2 → […] → t2 → exit`).
5. Branches may contain loops bounded by `max_visits` per §6.3
   (e.g. an inner verify ↔ fixup pair).
6. A `component` rejects `prompt`, `script`, `goal_gate`, and
   `class`. Outgoing edge labels, when present, must be unique
   within the component.
7. A `tripleoctagon`'s incoming edges must be unlabeled.
   Outgoing edges follow §6.1.

#### 6.9.5 Worked example: parallel verifiers + fixup

The canonical use: run independent verifiers concurrently, then —
if any failed — feed the combined output to a fixup agent that
retries the whole region.

```dot
implement -> fanout

fanout    [shape=component, label="run verifiers"]
tests     [shape=parallelogram, script="pytest"]
lint      [shape=parallelogram, script="ruff check ."]
typecheck [shape=parallelogram, script="pyright"]
join      [shape=tripleoctagon, label="all green?", max_visits=3]
fixup     [label="Read the verifier output and fix the issues."]

fanout -> tests     [label="tests"]
fanout -> lint      [label="lint"]
fanout -> typecheck [label="typecheck"]

tests     -> join
lint      -> join
typecheck -> join

join -> exit  [label="SUCCESS"]
join -> fixup [label="FAILURE"]
fixup -> fanout
```

The fixup agent enters via the join's FAILURE edge. Its
predecessor-context payload (§6.3) carries `predecessor_node = join`,
`entry_edge_label = "FAILURE"`, and a `captured_output` that
concatenates the failed branches' outputs prefixed by branch name.
The agent then loops back to `fanout`, re-running the region under
the join's `max_visits=3` bound.

#### 6.9.6 What this section defers

This section pins **syntax and per-node semantics**. The runtime
mechanics that follow from it are spec'd separately:

- **Journal entry shapes and resume semantics** for fan-out /
  fan-in — see §6.11. The design landed on one new field
  (`branch_name` on `NodeCompleted`) plus one new resume rule,
  with branch lists derived from the static graph and join
  outcomes computed at write time. No new entry types.
- **Worktree model under parallel** — see §6.10 (per-branch
  sub-worktrees, no auto-merge, recombination as a graph step).
- **Cancellation semantics** beyond the no-cancel-on-sibling-fail
  rule (e.g. user-initiated `attractor run abort` during a parallel
  region). Same treatment as today's sequential abort (§13); details
  pinned with the engine impl.

<!-- [doc->REQ-EXEC-PARALLEL-WORKTREE] -->
### 6.10 Worktree model under parallel (v0.2)

§6.7 pins one worktree directory per *run*. This section specifies
what that means inside a parallel region (§6.9): each branch gets
its own worktree directory and branch ref, the run's main worktree
is untouched across the region, and recombining branch work is a
graph step (agent or tool) — not engine policy.

This is the only choice that keeps file-mutating branches safe
without forcing the engine to ship a merge-conflict resolver. The
canonical agentic answer is "let an agent merge it" — exposed as
just another graph node downstream of the join.

#### 6.10.1 Per-branch sub-worktrees and refs

At fan-out, the engine creates one sub-worktree per branch off the
**fork point** — the run's main worktree HEAD at the moment the
`component` is entered.

- **Sub-worktree dir.** `.attractor/worktrees/<run-id>/branch/<name>/`,
  created via `git worktree add` from the fork-point commit.
- **Branch ref.** `refs/heads/attractor/run/<run-id>/branch/<name>`.
  This is a first-class durable artifact in the run's namespace,
  alongside `state` and `worktree` (§7.2).
- **Branch name resolution.** The component's outgoing edge label
  is the branch name when present; when absent, the destination
  node id is used. Two unlabeled edges to the same destination are
  rejected by the validator (§6.9.4 rule 6 extension).
- **Per-branch checkpointing.** Each node inside a branch performs
  today's dual commit (§6.4) — state branch + the **branch's own
  worktree ref** (not the main worktree ref). The branch's
  worktree ref advances linearly with one commit per completed node
  in that branch.
- **Lifecycle.** Sub-worktree directories are removed when the run
  completes successfully (mirror of §6.7). On failure / abort /
  incomplete, both the directories and the branch refs are
  retained for inspection. Branch refs persist beyond
  directory removal — they are the durable record of each
  branch's file work.
- **Prune.** `attractor prune` removes everything under
  `refs/heads/attractor/run/<id>/`, which sweeps up branch refs
  automatically (§14.1).

#### 6.10.2 Main worktree state across the region

The run's main worktree directory and `worktree` ref are
**untouched** across the parallel region:

- **Pre-fanout.** Main worktree HEAD = fork point.
- **During the region.** Main worktree HEAD is unchanged; no
  commits are written to the main `worktree` ref. All file work
  happens in sub-worktrees on branch refs.
- **Post-join.** Main worktree HEAD is still at the fork point.
  Successors of the `tripleoctagon` see fork-point state on disk.

If a workflow wants branch work reflected in the main worktree, it
adds an explicit recombination step (§6.10.3). Until that step
runs, branch work is isolated to the branch refs.

#### 6.10.3 Recombining branch work

Recombination is a **graph step**, not engine behaviour. The
`tripleoctagon` does not merge; it folds outcomes and routes per
§6.1. Three patterns cover the practical use cases:

- **No merge (verifier pattern, §6.9.5).** Branches read shared
  state and produce captured output (test failures, lint
  warnings). The fixup-after-join agent operates on the main
  worktree and consumes branch outputs via predecessor context.
  No recombination needed.
- **Agent merge.** A `box`-shape agent node downstream of the
  join, prompted to merge specific branch refs into the current
  worktree and resolve any conflicts. The agent calls `git merge`
  (or any strategy it prefers), reads conflict markers if any,
  edits to resolve, and commits. This is the canonical case for
  branches that produce overlapping source edits.
- **Tool merge.** A `parallelogram` tool node running a
  deterministic merge script (`git merge --strategy=ours`,
  `git rebase`, custom merge driver, etc.). Useful when the
  merge policy is mechanical.

A merge step's `cwd` is the main run worktree, and its commits
advance the main `worktree` ref. After a successful merge step,
the main worktree reflects the merged state and subsequent nodes
see it on disk.

**Predecessor-context payload from a `tripleoctagon`.** When a
successor node enters via the join's outgoing edge (either
SUCCESS or FAILURE), the §6.3 entry-edge payload includes a
structured `predecessor_branches` list:

```json
[
  {
    "name":            "agent_a",
    "ref":             "refs/heads/attractor/run/<id>/branch/agent_a",
    "success":         true,
    "captured_output": "..."
  },
  {
    "name":            "agent_b",
    "ref":             "...",
    "success":         true,
    "captured_output": "..."
  }
]
```

A merge agent reads this from its context and knows exactly which
refs to merge. No runtime ref-spelunking required. The existing
`captured_output` field on the payload still carries the join's
combined output (concatenated, branch-name prefixed) — the
structured list is additive.

**Worked example: parallel agents + agent merge.**

```dot
plan -> fanout

fanout    [shape=component]
agent_a   [label="Refactor module A", prompt="..."]
agent_b   [label="Refactor module B", prompt="..."]
agent_c   [label="Refactor module C", prompt="..."]
join      [shape=tripleoctagon]
merger    [prompt="Merge the listed branch refs into the current worktree; resolve any conflicts in favour of the more conservative change."]
verify    [shape=parallelogram, script="pytest && ruff check . && pyright"]

fanout -> agent_a [label="a"]
fanout -> agent_b [label="b"]
fanout -> agent_c [label="c"]
agent_a -> join
agent_b -> join
agent_c -> join
join   -> merger [label="SUCCESS"]
join   -> fixup  [label="FAILURE"]
merger -> verify
```

#### 6.10.4 Engine invariants

- One worktree directory per branch; no shared filesystem between
  branches inside a region.
- No engine-initiated merge. The `tripleoctagon` folds outcomes
  and emits the predecessor-context payload; it does not touch
  the main worktree filesystem.
- Branch refs are first-class durable artifacts; they survive
  branch failure for inspection (mirror of §6.7).
- Per-branch resume reuses today's three-state machine (§6.4)
  unchanged — each branch is its own linear ref chain.
  Multi-branch coordination (which branches are in-flight,
  which completed, when the join fires) is spec'd in §6.11.

#### 6.10.5 What this section defers

- **Sub-worktree dir creation cost** is `O(K)` per region. For
  very wide fan-outs this matters; benchmarks pinned with the
  engine impl.
- **Branch-ref garbage collection** during long-running daemons.
  Out of scope for v0.2 (§2 explicit non-feature: no daemon).
- **Cross-region branch reuse.** A branch ref from a prior region
  cannot be referenced as input to a later region in the same
  graph. Adding that would require an attribute syntax and is
  deferred.

<!-- [doc->REQ-EXEC-PARALLEL-RESUME] -->
### 6.11 Parallel region checkpoint and resume (v0.2)

§6.4 pins the dual-commit + three-state resume machine for
sequential runs. v0.2's parallel regions (§6.9, §6.10) extend it
with **one new field on `NodeCompleted` and one new resume rule**;
everything else about a parallel region is derived from the
existing journal and the workflow snapshot stored at run start.

#### 6.11.1 Journal: one new field

`NodeCompleted.branch_name: str | None`

- `null` for sequential nodes (every entry in a sequential run).
- Set to the branch name for nodes executed inside a parallel
  region. `worktree_commit_after` on such an entry refers to the
  branch's own worktree ref (§6.10.1), not the run's main
  `worktree` ref.

The `component` and `tripleoctagon` nodes themselves write plain
`NodeCompleted` entries with `branch_name = null` — they execute
in the main control flow, not inside any branch. The component's
entry records its successful entry into the region; the
tripleoctagon's entry records the AND-fold of branch outcomes
(in `success`) and the combined branch outputs (in
`captured_output`, branch-name-prefixed).

No new entry types. `RunInitialized` already carries the workflow
snapshot, so the engine can statically derive each component's
branch list whenever it needs to.

#### 6.11.2 Resume: one new rule

Resume reads the journal newest-last (§6.4). Before applying the
existing three cases, check for an **open parallel region**:

> A parallel region is open if some `component` node has a
> `NodeCompleted` entry whose matching `tripleoctagon` has no
> `NodeCompleted` entry.

When a region is open:

1. Statically derive the branch list from the workflow snapshot
   (the `component`'s outgoing edges, by §6.9.4).
2. For each branch: scan the journal for a terminal
   `NodeCompleted(branch_name=X, next_node=join_id)`. Branches
   without such an entry are **in-flight**.
3. Re-dispatch in-flight branches. Per-branch resume uses §6.4's
   three cases unchanged, scoped to that branch's worktree ref —
   each branch is its own linear ref chain (§6.10.1).
4. When the last in-flight branch finishes, the engine writes the
   tripleoctagon's `NodeCompleted` (AND-folded outcome, combined
   output) and the run continues from the join's outgoing edge —
   back to §6.4's standard flow.

Human gates inside a branch (§11) compose naturally: a hexagon
node pauses the run, `PausedAtGate` is written, the gate's
response routes the branch onward, and resume re-enters the
open-region check on the next pass.

#### 6.11.3 Worked example: crash mid-region

Using the §6.10.3 merger workflow. Suppose the engine crashes
after `agent_a` and `agent_b` finished but `agent_c` was mid-run.

Journal state at crash time (seq order, abbreviated):

```
NodeCompleted(node=fanout,  branch=null, success=true,  next_node=null)
NodeCompleted(node=agent_a, branch=a,    success=true,  next_node=join)
NodeCompleted(node=agent_b, branch=b,    success=true,  next_node=join)
```

On resume:

1. Walk newest-last. Latest entry is `NodeCompleted(branch=b)`.
2. Open-region check: `fanout` (a `component`) has a
   `NodeCompleted`; no `NodeCompleted` for the matching `join`.
   Open.
3. Static derive: branches are `a`, `b`, `c`.
4. Per-branch scan:
   - `a`: terminal entry present → done.
   - `b`: terminal entry present → done.
   - `c`: no terminal entry → in-flight.
5. Re-dispatch `c`. §6.4 case (a) or (b) determines exactly where:
   if `c`'s branch worktree HEAD matches its latest
   `worktree_commit_after`, re-enter `next_node`; otherwise
   re-enter the same node.
6. When `c`'s terminal `NodeCompleted` is written, the engine
   writes the tripleoctagon's `NodeCompleted` (`success` = AND of
   `[true, true, c.success]`) and routes to the join's outgoing
   edge.

#### 6.11.4 Engine invariants

- A `component`'s `NodeCompleted` precedes any branch-internal
  entries with that component as their fan-out source.
- A `tripleoctagon`'s `NodeCompleted` is written only after every
  branch listed in the static graph has its terminal
  `NodeCompleted(branch_name=X, next_node=join_id)`. Out-of-order
  writes are an engine bug.
- Branch entries from different in-flight branches may interleave
  freely in `seq` order — `seq` is global; `branch_name` is the
  discriminator.
- Per-branch dual commits land on the branch's own worktree ref;
  concurrent branches commit to different refs and do not contend
  on git locks.

#### 6.11.5 What this section defers

- **Goal-gate keying.** Today's `latest_outcomes` is
  `dict[node_id, bool]`. If a graph routes the same `node_id`
  through multiple branches (allowed but rare), the key becomes
  `(node_id, branch_name | None)` to avoid smearing branch
  outcomes over a same-named sequential one. Pinned with engine
  impl.
- **`attractor run show` rendering** of in-flight parallel regions
  (per-branch progress, in-flight markers). UX, not SPEC.
- **Token aggregation by branch.** Per-run totals sum all per-node
  tokens regardless of branch; branch-level summaries are a UI
  feature.

---

## 7. Local Checkpoint Store

The whole reason this is "dark": run state lives in **local git refs**, never
pushed. Same machine, same repo, no network.

<!-- [doc->REQ-STORE-BRANCH-AS-FS] -->
### 7.1 Approach

A `BranchStore` treats a git branch as a versioned key/value filesystem:

- `ensure_branch()` creates the branch with an empty root commit if missing.
- Each write is one commit; the tree of the new commit = previous tree +
  modifications.
- Reads are tree lookups at the branch tip.
- History is `git log <branch>`.

This is the same pattern fabro uses (see
`references/fabro/lib/crates/fabro-checkpoint`), and the
`rust-v0.1` tag preserves Attractor's first cut of it. The Python
implementation lifts the same shape (`Store`, `TreeEntries`,
`BranchStore`) on top of `pygit2` — no other dependencies.

<!-- [doc->REQ-STORE-REF-NAMESPACE] -->
### 7.2 Ref namespace

All Attractor refs live under a reserved prefix:

```
refs/heads/attractor/run/<run-id>/state         — execution state machine + events
refs/heads/attractor/run/<run-id>/worktree      — git branch backing the run's worktree directory; the durable record of all file changes
refs/heads/attractor/run/<run-id>/agent/<node>  — agent session history
refs/heads/attractor/meta/                      — engine-wide metadata
```

User branches are never touched.

We place these under `refs/heads/` rather than a custom `refs/attractor/`
namespace. The `worktree` ref is meant to be a real, ordinary git branch
(§6.7) — that's how `git diff main..<ref>`, `git branch -D ...`, and
`git worktree add` (which expects a branch under `refs/heads/`) all work
without special-casing. Visibility in `git branch` is a feature, not
clutter: it's how the user sees what runs exist and reviews their
output. `refs/heads/attractor/...` keeps these grouped by prefix while
remaining fully porcelain-compatible. Fabro does the same
(`refs/heads/fabro/run/<id>`, see
`references/fabro/lib/crates/fabro-workflow/src/git.rs`).

The `worktree` branch is the artifact store — there is no separate
`artifacts` ref. The worktree branch is itself a regular git branch with
the run's full file history; on successful completion the directory is
removed (§6.7) but the branch ref remains for `git diff` or merge.

<!-- [doc->REQ-STORE-AUTHOR-ID] -->
### 7.3 Author identity

Default committer: `Attractor <noreply@attractor.local>`. Configurable
per repo. If a user identity is configured, both identities appear on
the commit, with this ordering:

- **Commit author + committer header:** the configured user. This is
  what `git log --author=<user>` and `git blame` see, so attribution
  for hand-curated runs reads naturally.
- **`Co-Authored-By:` trailer:** the Attractor default. Marks the
  engine's involvement explicitly without displacing the user from
  the primary author slot.

If no user identity is configured, the Attractor default fills both
the author/committer header and there is no trailer — solo-engine
runs still have a single clear authorship line.

<!-- [doc->REQ-STORE-NO-NETWORK] -->
### 7.4 What we explicitly do NOT do

- No `git push`. No remotes. No fetch/push callbacks. The
  `checkpoint` module has zero network code.
- No GitHub API integration in the checkpoint layer. (The engine may
  separately know about remotes for other reasons; checkpointing does not.)
- No automatic GC. Users run `git gc` themselves. A `attractor prune`
  command for old runs is deferred (§14).

<!-- [doc->REQ-STORE-TRAILERS] -->
### 7.5 Trailers for structured metadata

Commit messages use git trailers (RFC-style `Key: value`) for machine-readable
fields:

```
checkpoint: stage advance plan -> approve

Run-Id: 01HZ...
Stage: plan
Outcome: SUCCESS
Duration-Ms: 2841
Tokens-In: 1820
Tokens-Out: 412
```

Searchable via plain `git log --grep`, parseable via the `trailer` module.

The same trailer convention applies to commits on the worktree branch
(§6.4) — minimum `Run-Id` and `Node` (or `Stage`) so you can tell at a
glance which run and which node produced a given file change.

---

## 8. Coding Agent Loop

Layer 2. Delegated to [**pi**](https://pi.dev/), a local coding-agent
harness (`@earendil-works/pi-coding-agent`). pi owns the conversation
loop, tool dispatch, streaming, skills, extensions, compaction,
retry, and — crucially — **multi-provider routing**: one harness that
talks to Anthropic, OpenAI (Codex/ChatGPT), Google Gemini, GitHub
Copilot, and ~20 API-key providers. The engine (Layer 3) drives one
pi session per agent node, supplies the workflow's `goal` and the
node's `prompt`, observes the session's event stream, and consumes
its terminal outcome.

We do not hand-roll an agent loop, and we do not bind to any single
provider's SDK. **pi is the contract.** Attractor shells out to the
`pi` binary in its headless JSON-RPC mode (`pi --mode rpc`) so the
integration stays language-agnostic and process-isolated, and so the
providers, tools, and skills pi ships are maintained upstream rather
than in this repo.

> Runtime dependency: the `pi` binary must be on `PATH`. It is a Node
> CLI, installed separately from this Python package (`npm i -g
> @earendil-works/pi-coding-agent`). A missing binary surfaces as an
> `AgentError` at the dispatch boundary (§6.2 path d / setup failure).

<!-- [doc->REQ-AGENT-SESSION-API] -->
<!-- [doc->REQ-AGENT-SESSION-EVENTS] -->
<!-- [doc->REQ-AGENT-SESSION-STEERING] -->
<!-- [doc->REQ-AGENT-EXECUTION-ENV] -->
### 8.1 What pi provides

- Multi-turn conversation with tool use, against **any provider pi
  supports** — selected per node via the `provider/model` string (§9.3).
- JSONL event stream over stdout — `agent_start`/`agent_end`,
  `turn_start`/`turn_end`, `message_update` token deltas,
  `tool_execution_start`/`_end`, `auto_retry_*`, `compaction_*`.
- Configurable per invocation: provider+model, thinking level,
  allowed/excluded tools, system-prompt appends, session persistence.
- **Built-in tools.** `read`, `bash`, `edit`, `write` by default; `grep`,
  `find`, `ls` available via the tool allowlist. These cover read,
  write, edit, shell, and search.
- **Skills.** pi auto-discovers `.agents/skills/` (walking up to the
  repo root) and `.pi/skills/`. Workflow authors compose existing
  skills rather than reinventing review/refactor/lint prompts. (This
  repo already ships skills under `.agents/skills/`.)
- **Extensions.** TypeScript extensions registered via the resource
  loader or `--extension` can add tools, commands, and event hooks.
- **Retry & compaction.** pi handles transient-error retry
  (`auto_retry_*`) and context compaction internally; the engine
  observes but does not reimplement them.

<!-- [doc->REQ-AGENT-TOOLS-V01] -->
### 8.2 Attractor's tool additions

pi's built-in tools plus skills/extensions cover almost everything.
Attractor adds **one** tool, shipped as a tiny pi extension inside
the package (`src/attractor/agent/report_outcome.ts`) and loaded with
`pi --extension <path>`:

- `report_outcome(outcome, reason)` — declares the agent node's
  outcome (§6.2). Default outcome is `SUCCESS` if the agent never
  invokes it and instead terminates with a final assistant message
  (pi's natural `agent_end`). Calling
  `report_outcome("FAILURE", "...")` declares a failure whose
  `reason` becomes the node's `captured_output` for downstream
  FAILURE-edge payloads (§6.3). <!-- [doc->REQ-EXEC-OUTCOME-NO-OP] --> Calling
  `report_outcome("NO_OP", "...")` declares that doing nothing is the
  correct result; the reason is preserved as evidence and routes via
  `label="NO_OP"` when the workflow author wires that edge.

A `read_only` agent node (§6.7) restricts this allowlist to the
non-mutating built-ins (`read`, `grep`, `find`, `ls`) plus
`report_outcome`, so the agent cannot `write`/`edit`/`bash` its way into
mutating the shared tree.

The extension validates and normalizes the `outcome` argument and
returns the normalized value in its tool-result `details`; the runner
reads the outcome from the `tool_execution_end` event. Attractor
spawns pi with `--no-extensions` so the repo's *host* extension
(`.pi/extensions/attractor/`, which drives Attractor from pi — the
reverse direction) is not pulled into the sub-agent's own session;
only the explicit `report_outcome` extension is loaded.

No dedicated `web_fetch`. Agents that need a URL invoke `bash curl`
(or `wget`). Per §1, the engine doesn't restrict outbound network
from agent tools — the workflow author owns that domain.

### 8.3 Session lifecycle

Each agent-node invocation is one pi RPC session that either ends
cleanly (`agent_end` — final assistant message and/or a
`report_outcome` call) or fails (turn-limit cap, wall-clock timeout,
or provider error after pi's own retry). The engine enforces the
turn cap (counting `turn_end` events) and the wall-clock timeout
(`asyncio.wait_for`), sending `abort` and terminating the subprocess
on either.

If a workflow needs to pause for user input mid-conversation, model
it as a `hexagon` (human gate, §11) at the graph level — outside the
agent session. v0 does not surface mid-turn "model asks the user"
prompts as their own state.

> The legacy v0.1 implementation embedded the Claude Agent SDK
> (Anthropic-only) directly. It was removed when pi became the agent
> layer; the archived shape lives in the pre-pi Python commit history.

---

## 9. LLM Transport

Layer 1. pi handles the entirety of LLM transport for every provider
it supports: HTTPS, SSE streaming, retry, rate-limit back-off, and
authentication. Attractor does not import any provider SDK
(`anthropic`, `openai`, `google-genai`, …); it spawns `pi --mode rpc`
and speaks pi's JSON protocol. There is no Attractor-owned transport
code to maintain.

<!-- [doc->REQ-LLM-SURFACE] -->
### 9.1 Surface

- Per agent-node invocation the engine spawns:

  ```
  pi --mode rpc --no-session --offline \
     --no-extensions --extension <report_outcome.ts> \
     [--model <provider/id>] [--tools <names>] \
     [--append-system-prompt <goal>] [--append-system-prompt <revisit>]
  ```

  and drives it: send one `prompt` (the node's task), read the JSONL
  event stream until `agent_end`, then query `get_last_assistant_text`
  and `get_session_stats`.
- `--offline` (a.k.a. `PI_OFFLINE=1`) disables pi's non-LLM startup
  network operations (update checks, etc.), so the only network
  egress is the LLM provider call itself — preserving §1 local-first.
- `--no-session` keeps each node's session ephemeral; cross-node
  state lives in the worktree and the journal, not pi sessions.
- Per-call usage comes from `get_session_stats` (`tokens.input` /
  `tokens.output`); the engine forwards it through `EngineEvent` and
  aggregates per-run (§15 token accounting).

<!-- [doc->REQ-LLM-PROVIDER-ANTHROPIC] -->
### 9.2 Providers

Multi-provider is now the default, inherited from pi at no
maintenance cost: Anthropic, OpenAI (Codex/ChatGPT subscription or
`OPENAI_API_KEY`), Google Gemini, GitHub Copilot, and the long tail
of API-key providers pi ships. Attractor is provider-agnostic — the
node's `provider/model` string (§9.3) selects the route, and adding
a provider is a pi release, not an Attractor change.

<!-- [doc->REQ-LLM-MODEL-STRINGS] -->
### 9.3 Model strings

Model strings are `provider/model` (e.g. `anthropic/claude-sonnet-4-5`,
`openai/gpt-5.2`, `google/gemini-2.5-pro`), passed verbatim to pi's
`--model`. pi resolves the provider from the prefix. An empty/unset
model lets pi pick its configured default. A bare model with no
provider prefix is forwarded as-is and resolved against pi's default
provider — authors should prefer the explicit `provider/model` form.
The stylesheet's `reasoning_effort` (§10) maps to pi's thinking level
and may be appended as `provider/model:<level>` in a later revision.

<!-- [doc->REQ-LLM-AUTH-ENV] -->
### 9.4 Authentication

Authentication is delegated entirely to pi, which resolves credentials
in this order (see pi's providers docs):

1. provider env vars (`ANTHROPIC_API_KEY`, `OPENAI_API_KEY`,
   `GEMINI_API_KEY`, …) for CI / scripted runs;
2. stored credentials in `~/.pi/agent/auth.json` — API keys or OAuth
   tokens from a subscription login (`pi`'s `/login` for Claude
   Pro/Max, ChatGPT Plus/Pro Codex, GitHub Copilot).

The engine never reads or writes credentials — it neither inspects
`~/.pi/agent/auth.json` nor writes any API key to checkpoints,
events, or git refs. Its only job is to dispatch pi; pi owns auth
resolution. The tool-script executor still scrubs the provider key
env vars from tool-node environments (§6.8), so workflow `bash`
scripts never see them.

If a workflow needs additional credentials for tool scripts (a
GitHub token for `gh`, an SSH key for `git clone`, etc.), the
workflow author exports those into the run environment the same way.

---

<!-- [doc->REQ-ROUTING-STYLESHEET] -->
## 10. Multi-Model Routing (Stylesheet)

CSS-like syntax on the `model_stylesheet` graph attribute:

```
*               { model: cerebras/gpt-oss-120b;        reasoning_effort: low;  }
.coding         { model: openai-codex/gpt-5.3-codex;   reasoning_effort: high; }
.review         { model: anthropic/claude-opus-4-5;    reasoning_effort: high; }
#critical_node  { model: anthropic/claude-opus-4-7;    reasoning_effort: high; }
```

Model strings are `provider/model` (§9.3); the stylesheet routes
across providers because pi does. The example above runs cheap/fast
nodes on Cerebras' GPT-OSS, coding nodes on an OpenAI Codex
subscription, and review/critical nodes on Anthropic — all three
verified end-to-end via the agent live test (`anthropic/...`,
`openai-codex/gpt-5.3-codex`, `cerebras/gpt-oss-120b`). `openai-codex`
is the ChatGPT subscription route; `openai/...` is the API-key route.

Selectors: `*` (all), `.class` (matches node `class` attribute), `#id`
(matches node ID). Specificity: id > class > universal. Last-wins within the
same specificity bucket. Per-node attributes (`model="..."`) override the
stylesheet.

Last-wins applies **per declaration**, not per rule — same as CSS. A
later `.heavy { model: X; }` overrides an earlier `.heavy { model: Y;
reasoning_effort: high; }` only on `model`; the earlier rule's
`reasoning_effort` survives. Rules contribute declarations into the
effective config; they don't replace prior rules wholesale.

Selectors match every node (`start`, `exit`, `agent`, `human`, `tool`).
The stylesheet-settable attributes (`model`, `reasoning_effort`) are
agent-only per §5.5; on non-agent nodes the engine resolves them but
does not apply them. A universal `* { model: ... }` rule is therefore
safe to write — it has no effect on `start`, `exit`, `human`, or
`tool` nodes.

---

<!-- [doc->REQ-HUMAN-GATE] -->
## 11. Human-in-the-Loop

`hexagon` nodes block the run. The engine emits a `HumanGate` event with:

- Node ID, label, and prompt. If the node has no `prompt` attribute,
  the event's `prompt` field defaults to the node's `label`.
- The labels of the node's outgoing edges (the available choices).

The host (CLI / future UI) collects the user's response and submits it
back. The engine matches the submitted string against an edge label, then
continues down that edge. The submitted value is a label string verbatim
(`approve`, `revise`) — no derived shortcuts. Hosts may offer their own
keystroke UX, but the engine only sees the label.

Hosts may attach an optional `reason` string with the submission on
**any** choice — not only rejection-style ones. If present, the engine
includes it in the next agent visit's context payload (§6.3) as
`predecessor_reason`, regardless of which edge label the user selected.
A `revise` choice that loops back to an agent is the canonical case,
but an `approve` choice may carry rationale forward the same way. The
§13 CLI exposes this as `--reason <text>` on `run respond`.

If the user needs to inspect files before deciding, the host reads them
from the run's worktree (§6.7). The engine doesn't attach artifacts to
the gate event — that's a host concern.

For long pauses, the run can be **suspended to disk** (write checkpoint,
exit the process). Resume picks up at the gate.

### 11.1 Edge cases

These behaviours were pinned by the v0.1 implementation; they're spec
rules, not implementation accidents:

- **Unlabeled outgoing edges from a hexagon are unreachable.** The
  engine only emits LABELED edges in `HumanGate.choices`. A hexagon
  whose only outgoing edge is unlabeled produces an empty `choices`
  list and `respond()` always errors with `UnknownChoice`. Rationale:
  human gates are explicit decision points; a silent default route
  would obscure intent.
- **Duplicate edge labels route to the first-declared edge.** If a
  graph has two `hexagon -> X [label="approve"]` edges, the first
  one in source order wins. `attractor validate` does NOT flag the
  duplicate (validator catches structural errors; this is ambiguity,
  not invalidity).
- **Unmatched choice → `UnknownChoice`.** The submitted value must
  match a labeled edge exactly. No prefix matching, no fuzzy match,
  no re-prompt loop — re-prompting is a host UI concern.
- **Stale response after graph edit.** The journal's
  `RunInitialized` entry preserves the DOT snapshot at run start;
  resume re-validates the responded choice against that snapshot.
  Edits to the workflow file after the pause do NOT change which
  choices are valid for a given run.

---

## 12. Package Layout

```
pyproject.toml          # UV-managed project + console_scripts entry point
src/
  attractor/
    __init__.py
    cli/                # Click/Typer subcommands; entry point lives here
    engine/             # Layer 3: DOT-graph traversal, dispatch, dual-commit
    workflow/           # DOT parser, validator, typed Graph, stylesheet
    agent/              # thin wrapper around claude-agent-sdk Session
    llm/                # auth + retry helpers (sparse — SDK owns most)
    execution/          # LocalExec (asyncio.subprocess + capture + timeout)
                        # (named `execution` to avoid shadowing builtins.exec)
    worktree/           # per-run git worktree manager (pygit2 + git CLI)
    checkpoint/         # git-backed BranchStore (pygit2; opaque blob store)
    testing/            # test scaffolding for hosts (see §12.1); not
                        # part of the stable host API contract
tests/                  # pytest tests, mirror module layout
workflows/              # DOT example library (read-only)
```

Single package `attractor`, installable via UV (`uv tool install
attractor` for users, `uv sync` for contributors). Submodules layer
strictly: `cli` → `engine` → `workflow` / `agent` / `execution` /
`worktree` → `checkpoint` / `llm`. Module boundaries enforce the
dependency direction; v0.1 does not pay for separate package
boundaries on these.

`checkpoint` keeps its own boundary: it depends only on `pygit2` and
stores opaque blobs — it doesn't know about engine `Outcome` or
`Event` types. Those live in the `engine` module.

Sandbox: v0.1 calls a `LocalExec` class directly (in `exec/`). An
`ExecutionEnvironment` protocol lands with container execution (§14).
Utility helpers (redaction, terminal formatting) live wherever they're
used; no `util` module until something is genuinely reused.

If an out-of-tree caller appears for `workflow`, `agent`, `execution`,
`worktree`, or `checkpoint`, the module is extracted to its own PyPI
package at that point. The single-package layout is the cheap default,
not a permanent decision.

<!-- [doc->REQ-API-IN-PROCESS] -->
### 12.1 In-Process Host API

The bundled CLI is one consumer of the engine; an external orchestrator
embedding Attractor in its own Python process is another. To keep the
contract honest, the top-level `attractor` package re-exports the names
a host needs and nothing else.

**Stable surface (importable as `from attractor import ...`):**

- `Engine` — the orchestrator. Construct with `Engine(repo_root)`;
  drive path-based host launches with `await engine.launch(...)`,
  lower-level graph launches with `await engine.run(...)` /
  `await engine.start(...)`, and existing runs with
  `await engine.resume(...)`, `await engine.respond(...)`. Use
  `await engine.start(...)` when the host already has a `ValidGraph`
  and needs the run_id while traversal is still in flight (see below).
- `RunStatus` — terminal / paused state returned by `run` / `resume`.
- `RunSession` — live handle returned by `Engine.start()`. Exposes
  `run_id: str` immediately, `await session.wait() -> RunStatus` for
  the terminal status, `session.done() -> bool` for a non-blocking
  check, and `session.cancel() -> bool` to request abrupt
  termination. `cancel()` delegates to `asyncio.Task.cancel`;
  cancellation is delivered at the traversal's next `await` point
  and `resume(run_id)` picks up from the last checkpoint per §6.4
  (same recovery path as a hard kill — case (a) if the cancel landed
  between nodes, case (b) if mid-node).
- Typed events — `RunStarted`, `NodeStarted`, `NodeOutcome`,
  `HumanGate`, `Checkpointed`, `AgentToolUse`, `RunCompleted`,
  `Aborted`, plus the `EngineEvent` union and the `EventCallback`
  type alias. Events are a synchronous callback by design (§6, §11);
  a host that wants async iteration wraps the callback into its own
  queue.
  <!-- [doc->REQ-AGENT-PROGRESS-EVENTS] -->
  `AgentToolUse(run_id, node_id, tool_name, args_preview)` fires once
  per tool invocation the SDK emits during an agent node's session,
  arriving between that node's `NodeStarted` and `NodeOutcome`.
  `args_preview` is a short, secrets-conservative summary: each
  value is str-coerced and truncated at 40 chars, and the whole
  line is truncated at 100 chars. Tool names plus typical first-arg
  payloads (file paths, command names, grep patterns) survive
  intact; long prompts and free-form strings get clipped. Surfaces
  what would otherwise be a multi-minute silent gap during heavy
  agent nodes (issue #6). v1 covers tool calls only — assistant
  message text and per-message token deltas are out of scope until
  the secret-preview policy is reviewed for those richer signals.
- Errors — `EngineError` (base), `UnknownRun`, `NotPausedAtGate`,
  `UnknownChoice`, `WorkflowNotValid`, `AbortNotPaused`.
- Inspection — `RunHandle`, `RunSummary`, `NodeRecord` (returned by
  `engine.list_runs()` / `engine.show_run(run_id)`).
- Workflow loading — `parse`, `validate`, `ValidGraph`, `ParseError`,
  `ValidationFailed`. Required to produce the `ValidGraph` argument
  `Engine.run` expects.
- `InputCopy` — describes per-run input files seeded into the worktree
  before traversal.

**Stability contract.** Names re-exported from `attractor` are stable:
they keep their public signatures and semantics across minor versions,
or change behind a deprecation window. Names reachable only through a
submodule path (`attractor.engine.journal.NodeCompleted`,
`attractor.workflow.Stylesheet`, etc.) are internal: usable, but
subject to change without notice. Journal entry types in particular
are available for tests and advanced inspection but are not part of
the host contract.

A worked example — construct an engine, parse + validate a workflow,
stream events through a callback, handle a human gate — lives in
`docs/host-api.md`.

<!-- [doc->REQ-LAUNCH-PATH-PREFLIGHT] -->
`Engine.launch(workflow_path, ...)` is the host-level launch boundary.
It accepts a workflow artifact path, workflow arguments, input files,
and a base ref; reads/parses/validates the DOT artifact; binds workflow
arguments; preflights input source files and base-ref resolution; then
creates an addressable run and returns a `RunSession`. A preflight
failure rejects the launch before any run refs or worktree directories
are created. Provider-auth preflight and relaunch-from-snapshot are
explicitly deferred until a host workflow demonstrates need.

<!-- [doc->REQ-API-RUN-SESSION] -->
`Engine.run` is the one-shot graph path: it awaits the entire traversal
and returns the terminal `RunStatus`. `Engine.start` is the live-handle
graph path: it performs the eager setup (validate, mint run_id, create
worktree, write the `RunInitialized` entry, emit `RunStarted`) and
returns a `RunSession` whose `run_id` is queryable via
`engine.show()`/`engine.list()` before traversal completes. The two
are equivalent under composition:
`await engine.run(...) == await (await engine.start(...)).wait()`.
Hosts that have a workflow artifact path pick `launch()`; hosts that
already have a `ValidGraph` and just want a status pick `run()`; hosts
that already have a `ValidGraph` and want to address the run while it
executes (correlate logs, surface a cancellation UI, attach inspection)
pick `start()`.

Re-exports are lazy (PEP 562 `__getattr__`). This is load-bearing:
`import attractor.checkpoint` must not transitively pull network
modules (httpx / urllib / etc.) per REQ-STORE-NO-NETWORK, and eager
top-level imports of `Engine` would breach that contract by dragging
in the Claude Agent SDK. Lazy access keeps the ergonomic
`from attractor import Engine` form while preserving the no-network
guarantee for narrow checkpoint imports.

<!-- [doc->REQ-API-TESTING-HELPERS] -->
**Test scaffolding lives in `attractor.testing`** and is explicitly
NOT part of this stability contract. It currently exports
`create_test_repo(path)` — a helper that initialises a fresh git repo
suitable for `Engine(path)` with `commit.gpgsign` / `tag.gpgsign`
disabled in local config so signing failures in CI / sandbox
environments don't break tests. Hosts use it from their own test
suites; production code (orchestrators driving real user runs) MUST
NOT import it — the host is responsible for the user's git repo and
its config. Symbols in `attractor.testing` are not in
`attractor.__all__` and may change in any release. The Attractor test
suite uses `create_test_repo` via the `seeded_repo` / `dod_repo`
fixtures, so it's exercised on every CI run.

---

<!-- [doc->REQ-CLI-SCAFFOLD] -->
<!-- [doc->REQ-CLI-RUN-LIFECYCLE] -->
<!-- [doc->REQ-CLI-INSPECT] -->
## 13. CLI Surface (v0)

```
attractor init                                          # set up repo for runs
attractor run <workflow.dot> [ARG...] [--input <name>=<path>...] # start a new run
attractor run resume <run-id>                           # resume from last checkpoint
attractor run respond <run-id> <choice> [--reason <s>]  # answer a human gate (§11)
attractor run abort <run-id>                            # terminate a paused run
attractor run list                                      # list runs in this repo
attractor run show <run-id>                             # status, events, current stage
attractor render <workflow.dot>                         # re-emit parsed graph as canonical DOT
attractor validate <workflow.dot>                       # parse + lint, no execution
```

<!-- [doc->REQ-LAUNCH-PATH-PREFLIGHT] -->
`attractor run <workflow.dot>` is the CLI form of path-based launch.
It rejects parse/validation/argument/input/base-ref preflight failures
before creating run refs or worktrees. Once launch succeeds, it prints
the run id as the launch receipt, stays attached by default, and streams
events until the run completes, pauses, or ends incomplete.

`--input <name>=<path>` (repeatable) copies a host file into
`<worktree>/<name>` before traversal begins. Use it to pass per-run
inputs (`goal.md`, `issue.json`, etc.) that early nodes read. See
§6.8.

Extra positional `ARG` values bind to the workflow's graph-level
`arguments` declarations (§6.8). Bound values are written to
`<worktree>/attractor-args.json` before traversal begins. Values that
begin with `-` should be passed after `--` so Click does not parse them
as CLI options.

`run respond` is the answer-channel for human gates (§11). When the
engine emits a `HumanGate` event, submit a label string matching one
of the gate's outgoing edges and an optional reason. Foreground
`attractor run` sessions may also accept gate input on stdin; the
`respond` subcommand works whether the run is foreground, suspended,
or being driven by a non-CLI host.

<!-- [doc->REQ-CLI-RUN-ABORT] -->
`run abort <run-id>` is the clean-exit path for a `paused` run the
user has decided not to resume. It writes a journal entry that
transitions the run to `incomplete`; refs and the worktree are
preserved so the user can still inspect what happened, and a later
`prune` will remove them without needing `--force`. Abort refuses
on terminal runs (`completed` or `incomplete`) with exit code 2 —
those should go through `prune` instead. Active foreground runs
should still be stopped with Ctrl-C; `abort` is for runs that have
already checkpointed at a `paused` state.

<!-- [doc->REQ-CLI-RENDER] -->
`render <workflow.dot>` parses the file, validates structure
(same checks as `validate`), and writes the parsed graph back out
as canonical DOT on stdout. It's a diagnostic for "what does
attractor actually see in this file" — useful when an attribute
isn't taking effect or a comment is masking a node. Parse or
validation errors exit 2 with the same diagnostics as `validate`.
SVG output is not in this cut; `dot -Tsvg workflow.dot` still works
on raw or rendered DOT alike:

```
attractor render workflow.dot | dot -Tsvg > workflow.svg
```

To stop a runaway *foreground* run, Ctrl-C the process; `run resume`
picks up from the last checkpoint. `prune` is documented in §14.1.

---

## 14. MVP Cut (what ships in v0.1)

In:

- DOT parser for the accepted subset.
- Engine traversal: start, exit, agent (`box`), human (`hexagon`),
  tool (`parallelogram`).
- Checkpoint/resume against local refs.
- Anthropic via Claude Agent SDK (with default skills/hooks/MCP/
  built-in tools available to agent nodes).
- Local execution environment for tool nodes (`asyncio.subprocess`).
- CLI: `init`, `run` (with `--input`), `run resume`, `run respond`,
  `run list`, `run show`, `validate`.
- Stylesheet with `*`, `.class`, `#id` selectors.

Deferred to v0.2+:

- Parallel (`component`, `tripleoctagon`).
- Sub-workflow (`house`).
- Non-Anthropic providers.
- `render`, `run abort`, and `prune` commands.
- Container execution environment.
- TUI / web UI.
- Per-node `hooks` scoping — `allowed_tools` and `mcp_servers` shipped
  in §5.5; `hooks` is deferred pending a hook-bundle registry design
  (§15 open question 4).
- Explicit sub-workflow dispatch via the SDK's `Task` tool.

<!-- [doc->REQ-CLI-PRUNE-SELECTORS] -->
### 14.1 `attractor prune` (implemented post-v0.1)

The deferred `prune` command above has the clearest user pull —
every workflow run leaves a `state` branch and a `worktree` branch
under `refs/heads/attractor/run/<id>/...` (per §7.2). v0.1 keeps
them all, intentionally, so the user can `git log` history and
`git diff` worktree contents. After a few dozen runs the ref list
gets noisy. This subsection documents the shipped surface; for the
implementation, see `attractor.engine.prune`,
`attractor.cli.prune`, and `Engine.prune_run` /
`Engine.auto_prune_completed` in `attractor.engine.engine`.

**Surface**

```
attractor prune [--run-id ID]... [--status STATUS]...
                [--older-than DURATION]
                [--all-completed]
                [--dry-run]
                [--force]
```

- `--run-id ID` — prune a specific run. Repeatable.
- `--status completed|incomplete|paused` — by status. Repeatable.
- `--older-than 7d` / `30d` / `12h` — by `started_at` age.
- `--all-completed` — convenience for `--status completed`.
- `--dry-run` — list what would be pruned; don't touch refs or dirs.
- `--force` — required to prune `paused` runs (they may still be
  responsive) or runs whose worktree directory still exists with
  uncommitted changes.

Selectors compose with AND across categories and OR within
(`--status completed --status incomplete --older-than 7d` means
"COMPLETED or INCOMPLETE, AND older than 7 days").

Calling `prune` with no selectors errors out — you don't want
typo-prune-everything semantics. The error suggests `--all-completed`
or `--dry-run` as the next move.

<!-- [doc->REQ-CLI-PRUNE-SAFETY] -->
**Behaviour per run**

For each resolved target run-id:

1. Resolve `RunSummary` via `Engine.show(run_id)`. If the state
   branch is missing entirely, warn and skip (corruption).
2. Refuse to prune `paused` runs unless `--force` is set. Refuse to
   prune runs whose worktree dir still exists AND has uncommitted
   changes unless `--force`.
3. If `--dry-run`, print one line per target and exit.
4. Delete refs under `refs/heads/attractor/run/<id>/`:
   - `state`, `worktree`, and any `agent/<node>` refs.
5. Remove the worktree directory via `git worktree remove --force`
   if still present, then `git worktree prune` to clean up the
   admin record. (The directory is normally gone for COMPLETED
   runs per §6.7; INCOMPLETE/PAUSED retain it.)
6. Print `pruned <id> <status>` for each.

`prune` does NOT touch `refs/heads/attractor/meta/` — that's
engine-wide metadata, not per-run. It also does NOT run `git gc`;
that's a user concern.

**Exit codes**

- `0` — at least one run pruned (or `--dry-run` succeeded).
- `1` — nothing to prune given the selectors. (Different from `0`
  so shell pipelines can branch on "did anything happen.")
- `2` — selector parse error, ambiguous match, or refusal that
  would need `--force`.

**Tests**

Unit:
- `Engine.prune_run(run_id)` deletes refs and worktree dir; returns
  a summary object listing what was removed.
- Safety: prune refuses paused without `--force`; prune refuses a
  dirty worktree without `--force`.
- Empty/missing state branch: prune warns + skips.

Integration:
- End-to-end via CliRunner: run a tool-only workflow to completion,
  call `prune --all-completed`, assert refs are gone and `attractor
  run list` shows no runs.
- `--dry-run` produces the same line output without removing.

**Out of scope for v0.2**

- Garbage-collecting the underlying git objects. The refs are
  removed; the user runs `git gc` themselves (already documented
  at §7.4).
- Reaping orphaned `attractor/run/.../agent/...` refs when the
  parent `state` ref is gone. Belongs in a separate `attractor doctor`
  command; deferred to v0.3.

<!-- [doc->REQ-CLI-AUTOPRUNE] -->
**Automatic pruning on `attractor run`**

Manual `prune` covers the explicit cleanup case. The implicit case —
"I've been using attractor for a month and have 80 stale COMPLETED
runs cluttering `git branch -l`" — wants pruning that happens
without the user thinking about it. Every `attractor run` invocation
performs an automatic prune pass before the new run starts:

- **Scope.** COMPLETED runs only. INCOMPLETE runs hold failure
  context the user may want to investigate; PAUSED runs are
  literally waiting on the user. Both are kept regardless of age.
- **Default retention.** 5 days, measured from `started_at`. Old
  enough that "yesterday's broken run" is still inspectable; short
  enough that routine use doesn't accumulate refs forever. Not
  configurable in v0.2 — users who want a different cutoff invoke
  manual `prune --older-than X` themselves, or pass `--no-auto-prune`
  to skip the auto pass entirely.
- **Disable per-invocation.** `attractor run --no-auto-prune
  <workflow.dot>` skips the auto pass for that run. Useful when
  scripting against a known set of run-ids, or when running inside
  a tight test loop.
- **Failure handling.** If the auto-prune pass raises (refs
  corrupted, worktree locked, anything), log a one-line warning
  to stderr and continue with the new run. Auto-prune is
  best-effort; it must not block forward progress.
- **Output.** Silent on zero pruned. On `N > 0` pruned, print one
  line to stderr: `auto-prune: removed N completed run(s) older
  than 5d`. The new run's banner / events stream follow normally.

Implementation reuses the same `Engine.prune_run` primitive; the
auto pass is just a `prune --status completed --older-than 5d`
behind the scenes, with the warn-and-continue policy on errors.

**Out of scope for v0.2 (auto-prune)**

- User-configurable retention (per-repo or per-workflow). v0.2 ships
  with the 5-day default; configurability lands when a user
  demonstrates need. Manual `prune --older-than` already covers
  one-off needs.
- Auto-pruning of INCOMPLETE / PAUSED runs. Both have user-facing
  signal value and should never be touched without explicit
  consent (the manual `--force` path covers it).

**Requirement IDs (in `traceable-reqs.toml`, doc tags above)**

- `REQ-CLI-PRUNE-SELECTORS` — the surface above.
- `REQ-CLI-PRUNE-SAFETY` — refusal rules for paused / dirty / no-
  selector cases.
- `REQ-CLI-AUTOPRUNE` — the on-`run` auto-prune pass: COMPLETED-
  only, 5-day retention, warn-and-continue, `--no-auto-prune`
  opt-out.

### 14.2 v0.2 priorities

§14 lists what was deferred; this subsection orders that list and
gives one-line rationale. Ordering is guidance — the next agent
picks based on signal at the time — but the dependency notes
between items are binding.

1. **`attractor render` and `run abort`.** ✅ Shipped in v0.2 —
   see §13 for the surface. Both are thin wrappers over existing
   engine state: `render` re-emits the parsed graph as canonical
   DOT (no SVG; pipe to `dot -Tsvg` if needed); `run abort`
   transitions a `paused` run to `incomplete` via a new journal
   entry, preserving refs for inspection.
2. **Parallel nodes (`component`, `tripleoctagon`).** Foundational
   graph capability and the largest engine change in v0.2. Unblocks
   (3) and the canonical concurrent-verification idiom (§6.6,
   sequential-only in v0.1). Touches the traversal scheduler,
   checkpoint semantics for concurrent writes, and the events
   stream.
3. **Sub-workflow (`house`).** A node references another `.dot`
   file and runs it as a child. Requires (2) for parallel sub-
   workflows; a strictly serial implementation could ship without
   it — decide at scoping time.
4. **Per-node `hooks` scoping.** `allowed_tools` and `mcp_servers`
   per-node scoping shipped (§5.5). `hooks` is deferred: the SDK's
   hook callbacks are Python callables, not strings, so a hook-bundle
   registry must be designed before the DOT attribute can be wired up.
   Layer-2 work, no graph changes needed beyond the registry.
5. **Non-Anthropic providers (OpenAI, Gemini).** §9.2 enumerates
   them. Layer-1 work: add transports, normalize on the Claude
   Agent SDK's message shape, extend the stylesheet model-string
   parser. The stylesheet selector grammar already handles multi-
   provider routing.
6. **Container execution environment.** Sandboxed tool execution
   (Docker / Podman). Big surface: mount strategy, env propagation,
   network policy, lifecycle. Local-first constraint stands —
   containers run on the user's machine.
7. **TUI.** Not blocking anything. §2 calls web UI a v0 non-goal,
   but TUI is fair game — implement against the events stream that
   any future UI consumes.

**Deferred past v0.2** (with reason):

- `attractor doctor` — §14.1 pins this at v0.3.
- Web UI — §2 explicit v0 non-goal; revisit when a TUI exists.
- Daemon mode, mid-run graph editing, speculative parallel
  execution — §2 explicit non-features.
- Skill-discovery precedence (§15) — gated on a workflow that
  demonstrably hits the precedence question.
- User-configurable prune retention — §14.1 says "until a user
  demonstrates need."

No REQ IDs are allocated here. Each item gets its REQs added to
`traceable-reqs.toml` when concretely scoped (the pattern §14.1
followed for prune).

---

## 15. Open Questions

Most v0.1-blocking questions were resolved during the Rust v0.1
implementation (`rust-v0.1` tag): the stylesheet parser is
hand-rolled (§10), resume idempotency is the three-state machine
in §6.4, and human-gate edge cases are pinned in §11.1. Those
decisions carry forward to the Python implementation.

Resolved during the Python v0.1 implementation:

- **Predecessor revisit context.** Threaded through the engine's
  traversal loop and reconstructed from the journal on resume.
  Agent nodes entering via a tool/agent `FAILURE` edge or a
  human-gate choice edge receive the §6.3 entry-edge payload
  (`predecessor_node`, `entry_edge_label`, `predecessor_outcome` +
  predecessor's `captured_output` for FAILURE, `predecessor_reason`
  for human-gate edges with an attached reason).
- **Token accounting — per-run aggregation.** Per-call `Usage`
  from the SDK is summed inside `attractor.agent.runner` into the
  `AgentOutcome` returned to the engine; the engine writes the
  per-node totals into the `NodeCompleted` journal entry (and the
  matching `Tokens-In`/`Tokens-Out` commit trailers per §7.5), and
  aggregates the run total in `RunSummary`. `attractor run show`
  displays both per-node and per-run totals.
- **Windows PowerShell fallback.** Stays as SPEC §6.8 mandates:
  `pwsh` only. PowerShell 5.1 (`powershell.exe`) is **not** a
  silent fallback — the syntax surface (cmdlet defaults,
  `Invoke-RestMethod` JSON handling, etc.) drifts enough that a
  fallback would produce workflows that behave differently between
  hosts with and without PS 7 installed, undermining the "DOT file
  is the workflow" portability claim. Hosts without `pwsh` get an
  `ExecError` with an install link from `LocalExec`. Workflow
  authors who need PS 5.1 specifically can shell out to it
  explicitly from a tool script.
- **`loop_restart` edge attribute (upstream §2.7)** —
  **deliberate divergence from upstream.** Upstream defines a
  boolean `loop_restart` edge attribute that "terminates the
  current run and re-launches with a fresh log directory" when
  the engine follows the edge. Our engine covers the equivalent
  shapes via `retry_target` (REQ-EXEC-RETRY-TARGET, gate-failure
  jump) and the verify-fixup idiom (REQ-EXEC-VERIFY-FIXUP-IDIOM,
  composed from `max_visits` + `goal_gate` + `FAILURE` edge).
  Adding `loop_restart` would introduce a *third* retry channel
  with weaker semantics — the same-run / new-run boundary is
  ambiguous (new run id? same journal?  resume behaviour?), and
  workflow authors would face a "which retry mechanism" choice
  with no clear guidance. Skip until a concrete use case
  surfaces that the existing primitives can't model. Workflow
  authors who need a hard reset today can `attractor run`
  another invocation from the CLI.
  <!-- [doc->REQ-EXEC-RETRY-TARGET] -->
- **`RETRY` outcome status (upstream §3.5)** — **deliberate
  divergence from upstream.** Upstream attractor-spec defines a
  fourth `OutcomeStatus` value, `RETRY`, used by the engine's
  retry-with-backoff loop to re-invoke the same handler.  Our
  engine already covers the retry shape via two existing
  mechanisms: `max_visits` caps re-entries of any node (the
  verify-fixup idiom in §6.6 composes from that), and
  `REQ-EXEC-RETRY-TARGET` provides the "exit-with-unmet-gate jumps
  to a recovery node" path.  Adding `RETRY` as a third channel
  would create three different ways to express "try again", and
  the additional value would force handler authors to decide
  between RETRY (engine re-invokes the handler in place) and
  FAILURE-with-retry_target (engine routes to a recovery node).
  Neither expressivity gain nor simpler workflow shape — so we
  skip RETRY as a typed value. Workflow authors who need the
  in-place retry semantics can model it with a `goal_gate`
  + `max_visits` loop the way `workflows/preflight.dot`
  demonstrates.  <!-- [doc->REQ-EXEC-OUTCOME-STATUSES] -->

What remains:

- **Skill discovery precedence.** pi auto-loads skills from
  project (`.agents/skills/`, `.pi/skills/`) and user
  (`~/.agents/skills/`, `~/.pi/agent/skills/`) locations. When a name
  appears in more than one, pi's resolution order is the default; if a
  workflow needs a different precedence (e.g. pin a specific version),
  this needs a SPEC rule. Deferred until a workflow demonstrates need.

---

## 16. Definition of Done (v0.1)

A v0.1 release is done when:

<!-- [doc->REQ-DOD-WORKFLOW-RUN] -->
<!-- [doc->REQ-DOD-LOCAL-FIRST] -->
<!-- [doc->REQ-DOD-RESUME-BETWEEN-NODE] -->
<!-- [doc->REQ-DOD-RESUME-MID-NODE] -->
<!-- [doc->REQ-DOD-CRATE-INDEPENDENCE] -->
<!-- [doc->REQ-DOD-VALIDATE-ERRORS] -->

- A user can write a 5-node DOT workflow, run it end-to-end on a fresh repo,
  hit a human gate, resume it, and reach `exit`.
- All run state is in local git refs under `attractor/`. No remote calls
  beyond the LLM API.
- A run interrupted at any node can be resumed with `run resume <id>` and
  reach `exit`. A *between-node* interruption produces the same final outcome
  as an uninterrupted run (modulo LLM non-determinism). A *mid-node*
  interruption resumes successfully but may produce a different result,
  because the resumed handler sees whatever partial state the interrupted
  attempt left in the worktree (per §6.3 / §6.7).
- The orchestrator, agent layer, and LLM transport are layered modules
  within `attractor` with one-way dependencies; each has unit tests.
  The `checkpoint` module depends only on `pygit2` (no upstream
  `attractor.*` imports) and has its own unit tests. The `agent`
  module has at least one live test that drives `pi` against a real
  provider (gated behind an env var so CI can skip when no provider
  credential is configured).
- `attractor validate` rejects malformed graphs with actionable errors
  (line, column, what was expected).
