# Goal: skip worktrees for read-only nodes (conditional isolation)

Status: **design proposal.** Additive — keeps today's worktree
checkpoint/resume model intact for mutating work; only *skips*
worktree creation where it is provably needless. No DoD downgrade.

## Motivation

The engine creates a git worktree per run, and a sub-worktree per
parallel branch, automatically. That's needless when a node (or a
whole fan-out) is **read-only** — search, read, analyze. Creating N
checkouts for a 5-way search fan-out is pure overhead.

Isolation is only load-bearing when a node **mutates** the tree (so it
needs its own sandbox + a worktree commit for durability/resume). Make
worktree creation conditional on that distinction.

## Model: a binary read/write distinction

One boolean node attribute, `read_only`. Default preserves today's
behavior. There is no third "manual" mode — it's redundant (see below).

- **`read_only=true` (read) — the default.** The node only reads. The
  engine creates **no** isolated worktree:
  - *sequential:* runs against the existing run worktree but is
    non-mutating; no worktree commit attempted (nothing to capture).
  - *parallel:* **agent** read-only branches **share the parent run
    worktree** (enforced, can't mutate). **Tool/shell** read-only
    branches keep an isolated sub-worktree by default (unenforceable),
    unless the author opts them into sharing. See "the non-agent write
    gap" below.
  - *agent nodes:* `read_only` **auto-restricts pi's `--tools`** to the
    non-mutating set (`read, grep, find, ls` + `report_outcome`),
    dropping `write`/`edit`/`bash`. This is real **enforcement**, not
    just a contract — the agent cannot mutate the shared tree. (An
    explicit `allowed_tools=` still wins if the author overrides it.)
- **`read_only=false` (write / mutating)** — *today's behavior.*
  Isolated worktree, dual-commit (§6.4), and in a parallel region a
  per-branch sub-worktree on a branch ref. **Merging that branch work
  back is an explicit downstream graph step — a merger node — not engine
  behavior** (§6.10.3: the `tripleoctagon` folds outcomes and routes; it
  does not merge). All current guarantees hold.

### Mixed fan-out semantics (Phase 2)

Isolation is decided **per branch**, computed statically. A parallel
region's branches (fan-out edge → join) have a known node set from the
workflow snapshot (the engine already derives this for parallel resume,
§6.11), so:

> A branch **shares the parent worktree** iff *every* node statically in
> that branch resolves `read_only=true`; otherwise the branch gets an
> isolated sub-worktree. Recombination at the join runs **only for the
> isolated (mutating) branches**.

**Race-freedom.** Because every *mutating* branch is isolated in its own
sub-worktree, nothing writes the parent tree during the region — so the
read-only branches sharing it all see a stable snapshot (the fan-out
commit).

**The non-agent write gap (and the fix).** Agent read-only nodes are
*enforced* (tool allowlist), but a tool/shell node's `bash` script can
write anything — the engine can't sandbox it without a container
(deferred §14). A mis-declared read-only *shell* branch sharing the tree
in parallel would race. So: **share only what we can enforce.**

- **Agent** read-only branch → **share** the parent tree (enforced;
  cannot mutate). The primary AI search/triage fan-out → 0 checkouts.
- **Tool/shell** read-only branch → **keep an isolated sub-worktree by
  default** (cheap; we can't prove it won't write). An author may opt a
  shell branch into sharing explicitly, accepting the contract.
- Sequential read-only tool nodes are unaffected — they write their own
  private run worktree and the diff is committed normally.

Belt-and-suspenders: a **post-node dirty check** on any read-only node
— if it left the tree dirty, **fail the node loudly** rather than
silently committing/corrupting.

Example — `[search, search, search, refactor, lint]`: the 3 search
branches + `lint` (all read-only) share one tree (0 extra checkouts);
`refactor` (contains a write node) gets its own sub-worktree and
leaves its work on a branch ref — merging it into the main tree is the
workflow's explicit merger step (§6.10.3), not engine behavior.

**Mix within one branch** (reads then writes): the branch contains a
non-read_only node → classified mutating → isolated as a whole. The
read-only nodes inside it just run in that branch's sub-worktree; their
per-node `read_only` there only governs agent tool-enforcement, not
isolation.

Classification mechanism (decided): the static “all nodes in the branch
are read_only” check above — no new edge attribute, no lazy copy-on-write.

**Run-level worktree: always created — it is the snapshot.** The run
worktree is a checkout of the ref the run was launched from (§6.7), so
the whole run sees a stable, consistent tree immune to whatever the
user edits in their live working tree mid-run. This matters for
read-only runs *too* — a search/analysis run wants a fixed snapshot of
the code it was triggered against. So we do **not** skip the run
worktree for fully read-only runs (an earlier draft suggested reading
the launch tree directly — that was wrong; it would expose the run to
concurrent edits and break the snapshot).

`read_only` therefore governs only: (a) per-*branch* sub-worktree
creation in parallel — read-only branches read the **one run worktree**
(the snapshot) instead of spawning N more checkouts; and (b) agent
tool enforcement. The disk win is "no per-branch fan-out checkouts,"
never "no run worktree."

### Why no `manual` mode

A write node already has its own isolated worktree, and its script can
run `git worktree add` (or clone, or mkdir temp) as many times as it
wants *from there*. So "multiple worktrees set up by one node" is just
a write node doing writes — it needs no special engine mode. read vs.
write is the whole spectrum: either you mutate the run tree (engine
isolates + checkpoints you) or you don't (share, no isolation).

## Why this is small (vs. removing worktrees)

- §6.4 checkpoint/resume: **unchanged** for mutating nodes. Read-only
  nodes already produce empty diffs → no worktree commit today, so
  there's nothing new to checkpoint.
- DoD §16 "all run state in git refs" + "between-node resume
  reproduces": **preserved** — read-only nodes have no durable file
  output to reproduce.
- Only new engine logic: "is this node/branch read-only? if so, don't
  call `create_branch` (parallel) / don't bother with worktree-commit;
  point cwd at the shared run worktree."

## Decisions needed

- **A1 — attribute spelling.** Boolean `read_only=true` (decided — no
  `isolation` enum, no `manual` mode; read vs. write is the whole
  spectrum).
- **A2 — default (DECIDED: `read_only=true`).** Disk-efficient by
  default; authors opt into `read_only=false` where work mutates.
  Footgun: an *unmarked mutating parallel branch* shares the tree with
  siblings and can race. Mitigations: (1) agent nodes are enforced
  read-only via the tool allowlist, so they can't mutate unmarked; (2)
  dirty-tree warning for tool/shell nodes; (3) sequential nodes are
  unaffected (run worktree still exists; empty diff = no commit).
- **A3 — enforcement (share only what we can enforce).** Agent nodes:
  **enforced** (read-only tool allowlist) → may share the tree in
  parallel. Tool/shell nodes: unenforceable → keep an isolated
  sub-worktree by default (opt-in to share), plus a post-node
  dirty check that **fails** a read-only node which mutated the tree.
- **A5 — read-only tool set.** `read, grep, find, ls` (+ the always-on
  `report_outcome`); `bash` excluded by default because it can mutate.
  Author can re-add tools via explicit `allowed_tools=`.
- **A4 — scope.** Does `read_only` only change parallel branching, or
  also document/skip the (already-empty) sequential worktree commit?
  Rec: applies everywhere, only *materially* changes parallel fan-out.

## Code touch points (after sign-off)

- `workflow/graph.py` + `validate.py`: add `read_only` node attribute
  (bool); validate it's only meaningful on tool/agent nodes.
- `engine.py` parallel path (`_traverse_branch`, the `create_branch`
  call sites ~1716/1958): if a branch's target node is `read_only`,
  skip `create_branch` and run it with cwd = parent run worktree.
  (Merging mutating branches back stays an explicit graph step per
  §6.10.3; the engine never auto-merges at the join.)
- `engine.py` sequential: thread `read_only` into `_dispatch_handler`
  to skip the worktree-commit attempt (cosmetic; diff is empty anyway).
- Optional: post-node dirty check for `read_only` nodes → warn event.

## SPEC / reqs impact (after decisions)

- §5.5: add the `read_only` boolean node attribute.
- §6.7: note isolation is per-node; read-only nodes share the tree.
- §6.10: read-only branches share the parent worktree; mutating
  branches get sub-worktrees. Merging branches back remains an explicit
  graph step (merger node), unchanged — the join still does not merge.
- traceable-reqs: extend `REQ-EXEC-RUN-ISOLATION` (conditional
  isolation) + the §6.10 parallel-worktree req; new
  `REQ-EXEC-READONLY-NODE` (doc/impl/unit, +int for a shared-tree
  parallel search fan-out).

## Interactions
- **pi migration:** none — the agent node already takes an arbitrary
  cwd; read-only just points it at the shared worktree.
- **sub-workflow plan:** unaffected by this; recombination still
  applies to mutating children.

---

## Alternative considered (rejected as too big): remove worktrees, track cwd

Make the engine track only a `cwd` run var and let workflows create all
worktrees themselves; checkpoint state-only. This solves the same
needless-isolation problem but **downgrades load-bearing guarantees**:
the engine stops owning file durability, so "between-node resume
reproduces the same outcome" (DoD §16) no longer holds, and parallel
recombination moves into workflow scripts. Bigger blast radius, weaker
guarantees, more author burden. The conditional-isolation model above
gets the same practical win (no worktrees for search fan-out) without
the downgrade, so prefer it unless a concrete need forces the full
pivot.

## Phasing
1. [DONE] SPEC §5.5/§6.7 edit + `REQ-EXEC-READONLY-NODE`.
2. [DONE] `read_only` attribute + `default_read_only` + validation +
   enforced read-only agents + unit tests.
3. [DONE] Parallel path: enforced read-only (agent) branches share the
   parent run worktree; mutating or tool/shell branches isolate. Static
   classification, fresh-start + resume. Fan-in merge stays a graph
   step (§6.10.3) — the join never merges.
4. [DONE] Sequential: read-only nodes already cost nothing (empty diff
   → no commit); enforced for agents.
5. [DONE] Integration tests: read-only fan-out shares one cwd (no
   sub-worktrees); mixed fan-out isolates only the writer.
6. Optional dirty-tree warning for read-only nodes.
