---
title: "Checkpoints"
description: "How Fabro uses Git to checkpoint and resume workflow runs"
---

Fabro checkpoints every workflow run using Git plus the durable run store. After each node completes, Fabro commits the file changes and execution state so that interrupted runs can be resumed exactly where they left off. This happens automatically — no configuration required beyond running inside a Git repository.

## Two branches, two purposes

Each run creates two Git branches that work in tandem:

| Branch | Ref format | Contains |
|---|---|---|
| **Run branch** | `fabro/run/{run_id}` | File changes made by agents and commands — the actual work product |
| **Metadata branch** | `fabro/meta/{run_id}` | Checkpoint JSON, the workflow graph, run and start records, and offloaded artifacts |

The run branch is a regular Git branch that grows one commit per completed node. The metadata branch is an orphan branch (no shared history with your code) that stores structured data using Git's object database directly — no working tree needed.

### Run branch commits

After each node finishes, Fabro stages file changes and creates a commit on the run branch. Files matching `[run.checkpoint] exclude_globs` patterns (configured in [run.toml](/execution/run-configuration#runcheckpoint) or [settings.toml](/administration/server-configuration#runcheckpoint-section)) are excluded from staging:

```
fabro(01JKXYZ...): plan (succeeded)

Fabro-Run: 01JKXYZ...
Fabro-Completed: 2
Fabro-Checkpoint: a1b2c3d4...
```

The commit message follows a structured format:

| Part | Description |
|---|---|
| Subject line | `fabro({run_id}): {node_id} ({status})` |
| `Fabro-Run` trailer | The run ID |
| `Fabro-Completed` trailer | Number of completed nodes so far |
| `Fabro-Checkpoint` trailer | SHA of the corresponding commit on the metadata branch |

The `Fabro-Checkpoint` trailer links each run branch commit to its metadata branch commit, so you can navigate from file changes to the full execution state and back.

Fabro disables Git commit and tag signing for checkpoint commits created inside a sandbox. Your personal or repository-level signing settings can stay enabled, but sandbox bookkeeping does not need access to your signing key.

### Metadata branch

The metadata branch (`fabro/meta/{run_id}`) is an orphan branch that stores structured run data using Git's object storage directly. Fabro writes it from inside the sandbox with Git plumbing commands, without checking out a metadata worktree. It is initialized at run start with:

- **`run.json`** — Current projection snapshot: run spec, start/status records, current checkpoint, conclusion, sandbox, retro state, and other run-level metadata
- **`graph.fabro`** — Workflow source for the run

After each node, the metadata branch is updated with:

- **`run.json`** — Refreshed projection snapshot with the new current checkpoint
- **`stages/{rank:03}-{node_id}@{visit}/...`** — Execution-order-prefixed per-stage trace files (prompts, responses, status, diffs, stdout/stderr, and tool metadata)
- **`stages/retro/*.md`** — Retro prompt/response text when present

## What's in a checkpoint

The `run.json.checkpoint` snapshot captures everything needed to resume a run:

| Field | Description |
|---|---|
| `timestamp` | When the checkpoint was created |
| `current_node` | The node that just completed |
| `next_node_id` | The next node the engine would execute |
| `completed_nodes` | Ordered list of all completed node IDs |
| `node_retries` | How many retry attempts each node has used |
| `node_outcomes` | Full outcome (status, context updates, usage) for each completed node |
| `context_values` | Snapshot of the entire [run context](/execution/context) |
| `git_commit_sha` | SHA of the run branch commit at this checkpoint |
| `loop_failure_signatures` | Failure signature counts for loop detection |
| `restart_failure_signatures` | Failure signature counts across loop-restart edges |

The durable run store also keeps the current checkpoint so `resume`, `inspect`, and API reads do not need to rely on scratch files.

## Worktrees

Fabro uses Git worktrees to isolate workflow runs from your working directory. When a local run starts in a Git repository:

1. Fabro records the current HEAD as the **base SHA**
2. Creates a new branch `fabro/run/{run_id}` at that SHA
3. Adds a worktree at `{run_dir}/worktree` on that branch
4. Changes into the worktree directory for the duration of the run

This means your original working directory stays untouched while the agent makes changes in the worktree. When the run completes, Fabro removes the worktree and restores your original directory.

<Note>
If the working directory has uncommitted changes, the worktree starts from committed `HEAD` and those uncommitted changes are not included. Fabro logs a warning so you can commit, stash, or run explicitly in place when that is what you want.
</Note>

For Docker and Daytona sandboxes, the repository is cloned into the sandbox and checkpoint Git operations run there. Both the run branch and metadata branch are pushed to origin from the sandbox after each checkpoint when pushing is configured.

## Resuming a run

Resume an interrupted run from its durable checkpoint:

```bash
fabro resume 01JKXYZ
```

Fabro resolves the run by ID prefix, validates that durable state contains a checkpoint, and asks the server to continue execution. No workflow file or override flags are needed — all configuration is restored from persisted state.

<Accordion title="What happens during resume">
1. Fabro looks up the run directory by ID prefix
2. Validates that a checkpoint exists in durable state and no engine process is already running
3. Cleans stale local artifacts from the previous execution
4. Resets status to `Submitted` and spawns a new engine subprocess with `--resume`
5. The engine restores the full context, completed node list, retry counts, and failure signatures from durable state
6. If the checkpointed node used `full` fidelity, downgrades the first resumed node to `summary:high` (since the original conversation thread no longer exists in memory)
7. Continues execution from `next_node_id`
</Accordion>

## The checkpoint cycle

Here's the full sequence that runs after every node completes:

1. **Append checkpoint event** — Persist the new checkpoint into durable run state
2. **Write metadata branch** — Serialize the checkpoint and any new artifacts to the metadata branch (shadow commit)
3. **Commit to run branch** — Stage all file changes, commit with structured trailers linking to the shadow commit SHA
4. **Update durable checkpoint** — Persist the `git_commit_sha` associated with the run-branch commit

Steps 2-4 are best-effort — if any Git operation fails, the run continues and emits a `RunNotice` warning event. Resume still uses the durable checkpoint in the run store.

## Inspecting run history

Because checkpoints are plain Git commits, you can inspect them with standard Git tools:

```bash
# View the commit log for a run
git log fabro/run/01JKXYZ... --oneline

# See what an agent changed at a specific node
git show fabro/run/01JKXYZ...

# Diff the full run against the starting point
git diff main..fabro/run/01JKXYZ...

# Read checkpoint data from the metadata branch
git show fabro/meta/01JKXYZ...:run.json | jq .checkpoint.current_node
```

## Rewinding to an earlier checkpoint

If a later stage goes off-track, you can rewind a terminal run to an earlier checkpoint and resume from there instead of restarting the entire workflow. Rewind creates a replacement run at the target checkpoint, archives the source run, and prints the new run ID to resume:

```bash
# List the checkpoint timeline
fabro rewind <RUN_ID> --list

# Rewind to a specific checkpoint
fabro rewind <RUN_ID> plan@2

# Resume from the rewound point
fabro resume <NEW_RUN_ID>
```

The source run must already be terminal (`succeeded`, `failed`, or `dead`). If the source is archived, unarchive it first. If archive fails after the replacement run is created, do not retry `fabro rewind`; archive the source run manually.

See [`fabro rewind`](/reference/cli#fabro-rewind) for the full command reference.

## Forking a run

If you want to explore an alternate path from a checkpoint without archiving the original run, use `fabro fork` instead of `fabro rewind`. Fork creates a new independent run branching from the target checkpoint; the original run stays intact.

```bash
# List checkpoints
fabro fork <RUN_ID> --list

# Fork from a specific checkpoint
fabro fork <RUN_ID> plan@2

# Resume the forked run
fabro resume <NEW_RUN_ID>
```

Use **rewind** when a terminal run should be abandoned and replaced from an earlier point. Use **fork** when you want to try a different approach while keeping the original run as a reference.

`fabro rewind --list`, `fabro fork --list`, `fabro rewind`, and `fabro fork` are server-backed. Timeline listing reads durable run-store checkpoints; it does not rebuild missing metadata branches.

See [`fabro fork`](/reference/cli#fabro-fork) for the full command reference.

## When checkpointing is active

Git checkpointing activates automatically when:

- The run uses a Git repository and checkpointing has not been explicitly disabled
- Local runs can create a Git worktree under the run scratch directory
- Docker or Daytona can clone the configured GitHub origin into the sandbox

It is skipped when:

- The working directory is not a Git repository
- The run uses `--dry-run`
- The run is explicitly started in place with checkpointing disabled
