---
phase: 35.2-psyche-sync-setup-data-loss-reconcile-bootstrap-determinism
reviewed: 2026-05-28T00:00:00Z
depth: standard
files_reviewed: 7
files_reviewed_list:
  - src/common/sync.rs
  - src/common/tracked.rs
  - src/owl/psyche_sync_setup.rs
  - tests/sync_accept_flow_two_machine.rs
  - tests/sync_reconcile_against_remote.rs
  - tests/sync_two_machine_attach.rs
  - CLAUDE.md
findings:
  critical: 1
  warning: 6
  info: 3
  total: 10
status: issues_found
---

# Phase 35.2: Code Review Report

**Reviewed:** 2026-05-28T00:00:00Z
**Depth:** standard
**Files Reviewed:** 7
**Status:** issues_found

## Summary

Reviewed the Phase 35.2 deterministic-bootstrap + reconcile + per-ref-push work
across `tracked.rs` (epoch-locked `commit-tree`), `sync.rs`
(`reconcile_against_remote` / `ReconcileVerdict` / `PerRefOutcome` / per-ref push
in `accept_flow`), the `psyche_sync_setup` dispatcher status tags, and the three
new integration tests.

The reconcile/push data-loss surface is well-constructed: the raw
`Command::status()` exit-1-vs-128 split is correct (`run_git_checked` does
collapse both into `Nonzero`, so the raw call is genuinely required), the rebase
runs in the linked worktree rather than the bare seed, `RebaseFailed` correctly
skips the push to preserve remote state, and every push is a plain (non-forced)
push so a non-fast-forward is rejected rather than clobbering remote history. No
data-loss path was found in the push loop itself.

The headline defect is in the **deterministic bootstrap**: the epoch-locked
`commit-tree` pins the commit *dates* and passes `-c user.name`/`user.email`, but
it does NOT neutralize the four `GIT_*_NAME`/`GIT_*_EMAIL` environment variables,
which take precedence over `-c` config in git. On any machine that exports those
(common in CI and many dev shells), the bootstrap commit SHA diverges — defeating
SYNC-BOOTSTRAP-DET-01, which the entire reconcile design relies on to treat `main`
as a shared baseline. The new determinism test cannot catch this because A and B
run under the same process environment.

## Critical Issues

### CR-01: Deterministic bootstrap SHA is broken by inherited `GIT_*` identity env vars

**File:** `src/common/tracked.rs:245-272`
**Issue:** The bootstrap `commit-tree` locks the dates and supplies identity via
`-c user.name=spt-bootstrap -c user.email=spt@local`, but the spawned `Command`
inherits the full parent environment (Rust's `std::process::Command` does not
clear env by default; confirmed no `env_clear`/`env_remove` of git identity vars
anywhere in `src/`). Git resolves committer/author identity in this precedence
order:

```
GIT_AUTHOR_NAME / GIT_AUTHOR_EMAIL / GIT_COMMITTER_NAME / GIT_COMMITTER_EMAIL  (env)
        ↓  override
-c user.name / -c user.email   (the values this code sets)
```

So on any machine where the operator (or CI) exports e.g. `GIT_COMMITTER_NAME` /
`GIT_COMMITTER_EMAIL` / `GIT_AUTHOR_NAME` / `GIT_AUTHOR_EMAIL`, the bootstrap
commit is built with that identity instead of `spt-bootstrap`/`spt@local`. Because
the commit SHA is a hash over (tree, author identity+date, committer identity+date,
message), the `refs/heads/main` SHA then differs from a machine without those env
vars set. SYNC-BOOTSTRAP-DET-01 — "every machine's main converges on the
byte-identical SHA" — is violated, which in turn breaks the reconcile loop's
assumption that `main` is a clean shared baseline (RC-2 in the bootstrap test
rationale). When the bases differ, second-machine attach can produce spurious
divergence / failed rebases on `main`.

The in-code comment at lines 263-264 ("Epoch-lock both dates → byte-identical
commit object across machines") asserts a guarantee the code does not actually
provide.

**Fix:** Pin the identity via environment variables (which beat `-c` and also
beat any inherited values when set explicitly), or explicitly remove the inherited
ones. Setting all four env vars is the robust form:

```rust
let commit_out = commit_cmd
    .args([
        "-C", &seed_s,
        "commit-tree", empty_tree,
        "-m", "init: tracked seed",
    ])
    .env("GIT_AUTHOR_NAME", BOOTSTRAP_NAME)
    .env("GIT_AUTHOR_EMAIL", BOOTSTRAP_EMAIL)
    .env("GIT_COMMITTER_NAME", BOOTSTRAP_NAME)
    .env("GIT_COMMITTER_EMAIL", BOOTSTRAP_EMAIL)
    .env("GIT_AUTHOR_DATE", "1970-01-01T00:00:00Z")
    .env("GIT_COMMITTER_DATE", "1970-01-01T00:00:00Z")
    .output()?;
```

(The `-c user.name/email` args may be dropped once the env vars are set, since the
env wins regardless.) Add a regression test that sets a bogus `GIT_COMMITTER_NAME`
in the child env before `ensure_seed` and asserts the resulting `main` SHA matches
the clean-env SHA.

## Warnings

### WR-01: `is_remote_404` substring match on bare `"404"` over-classifies as a permanent hard stop

**File:** `src/common/sync.rs:207-211` (consumed at `253` and `820`/`875`)
**Issue:** `is_remote_404` returns true if stderr `contains("404")` anywhere. Any
git/gh stderr that incidentally contains the three digits `404` (a path component,
an object count, a timestamp fragment, an unrelated HTTP code in a proxy message)
flips `SyncState::Failing` — a documented *hard stop* (D-17) that is NOT cleared by
the transient backoff path and requires the user to re-run setup. A transient
network blip whose error text happens to include `404` would permanently disable
sync.

**Fix:** Tighten to phrase-anchored matches and drop the bare-digit clause:

```rust
fn is_remote_404(stderr: &str) -> bool {
    stderr.contains("Repository not found")
        || stderr.contains("remote: Repository")
        || stderr.contains("HTTP 404")
        || stderr.contains("404 Not Found")
}
```

### WR-02: `ProbeFailed` reconcile verdict emits no operator-visible status tag

**File:** `src/owl/psyche_sync_setup.rs:141-153`
**Issue:** `emit_outcome_tag` only emits a reconcile-side tag for `Reconciled`
(`RECONCILED:`) and `RebaseFailed` (`DIVERGED:`). When `reconcile_against_remote`
returns `ProbeFailed` (merge-base exited 128 / spawn failure / signal), the push is
still attempted (`sync.rs:828-840` only skips push for `RebaseFailed`), so the
operator sees a bare `PUSHED:`/`PUSH_FAILED:` with no signal that the ancestry
probe failed and the push therefore went out blind. The rustdoc at lines 138-140
explicitly lists `ProbeFailed` as a "quiet variant (no tag emitted)", but a probe
failure is an anomaly worth surfacing, not a happy-path no-op.

**Fix:** Emit a distinct tag for `ProbeFailed`, e.g.:

```rust
match &o.reconcile {
    Reconciled => eprintln!("RECONCILED:{}", o.branch),
    RebaseFailed(_) => eprintln!("DIVERGED:{}", o.branch),
    ProbeFailed => eprintln!("PROBE_FAILED:{}", o.branch),
    _ => {}
}
```

(and register `PROBE_FAILED:{branch}` in the CLAUDE.md status-tag inventory).

### WR-03: Doc comment for `reconcile_against_remote` is attached to the wrong function

**File:** `src/common/sync.rs:512-554` (over `worktree_for_branch` at `555`)
**Issue:** The large `///` block beginning at line 512 documents
`reconcile_against_remote` (SYNC-RECON-FETCH-01, the routing matrix, exit-1-vs-128
rationale), then at line 543 — still inside the same uninterrupted `///` block —
abruptly switches to "Resolve the linked-worktree path that has `branch` checked
out…", which is the doc for `worktree_for_branch`. Because the block is contiguous
and immediately precedes `fn worktree_for_branch`, Rustdoc binds the entire
512-554 text to `worktree_for_branch`. The actual `reconcile_against_remote`
(line 592) carries only the short Plan-35.2-02 visibility note (586-591) as its
doc — its real routing-matrix documentation is mis-filed. This is exactly the kind
of mismatch a future maintainer trusts and is misled by.

**Fix:** Split the block: move lines 512-541 (and the blank-line break) to sit
directly above `pub fn reconcile_against_remote` (replacing/merging with the
586-591 note), and keep only the 543-554 worktree-path text above
`fn worktree_for_branch`.

### WR-04: `handle_disable` leaves a stale `remote_url`/`acked_ts` and does not reset backoff counters

**File:** `src/owl/psyche_sync_setup.rs:84-93`
**Issue:** `--disable` sets `state = Failing` and stamps a `user-disabled` reason,
but leaves `remote_url`, `acked_ts`, `consecutive_failures`, and
`next_retry_after_ts` untouched. If sync was previously in a transient-backoff
state, the disable preserves a future `next_retry_after_ts`; on a later re-enable
the operator's accept_flow does reset those (sync.rs:871-874), so this is not a
correctness BLOCKER — but the half-updated record is confusing for the doctor
surface, which reads `last_failure_reason="user-disabled"` alongside a possibly
non-zero `consecutive_failures` from the prior real failures. The two reasons
collide in one field.

**Fix:** On disable, also reset the transient-failure bookkeeping so the disabled
record is unambiguous:

```rust
s.state = SyncState::Failing;
s.last_failure_reason = Some("user-disabled".to_string());
s.last_failure_ts = Some(now_iso_utc());
s.consecutive_failures = 0;
s.next_retry_after_ts = None;
```

### WR-05: `two_machine_attach`/`bootstrap-determinism` tests cannot detect the CR-01 env-var hole

**File:** `tests/sync_two_machine_attach.rs:23-57`, `tests/sync_two_machine_attach.rs:131-157`
**Issue:** Both the determinism test and the two-machine-attach test run "machine
A" and "machine B" in the *same process* with the *same inherited environment*.
If `GIT_COMMITTER_NAME`/etc. are set, BOTH machines pick up the same value, so
their bootstrap SHAs still match and the rebase still succeeds — the tests pass
green while the real cross-machine invariant (different environments on genuinely
different hosts) is broken (see CR-01). The suite therefore gives false
confidence in the determinism guarantee.

**Fix:** Add a test that bootstraps machine A with a clean child env and machine B
with `GIT_COMMITTER_NAME=someone GIT_COMMITTER_EMAIL=x@y` injected (e.g. via a
guard mirroring `GitConfigGlobalGuard`), and assert the two `main` SHAs are still
identical. This both pins the fix and documents the requirement.

### WR-06: `accept_flow` Step 5b re-shells `git branch --set-upstream-to` for branches whose push failed or was skipped

**File:** `src/common/sync.rs:848-864`
**Issue:** Step 5b unconditionally iterates `list_local_branches(&seed)` and runs
`git branch {b} --set-upstream-to=origin/{b}` for every local branch, including
branches whose Step-5 push was skipped (`RebaseFailed`) or failed
(`PUSH_FAILED`). For a `RebaseFailed` branch, `origin/{b}` may not reflect the
local tip and the upstream wiring is set against a ref the local branch could not
be reconciled to — a misleading upstream that the later `pull_branch` fetch+rebase
will then chase. It is soft-failed (`let _ =`) so it cannot crash, but it wires
upstreams for branches that explicitly were left un-pushed to preserve remote
state, partially undermining the "skip to preserve" intent.

**Fix:** Drive Step 5b off the `outcomes` vec and set upstream only for branches
whose push actually succeeded (`Some(Ok(()))`), rather than re-enumerating all
local branches.

## Info

### IN-01: `user` from `gh api user` is interpolated into repo arg and URL without validation

**File:** `src/common/sync.rs:719-760`
**Issue:** `user` (from `probe_gh_user` → `gh api user --jq .login`) is spliced
into `format!("{}/{}", user, SYNC_REPO_NAME)` (gh argv) and the canonical URL.
It is passed as a separate argv element (no shell), so command injection is not
possible, and GitHub logins are constrained — but a value containing `/` or `..`
would alter the repo path / URL shape. Low risk given the trusted source.
**Fix:** Optionally validate against `[A-Za-z0-9-]+` (the GitHub username charset)
before use, mirroring `tracked::validate_id_chars`.

### IN-02: `SYNC_PROBE_TIMEOUT` constant is dead

**File:** `src/common/sync.rs:58-59`
**Issue:** `SYNC_PROBE_TIMEOUT` is declared with `#[allow(dead_code)]`; `gh_present`
uses a hardcoded `Duration::from_millis(500)` sleep (line 183) instead. The
constant documents a 2s budget that is never applied, so the doc and behavior
disagree.
**Fix:** Either use `SYNC_PROBE_TIMEOUT` in `gh_present` (via a real wait loop) or
delete the constant.

### IN-03: `classify_and_record_outcome` `_branch` parameter is unused

**File:** `src/common/sync.rs:241`
**Issue:** The `_branch: &str` parameter is threaded through every call site
(`pull_branch`, `push_branch`) but never read — failure records are not
branch-scoped, so the persisted `last_failure_reason` cannot distinguish which
branch failed. Not a bug, but the parameter is misleading dead weight.
**Fix:** Either drop the parameter, or include the branch in the recorded
`last_failure_reason` so doctor can attribute the failure.

---

_Reviewed: 2026-05-28T00:00:00Z_
_Reviewer: Claude (gsd-code-reviewer)_
_Depth: standard_
