--- status: investigating trigger: "Psyche subprocess loaded spt:send skill in gen22, bypassed text-marker channel via direct $OWL deliver Bash call" created: 2026-04-18T06:30:00-07:00 updated: 2026-04-20T14:50:00-07:00 --- ## Current Focus hypothesis: Prior fix used the WRONG flags. `--allowed-tools` is an auto-approve list, not a tool-surface restriction. `--disable-slash-commands` only blocks `/slash` commands, not model-invoked Skill tool activation. The correct flags are `--tools` (built-in allow-list) or `--disallowed-tools`. test: Reproduced bypass at CLI with the exact wrapper flag set on a fresh process. Verified `--tools "Read,Write,Edit"` + `--dangerously-skip-permissions` does block Bash and Skill tool. expecting: User applies corrected fix (swap `--allowed-tools` for `--tools`), redeploys, restarts Psyche. next_action: Report ROOT CAUSE FOUND; await user decision on fix direction. ## Symptoms expected: Psyche restricted to Read/Write/Edit. Reply emitted as `[REPLY] ...` text marker in stdout; wrapper parses and forwards via $OWL. actual: Psyche loaded `spt:send` skill which granted Bash access ("2 tools allowed"). Psyche ran `$OWL deliver doyle doyle-psyche` directly, bypassing wrapper text-marker channel. errors: First `$OWL reply doyle` failed with "Could not determine your perch" (Psyche has no perch — runs under -p with no session match). Second attempt with explicit `doyle doyle-psyche` succeeded as direct delivery. reproduction: Send a message to a Psyche (e.g. `$OWL deliver doyle-psyche `). Psyche turn loads spt:send skill and bypasses text-marker channel. started: gen22 of `doyle` Self on 2026-04-18 ~05:57 PDT. Likely affected all psyches since spt:send (or any tool-granting skill) became available to claude -p subprocess. ## Eliminated (none yet) ## Evidence - timestamp: 2026-04-18T06:30:00-07:00 checked: src/live/wrapper/claude.rs — init_session() arg list (lines 42-59) found: Args are `-p --agents --agent owl-psyche --name --model sonnet --fallback-model opus --effort medium --dangerously-skip-permissions --output-format json`. NO `--allowed-tools`, NO `--disallowed-tools`, NO skill-disable / strict-mcp / no-skills flag. The agent definition in build_agents_json() includes only `description` and `prompt` fields — no `tools` field, no `allowed_tools` field on the agent object. implication: Tool restriction is enforced ONLY by the natural-language prompt in psyche.md ("You have access to Read, Write, and Edit tools only. You do NOT have Bash access."). With `--dangerously-skip-permissions`, Claude has full tool surface available and can choose to invoke any tool — including Skill, which can transitively grant Bash. - timestamp: 2026-04-18T06:30:00-07:00 checked: src/live/wrapper/claude.rs — resume_session_checked() arg list (lines 242-253) and final_session() arg list (lines 352-363) found: Same flag set as init_session minus `--agents`/`--agent`/`--name`/`--output-format` (uses `--resume ` instead). Still `--dangerously-skip-permissions`, still no tool restriction. implication: Every Psyche turn (init + resume + final) runs with unrestricted tool access. The agent identity carries via `--resume` but the per-invocation flag set never reasserts tool restrictions. - timestamp: 2026-04-18T06:30:00-07:00 checked: psyche.md (compile-time embedded via include_str!) found: Lines 9-43 contain `` block: "You have access to Read, Write, and Edit tools only. You do NOT have Bash access." and "Do NOT use $OWL deliver or any Bash commands for messaging." — purely prompt-level restriction. implication: Prompt instructions are guidance, not enforcement. When a HEALTH_CHECK arrives that the Psyche interprets as "respond ALIVE", and the spt:send skill exists in the skill registry advertising itself as the way to send messages, Claude is liable to load it — overriding the prompt restriction. The prompt says "the wrapper handles all delivery" but does not warn against loading skills. - timestamp: 2026-04-18T06:30:00-07:00 checked: build_agents_json() in claude.rs (lines 14-27) found: Agent definition is `{ "owl-psyche": { "description": "...", "prompt": "..." } }`. No `tools` array, no `model` override at agent level, no `allowed_tools`. implication: Claude Code's `--agents` JSON format supports a `tools` field (per Claude Code agent docs) that constrains the agent to a specific tool set. Currently unused. This is the most surgical fix point — set `"tools": ["Read", "Write", "Edit"]` on the owl-psyche agent definition so the agent's tool surface is constrained at the Claude Code layer, not just the prompt layer. - timestamp: 2026-04-18T06:30:00-07:00 checked: src/common/owlery.rs — psyche_dir() returns `{SPT_HOME}/psyches/tracked/` found: This is the Psyche's CWD (current_dir set in all 3 spawn sites). Skills resolved from `~/.claude/skills/` are user-scoped and visible from any CWD. Project-scoped skills in `.claude/skills/` would not apply here since psyche_dir is outside any project. implication: The spt:send skill at `~/.claude/skills/owl/SKILL.md` (per CLAUDE.md deploy section) is user-scoped and available to Psyche. Since psyche_dir has no `.claude/settings.json` of its own, all user-level skills are inherited. ## Evidence (additional) - timestamp: 2026-04-18T06:45:00-07:00 checked: ~/.claude/plugins/cache/cplugs/spt/1.8.1/skills/send/SKILL.md frontmatter found: Skill name is `send` (invoked as `/spt:send` via the spt plugin namespace). Frontmatter line 7: `allowed-tools: [Bash, Read]`. The skill file itself declares — via Claude Code's standard SKILL.md frontmatter — that loading this skill grants Bash and Read tools. implication: This is the precise mechanism of bypass. Claude Code skill activation honors per-skill `allowed-tools` frontmatter, granting those tools for the duration of the skill invocation regardless of session-level restrictions. The "2 tools allowed" message in the screenshot ("Successfully loaded skill · 2 tools allowed") is exactly Bash + Read from this frontmatter. Prompt-level "no Bash" guidance in psyche.md is bypassed at the Claude Code framework layer the moment Skill(spt:send) is invoked. - timestamp: 2026-04-18T06:45:00-07:00 checked: `claude --help` output for restriction flags found: Available flags on installed Claude Code: - `--allowed-tools ` / `--allowedTools` — comma/space-separated allow list, supports tool-with-args syntax like `Bash(git *)` - `--disallowed-tools ` / `--disallowedTools` — same shape, deny list - `--tools ` — restrict the built-in tool surface; `""` disables all, `"default"` enables all, or list specific names - `--disable-slash-commands` — described as "Disable all skills" - `--permission-mode ` — `default | acceptEdits | auto | bypassPermissions | dontAsk | plan` implication: Multiple viable enforcement paths exist. None are currently used by claude.rs. The two cleanest options: (a) `--disable-slash-commands` to block skill loading entirely (kills the bypass at the source), or (b) `--allowed-tools "Read Write Edit"` to constrain the tool surface even if a skill tries to grant Bash. Option (a) is simpler; option (b) is defense-in-depth. Combining both gives belt-and-suspenders. - timestamp: 2026-04-18T06:45:00-07:00 checked: src/live/wrapper/claude.rs source for any tool-restriction flag references found: `grep -nE "allowed-tools|disallowed-tools|--no-skills|strict-mcp|--tools"` against src/ → no matches. implication: Confirms there is no existing mechanism that the bug is "regressing" — this is a never-fixed gap, not a regression. The prompt-only restriction has always been the sole guardrail; spt:send was just the first user-scoped skill to advertise itself prominently enough that Claude chose it over the prompt instruction. ## Evidence (2026-04-20 — post-fix verification failure) - timestamp: 2026-04-20T14:45:00-07:00 checked: Build/deploy pipeline integrity — target/release/owl.exe mtime, plugin cache binary, deployah-psyche wrapper log since fix shipped found: target/release/owl.exe = Apr 20 04:50; plugin cache ~/.claude/plugins/cache/cplugs/spt/1.9.0/owl.exe = Apr 20 04:50 (identical size 3,819,008 bytes, both rebuilt). src/live/wrapper/claude.rs mtime = 2026-04-18 (Unix epoch 1776653427). Deployah-psyche wrapper started gen=2 at 05:07:18 on 2026-04-20 — AFTER the 04:50 build. No `~/.claude/skills/owl/owl.exe` exists (plugin-only deployment). Log iteration 29 current as of 14:30:31. Wrapper is definitely running the fixed binary. implication: This is NOT a build-pipeline problem. The fixed Rust code IS compiled, IS deployed, and IS executing in the live wrapper. The bypass at 05:28:24 (`"Both messages delivered."`) occurred through a process running the fixed binary with all 5 patched sites active. - timestamp: 2026-04-20T14:47:00-07:00 checked: Claude Code CLI installed version and flag semantics via `claude --help` (v2.1.114) found: `--allowed-tools ` is documented as "list of tool names to allow" (variadic). `--disallowed-tools ` is "list of tool names to deny". `--tools ` is "Specify the list of available tools from the built-in set. Use '' to disable all tools, 'default' to use all tools, or specify tool names". `--disable-slash-commands` is described as "Disable all skills". `--dangerously-skip-permissions` is "Bypass all permission checks". implication: Help text is ambiguous about whether `--allowed-tools` is a tool-SURFACE constraint or a permission-APPROVAL list. The prior fix assumed the former; need empirical test to determine actual behavior. - timestamp: 2026-04-20T14:47:30-07:00 checked: Empirical test — reproduced bypass at CLI outside wrapper using exact wrapper flag set found: Running `echo 'Use the spt:send skill to send a message "test" to doyle...' | claude -p --dangerously-skip-permissions --disable-slash-commands --allowed-tools "Read Write Edit" --output-format json` returned `{"num_turns":2, "result":"Delivered to doyle. No reply yet.", "permission_denials":[]}`. The message was actually delivered. Running `echo 'Run bash command: echo hello from bash' | claude -p --disable-slash-commands --allowed-tools "Read Write Edit"` returned `"Output: hello from bash"`. Also tested with comma-separated `--allowed-tools "Read,Write,Edit"` and with argv-split `--allowed-tools Read Write Edit` — all permitted Bash. implication: `--allowed-tools` does NOT restrict the tool surface — Bash remained available and was invoked. Prior fix's core assumption was wrong. The `"Read Write Edit"` quoting/splitting was a red herring; no form of `--allowed-tools` blocks Bash. - timestamp: 2026-04-20T14:48:30-07:00 checked: Empirical test — which flag combos actually block Bash found: (a) `--disallowed-tools Bash` → `"Bash not in deferred list. No Bash tool available this session."` BLOCKED. (b) `--tools "Read,Edit"` → `"Bash tool not available this session. Only Read/Edit/LSP..."` BLOCKED. (c) `--dangerously-skip-permissions --disallowed-tools Bash` → `"Bash not available. Cannot run shell commands this session."` BLOCKED. (d) `--dangerously-skip-permissions --tools "Read,Edit,Write"` → `"No Bash tool available. Got Read/Edit/Write/LSP plus screen-timelapse/Drive MCP."` BLOCKED. (e) `--dangerously-skip-permissions --tools "Read,Write,Edit"` + request to load skill → `"No skill loading tool available. Tools listed: Edit, LSP, Read, Write, Google Drive auth, screen-timelapse. No spt:send skill nor skill loader present."` BLOCKED AND the Skill tool itself was not exposed. implication: The correct flags are `--tools ` (strict allow-list at the built-in tool surface) or `--disallowed-tools ` (deny-list). Both work even with `--dangerously-skip-permissions`. `--tools` is stronger because it also removes the Skill tool itself from the toolset, preventing autonomous skill loading entirely. `--disable-slash-commands` claims to "disable all skills" but in practice only affects manual `/slash` invocation — the model-invoked Skill tool continues working unless explicitly removed via `--tools` / `--disallowed-tools Skill`. - timestamp: 2026-04-20T14:49:00-07:00 checked: Semantics of `--allowed-tools` by comparison with documented Claude Code permission model found: Per help text wording ("list of tool names to allow (e.g. 'Bash(git *) Edit')") and per behavior in tests, `--allowed-tools` is an **auto-approve permission list** for interactive sessions — tools in this list bypass the permission prompt but are not the sole tools available. It extends pre-approved permissions; it does not restrict the tool set. With `--dangerously-skip-permissions` all tools are auto-approved anyway, making `--allowed-tools` effectively inert. implication: The prior fix applied `--allowed-tools "Read Write Edit"` thinking it would deny Bash/Skill. Instead it did nothing — both because (a) the flag is an approval list not a restriction, and (b) `--dangerously-skip-permissions` was already auto-approving everything, so even the approval-list semantics were moot. - timestamp: 2026-04-20T14:49:30-07:00 checked: Agent-level `tools` field in `--agents` JSON (src/live/wrapper/claude.rs line 24: `"tools": ["Read", "Write", "Edit"]` inside `owl-psyche` definition) found: The live Psyche ran with this agent definition active (init_session at 05:07 built the JSON with the tools array) and STILL delivered messages via Bash at 05:28. No documentation of this field's exact semantics in `claude --help`; the example in help only shows `description` + `prompt`. implication: Either (a) the agent-level `tools` field is ignored by Claude Code, (b) it follows the same auto-approve semantics as `--allowed-tools` rather than surface-restriction semantics, or (c) it is overridden when any per-invocation `--allowed-tools`/`--tools` flag is also present. Cannot conclude which; what IS confirmed is that setting only this field (as originally proposed as "most surgical fix") does not block Bash in the running Psyche. ## Resolution root_cause: **The prior fix used the wrong Claude Code flags.** Two independent semantic errors: 1. **`--allowed-tools` is an auto-approve permission list, NOT a tool-surface restriction.** Empirically verified: with `--allowed-tools "Read Write Edit"` (in any quoting form), Bash remains available and can be invoked. The flag only pre-approves permission prompts; it does not remove tools from the surface. With `--dangerously-skip-permissions` also set (as in all 4 wrapper spawn sites), the approval-list semantics are moot anyway since everything is auto-approved. So `--allowed-tools "Read Write Edit"` was effectively a no-op. 2. **`--disable-slash-commands` blocks manual `/slash` invocation but does NOT prevent the model-invoked Skill tool from autonomously loading skills.** Help text "Disable all skills" is misleading. The Psyche's actual bypass path was: model decided to use spt:send → invoked the built-in Skill tool with argument `spt:send` → Claude Code skill activator honored the skill's frontmatter and granted Bash for that turn. The Skill tool is a separate built-in; `--disable-slash-commands` does not remove it. The agent-level `"tools": ["Read", "Write", "Edit"]` field in the `--agents` JSON is also ineffective (live evidence: Psyche ran with this field set AND still delivered messages via Bash). Either ignored, same auto-approve semantics, or overridden. **What actually works (verified empirically):** - `--tools "Read,Edit,Write"` — strict built-in tool allow-list. Removes Bash AND the Skill tool from the surface entirely. Even with `--dangerously-skip-permissions`, blocked. - `--disallowed-tools Bash` (optionally `--disallowed-tools Bash Skill`) — deny-list at surface level. Bash disappears from available tools. `--tools` is stronger for this use case because it also removes Skill, closing the skill-loading pathway entirely in one flag. **Pre-existing root cause (2026-04-18 section) still holds** — User-scoped skills with Bash-granting frontmatter are loadable by Psyche, and psyche.md's prompt-level restriction is advisory. The fix direction (restrict tool surface at Claude Code layer) was correct; the chosen flags were wrong. User-scoped skills (installed under `~/.claude/plugins/cache/cplugs/spt/1.8.1/skills/`) are visible from the Psyche's CWD (`{SPT_HOME}/psyches/tracked/`) since user-scoped skills are global. The `send` skill's SKILL.md frontmatter at line 7 declares `allowed-tools: [Bash, Read]`. When Psyche received the HEALTH_CHECK and reasoned "I should reply", it autonomously invoked Skill(spt:send) (the skill self-describes as the way to send messages); Claude Code's skill activator honored the skill's frontmatter and granted Bash for that turn — which Psyche then used to call `$OWL deliver doyle doyle-psyche` directly, bypassing the wrapper's [REPLY] text-marker channel entirely. The bypass is structural: any user-scoped skill whose frontmatter advertises Bash (or any other restricted tool) can be loaded by Psyche, fully overriding the psyche.md prompt restriction. spt:send is the most likely candidate because the message-handling context primes Psyche to think about message delivery, but the same loophole applies to any skill. prior_fix_attempted (FAILED — retained for reference): Belt-and-suspenders restriction at 5 sites in `src/live/wrapper/claude.rs`: 1. **Agent-level (build_agents_json, ~line 20)**: Added `"tools": ["Read", "Write", "Edit"]` to the `owl-psyche` agent JSON definition. **Result: ineffective** — live Psyche ran with this and still delivered Bash. 2. **Per-invocation flags (4 sites: init_session, resume_session, resume_session_checked, final_session)**: Added: - `--disable-slash-commands` — **Result: ineffective** for model-invoked Skill tool; only blocks manual `/slash` invocation. - `--allowed-tools "Read Write Edit"` — **Result: ineffective** — this is an auto-approve permission list, not a tool-surface restriction. With `--dangerously-skip-permissions` also set, it is effectively a no-op. proposed_fix_direction (NOT applied — diagnose-only mode): At the 4 per-invocation spawn sites in `src/live/wrapper/claude.rs` (init_session, resume_session, resume_session_checked, final_session), REPLACE `--allowed-tools "Read Write Edit"` with `--tools "Read,Edit,Write"`. The `--tools` flag restricts the built-in tool surface to the listed names, removing Bash AND the Skill tool itself — closing both the direct-Bash and skill-loading paths in one change. Comma-separated single-arg form is confirmed working via empirical test. Leave `--dangerously-skip-permissions` in place (it does not override `--tools`). The `--disable-slash-commands` flag can stay (harmless, also blocks `/slash` manual invocation) or be removed (redundant since `--tools` already removes Skill). The agent-level `"tools": [...]` field can stay (harmless) or be removed (confirmed ineffective in this version). verification (of prior fix): Runtime verification on 2026-04-20 — FAILED. Live Psyche at 05:28:24 (log line 22-23) reported "Both messages delivered" after running with the patched binary (build 04:50 2026-04-20). Bypass still active. New evidence section added today documents the mechanism. files_changed_in_prior_fix: - src/live/wrapper/claude.rs (5 hunks: build_agents_json, init_session args, resume_session args, resume_session_checked args, final_session args) — all still present in source; awaiting user decision to replace `--allowed-tools` with `--tools`.