# Architecture

**Analysis Date:** 2026-03-28

## Pattern Overview

**Overall:** File-system-based message queue with a CLI dispatcher

**Key Characteristics:**
- Zero external dependencies -- uses only bash and coreutils
- Inter-process communication via filesystem polling (no sockets, no daemons)
- Two-component system: a bash runtime (`owl.sh`) and an LLM instruction manifest (`SKILL.md`)
- Designed to be consumed by Claude Code agents, not human users directly

## Layers

**Instruction Layer (SKILL.md):**
- Purpose: Teaches Claude Code agents how to use the owl messaging system. This is the "API documentation" that the LLM reads as a skill.
- Location: `SKILL.md`
- Contains: Subcommand reference, behavioral rules (when to block, when to re-poll, how to parse messages), error recovery steps
- Depends on: Nothing
- Used by: Claude Code's skill loader (injected into agent context at runtime)

**Runtime Layer (owl.sh):**
- Purpose: Implements all messaging primitives as a single bash script with a case-dispatch pattern
- Location: `owl.sh`
- Contains: Perch lifecycle management (setup, poll, stop), message delivery (deliver, reply, send), directory listing and stale cleanup (list)
- Depends on: Filesystem at `$HOME/.claude/owlery/`
- Used by: Claude Code agents via the `$OWL` environment variable

**Specification Layer (docs/):**
- Purpose: Design documents and deployment instructions. Not consumed at runtime.
- Location: `docs/DEPLOY.md`, `docs/SKILL_SPEC-LIVE.md`
- Contains: Deployment workflow, and a future "Live Agent" spec that extends owl with Psyche/Spine concepts
- Depends on: Nothing
- Used by: Human developers

## Data Flow

**Message Send (agent A to agent B):**

1. Agent B runs `$OWL poll <B-id> --setup` which creates `$HOME/.claude/owlery/<B-id>/inbox/` and a `ready` file
2. Agent A runs `$OWL deliver <B-id> <A-id> <<< "message"` which writes a timestamped `.msg` file into B's inbox
3. Agent B's polling loop (`_poll_once`) detects the `.msg` file, prints its contents to stdout, deletes the file
4. Agent B parses the `__REPLY_TO__:<A-id>` header line from output, processes the message body
5. Agent B runs `$OWL reply <A-id> <B-id> <<< "response"` to send back

**Ephemeral Send (one-shot with reply wait):**

1. Agent A runs `$OWL send <target-id> <my-id> <<< "msg"` which creates a temporary reply perch for A, delivers the message to target, then polls A's own inbox for a reply
2. When the reply arrives, the ephemeral perch self-cleans (removes ready, info.json, inbox, directory)

**State Management:**
- All state is filesystem-based under `$HOME/.claude/owlery/`
- Each "perch" is a directory: `owlery/<id>/` containing `ready`, `info.json`, and `inbox/`
- `ready` file: presence signals the perch is active and accepting messages
- `info.json`: stores owl_id, start timestamp, mode (listen/wait/once), and PID of the polling process
- Messages are individual files: `inbox/<epoch>-<random>.msg`
- Stale detection: `list` and `_check_alive` verify the PID in `info.json` is still running via `kill -0`

## Key Abstractions

**Perch:**
- Purpose: Represents a registered listener (an agent's mailbox)
- Structure: `$HOME/.claude/owlery/<id>/` with `ready`, `info.json`, `inbox/` subdirectory
- Lifecycle: Created by `setup` or `poll --setup` or `send`, destroyed by `stop`

**Message File:**
- Purpose: A single queued message waiting for delivery
- Format: First line is `__REPLY_TO__:<sender-id>`, remaining lines are the message body
- Location: `owlery/<target-id>/inbox/<timestamp>-<random>.msg`
- Pattern: Write-once, read-once, delete-on-read (consumed by `_poll_once`)

**Owl Commands (case dispatch):**
- Purpose: Each subcommand is a case branch in `owl.sh`
- Commands: `setup`, `poll`, `deliver`, `reply`, `send`, `list`, `stop`
- Pattern: `$OWL <command> [args...]` -- first argument is always the command, remaining are positional

## Entry Points

**owl.sh (runtime):**
- Location: `owl.sh`
- Triggers: Called by Claude Code agents via `$OWL <command> [args...]`
- Responsibilities: All messaging operations -- perch lifecycle, message delivery, polling, cleanup

**SKILL.md (skill manifest):**
- Location: `SKILL.md`
- Triggers: Loaded by Claude Code's skill system when the `owl` skill is active
- Responsibilities: Instructs agent behavior -- which commands to run, how to handle messages, error recovery

## Error Handling

**Strategy:** Fail-fast with descriptive status codes printed to stdout

**Patterns:**
- `_check_alive` exits with code 1 and prints `NO_PERCH: <id> is not listening` or `STALE: <id>'s poll process is dead`
- `deliver`, `reply`, and `send` all call `_check_alive` before writing, preventing messages to dead agents
- `list` auto-cleans stale perches (dead PIDs or missing ready files) -- this is the primary garbage collection mechanism
- The `SKILL.md` instruction layer includes a recovery procedure for when `$OWL` env var is not set (lines 11-15)

## Cross-Cutting Concerns

**Logging:** No logging system. Commands print status codes to stdout/stderr (e.g., `READY:id`, `SENT:id`, `REPLIED:id`, `CLEANED:id`, `STOPPED:id`).

**Validation:** Minimal. `_check_alive` validates target exists and PID is live. No input sanitization on IDs or message bodies.

**Authentication:** None. Any process with filesystem access to `$HOME/.claude/owlery/` can read/write messages. Security relies on OS-level file permissions.

**Concurrency:** No locking. Race conditions are possible if two agents write to the same inbox simultaneously (mitigated by unique `<epoch>-<random>` filenames). `_poll_once` reads the oldest message (`ls -1t | tail -1`) to approximate FIFO ordering.

---

*Architecture analysis: 2026-03-28*
