# VPN Cluster — Repo Map (overlay patterns, NOT a VPN model for SPT)

Three production overlay/VPN systems studied for coordination patterns. SPT shares
the "scattered agents discover each other and exchange small messages" problem but
explicitly does NOT want to be a VPN: no kernel tun device, no daemon, no
always-on control plane. The interesting borrowables are coordinator topology,
identity bootstrap UX, peer-state sync cadence, and (very selectively)
NAT-traversal hints.

Repos read:
- `docs/research/inspiration/repos/slackhq__nebula/` (4.8 MB, Go)
- `docs/research/inspiration/repos/tonarino__innernet/` (1.2 MB, Rust)
- `docs/research/inspiration/repos/juanfont__headscale/` (390 MB, Go) — README,
  `AGENTS.md`, `hscontrol/state/`, `hscontrol/auth.go`, `hscontrol/noise.go`,
  `hscontrol/mapper/`, `hscontrol/derp/derp.go`, `hscontrol/db/preauth_keys.go`
  only. Heavy generated/test/integration trees skipped.

---

## nebula (Go, Slack)

- Repo: `docs/research/inspiration/repos/slackhq__nebula/`
- Architecture: single binary `nebula`, every node runs the same daemon; some
  nodes are flagged `lighthouse.am_lighthouse=true` and act as discovery brokers.
  No central database — lighthouses just cache who reported what, in RAM.
- Entry: `main.go:23` `Main()` wires `PKI -> Firewall -> Lighthouse -> Punchy ->
  HandshakeManager -> Interface`. Build: `cmd/nebula/`. PKI CLI: `cmd/nebula-cert/`.

### Lighthouse (coordinator)

- File: `lighthouse.go`
- Struct `LightHouse` at `lighthouse.go:29`. Fields of interest:
  `amLighthouse bool`, `addrMap map[netip.Addr]*RemoteList` (vpn-addr → cached
  reports), `lighthouses atomic.Pointer[[]netip.Addr]` (who *my* upstream
  lighthouses are), `relaysForMe atomic.Pointer[[]netip.Addr]`.
- Constructor: `NewLightHouseFromConfig(ctx, l, c, cs, pc, p) (*LightHouse, error)`
  at `lighthouse.go:82`.
- Handler dispatch: `LightHouseHandler.HandleRequest` at `lighthouse.go:1082`
  switches on `NebulaMeta` proto type → `handleHostQuery` (`:1123`),
  `handleHostQueryReply` (`:1288`), `handleHostUpdateNotification` (`:1322`),
  `handleHostPunchNotification` (`:1401`).
- Periodic state push: `StartUpdateWorker` at `lighthouse.go:869` ticks every
  `lighthouse.interval` seconds (default 10s) and calls `SendUpdate` (`:908`)
  which packages local interface addrs + configured `advertise_addrs` into a
  `HostUpdateNotification` to every upstream lighthouse. `TriggerUpdate` (`:901`)
  is a non-blocking early-fire after a fresh handshake.
- Auth model: **mutual cert auth, no shared secret with the lighthouse**. The
  lighthouse just validates a Noise IX handshake using the same CA-signed cert
  every node carries. No "lighthouse owns the database of who's allowed"
  concept — a node either has a CA-valid cert or it doesn't.
- Data plane vs control plane: **same wire, same UDP socket, same Noise tunnel**.
  Lighthouse traffic is just an in-band protobuf inside the encrypted data
  channel (`NebulaMeta_*` message types). There is no separate "control port".

### NAT hole-punching

- File: `punchy.go` (the whole file is ~230 lines, very readable)
- Struct `Punchy` at `punchy.go:35`. `holepunchJob` (`:22`): either an immediate
  `target netip.AddrPort` to fire a UDP byte at, or a `vpnAddr` to send an
  encrypted Nebula `Test` packet to.
- `Punchy.Schedule(target, vpnAddr)` at `punchy.go:142`: queues a 1-byte UDP punch
  after `punchy.delay` (default 1s). `ScheduleRespond(vpnAddr)` (`:151`): a
  delayed encrypted test packet, gated by `punchy.respond`.
- `Punchy.SendPunch(hostinfo)` at `punchy.go:167`: immediate keepalive byte for
  an idle tunnel. Skips lighthouses (their normal update interval keeps NAT
  state warm).
- Punch signaling flow (the actually-clever bit):
  - When lighthouse answers `HostQuery` with `HostQueryReply` (`:1181`), it
    *also* sends `sendHostPunchNotification` (`:1187`) to the **queried** peer,
    telling it: "someone with these endpoints is trying to reach you — please
    send zero-byte UDP packets to those endpoints so the firewall state opens".
    Both sides punch simultaneously from behind their NATs. This is the entire
    mechanism — no STUN, no ICE candidate gathering, just "lighthouse tells
    both sides each other's `endpoint:port`".
  - Receiver path: `handleHostPunchNotification` at `lighthouse.go:1401` →
    `punchy.Schedule(b, detailsVpnAddr)` for each V4/V6 addr (`:1423`, `:1430`).
- Relay fallback when punching fails: `relay_manager.go`. `relayManager.StartRelays`
  at `relay_manager.go:60` walks configured relays and tunnels handshake
  packets through them. Requires both ends to opt in (`relay.am_relay` /
  `relay.use_relays`).

### Identity / node-join UX

- PKI: `cmd/nebula-cert/` (`ca.go`, `keygen.go`, `sign.go`, `passwords.go`).
  Flow: operator runs `nebula-cert ca -name X` once to mint a self-signed root,
  then `nebula-cert sign -name node1 -ip 10.x.y.z/24 -groups admin,server` per
  node. CA private key never leaves the operator's box.
- Result: `host.crt` + `host.key` files copied to the new node alongside a YAML
  config that lists the lighthouse VPN-IPs in `static_host_map` and
  `lighthouse.hosts`.
- Loading: `pki.go:27` `PKI{cs, caPool}`, `CertState` (`:33`) holds parsed
  cert, version (v1 or v2), private key. Reloadable via config callback.
- No registration RPC. A new node just shows up, completes Noise IX with the
  lighthouse, and starts sending `HostUpdateNotification`s. Lighthouse caches
  them in RAM (`addrMap`). On lighthouse restart the cache is lost and refilled
  organically within one update interval.

### Peer-state sync

- Pull: node sends `HostQuery(targetVpnAddr)` lazily, only when it needs to
  reach `targetVpnAddr` for the first time (or its cached `RemoteList` ages
  out). Lighthouse replies with cached V4/V6/relay candidates.
- Push: every node periodically pushes its own observed addresses
  (`SendUpdate` at `lighthouse.go:908`, interval 10s default).
- `RemoteList` per-peer cache: `remote_list.go`. Coalesces "what I learned" +
  "what peer reported": `coalesceAnswers` at `lighthouse.go:1244`.
- Eventual consistency, **no global view** anywhere. Lighthouses don't gossip
  to each other — each peer pushes to every lighthouse it knows.

### Borrow

- **Punch-signaling pattern** as inspiration only: lighthouse-mediated
  rendezvous where the broker tells *both* sides about each other's last-seen
  endpoint in a single roundtrip. If SPT ever does cross-NAT delivery between
  two laptop perches, this 3-message pattern (Query → Reply + PunchNotif → both
  send UDP) is the minimum-viable design.
- **Reload-via-config-callback** pattern (`config.C.RegisterReloadCallback`):
  every subsystem registers a closure that runs when config changes; the
  closure compares `c.HasChanged("key")` and acts. Useful for SPT if we ever
  want hot-reload of perch configuration without restart.
- **In-band control protocol** over the same transport as data: avoids the
  "wait, is the control plane up?" failure mode that headscale lives with.
- **Lazy peer discovery** (`Query` only when traffic appears) — matches SPT's
  pattern of "look up perch when about to deliver".

### Avoid

- **Persistent UDP listener / daemon model**: every nebula node holds a
  long-lived UDP socket bound to a port. SPT explicitly has no daemon — the
  binary forks short-lived processes per command. Lighthouse + Punchy + handshake
  state machine all assume "I'm always running".
- **CA hierarchy + cert distribution**: requires offline operator-managed CA
  key, plus a per-node sign step. Too much ceremony for SPT's "copy `owl.exe`
  and go" goal. SPT's TCP-registry-on-localhost model needs no PKI.
- **TUN/TAP device + kernel routing**: nebula owns layer-3. SPT is a
  message-bus, not a network — never touch the kernel.
- **In-RAM-only lighthouse state**: works because peers re-push every 10s, but
  requires the lighthouse to be reachable continuously. SPT's spool-to-disk
  is more resilient for the "everyone on laptops" case.

---

## innernet (Rust, tonari)

- Repo: `docs/research/inspiration/repos/tonarino__innernet/`
- Workspace crates: `server/`, `client/`, `client-core/`, `shared/`,
  `wireguard-control/`, `hostsfile/`, `netlink-request/`, `publicip/`.
- Architecture: **explicitly centralized**. One `innernet-server` per network
  holds the SQLite source of truth; clients fetch state over HTTPS *inside* the
  WireGuard tunnel.

### Coordinator (innernet-server)

- File: `server/src/lib.rs`
- Entry: `server/src/main.rs:1`. Subcommands at `main.rs:39` — `new` (init
  wizard), `serve`, `add-cidr`, `add-peer`, `disable-peer`, `enable-peer`,
  `rename-peer`, `rename-cidr`, `add-association`, `uninstall`.
- HTTP routes: `server/src/api/user.rs:16` `routes()` — `GET /state`,
  `GET /capabilities`, `POST /redeem`, `PUT /endpoint`, `PUT /candidates`.
  Admin routes at `server/src/api/admin/` (peer / cidr / association CRUD).
- `Context` struct at `server/src/lib.rs:51`: `db: Arc<Mutex<Connection>>`,
  `endpoints: Arc<RwLock<HashMap<String, SocketAddr>>>`, `interface`, `backend`,
  `public_key`. Database is SQLite (`rusqlite`, `Connection::open`).
- Auth model: **TCP source-IP authentication inside WireGuard**. The HTTP API
  binds to the server's WireGuard interface address only; if a packet arrives,
  WireGuard has already authenticated it cryptographically. `Session` struct
  (`lib.rs:59`) is built by looking up the source IP in the peer table.
- Data plane vs control plane: **control = HTTPS over WireGuard; data =
  raw WireGuard between peers**. After redemption a peer talks to other peers
  directly via WG and never to the server except for state pulls.
- Init wizard: `server/src/initialize.rs:110` `init_wizard()` walks operator
  through CIDR, listen port, endpoint detection (`innernet-publicip` crate),
  generates WG keypair, writes `populate_database` (`:66`) with the root CIDR
  + "innernet-server" CIDR + the server's own peer row.

### NAT traversal

- File: `client-core/src/nat.rs` (whole file, 139 lines)
- Approach: **endpoint candidate list, no hole-punching**. Each peer reports
  its locally-observed addresses via `PUT /candidates` (`api/user.rs:142`,
  max 10). The server also tracks the source addr it saw on the last connection
  via `inject_endpoints`. Other peers receive the candidate list as part of
  `/state` and iterate.
- `NatTraverse` at `nat.rs:17`. Methods:
  - `NatTraverse::new(interface, backend, &[PeerDiff]) -> Result<Self>` (`:23`)
  - `step(&mut self) -> Result<()>` (`:98`) — pops one candidate per peer,
    writes it as the peer's WG endpoint, sleeps `STEP_INTERVAL` (5s) polling
    `is_recently_connected`.
  - `refresh_remaining(&mut self)` (`:71`) — drops peers that connected or have
    no candidates left.
- Helper: `set_endpoint(public_key, endpoint) -> Option<PeerConfigBuilder>`
  (`:132`) — resolves a DNS name and constructs a WG peer update.
- **No UDP punching.** The assumption is WG keepalives + having at least one
  reachable endpoint per peer (or one peer with a public IP) is enough.
  `PERSISTENT_KEEPALIVE_INTERVAL_SECS` (server-initialize) keeps NAT state.

### Identity / node-join UX

- File: `server/src/api/user.rs:85` (the `redeem` handler), `client/src/main.rs:262`
  (the `install` flow).
- Flow:
  1. Admin runs `innernet add-peer <iface>` → server (or admin client)
     generates a *temporary* WG keypair, writes a `PeerInvitation` (TOML file
     at `shared/src/interface_config.rs:17` `InterfaceConfig {interface,
     server}`) with the temp private key + server's public key + endpoints.
  2. Operator hand-delivers the `.toml` file to the new machine.
  3. New peer runs `innernet install /path/to/invite.toml` (`client/src/main.rs:262`).
     `redeem_invite` (called at `main.rs:283`) brings up WG with the temp key,
     reaches the server's internal endpoint, generates its *real* keypair,
     `POST /v1/user/redeem` with the new pubkey (`api/user.rs:93`). Server
     updates the peer row, schedules a delayed WG reconfig (`:122`, after
     `REDEEM_TRANSITION_WAIT` so the HTTP response can flush).
  4. Invitation is one-shot: `is_redeemed` flag flips, `expired_invite_sweeper`
     (`lib.rs:424`) GCs unredeemed invites.
- File permissions: `chmod 0600` on the invitation file (`lib.rs:97`).

### Peer-state sync

- Pull only: client runs `innernet up` (`client/src/main.rs:378`) periodically
  (default loop, `--interval`). Each iteration: `fetch` calls `GET /v1/user/state`,
  server returns `State { peers, cidrs }` filtered to what *this* peer is
  allowed to see (`api/user.rs:62`, `get_all_allowed_peers`).
- No push from server to client. State staleness window = poll interval.
- Endpoint reporting: client sends `PUT /endpoint` if it has overridden its
  endpoint, and `PUT /candidates` with local NIC addresses, both inside
  the WG tunnel.
- Server-side cache of last-seen socket addr per peer:
  `Endpoints = Arc<RwLock<HashMap<String, SocketAddr>>>` (`lib.rs:48`),
  injected into `/state` responses.

### Borrow

- **Invitation-as-file UX**: a single short-lived TOML containing a throwaway
  key + server contact info, redeemed once for a permanent key. This is a
  much cleaner mental model than "send your pubkey to the admin and wait".
  SPT could use the same shape for `spt invite` → `spt accept invite.toml`
  if we ever do cross-host registration.
- **CIDR-as-namespace + association graph** for ACL: every peer belongs to a
  CIDR; CIDRs are explicitly associated for cross-traffic. Conceptually maps to
  "perch groups" — SPT could group perches and allow targeting "group:foo".
- **`Endpoints` HashMap injected into responses**: server caches the source
  addr of the last HTTP hit and offers it as a candidate to peers. SPT's TCP
  registry already does this implicitly; the pattern of "the *coordinator's*
  view of where you actually came from is itself a piece of peer metadata"
  is worth keeping in mind.
- **`expired_invite_sweeper`** background task (`lib.rs:424`): GC unredeemed
  invitations on a timer. SPT already does similar for stale perches; the
  pattern of one tiny dedicated goroutine per cleanup concern is clean.
- **Atomic in-flight key rotation with a deliberate flush delay**
  (`REDEEM_TRANSITION_WAIT`, `api/user.rs:111-135`): swap the WG peer only
  *after* the HTTP response flushes, because the response itself rides the
  tunnel that's about to be rekeyed. The general pattern — "schedule the
  disruptive state change for *after* the current RPC finishes" — applies
  to any RPC that mutates its own transport.

### Avoid

- **Single coordination server is mandatory**. If `innernet-server` is down,
  `innernet add-peer` fails and existing peers run on stale state. SPT must
  not have a global single point of failure.
- **WireGuard tunnel for the control plane**: requires kernel WG (or
  `wireguard-go` userspace), root, `setcap`, `/etc/wireguard/` config, systemd
  unit. SPT's "no install" promise dies instantly.
- **SQLite at the coordinator** as the source of truth: works for innernet
  because the server is always-on; SPT's coordinator-less model means peers
  *are* the source of truth, and per-perch SQLite spools converge by gossip.
- **Endpoints can never be reused**: peers cannot be deleted, only disabled
  (`README.md:108`). A defensible WG choice, but for SPT a perch dying must
  fully release its ID.
- **No NAT-punching at all**: relies on "someone in your network has a public
  IP". SPT can't make that assumption.

---

## headscale (Go, ~390 MB)

- Repo: `docs/research/inspiration/repos/juanfont__headscale/`
- Heavy weight: most of the size is `integration/` Docker-based e2e tests
  (Tailscale client containers, derp containers) and `gen/` protobuf output.
  Production code lives in `hscontrol/` and `cmd/headscale/`.
- Architecture: **a reimplementation of the Tailscale control server** that
  speaks the Tailscale (Noise/TS2021) wire protocol. Tailscale clients connect
  to headscale as if it were the SaaS, get a tailnet config back, then talk
  WireGuard peer-to-peer. There is no headscale data plane — DERP relays are
  separate processes.
- Key reading from `AGENTS.md` (which is unusually good docs for AI agents).

### Coordinator / control plane

- `cmd/headscale/main.go` — single binary, runs as a long-lived server.
- `hscontrol/app.go` wires HTTP / gRPC / Noise listeners.
- **`hscontrol/state/state.go` is the central coordinator.** `State` struct,
  constructor `NewState(cfg) (*State, error)` at `state.go:177`. All
  cross-subsystem mutations go through `State` methods:
  `CreateUser` (`:358`), `Connect/Disconnect` (`:572`, `:610`),
  `SetNodeExpiry`, `SetNodeTags`, `SetApprovedRoutes`,
  `UpdateNodeFromMapRequest` (the big sync point, at `state.go:2351` per
  AGENTS.md; the file is huge).
- **`hscontrol/state/node_store.go` is a copy-on-write in-memory cache.**
  `NodeStore` struct (`node_store.go:104`) holds
  `data atomic.Pointer[Snapshot]` plus a single `writeQueue chan work`.
  Reads (`GetNode`, `ListPeers`) are pointer loads — no lock. Writes
  (`PutNode` `:188`, `UpdateNode` `:227`, `UpdateNodes` `:245`,
  `DeleteNode` `:271`) push to the queue; a single writer goroutine
  (`processWrite` `:347`, `applyBatch` `:393`) drains in batches and
  atomically swaps a new `Snapshot` (`:140`) in. `snapshotFromNodes` (`:576`)
  rebuilds derived indices (peers, routes, primary route election).
- Auth handshake: `hscontrol/noise.go`. `NoiseUpgradeHandler` at `noise.go:96`
  upgrades an HTTP `/ts2021` request to a Tailscale Noise session via
  `controlhttpserver.AcceptHTTP` (`:120`), then mounts a chi router serving
  `/machine/register` → `RegistrationHandler` (`:736`) and
  `/machine/map` → `PollNetMapHandler` (`:696`) inside the Noise tunnel.
- Registration: `hscontrol/auth.go:26` `handleRegister(ctx, req, machineKey)`.
  Three branches:
  - Past-expiry → logout (`:34`, calls `handleLogout` at `:148`).
  - `req.Auth.AuthKey != ""` → `handleRegisterWithAuthKey` (`:357`), which
    calls `state.HandleNodeFromPreAuthKey`.
  - Otherwise interactive: `handleRegisterInteractive` (`:421`) → mints
    an `AuthID`, returns `RegisterResponse.AuthURL` for the user to visit
    (OIDC or CLI command). Client polls via `waitForFollowup` (`:269`) on
    a per-AuthID `AuthRequest.WaitForAuth()` channel until the operator
    accepts via `headscale nodes register` or the OIDC callback fires.
- PreAuth keys: `hscontrol/db/preauth_keys.go:47` `CreatePreAuthKey(tx, uid,
  reusable, ephemeral, expiration, aclTags) (*PreAuthKeyNew, error)`. Generates
  `hskey-auth-{12-char prefix}-{64-char secret}`. Stores bcrypt hash of the
  secret only — the plaintext is shown once at creation (`:122`).
- Data plane vs control plane: **fully separate**. Headscale is *only* the
  control plane (registration, peer-map distribution, ACL evaluation).
  Data plane is WireGuard peer-to-peer + DERP relays. DERP relays
  (`hscontrol/derp/`) are optional separate processes; headscale ships their
  addresses in the `DERPMap` it includes in every `MapResponse`.

### NAT traversal

- **Headscale itself does NOT implement hole-punching.** NAT traversal is
  client-side, in the Tailscale client, using Tailscale's own DISCO protocol.
- What headscale provides:
  - DERP relay list (`hscontrol/derp/derp.go:25` `loadDERPMapFromPath`,
    `:44` `loadDERPMapFromURL`, `:76` `mergeDERPMaps`). DERP is a fallback
    TCP-relay for "we can't punch through" — basically a meet-in-the-middle
    server.
  - Endpoint distribution: each tailscale client reports its discovered
    candidates in `MapRequest.Endpoints` and `Hostinfo.NetInfo`
    (`hscontrol/state/maprequest.go:15` `netInfoFromMapRequest` — picks new
    NetInfo if present, falls back to last-seen).
- `hscontrol/derp/server/` — embedded DERP server (optional). For most
  deployments operators point at Tailscale-operated DERP servers.

### Identity / node-join UX

Two flows:

1. **PreAuth key (machine-to-machine, scripted):**
   - Operator: `headscale preauthkeys create -u <user>` →
     `db/preauth_keys.go:47` `CreatePreAuthKey` returns
     `hskey-auth-XXXXXXXXXXXX-{64chars}`.
   - Node: `tailscale up --login-server https://headscale.example.com --authkey
     hskey-auth-...` — single command, done.
   - Server: `auth.go:357` `handleRegisterWithAuthKey` validates, calls
     `state.HandleNodeFromPreAuthKey` (in `state.go`, around `:2280` for the
     re-register path I read). Creates node, assigns IPv4+IPv6 from the
     `IPAllocator`, returns `MachineAuthorized: true`.

2. **Interactive (human-in-the-loop):**
   - Node: `tailscale up --login-server https://headscale.example.com` — no key.
   - Server: `auth.go:421` `handleRegisterInteractive` mints `AuthID`, returns
     `AuthURL` that prints in the tailscale client: "Open this URL to
     register: https://headscale.example.com/register/nodekey:..."
   - Operator runs `headscale nodes register --user X --key nodekey:...` to
     accept. The pending `AuthRequest` (held in an LRU cache, see
     `state.go:188`) gets `FinishAuth(AuthVerdict)` and the client's
     `waitForFollowup` (`auth.go:269`) unblocks.

- Cap on unauthenticated cache fill: LRU bounded by `RegisterCacheMaxEntries`
  (`state.go:183`).

### Peer-state sync

- **Long-poll over Noise:** `noise.go:174` mounts `POST /machine/map` →
  `PollNetMapHandler`. The map session (`hscontrol/poll.go:31` `mapSession`)
  holds a `chan *tailcfg.MapResponse`, a keepAlive ticker (`:24`, 50s +
  jitter), and stays open for hours/days. Each `MapResponse` is a full or
  partial view of the tailnet.
- **Streaming batcher distributes updates:** `hscontrol/mapper/batcher.go:39`
  `NewBatcher(batchTime, workers, mapper)`. Per-node connection state in
  `mapper/node_conn.go:38` `multiChannelNodeConn` — one entry per active
  long-poll, with `pending []change.Change` and a `workMu` mutex to serialize
  in-order delivery. `Batcher.AddWork(r ...change.Change)` (`batcher.go:377`)
  is the fanout entry. Workers (`worker` `:449`, `addToBatch` `:549`,
  `processBatchedChanges` `:622`) coalesce changes into one `MapResponse` per
  poll session per batch tick.
- `change.Change` is a typed event (NodeAdded, PolicyChange, RouteChange, etc.).
  Comes out of `State` methods: `state.go` returns `change.Change` from every
  mutating method so callers know what to broadcast.
- Online status: `IsOnline` is set exclusively in `State.Connect`/`Disconnect`
  (`state.go:572`, `:610`), not derived from polling.

### Borrow

- **Copy-on-write in-memory store with single-writer goroutine** — exactly the
  shape we want for SPT's per-perch registry. `NodeStore` reads are lock-free
  (`atomic.Pointer` swap on a `Snapshot` struct); all writes funnel through
  one `chan work` and get batched. Bounded queue (`writeQueue`), bounded
  batch size, atomic publish. This is the cleanest implementation of the
  pattern I've read recently — directly applicable if SPT ever has hot reads
  contending with hot writes on the same data.
- **Typed `change.Change` events as the bus between layers**: every State
  method returns "here's what observers should know about", and a separate
  batcher decides who to notify and when. SPT's poll-driven model could
  benefit from the same explicit-event-type discipline rather than diffing
  state in the receiver.
- **PreAuthKey UX**: `hskey-auth-{prefix}-{secret}`, prefix is plaintext (for
  log/audit), secret is bcrypt-hashed. Single-shot vs reusable flag, optional
  expiry, optional ACL tag binding. Maps cleanly to "SPT one-shot
  joining-token" if we ever need cross-host bootstrap.
- **Two-mode registration (machine token OR interactive accept)**: critical
  for both scripted (CI, containers) and human (laptop) flows. The
  `waitForFollowup` channel pattern (`auth.go:269`) — where the client polls
  while the operator approves — is exactly how SPT's `live` agents could
  request operator approval before joining a shared namespace.
- **Long-poll-with-jitter keepalive** (`poll.go:24`, 50s base + up to 9s
  jitter): avoids the thundering herd of N clients all keepalive-ing at the
  same second. Useful pattern for SPT's poll loops.
- **DERP-as-optional-relay separation**: relay is a different binary, deployed
  by people who need it, advertised through the control plane. The control
  plane never proxies data. SPT should keep its "messages flow point-to-point;
  the registry never sees payload" invariant.

### Avoid

- **Long-lived HTTP server holding tens of thousands of open streams** is
  fundamentally incompatible with SPT's no-daemon model. The
  `multiChannelNodeConn` machinery exists *because* there are persistent
  poll sessions to manage. SPT's poll loop is opposite: short-lived process
  reads, exits.
- **GORM + SQLite + 30+ migration files** with a frozen migrations array
  (`AGENTS.md` "Database Migration Rules"): the operational cost of
  schema-as-code at the coordinator is significant. SPT's per-perch
  `info.json` + spool DB is fine; do not centralize.
- **OIDC integration** (`hscontrol/oidc.go`, `oidc_*_test.go`): adds an
  external dependency (an OIDC provider) for human registration. SPT can
  stay with filesystem-permission-based identity (POSIX UID / Windows ACL on
  the runtime root).
- **Policy engine with autogroups, tag owners, route approvals, SSH grants**
  (`hscontrol/policy/v2/`): tailscale's ACL feature surface. Massive scope,
  only justified by "you need a real VPN ACL". SPT's access control is
  filesystem perms; do not invent a DSL.
- **Reimplementing someone else's wire protocol** (`tailcfg.MapRequest` etc.):
  headscale is forever chasing Tailscale client compatibility. SPT controls
  both ends of its protocol — never accept that burden.
- **Bcrypt for preauth tokens at this volume**: fine for headscale's 1 RPS
  registration rate; would be a measurable issue at SPT's "many short-lived
  processes per second" cadence. Use HMAC-with-prefix-lookup instead.

---

## Cross-cluster patterns

### Shared design choices

- **Discovery is always "broker holds last-seen endpoints; peer queries
  lazily; both ends learn each other".** Nebula does it in-band over Noise;
  innernet does it via HTTP-over-WG; headscale does it via Noise-over-HTTP.
  None of them gossip peer state between peers — all three rely on the
  coordinator (lighthouse / server / control plane) as the rendezvous point.
- **Eventual consistency with periodic push + on-demand pull.** Nebula:
  10s push + lazy query. Innernet: poll-interval pull only. Headscale: long-poll
  push from server + on-change pull. SPT's spool+poll is closer to innernet.
- **NAT traversal is "give both ends the candidate list, let them race"** —
  no STUN/ICE/TURN orchestration. Nebula uses zero-byte UDP punches as the
  cheapest signal. Innernet doesn't even punch, just iterates candidates with
  a 5s timer. Headscale delegates entirely to client-side DISCO.
- **Identity bootstrap is always offline → online**: someone creates a
  credential out-of-band (CA-signed cert / invitation TOML / preauth key) and
  hand-delivers it. None of the three has self-service registration without
  an admin step. SPT should follow this — never auto-trust a new perch.
- **Control plane runs over the same transport as data, but logically
  distinct.** Nebula: in-band protobuf inside Noise tunnels. Innernet:
  HTTP inside WG. Headscale: Noise over HTTP, then HTTP/2 for the map poll.
- **Single-writer + many-reader concurrency** is the dominant pattern at the
  coordinator. Nebula uses `sync.RWMutex` + `atomic.Pointer` for hot config.
  Headscale uses `atomic.Pointer[Snapshot]` + a single-goroutine batcher.
  Innernet uses `Arc<Mutex<Connection>>` for SQLite. SPT's perch registry
  already uses similar isolation; the headscale `NodeStore` pattern is the
  most ambitious and cleanest of the three.

### Relevance ranking for SPT (highest first)

1. **headscale `NodeStore` copy-on-write pattern** (`node_store.go`). Directly
   applicable shape for any hot-read SPT registry data. ~900 LOC, very legible.
2. **innernet invitation-TOML + redeem pattern**. The simplest, cleanest
   "trust bootstrap" UX I've seen. If SPT ever does cross-machine perches,
   this is the model.
3. **nebula lighthouse → both-sides-punch signaling** (`lighthouse.go:1183`,
   `:1401`). Smallest viable NAT-rendezvous design if SPT ever needs it.
   Not needed for the localhost-only case we're in today.
4. **headscale typed `change.Change` events + batcher fan-out**. A clean
   replacement for the "subscriber diffs state every poll" pattern SPT uses.
   Lower priority — only worth adopting once polling cost matters.
5. **innernet `expired_invite_sweeper` + nebula `RegisterReloadCallback`** —
   both are tiny dedicated cleanup/reload tasks. SPT already does similar
   things ad-hoc; making them explicit as "one goroutine per concern" might
   improve readability.

### Hard "do not borrow" list

- Long-lived daemon, kernel network device, root requirement (all three).
- CA hierarchy with operator-managed offline key (nebula).
- Single mandatory coordinator process (innernet, headscale).
- ACL DSL with policy compilation (headscale).
- Bcrypt'd tokens validated on every request (headscale, hot path).
- GORM + 30+ migrations + frozen-migration-array discipline (headscale).
- WireGuard / Noise / DERP wire protocol surface (all three) — SPT controls
  both sides of its TCP+spool protocol; do not import someone else's.
