# 14 - Arda as OAuth 2.0 Authorization Server (REVISED)

**Status: NOT STARTED**

**Supersedes:** `SPEC.md` in this folder. This revision resolves the ship-blockers raised in the engineering review: (1) IP allowlist interaction with admin_api's EC2 security group, (2) arda-logout vs OAuth-token lifecycle independence. It also folds in smaller review wins: code minting moved into arda (no service hop), concrete CSRF design, per-endpoint feature flags, and a refresh-rotation schema for reuse detection.

## Summary

Turn arda into an OAuth 2.0 authorization server so external apps can let Bigscreen users sign in via arda and then call admin_api on their behalf. Flow: external app redirects to arda → arda authenticates the user (existing login flow) and shows a consent screen → arda issues a one-time authorization code → external app exchanges it at `/oauth/token` for an access token + refresh token → external app uses the access token as `Authorization: Bearer` against admin_api and can learn the user's identity via `/oauth/userinfo`.

**Design principle: purely additive.** The existing first-party JWT access/refresh token system, login flow, `allowed_apps.json` service-to-service keys, and `x-access-token` + app-key double-auth on admin_api all stay unchanged. OAuth is a new code path alongside; existing v2 tokens keep working indefinitely.

**Scope constraints:**
- **Internal / trusted-partner apps only** for v1 — SuperUser-managed clients via a small dev portal in arda. No self-service developer signup, no review queue.
- **Users sign in via the external app** — `/oauth/userinfo` ships in v1 so apps learn the authenticated user's identity. Full OIDC with `id_token` minting deferred.
- **First consumer: factory/fabricator tool** — initial scopes target `fabricator:*`, `orders:*`, plus identity scopes.

---

## 🔑 Key architectural decisions from review

These answers shape the rest of the spec; read them first.

### 1. IP allowlist: OAuth clients must have a whitelisted static server IP

admin_api runs on EC2 behind a security group that only admits known IP addresses. This is not relaxed for OAuth traffic — OAuth only adds *user-delegated identity* on top of the existing network perimeter; it does not replace the perimeter.

Therefore:

- **Every OAuth client must declare one or more static server IP addresses** (typically an AWS Elastic IP or equivalent static NAT egress) at registration time.
- **The admin_api security group is updated whenever a client is registered** to admit that IP for port 443.
- **The OAuth client registration flow has a pre-requisite manual step**: the SuperUser registering a client must file a security-group-update request (or self-serve via AWS console if they have access). Client creation in the dev portal is gated on providing the IP; the SPEC assumes the IP is whitelisted out-of-band before the client's first `/oauth/token` call.
- **Public/native clients cannot use this OAuth provider for v1.** Mobile apps, browser SPAs, and desktop apps don't have stable server IPs. Only confidential server-side clients are supported. The SPEC's `clientType` column keeps the `'public'` option for forward compatibility, but the dev portal will reject `clientType=public` creations until a future phase relaxes the IP-allowlist constraint.
- **At runtime, admin_api enforces the IP allowlist first (via the EC2 security group), then OAuth enforces identity and scope.** If an OAuth token is presented from a non-whitelisted IP, the request is dropped at the network layer before OAuth code runs — so no "token leaked but blocked by IP" branch in application code. This is correct defense-in-depth.
- **Registration audit row** on client creation includes the declared `serverIps[]` and the AWS security-group rule ID for traceability. If the client's IP changes, a new audit row is written.

Implication for the factory tool (first consumer): it runs on an AWS instance with a persistent EIP; that EIP goes in the security group when we register `fabricator_tool` as an OAuth client.

### 2. Arda logout does NOT invalidate OAuth tokens

The two lifecycles are independent:

- **Arda login session** — created on `POST /auth/login`, bound to `userSessionId` in v2 JWTs. Arda cookie auth. Deactivated on `GET /auth/logout`, which calls `AuthDatabase.deactivateAccessToken(hashOfArdaAccessToken)` and removes the refresh token from `RefreshTokenWhitelist`.
- **OAuth grant session** — created on successful `POST /oauth/token` code exchange, bound to a **new** `oauthGrantId` in v3 JWTs (NOT reused from arda's `userSessionId`). Tracked in Redis set `oauth:user_client_tokens:<userId>:<clientId>`. Deactivated only by: (a) user revoking the grant in arda's `/settings/connected-apps`, (b) admin disabling the client, (c) refresh token rotation consuming the old refresh, (d) natural expiry.

**Concrete implementation:**

The v3 OAuth token `sub` uses `oauthGrantId`, not `userSessionId`:

```ts
sub: {
    version: "v3",
    bigscreenAccountId,
    oauthGrantId,        // NEW: separate from arda's userSessionId; stays valid past arda logout
    grantType: "oauth_user",
    clientId,
    scope: ["fabricator:read", "orders:read"],
    authTime: 1713100000
}
```

`oauthGrantId` is a UUID generated at code-exchange time and stored on the `oauth_grants` row. It is what admin_api validates against when the token arrives. The arda `userSessionId` that existed when the user consented is **not** carried into the OAuth token at all — this is the key design choice that decouples the lifecycles.

`AuthDatabase.deactivateAccessToken` and `deactivateRefreshToken` operate per-token-hash, not per-session-id. Arda logout only removes arda-issued hashes. The new Redis set `oauth:user_client_tokens:<userId>:<clientId>` tracks OAuth token hashes separately; arda logout never touches it.

**Side effect:** if a user logs out of arda and then logs back in as a *different* Bigscreen user on the same browser, the first user's OAuth tokens (issued to third-party apps) remain valid — because those tokens belong to that user, not to the browser session. This is correct OAuth semantics: tokens are bound to users, not to arda browser sessions.

---

## Current State

No OAuth infrastructure exists. Thorough search of the monorepo confirmed:

- No `oauth2-server`, `@node-oauth/oauth2-server`, `oidc-provider`, or `openid-client` dependencies in any `package.json`.
- `passport` appears transitively in `webapps/package.json` but is not imported anywhere.
- The only existing OAuth usage is as a **client** to Brightcove in `api/src/media/Brightcove.ts:26` (`grant_type=client_credentials`).
- No database tables for OAuth clients, grants, authorization codes, or scopes.

### Primitives we already have (and will reuse)

| Primitive | Location | Reuse |
|---|---|---|
| RS256 JWT signing + verification | `auth/Auth.ts:138-220` | v3 tokens reuse the same signing machinery with an extended `sub` |
| Access-token whitelist (Redis) | `auth/AuthDatabase.ts:103-116` (set `accessToken:{accountId}`) | Reuse for OAuth access-token revocation |
| Refresh-token whitelist (Firestore) | `auth/AuthDatabase.ts:63-85` (`RefreshTokenWhitelist`) | Reuse for OAuth refresh tokens, with extra fields for rotation chain |
| Renewal nonces (Redis) | `auth/AuthDatabase.ts:87-100` | Parallel pattern for `oauth:code:*` Redis keys |
| JWT payload dispatch | `auth/Auth.ts:103-117` (`Object.assign(this, payload.sub)`) | Extending `sub` with new fields flows through automatically |
| Admin_api double auth | `apps/admin_api/admin_api.ts:43-56` (`verifyAdminRequest`) | Rewrite to detect OAuth bearer and skip app-key check |
| AccessPolicy enforcement | `auth/AuthApi.ts:345` (`getAccessPolicyHandler`) | Extend with parallel `requireScopeAndPolicy` |
| Arda login flow | `webapps/src/server/api.js:98-132` (`login`), `webapps/src/components/Auth/Login.jsx` | Extend to honor a `returnTo` query param |
| Arda Redis access | `@bigscreen/lib` `RedisClient()` (already used throughout) | Arda mints authorization codes directly into Redis — no service hop |
| Arda Express routing | `webapps/arda/arda.js` | Add `/oauth/authorize` routes before the line 87 catch-all |
| Postgres pool | `lib/PostgresDatabase.ts:251` (`Postgres.getFabricatorClient()`) | OAuth tables on the fabricator pool |
| Semantic UI + Pug templates | `webapps/src/components/**`, `webapps/arda/views/index.pug` | Consent screen and dev portal UI |

---

## Architecture

### Service boundaries

| Service | Port | OAuth role | New endpoints |
|---|---|---|---|
| arda (Express + React) | 3010 | Authorization server + authorization-code minting | `GET /oauth/authorize`, `POST /oauth/authorize/decision`, `GET /developers/*`, `GET /settings/connected-apps` |
| auth-api (`apps/api/api.ts`) | 3009 | Token issuer + token inspection | `POST /oauth/token`, `POST /oauth/revoke`, `POST /oauth/introspect`, `GET /oauth/userinfo`, `GET /.well-known/jwks.json`, `GET /.well-known/oauth-authorization-server` |
| admin_api (`apps/admin_api/admin_api.ts`) | 3999 | Resource server + OAuth client CRUD | `/admin/oauth/clients/*` (SuperUser CRUD); existing routes enforce OAuth scope via new middleware |

**Change from SPEC.md:** the `POST /oauth/codes` service-to-service endpoint on auth-api is REMOVED. Arda mints authorization codes directly into its own Redis connection. Codes are 32 random bytes stored under `oauth:code:<code>` with a 60s TTL; they don't need cryptographic signing or access to the RS256 private key, so there's no architectural reason to cross a service boundary. This removes a network hop and an additional failure mode.

### Authorization Code + PKCE flow (end-to-end)

```
1. External app (factory tool, running on whitelisted-IP server) →
   https://arda.bigscreencloud.com/oauth/authorize
     ?response_type=code
     &client_id=fabricator_tool
     &redirect_uri=https://tool.example.com/cb
     &scope=fabricator:read%20orders:read
     &state=<random>
     &code_challenge=<sha256-b64url(verifier)>
     &code_challenge_method=S256

2. Arda validates:
     - oauth_clients row exists, not disabled, clientType=confidential (v1)
     - redirect_uri exact-match in client.redirectUris
     - requested scopes ⊆ client.allowedScopes
     - state present
     - code_challenge present + method=S256
   If cookie missing → 302 /login?returnTo=<same-origin-validated>. Login flow returns here.
   If oauth_grants row for (userId, clientId) already covers requested scope → skip to step 4.

3. Arda renders Pug consent page. Sets double-submit CSRF cookie `oauthCsrf=<random>`.
   User clicks Allow → POST /oauth/authorize/decision (body includes hidden input
   `_csrf=<sameValue>` + original request params re-submitted).
   Server validates cookie==form value; upserts oauth_grants; writes audit row.

4. Arda mints code directly:
     - code = crypto.randomBytes(32).toString("base64url")
     - oauthGrantId = uuid()  (NEW: bound to this grant, persisted on oauth_grants row)
     - SET oauth:code:<code> EX 60 NX {clientId, userId, oauthGrantId, redirectUri,
       scope, codeChallenge, codeChallengeMethod, authTime}
   Arda 302s to redirect_uri?code=<code>&state=<state>.

5. External app backend → POST https://api.bigscreencloud.com/oauth/token
     grant_type=authorization_code
     &code=<code>
     &redirect_uri=...
     &client_id=...
     &code_verifier=...
   Confidential clients additionally authenticate via HTTP Basic with client_secret.

6. Auth-api /oauth/token:
     a. GETDEL oauth:code:<code>  (atomic one-time-use)
     b. Verify redirect_uri + client_id match stored
     c. Verify PKCE: sha256(code_verifier) === codeChallenge
     d. Bcrypt-compare client_secret
     e. Load BigscreenAccount; reject if banned; check user still holds
        at least one required policy for each requested scope
     f. Mint v3 OAuth access token + refresh token
        (sub includes oauthGrantId, NOT userSessionId)
     g. AuthDatabase.activateAccessToken + activateRefreshToken (OAuth-tagged)
     h. SADD oauth:user_client_tokens:<userId>:<clientId> <tokenHash>
        (for per-client revocation; independent of arda's accessToken:<userId> set)
     i. Return { access_token, token_type:"Bearer", expires_in:900,
                 refresh_token, scope, refresh_chain_id }  (chain_id internal;
                 not actually returned to client, see §Refresh rotation)

7. External app (from its whitelisted IP) → admin_api:
     Authorization: Bearer <oauth_access_token>
     (No x-access-token, no app key. EC2 security group allows the IP.)

8. admin_api verifyAdminRequest detects grantType=oauth_user in the JWT sub,
   loads OAuth context (client enabled, oauth_grants row for oauthGrantId not revoked,
   grant's scope still covers token's claimed scope — handles mid-session revoke),
   enforces scope via requireScopeAndPolicy,
   enforces AccessPolicy identically to first-party path.
```

### Token format (JWT v3)

Reuse RS256 signing, `JWT_ACCESS_TOKEN_ISSUER`, existing Redis/Firestore whitelists. Only the `sub` payload changes:

```ts
// Existing v2 (first-party, unchanged):
sub: { version: "v2", bigscreenAccountId, userSessionId }

// New v3 (OAuth only):
sub: {
    version: "v3",
    bigscreenAccountId,
    oauthGrantId,              // NEW: independent from arda's userSessionId.
                                //       Arda logout will NOT invalidate OAuth tokens.
    grantType: "oauth_user",   // discriminator
    clientId: "fabricator_tool",
    scope: ["fabricator:read", "orders:read"],
    authTime: 1713100000
}
```

Access token TTL stays at 15 minutes. Refresh token TTL **reduced to 30 days** for OAuth (vs first-party's 3 months) — refresh rotation is enforced, so the leaked-token blast radius is already limited, but 30 days is a more responsible default for delegated access and matches common OAuth practice.

### Refresh token rotation (new schema)

v3 OAuth refresh tokens are rotated on every use. The Firestore `RefreshTokenWhitelist` document shape is extended for OAuth records:

```ts
// Existing (v2, first-party):
{ bigscreenAccountId, token }

// New (v3, OAuth):
{
    bigscreenAccountId,
    token,                    // SHA1 hash as today
    version: "v3",
    clientId,
    oauthGrantId,
    chainId: uuid,            // same for all tokens in a rotation chain
    generation: 1,            // increments on each rotation
    createdAt: unix,
    parentHash: null | hash   // previous token in chain; null for the first
}
```

On refresh:
1. Look up current refresh token record by hash; validate `clientId`, `oauthGrantId` match what's in the token JWT.
2. If already deactivated, this is a **reuse attempt** — deactivate the entire chain (`chainId`), revoke all access tokens in the OAuth grant (remove all members of `oauth:user_client_tokens:<userId>:<clientId>` that were issued under this `oauthGrantId`), write a security audit row `refresh_reuse_detected`, return 401.
3. Otherwise: mint new access token and new refresh token with `generation+1`, `parentHash = oldHash`, same `chainId` + `oauthGrantId`. Deactivate old refresh token. Activate new.
4. Scope on the new tokens is ⊆ the old refresh token's scope. Narrowing OK; widening rejected.

Reuse detection closes the "attacker stole refresh token and used it before legitimate client could" window.

---

## Session independence: what arda logout does and doesn't do

### What arda logout does (unchanged behavior)

- Calls `GET /auth/logout` on auth-api, which:
  - `AuthDatabase.deactivateAccessToken(hashOfArdaAccessToken)` — removes from `accessToken:<userId>` Redis set
  - `AuthDatabase.deactivateRefreshToken(hashOfArdaRefreshToken)` — removes Firestore row
- Clears arda cookies
- Redirects to /login

### What arda logout explicitly does NOT do (new guarantee)

- Does NOT touch `oauth:user_client_tokens:<userId>:<clientId>` — OAuth token hashes stay
- Does NOT invalidate `oauth_grants` rows
- Does NOT deactivate OAuth access tokens in Firestore or Redis
- Does NOT cause admin_api to 401 on existing OAuth bearer tokens

### Guarantees this preserves

- A user can sign into `fabricator_tool` via OAuth, close arda, log out of arda — and `fabricator_tool`'s access token keeps working until its 15-minute expiry, after which it refreshes normally.
- If the user later signs into arda as a different Bigscreen account on the same browser, it does not affect the first user's OAuth grants.

### How to invalidate an OAuth token (the only ways)

- User visits arda `/settings/connected-apps` and revokes the grant → `oauth_grants.revokedAt` set; all entries in `oauth:user_client_tokens:<userId>:<clientId>` removed from Redis; all OAuth refresh tokens for this (userId, clientId) deactivated in Firestore.
- Admin visits arda `/developers/apps/:clientId` and disables → client-wide equivalent of the above.
- Refresh rotation reuse detected → chain-wide revocation as described above.
- Natural expiry.

---

## Scope model: orthogonal to AccessPolicy

**Rule: scope narrows, never expands.** OAuth routes require BOTH (a) token scope covers the action AND (b) user still holds a matching AccessPolicy. A SuperUser authorizing a narrow scope gets narrow access via that token; a non-SuperUser authorizing a broad scope still can't do what their AccessPolicies don't permit.

### v1 initial scope set

| Scope | Purpose | Required AccessPolicy (any of) |
|---|---|---|
| `openid` | Enables `/oauth/userinfo` identity lookup | any |
| `profile` | Expose username, email, createdAt on userinfo | any |
| `fabricator:read` | Read `/admin/fabricator/*`, `/admin/inventory/*` GET | FabricatorReadOnly, Fabricator, FabricatorAdmin, Admin |
| `fabricator:write` | Mutate fabricator data | Fabricator, FabricatorAdmin, Admin |
| `orders:read` | Read `/admin/shop/*`, `/admin/big_orders/*` | Fabricator, Inventory, Admin, SuperUser |
| `orders:write` | Mutate orders | Fabricator, Inventory, Admin |
| `accounts:read` | Read user profiles (factory contact info lookup) | Moderator, AccountsReadOnly, Admin |

**Deferred:** `accounts:write`, `reports:read`, `network:read`, `admin:all`.

Scope → required-policy mapping lives in a new file `auth/OAuthScopes.ts` exported from `@bigscreen/auth`.

---

## Database Schema

New migration file `apps/db_setup/oauth_db_setup.ts`, mirroring the style of `beyond_db_setup.ts`. Tables live on admin_api's Postgres fabricator pool.

```sql
CREATE TABLE oauth_clients (
    "uniqueId"         uuid PRIMARY KEY DEFAULT uuid_generate_v4(),
    "clientId"         VARCHAR(48)  NOT NULL UNIQUE,
    "clientSecretHash" VARCHAR(128) NOT NULL,          -- bcrypt; required (public clients not supported in v1)
    "clientType"       VARCHAR(16)  NOT NULL,          -- 'confidential' only in v1
    "name"             VARCHAR(128) NOT NULL,
    "description"      TEXT,
    "logoUrl"          VARCHAR(512),
    "homepageUrl"      VARCHAR(512),
    "redirectUris"     TEXT[]       NOT NULL,          -- exact-match validated
    "allowedScopes"    TEXT[]       NOT NULL,
    "serverIps"        TEXT[]       NOT NULL,          -- NEW: whitelisted static server IPs (CIDR allowed)
    "securityGroupRuleIds" TEXT[],                      -- NEW: AWS SG rule IDs for traceability
    "ownerAccountId"   VARCHAR(64)  NOT NULL,
    "createdAt"        BIGINT       NOT NULL,
    "disabledAt"       BIGINT,
    "disabledReason"   TEXT
);
CREATE INDEX idx_oauth_clients_owner ON oauth_clients("ownerAccountId");

CREATE TABLE oauth_grants (
    "uniqueId"      uuid PRIMARY KEY DEFAULT uuid_generate_v4(),
    "oauthGrantId"  uuid        NOT NULL UNIQUE,       -- NEW: bound to tokens issued under this grant
    "userId"        VARCHAR(64) NOT NULL,
    "clientId"      VARCHAR(48) NOT NULL REFERENCES oauth_clients("clientId"),
    "scopes"        TEXT[]      NOT NULL,
    "grantedAt"     BIGINT      NOT NULL,
    "updatedAt"     BIGINT      NOT NULL,
    "revokedAt"     BIGINT,
    UNIQUE ("userId", "clientId")
);
CREATE INDEX idx_oauth_grants_user       ON oauth_grants("userId");
CREATE INDEX idx_oauth_grants_client     ON oauth_grants("clientId");
CREATE INDEX idx_oauth_grants_grant_id   ON oauth_grants("oauthGrantId");

CREATE TABLE oauth_audit_log (
    "uniqueId"       uuid PRIMARY KEY DEFAULT uuid_generate_v4(),
    "at"             BIGINT      NOT NULL,
    "eventType"      VARCHAR(48) NOT NULL,
    "actorAccountId" VARCHAR(64),
    "clientId"       VARCHAR(48),
    "userId"         VARCHAR(64),
    "ip"             VARCHAR(64),
    "details"        jsonb
);
CREATE INDEX idx_oauth_audit_at     ON oauth_audit_log("at" DESC);
CREATE INDEX idx_oauth_audit_client ON oauth_audit_log("clientId", "at" DESC);
```

Audit event types: `client_created`, `client_disabled`, `client_enabled`, `client_updated`, `secret_rotated`, `grant_created`, `grant_updated`, `grant_revoked`, `token_issued`, `token_refreshed`, `token_revoked`, `refresh_reuse_detected`, `auth_denied`, `redirect_mismatch`, `pkce_failure`, `ip_registration_mismatch`, `scope_denied`.

Authorization codes live in **Redis** (key `oauth:code:<code>`, TTL 60s). Redemption is atomic via `GETDEL`.

---

## Grant Types

Supported:
- **Authorization Code + PKCE (S256)** — the only flow for minting tokens.
- **Refresh Token grant with rotation** — see §Refresh token rotation above.

Not supported:
- Implicit (deprecated), Password grant (deprecated), Client Credentials (covered by existing `allowed_apps.json`), Device Authorization (defer).

---

## CSRF on `/oauth/authorize/decision` (concrete design)

No external library needed; use a double-submit cookie pattern:

1. On `GET /oauth/authorize` that renders consent (i.e., when prior grant is insufficient), arda generates `csrf = crypto.randomBytes(32).toString("hex")`.
2. Arda sets cookie `oauthCsrf=<csrf>; HttpOnly=false; SameSite=Lax; Secure; Path=/oauth/`.
3. The consent Pug template includes `<input type="hidden" name="_csrf" value="<csrf>">` and resubmits the original authorize parameters (`client_id`, `redirect_uri`, `scope`, `state`, `code_challenge`, `code_challenge_method`) as hidden fields.
4. On `POST /oauth/authorize/decision`, arda reads both and requires `request.cookies.oauthCsrf === request.body._csrf`. Mismatch → 403 with audit row `csrf_failure`. Clear cookie on successful decision.

Rationale: double-submit works because an attacker-controlled origin cannot read or write arbitrary cookies on `arda.bigscreencloud.com`. `HttpOnly=false` is required so the form can read the cookie value server-side. `SameSite=Lax` protects against most CSRF vectors; the form-value double-check is belt-and-braces.

This is approximately 30 lines of code in `webapps/src/server/oauth.js`. No dependency added.

---

## Feature flags (rollback story)

Every new endpoint is gated on an env var. Default in staging: all on. Default in production: all off until explicit rollout.

| Env var | Default (prod) | Gates |
|---|---|---|
| `OAUTH_AUTHORIZE_ENABLED` | `false` | `GET /oauth/authorize`, `POST /oauth/authorize/decision` |
| `OAUTH_TOKEN_ENABLED` | `false` | `POST /oauth/token` |
| `OAUTH_USERINFO_ENABLED` | `false` | `GET /oauth/userinfo` |
| `OAUTH_REVOKE_ENABLED` | `false` | `POST /oauth/revoke`, `POST /oauth/introspect` |
| `OAUTH_ADMIN_API_ENABLED` | `false` | `verifyAdminRequest` accepts OAuth tokens (when false, OAuth bearers are rejected even if they're otherwise valid) |
| `OAUTH_DEV_PORTAL_ENABLED` | `false` | `/developers/*` UI + `/admin/oauth/clients/*` CRUD |
| `OAUTH_CONNECTED_APPS_ENABLED` | `false` | `/settings/connected-apps` + `/admin/oauth/my_grants/*` |

When a gate is off, the endpoint returns `501 Not Implemented` with `{error: "endpoint_disabled"}`. No DB reads, no token validation, no side effects.

Global kill switch: setting all gates to `false` disables the entire OAuth surface without requiring a deploy, if the secret per-flag env var is runtime-reloaded. If env vars only load at boot, flipping a flag requires a rolling restart — acceptable for this threat model.

---

## API Endpoints

### arda (port 3010)

| Method | Endpoint | Auth | Purpose |
|---|---|---|---|
| GET | `/oauth/authorize` | cookie (redirect to /login if missing) | Validate params, skip-consent if prior grant, render consent page |
| POST | `/oauth/authorize/decision` | cookie + CSRF double-submit | Upsert `oauth_grants`, mint code into Redis, 302 to redirect_uri |
| GET | `/login?returnTo=...` | none | Existing login extended to honor same-origin `returnTo` (reject protocol-relative) |
| GET | `/developers` | cookie + SuperUser | Developer portal home |
| GET | `/developers/apps/:clientId` | cookie + SuperUser | Client editor |
| GET | `/settings/connected-apps` | cookie | User's own OAuthGrants list |

### auth-api (port 3009)

| Method | Endpoint | Auth | Purpose |
|---|---|---|---|
| POST | `/oauth/token` | client_id + client_secret (HTTP Basic or form) | Exchange code or refresh_token for tokens |
| POST | `/oauth/revoke` | client_id + client_secret | RFC 7009 token revocation |
| POST | `/oauth/introspect` | client_id + client_secret | RFC 7662 token introspection |
| GET | `/oauth/userinfo` | OAuth bearer | Return `{ sub, username, email, createdAt }` based on scopes |
| GET | `/.well-known/jwks.json` | none | RSA public key |
| GET | `/.well-known/oauth-authorization-server` | none | OAuth 2.0 AS Metadata (RFC 8414) |

(`POST /oauth/codes` is REMOVED from the auth-api surface — arda mints codes directly now.)

### admin_api (port 3999)

| Method | Endpoint | Policy | Purpose |
|---|---|---|---|
| GET | `/admin/oauth/clients` | SuperUser | List all clients |
| POST | `/admin/oauth/clients` | SuperUser | Create (requires `serverIps` in body; returns one-time client_secret) |
| GET | `/admin/oauth/clients/:clientId` | SuperUser | Fetch client |
| PUT | `/admin/oauth/clients/:clientId` | SuperUser | Update name/description/logo/redirect URIs/scopes/serverIps |
| POST | `/admin/oauth/clients/:clientId/rotate_secret` | SuperUser | Rotate secret |
| POST | `/admin/oauth/clients/:clientId/disable` | SuperUser | Disable + invalidate all outstanding tokens for this client |
| POST | `/admin/oauth/clients/:clientId/enable` | SuperUser | Re-enable |
| GET | `/admin/oauth/clients/:clientId/grants` | SuperUser | List all user grants |
| GET | `/admin/oauth/clients/:clientId/audit` | SuperUser | Event log |
| GET | `/admin/oauth/my_grants` | any authenticated | Current user's grants |
| DELETE | `/admin/oauth/my_grants/:clientId` | any authenticated | Revoke consent + all outstanding tokens |

---

## Files to Create

| File | Purpose | Approx LOC |
|---|---|---|
| `auth/OAuthScopes.ts` | Scope enum + scope → required-policy map | 80 |
| `auth/OAuthClientDatabase.ts` | Postgres CRUD for `oauth_clients`, `oauth_grants`, `oauth_audit_log` | 400 |
| `auth/OAuthTokens.ts` | `generateOAuthAccessToken`, `generateOAuthRefreshToken`, refresh rotation + reuse detection, `verifyAndEnforceOAuthContext` | 400 |
| `auth/OAuthApi.ts` | Auth-api HTTP handlers: `/oauth/token`, `/revoke`, `/introspect`, `/userinfo`, `/.well-known/*`; `requireScopeAndPolicy` middleware | 550 |
| `apps/db_setup/oauth_db_setup.ts` | Creates OAuth tables | 140 |
| `webapps/src/server/oauth.js` | Arda handlers for `/oauth/authorize`, `/oauth/authorize/decision`, CSRF double-submit, direct code minting into Redis | 300 |
| `webapps/arda/views/oauth_consent.pug` | Server-rendered consent screen | 70 |
| `webapps/src/components/Developers/DevHome.jsx` | Developer portal landing | 150 |
| `webapps/src/components/Developers/OAuthClientEditor.jsx` | Create/edit client form (includes `serverIps` field with validation) | 350 |
| `webapps/src/components/Developers/OAuthClientSecretModal.jsx` | One-time secret display modal | 50 |
| `webapps/src/components/Developers/OAuthGrantsList.jsx` | Admin view of user grants | 120 |
| `webapps/src/components/Developers/OAuthAuditLog.jsx` | Audit log viewer | 100 |
| `webapps/src/components/Settings/ConnectedApps.jsx` | User's "Connected apps" with per-app revoke | 150 |
| `tests/auth/OAuth.spec.ts` | End-to-end OAuth tests | 700 |
| `docs/oauth.md` | Integrator-facing OAuth documentation (includes IP-whitelist prerequisite, flow examples, scope table) | 450 |
| `docs/FORGEMASTER.md` | Project explanation doc per CLAUDE.md convention | 300 |

## Files to Modify

| File | Change |
|---|---|
| `auth/Auth.ts` | Extend `AccessTokenPayload` at line 103 with optional `grantType`, `clientId`, `scope`, `oauthGrantId` (typed; no constructor change because `Object.assign(this, payload.sub)` handles it). Extend `requiresValidAccessTokenInternal` at line 488: on `grantType === "oauth_user"`, call `OAuthClientDatabase.isClientEnabled` + grant-still-covers-scope check via `oauthGrantId`. Add `v3` version constant. |
| `auth/AuthApi.ts` | `getAccessPolicyHandler` at line 345 rejects OAuth tokens with `403 insufficient_scope`. Unmigrated routes are OAuth-closed by default. |
| `auth/AuthDatabase.ts` | Helpers `trackOAuthToken(userId, clientId, tokenHash)`, `revokeAllTokensForUserClient(userId, clientId)`, `revokeRefreshChain(chainId)`. Extend `activateAccessToken` at line 103 with optional `clientId`. |
| `auth/index.ts` | Export `OAuthScopes`, `OAuthClientDatabase`, `OAuthTokens`, `OAuthApi`. |
| `apps/api/api.ts` | Mount OAuth endpoints (gated on feature flags). Enable the rate limiter commented out at lines 93-94 (decide: re-enable existing code vs adopt `express-rate-limit`). |
| `apps/admin_api/admin_api.ts` | Rewrite `verifyAdminRequest` (lines 43-56) to detect OAuth bearer and skip app-key check when OAuth (gated on `OAUTH_ADMIN_API_ENABLED`). Add `/admin/oauth/clients/*` CRUD. Migrate pilot routes to `requireScopeAndPolicy` in Phase D. |
| `webapps/arda/arda.js` | Add `/oauth/authorize` + `/oauth/authorize/decision` BEFORE catchall at line 87. Add `/api/oauth/*` proxies for dev portal. |
| `webapps/src/server/api.js` | `login` (lines 98-132) honors same-origin `returnTo` (validate starts with `/`, reject protocol-relative). |
| `webapps/src/components/Auth/Login.jsx` | Honor `returnTo` query param on submit success. |
| `webapps/arda/app/App.jsx` | Add routes: `/developers*`, `/settings/connected-apps`. |
| `webapps/arda/app/ArdaWrapper.jsx` (lines 81-93) | Add Developers menu item, SuperUser-gated. |
| `docs/testing.md` | Document OAuth test harness expectations. |
| `.env` sample / `DEV_SETUP.md` | New env vars: feature flags above, `OAUTH_ISSUER_URL`, `OAUTH_CODE_TTL_SECONDS`, `OAUTH_REFRESH_TTL_DAYS`, `OAUTH_CSRF_COOKIE_NAME`. |

## Files NOT to Modify (deliberately)

- `auth/Auth.ts:138-167, 186-220` (`generateRefreshToken` / `generateAccessToken`) — untouched. OAuth uses new factories in `OAuthTokens.ts` that invoke the same RS256 machinery but produce v3 payloads.
- `tests/allowed_apps.json` — different trust model, kept separate.
- Existing `api.get("/admin/...", getAccessPolicyHandler(...))` routes — migration is opt-in per route, one at a time, in Phase D onwards.

---

## Security Requirements

All P0. None optional.

- **IP allowlist first, OAuth second.** admin_api's EC2 security group is the outer perimeter. OAuth only layers identity+scope on top.
- **Every OAuth client registration requires declaring `serverIps[]`.** Audit row on every IP change.
- **Public clients rejected in v1** (no PKCE-only flow allowed). `clientType=public` is stored but rejected at registration.
- **Redirect URI exact-match only.** No wildcards. Localhost any-port only when `NODE_ENV !== 'production'`.
- **`state` parameter required.** Missing → 400 `invalid_request`.
- **PKCE required (S256 only).** `plain` rejected at `/oauth/token`.
- **Client secret server-to-server only.** HTTP Basic or form-post. Never in URLs.
- **Rate limiting** on all `/oauth/*` endpoints. The limiter at `apps/api/api.ts:93-94` must be resolved before Phase C ships.
- **Per-client token tracking in `oauth:user_client_tokens:<userId>:<clientId>`** — enables per-client revocation separate from user-wide revocation.
- **Scope never widens on refresh.** New token's scope must be ⊆ refresh token's scope.
- **Refresh rotation with chain reuse-detection.** Reuse attempt → revoke entire chain + all access tokens issued under `oauthGrantId`.
- **Dynamic CORS on `/oauth/token`** — allow-list of origins derived from registered `redirectUris`.
- **Open-redirect protection on `returnTo`** — validate starts with `/`, reject `//host` protocol-relative.
- **JWKS endpoint** exposes public RSA key; zero risk, key is already public.
- **Audit everything** to `oauth_audit_log` AND existing Logger.
- **Log redaction** for `code`, `code_verifier`, `client_secret`, access/refresh tokens. Follow `getSanitizedRequestString` pattern at `auth/AuthApi.ts:24`.
- **Admin disables client** → `disabledAt` set; all outstanding tokens for that client revoked; next request 401s.
- **User revokes grant** → (user, client) pair-scoped equivalent.
- **OAuth-standard error responses** (RFC 6749 §5.2, RFC 6750 `WWW-Authenticate`).
- **CSRF double-submit cookie** on `/oauth/authorize/decision` as detailed above.
- **Arda logout never touches OAuth tokens** — documented invariant; tests enforce this.

---

## Rollout Phases

Each phase lands independently behind feature flags. Existing first-party behavior unchanged throughout.

### Phase A — Client registration plumbing

- Postgres migration (`oauth_db_setup.ts`)
- `OAuthClientDatabase.ts` CRUD, bcrypt hashing, audit writes
- Admin_api `/admin/oauth/clients/*` routes (SuperUser)
- Arda `/developers` portal UI
- `serverIps` field enforced at creation; `clientType=public` rejected
- All `/oauth/*` flow endpoints return `501 endpoint_disabled`

**Deliverable**: admins can register clients. No OAuth flow yet.

### Phase B — `/authorize` + consent + code mint (arda-local)

- Arda `/oauth/authorize` GET (parameter validation, login redirect, prior-grant memoization)
- Arda `/oauth/authorize/decision` POST (CSRF double-submit, grant upsert, Redis code mint, 302)
- Consent Pug template
- `/login?returnTo=` honored

**Deliverable**: end-to-end authorize flow produces a code at the client's redirect URI. Token endpoint disabled.

### Phase C — `/token` + refresh with rotation

- Auth-api `/oauth/token` (authorization_code + refresh_token grants)
- PKCE S256 enforcement
- Confidential client authentication (Basic + form-post)
- v3 OAuth JWTs with `oauthGrantId`
- Refresh token rotation + chain reuse detection
- `/.well-known/jwks.json`, `/.well-known/oauth-authorization-server`

**Deliverable**: clients can exchange code for tokens. Admin_api still rejects OAuth tokens.

### Phase D — Admin_api OAuth enforcement + pilot migration

- `verifyAdminRequest` detects OAuth tokens (gated on `OAUTH_ADMIN_API_ENABLED`)
- `requireScopeAndPolicy` middleware
- Migrate pilot routes:
  - `GET /admin/profile` (openid + profile)
  - `GET /admin/fabricator/*` read endpoints
  - `GET /admin/inventory/*` read endpoints
  - `GET /admin/shop/orders`, `GET /admin/big_orders/*` (orders:read)

**Deliverable**: first real OAuth admin_api call succeeds end-to-end. Factory tool builds against this.

### Phase E — Userinfo, revocation, audit UI

- Auth-api `GET /oauth/userinfo` (gated on `OAUTH_USERINFO_ENABLED`)
- Auth-api `POST /oauth/revoke` (RFC 7009)
- Auth-api `POST /oauth/introspect` (RFC 7662)
- Arda `/settings/connected-apps` with per-app revoke
- Arda `/developers/apps/:clientId/audit` viewer
- Admin "Disable client" action invalidates all outstanding tokens for that client

**Deliverable**: full revocation story. Sign-in-with-Bigscreen via `/oauth/userinfo` works.

### Phase F (deferred) — Full OIDC id_token

Only if a concrete consumer asks. Add `id_token` minting on `scope=openid`, `/.well-known/openid-configuration`, nonce binding, `id_token_hint`.

### Phase G (deferred) — Public clients

Adds PKCE-required-only authentication (no client_secret) for browser SPAs / mobile. Requires relaxing the IP-allowlist constraint on a dedicated public admin_api edge, or proxying all public-client admin_api calls through arda. Explicitly out of v1.

---

## Verification

Run existing auth + admin_api test suites at every phase. New tests per phase:

1. **Phase A** (`tests/auth/OAuth.spec.ts`):
   - Create client → receive one-time secret; secret hash round-trips bcrypt
   - Rotate secret → old rejected, new accepted
   - Disable → client record read-only
   - Non-SuperUser cannot access `/admin/oauth/clients/*`
   - Creating a client without `serverIps` → 400
   - Creating with `clientType=public` → 400
   - Audit rows written for every mutation

2. **Phase B**:
   - Invalid `client_id`, wrong `redirect_uri`, missing `state`, missing PKCE, scope not ⊆ allowedScopes → 400
   - Happy path: unauthenticated → /login?returnTo → login → back → consent → callback with code+state
   - Skip-consent: re-run, no consent shown, redirect immediate
   - `returnTo=//evil.com` rejected at login
   - CSRF: mismatched cookie/form value → 403

3. **Phase C**:
   - Happy path: code → tokens
   - Wrong `code_verifier`, reused code, expired code, wrong `redirect_uri` → 400
   - Confidential client without secret → 401 `invalid_client`
   - `code_challenge_method=plain` rejected
   - Refresh rotation: refresh → new pair; old rejected on next use; scope subset enforced
   - **Rotation reuse attack**: use old refresh after new one issued → entire chain revoked + audit row + access tokens nuked
   - `/.well-known/jwks.json` returns valid JWKS

4. **Phase D**:
   - Migrated route with correct scope + policy → 200
   - Correct scope but user lacks policy → 403
   - Wrong scope → 403 `insufficient_scope` with `WWW-Authenticate` header
   - Unmigrated route with OAuth token → 403
   - First-party `x-access-token` on migrated route → still works
   - **Arda logout does not break OAuth**: sign into arda as user A, authorize factory_tool, log out of arda; factory_tool's OAuth token continues to work against admin_api. Assert `oauth:user_client_tokens:<userId>:<clientId>` unchanged after logout.
   - OAuth call from non-whitelisted IP → dropped at security group (manual test in staging)

5. **Phase E**:
   - `GET /oauth/userinfo` with openid scope → 200 with `sub`
   - With profile scope added → 200 with username + email
   - User revokes via `/settings/connected-apps` → `oauth_grants.revokedAt` set; next admin_api call → 401
   - Admin disables client → all outstanding tokens from that client → 401
   - `POST /oauth/revoke` with valid refresh token → subsequent refresh → 400

6. **Acceptance test** (end of Phase D):
   Thin CLI simulating factory_tool: opens browser to `/oauth/authorize` with localhost callback, receives code, exchanges for tokens, calls `GET /admin/fabricator/scans`, calls `GET /oauth/userinfo`, prints authenticated user. Scriptable via ts-mocha, included in `tests/auth/OAuth.spec.ts`.

7. **Session independence test** (explicit, runs in every phase after B):
   - Log in to arda as user U1 → userSessionId S1
   - Authorize client C1 → OAuth tokens bound to oauthGrantId G1
   - Log out of arda
   - OAuth access token against admin_api → still 200
   - Refresh OAuth token → still works
   - Log in again as user U2 (different account)
   - OAuth token for U1/C1 → still 200 (tokens belong to U1, not to arda browser session)

---

## Implementation Notes

- **Postgres writes use the fabricator pool** (`Postgres.getFabricatorClient()`), matching admin_api's existing convention at `lib/PostgresDatabase.ts:251`.
- **Indentation: 4 spaces**, per CLAUDE.md.
- **Never read `.env` files directly.** Document new env vars in `.env.sample` / `DEV_SETUP.md`.
- **Absolute paths always** — no `cd ../..`.
- **Integration tests run from `tests/` with ts-mocha**, e.g. `ts-mocha --bail --exit --timeout=500000 auth/OAuth.spec.ts`.
- **Yarn workspace builds only** — use `yarn workspace <name> build`, never `cd <subdir> && yarn build`.
- **Logger usage** — `@bigscreen/lib` Logger + `DiscordLogger`. Emit security-critical events (IP mismatch, refresh reuse, PKCE failure, redirect mismatch, client disable) at warning level to Discord.
- **CLAUDE.md requires `docs/FOR[yourname].md` per project** — create `docs/FORGEMASTER.md` covering: architecture, why the three-service split (with code minted on arda), why reuse JWT v3, why `oauthGrantId` independent of `userSessionId`, security rationale, bugs encountered.

---

## Operational runbook (registration prerequisites)

When registering a new OAuth client, the SuperUser must:

1. Collect the client's `serverIps` (static EIPs / NAT egress). Validate they're static by asking the integrator to confirm — dynamic IPs break the model.
2. File (or self-serve) an AWS security-group update for admin_api's SG: add an inbound rule on port 443 from each declared IP. Capture the rule ID(s).
3. Register the client in `/developers` with `serverIps[]` and `securityGroupRuleIds[]` populated.
4. Send the client its `client_id` + `client_secret` (one-time reveal). Secure channel (1Password, encrypted email, in-person).
5. Confirm the integrator's first `/oauth/token` call reaches admin_api successfully before closing the ticket.

On client disable:
1. Disable in `/developers` (sets `disabledAt`, invalidates tokens).
2. Remove the corresponding AWS security-group rules (prevents reactivation without re-review).
3. Audit row `client_disabled` captures the deletion.

---

## Open questions to resolve during implementation

- **Rate limiter** (`apps/api/api.ts:93-94`): re-enable existing commented code or adopt `express-rate-limit`? Resolve before Phase C.
- **OAuth access token TTL**: default is 15 min. If factory tool reveals refresh churn is painful, consider 30-60 min in Phase D retrospective.
- **Audit log retention**: confirm indefinite vs GDPR-driven expiry. Default: indefinite with periodic `userId` anonymization on rows older than a year.
- **Consent screen UX copy**: who writes the scope descriptions? Draft with engineering, review with product before Phase B ships.
- **Security-group automation**: v1 is manual AWS console. v2 could auto-add rules via IAM-gated API call from admin_api. Not in v1 scope.
- **`allowed_apps.json` entry required for OAuth clients?** **No.** OAuth bearer tokens carry `clientId` inside the JWT; `allowed_apps.json` is not consulted on the OAuth code path. Confirmed in the `verifyAdminRequest` rewrite.