# Architecture

The Bigscreen Cloud is not a single monolith. It's a small fleet of Node/TypeScript services that cooperate over HTTP, WebSockets, and AWS SQS queues, plus a React admin SPA and a couple of auxiliary workers. Each service runs as its own Node process (local dev) or its own EC2 instance (production).

This document gives the system-level picture. For per-service detail, drill into the [services/](./services/) pages.

## Component Diagram

```mermaid
flowchart TB
    subgraph Clients
        VR[VR Unity App]
        WEB[Website]
        ARDA[Arda SPA]
        IOS[iOS Scan App]
        CNC[CNC Machine]
        FUSION[Fusion 360]
    end

    subgraph Public["Public-facing services"]
        API["apps/api<br/>(:3009)"]
        CLOUDAPI["cloud/cloud_api<br/>(:3002)"]
        WSS["cloud/ws_server<br/>(:3003)"]
        MS["cloud/media_server_next<br/>(mediasoup RTC)"]
    end

    subgraph Private["Private services"]
        ADMIN["apps/admin_api<br/>(:3999)<br/>IP-restricted"]
        WORKER["cloud/cloud_worker"]
        MON["cloud/monitor"]
        KB["apps/kb"]
        SCAN["apps/scan_yourself"]
    end

    subgraph Data["Data stores"]
        PG[("Postgres")]
        REDIS[("Redis +<br/>Redlock")]
        DDB[("DynamoDB")]
        S3[("S3")]
        CW[("CloudWatch")]
    end

    subgraph Queues["AWS SQS queues"]
        BCQ[BigscreenCloud]
        WIN[WebsocketInbox]
        WOUT[WebsocketOutbox]
    end

    VR --> API
    VR --> CLOUDAPI
    VR <-.websocket.-> WSS
    VR <-.RTC.-> MS

    WEB --> API
    WEB --> CLOUDAPI

    ARDA -->|/api/admin| ADMIN
    ARDA -->|/cloud/admin| CLOUDAPI

    IOS --> SCAN
    SCAN --> ADMIN
    CNC --> ADMIN
    FUSION --> ADMIN

    API --> PG
    API --> REDIS
    API --> CW

    ADMIN --> PG
    ADMIN --> REDIS
    ADMIN --> S3
    ADMIN --> CW
    ADMIN --> KB

    CLOUDAPI --> REDIS
    CLOUDAPI --> DDB
    CLOUDAPI --> BCQ
    CLOUDAPI -.HTTP.-> MS

    WSS <--> WIN
    WSS <--> WOUT

    WORKER <--> BCQ
    WORKER <--> WIN
    WORKER --> REDIS

    MS <-.remote queue.-> CLOUDAPI

    MON -.UDP listen.-> MS
```

## Process & Port Map

Every service in the `dev` configuration (see repo-root `package.json` `dev` script) runs concurrently under nodemon:

| Service | Entry file | Port | Nodemon config |
|---------|-----------|------|----------------|
| apps/api | [apps/api/api.ts](../apps/api/api.ts):19 | 3009 | [nodemon.api.json](../nodemon.api.json) |
| apps/admin_api | [apps/admin_api/admin_api.ts](../apps/admin_api/admin_api.ts):26 | 3999 (configurable via `argv[2]`) | [nodemon.admin-api.json](../nodemon.admin-api.json) |
| cloud/cloud_api | [cloud/cloud_api/cloud_api.ts](../cloud/cloud_api/cloud_api.ts):48 | 3002 (`CLOUD_API_SERVER_PORT`) | [nodemon.cloud-api.json](../nodemon.cloud-api.json) |
| cloud/ws_server | [cloud/ws_server/ws_server.ts](../cloud/ws_server/ws_server.ts):15 | 3003 (`WEBSOCKET_SERVER_PORT`) | [nodemon.ws-server.json](../nodemon.ws-server.json) |
| webapps/arda | [webapps/arda/arda.js](../webapps/arda/arda.js):19 | 3010 (`PORT`) | [nodemon.arda.json](../nodemon.arda.json) |

Services that do not run under nodemon by default (started manually or by CI / factory automation):

| Service | Entry file | Notes |
|---------|-----------|-------|
| cloud/cloud_worker | [cloud/cloud_worker/cloud_worker_v2.ts](../cloud/cloud_worker/cloud_worker_v2.ts) | Polling loop, no HTTP port |
| cloud/media_server_next | [cloud/media_server_next/media_server_next.ts](../cloud/media_server_next/media_server_next.ts) | Mediasoup workers; UDP ports 10000–59999 |
| cloud/media_server_aws | [cloud/media_server_aws/media_server_aws.ts](../cloud/media_server_aws/media_server_aws.ts) | Legacy AWS-specific variant |
| cloud/monitor | [cloud/monitor/](../cloud/monitor) | Python UDP performance monitor (jitter / loss) |
| apps/kb | [apps/kb/](../apps/kb) | Knowledge-base RAG service + Discord bot |
| apps/scan_yourself | [apps/scan_yourself/](../apps/scan_yourself) | iOS face-scan backend + Lambda |

## Network Topology

```mermaid
flowchart LR
    subgraph Internet["Public internet"]
        direction LR
        CLIENT[Any client]
    end

    subgraph Bigscreen["Bigscreen office /<br/>allow-listed IPs"]
        STAFF[Staff workstations]
        FACTORYNET[Factory network<br/>CNC, Fusion]
    end

    subgraph AWS["AWS VPC"]
        API_S["apps/api<br/>public LB"]
        CLOUD_S["cloud/cloud_api<br/>public LB"]
        WS_S["cloud/ws_server<br/>public LB"]
        MS_S["media_server_next<br/>public UDP"]

        ADMIN_S["apps/admin_api<br/>SG: Bigscreen IPs only"]
        WORKER_S[cloud/cloud_worker]
        DATA_S[("Postgres, Redis,<br/>DynamoDB, S3")]
    end

    CLIENT -->|HTTPS| API_S
    CLIENT -->|HTTPS| CLOUD_S
    CLIENT -->|WSS| WS_S
    CLIENT -->|UDP| MS_S

    STAFF --> ADMIN_S
    FACTORYNET --> ADMIN_S

    API_S --> DATA_S
    CLOUD_S --> DATA_S
    WS_S --> DATA_S
    WORKER_S --> DATA_S
    ADMIN_S --> DATA_S
```

The defining rule: `apps/admin_api` is only reachable from a curated allow-list of IP addresses (EC2 security group). Everything else is public HTTPS / WSS / UDP.

## Tech Stack

| Area | Choice | Notes |
|------|--------|-------|
| Language | TypeScript 5 | `tsconfig.json` at repo root |
| Runtime | Node 18 (via Volta) | Pinned in root `package.json` → `volta.node = "18.20.8"` |
| HTTP | Express 5 | All three REST servers |
| WebSockets | `ws` 8 | `cloud/ws_server` |
| Real-time media | [mediasoup](https://mediasoup.org/) | Data, mic, audio, video transports per peer |
| Caching / locks | Redis 6.2 + Redlock | Distributed locks for room/user mutations |
| Primary DB | Postgres 15+ | Accounts, orders, analytics, fabricator data |
| Document DB | DynamoDB | Cloud state (room history, some multiplayer bookkeeping) |
| Vector DB | pgvector on Postgres | KB embeddings |
| Queue | AWS SQS | `BigscreenCloud`, `WebsocketInbox`, `WebsocketOutbox` |
| Object store | AWS S3 | RDC builds, scan uploads |
| Logging | Pino + Winston + Discord webhooks | CloudWatch aggregation |
| Secrets | AWS Secrets Manager | Via `lib/SecretsLoader` |
| Package mgmt | Yarn 3 (Berry) | **Never npm.** See repo [CLAUDE.md](../CLAUDE.md) |
| Build | `tsc --build` | Workspace-aware |
| Bundler (SPA) | Webpack 4 | Arda only |

## Authentication Model

Every HTTP request to any service hits a two-tier check. See [libraries/auth.md](./libraries/auth.md) for the detailed mechanics.

```mermaid
flowchart TB
    REQ[HTTP request] --> B1{Authorization<br/>Bearer &lt;application-key&gt;?}
    B1 -- no --> R401[401 Unauthorized]
    B1 -- yes --> B2[Identifies which app<br/>is calling]
    B2 --> AT{Access token<br/>required?}
    AT -- no --> OK1[Continue]
    AT -- yes --> B3{Valid JWT in<br/>x-access-token header?}
    B3 -- no --> R401b[401 Unauthorized]
    B3 -- yes --> B4{Access policy<br/>check?}
    B4 -- passes --> OK2[Continue]
    B4 -- fails --> R403[403 Forbidden]
```

Tier 1: **application key** — every request must identify the calling application (`apps/api` itself is an app; so is `cloud-admin-api`, `website-cloud-api`, `hyperbeam-webhook`, `media-server-api`, etc.). This is the bearer token in the standard `Authorization` header.

Tier 2: **access token** — a JWT identifying the logged-in user, carried in the `x-access-token` header (plus `x-refresh-token`, `x-timestamp`, `x-nonce` for rotation and replay protection). Not required for every endpoint — e.g. `/auth/login` obviously can't require one, and the Shopify webhook uses HMAC signing instead.

Tier 3 (optional): **access policy** — the JWT carries policy claims (Admin, Moderator, SuperUser, Fabricator, CommunityModerator, Inventory, Reporting, etc.). Admin-API routes gate behind specific policies.

**Alternate admission path (admin_api only)**: an OAuth 2.0 JWT access token issued by the cloud's own OAuth provider (plan 14) can stand in for tiers 1 + 2 on `apps/admin_api`. A bearer that looks like a JWT is verified, checked against the active-token Redis set, and cross-referenced with the `oauth_clients` + `oauth_grants` tables; on success the request proceeds with scope + policy enforcement via `AuthApi.requireScopeAndPolicy`. See [services/oauth.md](./services/oauth.md).

## Deployment

Production lays each service onto a dedicated EC2 instance (or set of instances behind a load balancer). Historical build pipeline has been Jenkins, but the `jenkins/` folder is currently empty — deployment scripts live elsewhere (see devops repo referenced in [README.md](../README.md#generating-environmnent-files)).

Each service carries its own nodemon config at the repo root (`nodemon.*.json`) so `yarn dev` at root spins all five dev-critical services concurrently. The others (`cloud_worker`, `media_server_next`, `monitor`, `kb`, `scan_yourself`) are started manually when needed.

## Startup Sequences

Most services share this boot shape. Per-service detail lives in each service doc.

```mermaid
sequenceDiagram
    participant P as Process
    participant ENV as dotenv
    participant LOG as Logger
    participant SEC as SecretsLoader
    participant DB as Postgres
    participant RED as Redis
    participant Q as MessageQueue SQS
    participant SRV as Express / ws / mediasoup

    P->>ENV: dotenv.config()
    P->>SEC: Load AWS secrets
    P->>LOG: initialize (Pino + CloudWatch + Discord)
    P->>DB: create connection pool
    P->>RED: connect + register Redlock
    P->>Q: initQueue(BigscreenCloud, WebsocketInbox, ...)
    P->>SRV: register routes / start listening
    Note over SRV: Ready to serve
```

`cloud/ws_server` and `cloud/cloud_worker` additionally call `Cloud.restoreExistingBigscreenUsers()` to recover state from Redis across restarts. `cloud/media_server_next` additionally registers itself with `cloud/cloud_api` and retrieves an auth token before accepting peer connections.

## Where to go next

- **Full list of yarn workspaces** → [workspaces.md](./workspaces.md)
- **How specific requests flow through the system** → [data-flows.md](./data-flows.md)
- **Every third-party / AWS service we depend on** → [external-services.md](./external-services.md)
- **Per-service detail** → [services/](./services/)
