# cloud/ws_server — WebSocket Connection Manager

A thin but important service: it holds the live WebSocket connection for every active VR user, keeps it alive with pings, and is responsible for delivering `cloud_api`-originated push messages to the right user. It talks to the rest of the stack exclusively through SQS queues (`WebsocketInbox` inbound from users, `WebsocketOutbox` outbound to users).

**Entry:** [cloud/ws_server/ws_server.ts](../../cloud/ws_server/ws_server.ts):15 — port `WEBSOCKET_SERVER_PORT` or 3003.
**Package:** `bigscreen_ws_server`.
**Depends on:** [`@bigscreen/lib`](../libraries/lib.md), [`@bigscreen/auth`](../libraries/auth.md), [`@bigscreen/cloud`](../libraries/cloud-handlers.md).

## Connection Lifecycle

```mermaid
sequenceDiagram
    autonumber
    actor U as VR client
    participant WSS as cloud/ws_server
    participant AUTH as Auth.verifyAccessToken
    participant C as Cloud @bigscreen/cloud
    participant REDIS as Redis
    participant WOUT as SQS WebsocketOutbox

    U->>WSS: WS connect /?<base64(JSON)>
    Note right of U: JSON = { accessToken, systemInfo }
    WSS->>AUTH: verify(accessToken)
    AUTH-->>WSS: AccessTokenPayload
    WSS->>WSS: new BigscreenWebsocket(...)
    WSS->>C: Cloud.addBigscreenUser(user)
    C->>REDIS: HSET bigscreenUser:<id>
    WSS->>WSS: activeWebsockets[id] = bsw

    loop 100 ms
        WSS->>WOUT: getMessages (for this websocketServerId)
        WOUT-->>WSS: messages
        WSS->>U: ws.send(payload)
    end

    loop 10 s
        WSS->>U: ping
        U-->>WSS: pong (or nothing)
        WSS->>WSS: unansweredPingCount++<br/>terminate if > 5
    end

    loop 30 s
        WSS->>C: update active connection count
    end

    U--xWSS: close
    WSS->>C: Cloud.removeBigscreenUser(user)
    C->>REDIS: DEL bigscreenUser:<id>
    WSS->>WSS: delete activeWebsockets[id]
```

Clients authenticate **via URL payload** on connect — a base64-encoded JSON blob containing `accessToken` and `systemInfo`. See the `BigscreenWebsocket` constructor at [ws_server.ts:34](../../cloud/ws_server/ws_server.ts).

## The `BigscreenWebsocket` Wrapper

Every live connection is represented by:

```ts
class BigscreenWebsocket {
    websocket: WebSocket;
    id: string;                    // HashUtils.generateRandomId2()
    userSessionId: string;
    bigscreenAccountId: string;
    systemInfo: any;
    version: string;
    accessToken: string;
    unansweredPingCount = 0;
}
```

The `id` is a freshly generated random id, not the user's account id — that's intentional because a single user can have multiple sessions (VR headset + phone, etc.). `activeWebsockets[id]` is the local process registry.

## What `ws_server` Does NOT Do

- **Does not process inbound user messages.** The `onUserMessage` handler is currently a no-op ([ws_server.ts:98](../../cloud/ws_server/ws_server.ts)). There's a path for routing inbound messages through `WebsocketInbox` to `cloud_worker`, but most inbound traffic today is just pings.
- **Does not hold business state.** Everything mutable lives in Redis via `Cloud.*`. `ws_server` is effectively stateless aside from the in-process connection map, which can be rebuilt from Redis on restart.
- **Does not pick message recipients.** `cloud_api` / `cloud_worker` decide what goes to whom and enqueue with the correct `websocketServerId`. `ws_server` just delivers what's targeted at its own id.

## Startup & Recovery

```mermaid
sequenceDiagram
    autonumber
    participant P as Node process
    participant CFG as updateConfig()
    participant MQ as MessageQueue
    participant C as Cloud
    participant SRV as WebSocket.Server

    P->>CFG: derive websocketServerId<br/>from announced IP
    P->>MQ: initQueue(WebsocketOutbox, websocketServerId)
    P->>C: Cloud.initialize({ watchExpired: true })
    P->>C: restoreExistingBigscreenUsers()
    P->>SRV: new WebSocket.Server({ port })
    Note over SRV: Ready for connections
```

`restoreExistingBigscreenUsers()` walks Redis for users that were registered against this server before a restart. They're left in Redis with a watched expiry, so if they don't reconnect quickly they get cleaned up by `cloud_worker`'s periodic sweep. See [data-flows.md#4-websocket-message-delivery-server--client](../data-flows.md#4-websocket-message-delivery-server--client) for the message-delivery flow.

## Health & Scale Notes

- **Fan-out** is horizontal: many `ws_server` processes run in parallel, each owning its own `websocketServerId` and its own slice of the `WebsocketOutbox` queue. There is no shared in-memory state between them.
- **Dead connections**: 5 unanswered pings at 10 s each = up to 50 s before a user is kicked. In practice, most drops are detected sooner by the `ws.on('close')` event.
- **Startup recovery** is designed to tolerate abrupt restarts without leaking ghost users in Redis.

## Further reading

- What `Cloud.addBigscreenUser` / `removeBigscreenUser` actually do to Redis → [libraries/cloud-handlers.md](../libraries/cloud-handlers.md)
- How messages reach `WebsocketOutbox` in the first place → [services/cloud-api.md](./cloud-api.md)
- The message-processing side of the queues → [services/cloud-worker.md](./cloud-worker.md)
