# Phase 13: Data Flow Architecture Document - Research

**Researched:** 2026-03-22
**Domain:** Data architecture documentation — data dictionaries, flow diagrams, agent-facing prescriptive references
**Confidence:** HIGH (entire codebase read directly; no external library research needed)

---

<user_constraints>
## User Constraints (from CONTEXT.md)

### Locked Decisions

#### Document Format
- Layered reference style (like an API spec / data dictionary hybrid)
- Each data layer gets its own section with field tables and source annotations
- ASCII flow diagrams showing sync pipeline, write-back paths, and fallback behavior
- Both descriptive (what exists) and prescriptive (rules agents must follow)

#### Document Location
- `.planning/DATA-FLOW.md` alongside PROJECT.md, REQUIREMENTS.md, ROADMAP.md

#### Data Layer Definitions — RECIPIENTS (from GitHub Project)
- Source: GitHub Project columns
- Existing GH columns: Title, Shopify Profile URL, Purpose, Vision Rx - OD, Vision Rx - OS
- New GH columns needed: discord_username, discord_user_id, product names (parallel array), product Shopify URLs (parallel array)
- CORRECTION: `github_profile_url` does NOT exist and was never intended — agents must never re-introduce it
- Purpose field determines a colored ring around the recipient's profile pic in the UI
- Vision Rx fields (OD = right eye, OS = left eye) are recipient-level detail data
- Recipients are people, NOT shipments. `shipment_status` and `tracking_state` do NOT belong on recipients.

#### Data Layer Definitions — CARDS (from Shopify orders + GitHub Issues)
- Primary source: Shopify orders tagged "wit-what" (Shopify-first pipeline)
- Cloud storage: GitHub Issues in `BigscreenVR/beyond-outgoing` repo, tagged `ww-card`
- `recipient_name` renamed to `recipient_key` — cards reference a valid recipient key, not a display name
- `first_item_image_hint` retired — products displayed as individual image squares
- Notes are a collection `{date, content}[]`, not a single `latest_note` string
- `item_summary` replaced with deterministic product reference list
- Shipment status and tracking state are card-level fields sourced from Shopify

#### Data Layer Definitions — PRODUCTS (new layer)
- Cloud storage: GitHub Issues in `BigscreenVR/beyond-outgoing` repo, tagged `ww-product`
- Standalone catalog — products can be Shopify-linked OR manually created
- Two product types: Serializable (tracked instances, lifecycle) and Non-serializable (freely assignable)
- GH Project parallel-array columns for product data per recipient row

#### Data Layer Definitions — SERIAL INSTANCES (sub-entity of Products)
- Lifecycle: Created -> Assigned -> In Transit -> Delivered -> Return In Transit -> Returned -> Available
- Only one active assignment per serial instance at a time
- Cloud storage: within the parent product's GH Issue (body or structured comment TBD during planning)

#### SQLite Cache Architecture
- Full mirror of all entities
- App always reads from SQLite — sync updates SQLite from GH/Shopify sources
- Offline mode: show cached data with persistent offline indicator; edits queue locally
- Schema: App-optimized relational schema with normalized tables and foreign keys (NOT JSON blobs)
- Location: `%APPDATA%/WITwhat/`
- Conflict resolution: Last-write-wins with audit log table for all overwrites

#### CLAUDE.md Enforcement
- Short pointer (3-5 lines) in CLAUDE.md pointing to `.planning/DATA-FLOW.md`
- Critical inline rules: no new fields without source doc, SQLite is single read source, GH Issues are cloud of record
- Mandatory read + update obligation for agents

### Claude's Discretion
- Exact ASCII diagram style and detail level
- Table formatting choices within the layered reference structure
- How to organize the prescriptive rules section (inline per layer vs. consolidated)
- Level of detail for serial instance lifecycle documentation

### Deferred Ideas (OUT OF SCOPE)
- SQLite migration implementation
- GH Issues storage implementation (ww-card / ww-product)
- Product catalog UI
- Serial number tracking UI
- Parallel-array GH column creation
- discord_username / discord_user_id GH column creation
</user_constraints>

---

## Summary

Phase 13 produces a single authoritative document (`.planning/DATA-FLOW.md`) that maps the entire data architecture of WITwhat. This is a documentation-only phase — no code is changed. The document must be good enough that a future agent reading it can understand every data entity, every field's source, every sync direction, and every rule without reading any code.

The codebase was read in full. The current code contains several architectural inconsistencies that the document must name explicitly and permanently: `github_profile_url` is present across Recipient, RecipientCardSnapshot, and DashboardCardViewModel but was never an actual GH Project column and must never be re-introduced; `shipment_status` and `tracking_state` appear on the `Recipient` domain model but belong exclusively on cards (Shopify order data); `latest_note` is a scalar in the current snapshot but the architecture calls for a dated collection; and `item_summary` is currently a pipe-separated string but must become a deterministic product reference list.

The document's value lies in three things: (1) precise field tables with source/owner annotations, (2) ASCII flow diagrams of the sync pipeline that agents can navigate without running the code, and (3) a prescriptive rules section that gives a hard NO to the most common agent mistakes.

**Primary recommendation:** Write DATA-FLOW.md as four horizontal layers (Recipients, Cards, Products, Serial Instances) crossed with four vertical concerns (fields + sources, sync flow, storage, rules). Use ASCII box-and-arrow diagrams. Put the corrections and prohibitions in a prominent "AGENT RULES" section at the top where they cannot be missed.

---

## Current Codebase: Complete Field Inventory

This section is the raw evidence the planner needs to write the DATA-FLOW.md. Every struct and every field has been catalogued from source.

### Layer 1: RECIPIENTS

**Currently in code** — `crates/core/src/domain/recipient.rs`:

| Field | Type | Status | Correct Owner | Notes |
|-------|------|--------|--------------|-------|
| `recipient_id` | String | KEEP | Recipient | Derived from GH Title or GH username from Shopify Profile URL |
| `github_item_id` | Option<String> | KEEP | Recipient | GH Project item node ID for write-back |
| `shopify_customer_id` | Option<String> | KEEP | Recipient | Extracted from Shopify Profile URL GH field |
| `shopify_order_id` | Option<String> | MOVE | Card | Belongs on card, not recipient |
| `github_profile_url` | Option<String> | **REMOVE** | NONE | Does not exist as a GH Project column. Never re-introduce. |
| `shipment_status` | Option<String> | **MOVE** | Card | Sourced from Shopify fulfillment; belongs on card |
| `tracking_state` | Option<String> | **MOVE** | Card | Sourced from Shopify tracking; belongs on card |
| `discord_username` | Option<String> | KEEP | Recipient | GH Project column (to be added) |
| `discord_user_id` | Option<String> | KEEP | Recipient | GH Project column (to be added) |
| `unresolved_fields` | Vec<String> | KEEP | Recipient | Validation warnings during ingest |
| `discrepancy` | RecipientDiscrepancy | KEEP | Recipient | UI warning state |
| `manual_override` | RecipientOverride | KEEP | Recipient | User-set discord_username override |
| `version` | i64 | KEEP | Recipient | Optimistic concurrency |

**New fields needed on Recipient** (from CONTEXT.md decisions):

| Field | Type | Source | Notes |
|-------|------|--------|-------|
| `purpose` | Option<String> | GH Project "Purpose" column | Drives colored ring in UI |
| `vision_rx_od` | Option<String> | GH Project "Vision Rx - OD" | Right eye prescription |
| `vision_rx_os` | Option<String> | GH Project "Vision Rx - OS" | Left eye prescription |
| `product_names` | Vec<String> | GH Project parallel-array column | Product names assigned to recipient |
| `product_shopify_urls` | Vec<String> | GH Project parallel-array column | Shopify product URLs (index-aligned with product_names) |

**GH Project column mapping** (current `project_mapping.rs` behavior):

| GH Column | Maps to | How |
|-----------|---------|-----|
| `Title` | `recipient_key` (fallback) | Used when Shopify Profile URL absent |
| `Shopify Profile URL` | `shopify_profile_url` on GithubMappedRecipient | customer_id extracted via URL parse |
| `github_profile_url` | NOT USED — column does not exist | Was an agent assumption |
| `discord_username` | `discord_username` | Direct field read |
| `discord_user_id` | `discord_user_id` | Direct field read |

### Layer 2: CARDS

**Currently in code** — `crates/app/src/service_client.rs` `RecipientCardSnapshot`:

| Field | Type | Status | Source | Notes |
|-------|------|--------|--------|-------|
| `recipient_id` | String | KEEP, RENAME | GH Title or Shopify URL | Rename to `recipient_key` per decisions |
| `recipient_name` | String | KEEP, RENAME | GH Title | Rename to display name; `recipient_key` is the key |
| `discord_username` | Option<String> | KEEP | Recipient (GH) | Recipient-level |
| `discord_user_id` | Option<String> | KEEP | Recipient (GH) | Recipient-level |
| `github_profile_url` | Option<String> | **REMOVE** | Does not exist | Propagated from incorrect Recipient field |
| `shopify_customer_id` | Option<String> | KEEP | Shopify order | Card-level |
| `shopify_order_id` | Option<String> | KEEP | Shopify order | Card-level |
| `email` | Option<String> | KEEP | Shopify order | Card-level |
| `shipment_status` | Option<String> | KEEP | Shopify fulfillment | Card-level (correctly here) |
| `shipment_status_date` | Option<String> | KEEP | Shopify fulfillment | Card-level |
| `item_summary` | String | REPLACE | Items | Replace with product reference list |
| `latest_note` | Option<String> | REPLACE | Manual | Replace with `notes: Vec<{date, content}>` |
| `first_item_image_hint` | Option<String> | REMOVE | Items | Retired; products shown as image squares |
| `partial_data` | bool | KEEP | Sync | Indicates incomplete sync |
| `last_updated_at` | Option<SystemTime> | KEEP | Sync | Staleness computation |
| `is_unassigned` | bool | KEEP | Sync | True when no GH recipient match |
| `unassigned_customer_name` | Option<String> | KEEP | Shopify order | Display name for unassigned cards |

**New fields needed on Card** (architecture):

| Field | Type | Source | Notes |
|-------|------|--------|-------|
| `recipient_key` | String | GH Project | Renamed from `recipient_id` conceptually |
| `product_refs` | Vec<ProductRef> | GH Issue body | Replaces `item_summary` |
| `notes` | Vec<NoteEntry {date, content}> | GH Issue comments | Replaces `latest_note` |
| `github_issue_id` | Option<String> | GH Issue | Cloud storage ref (ww-card) |

**Slint CardData struct** (current, `dashboard.slint`):

| Slint Field | Rust Source | Notes |
|-------------|-------------|-------|
| `recipient-name` | `recipient_name` | Display name |
| `status-pill` | `status_pill` (ViewModel) | Formatted shipment status |
| `status-date` | `status_date_inline` | |
| `item-summary` | `item_summary` | Pipe-separated string (to be replaced) |
| `item-squares` | `[ItemSquareData]` | One entry per product |
| `note-preview` | `note_preview` | First 80 chars of latest note |
| `archive-state` | `archive_state` as int | 0=Active, 1=TBA, 2=Archived |
| `discord-user-id` | `discord_user_id` | For avatar fetch |
| `shopify-customer-url` | Built from customer_id + store_slug | |
| `shopify-order-url` | Built from order_id + store_slug | |
| `package-id` | `package_id` | For item CRUD operations |
| `active-item-id` | `active_item_id` | Currently unused |
| `is-unassigned` | `is_unassigned` | |
| `unassigned-customer-name` | `unassigned_customer_name` | |

### Layer 3: PRODUCTS (new layer, not yet in code)

Per CONTEXT.md decisions, this is a NEW data layer that does not yet exist in code. The document must define its full schema for future implementers.

**Product entity** (target architecture):

| Field | Type | Source | Notes |
|-------|------|--------|-------|
| `product_id` | String | Generated | UUID or slug |
| `name` | String | GH Issue body or manual | Display name |
| `image_url` | Option<String> | GH Issue body | Product image |
| `shopify_product_url` | Option<String> | GH Issue body | Optional Shopify link |
| `is_serializable` | bool | GH Issue body | Drives serial instance tracking |
| `github_issue_id` | String | GH Issue | Cloud storage ref (ww-product) |

**GH Issue storage** (ww-product):
- Label: `ww-product`
- Repo: `BigscreenVR/beyond-outgoing`
- Body: structured YAML or JSON with all fields above

### Layer 4: SERIAL INSTANCES (sub-entity of Products, not yet in code)

| Field | Type | Notes |
|-------|------|-------|
| `serial_id` | String | e.g. "PC-001" |
| `product_id` | String | FK to parent product |
| `state` | SerialInstanceState enum | Lifecycle state |
| `assigned_card_id` | Option<String> | FK to card (null when Available) |
| `assigned_at` | Option<DateTime> | When assignment occurred |

**SerialInstanceState lifecycle:**
```
Created -> Assigned -> In Transit -> Delivered -> Return In Transit -> Returned -> Available
```
Constraint: Only one active assignment per serial instance at any time. "Active" means any state between Assigned and Returned (exclusive).

### Current Storage (to be documented as "existing" vs "target")

**Existing local persistence files** (all in `%APPDATA%/WITwhat/`):
- `config.toml` — AppConfig (shopify_store_slug, shopify_secret_ref, github_project_node_id, github_project_url)
- `archive_store.json` — ArchiveStore: HashMap<recipient_id, ArchiveRecord>
- `pending_edits.json` — PendingEditQueue: Vec<PendingEdit> (SaveNote | AddItem | RemoveItem | RenameItem)

**Current in-memory storage** (non-persistent):
- `Repository` — HashMap-backed: recipients, packages, items, conflict_notes
- `LiveClient.unassigned_cards` — Arc<Mutex<Vec<RecipientCardSnapshot>>>
- `LiveClient.recipient_list` — Arc<Mutex<Vec<RecipientListEntry>>>

**Target storage** (SQLite full-mirror — documented but not yet implemented):
- Single `.db` file at `%APPDATA%/WITwhat/witwhat.db`
- Tables: recipients, cards, products, serial_instances, notes, archive_records, pending_edits, sync_audit_log
- App always reads from SQLite; background sync writes to SQLite

### Sync Pipeline (current implementation)

The sync pipeline in `run_sync_cycle` (live_client.rs) is Shopify-first:

```
GH Project rows
    |
    v
map_rows() -> Vec<GithubMappedRecipient>
    |
    |  keyed by: extract_customer_id_from_shopify_profile_url(shopify_profile_url)
    v
HashMap<customer_id, recipient_idx>
    +
Shopify orders_by_tag("wit-what")
    |
    v  for each order:
   order.customer_id in map?
    YES -> matched RecipientCardSnapshot (recipient_key from GH)
    NO  -> unassigned RecipientCardSnapshot (unassigned:order:{id})
    |
    v
(all_snapshots, recipient_list) returned
    |
    v
slint::invoke_from_event_loop -> UI update
```

**Shipment status derivation** (current, in run_sync_cycle):

| Shopify fulfillment_status | Displayed as |
|----------------------------|--------------|
| `"fulfilled"` | `"Delivered"` |
| `"partial"` | `"In Transit"` |
| `"restocked"` | `"Returned"` |
| Other string | That string verbatim |
| None (null) | `"Unfulfilled"` |

**Transformation pipeline** (layer by layer):

```
GH Project row (HashMap<String, String>)
    -> GithubMappedRecipient (project_mapping.rs)
    -> Recipient domain model (core/domain/recipient.rs)  [via Repository]
    -> RecipientAggregate (service/db/models.rs)           [with packages + items]
    -> service RecipientSnapshot (service/api/recipients.rs)
    -> app RecipientCardSnapshot (service_client.rs)       [via map_snapshot in live_client.rs]
    -> DashboardCardViewModel (dashboard/view_model.rs)    [via project_snapshot in projection.rs]
    -> Slint CardData (dashboard.slint)                    [via main.rs binding]
```

**Write-back paths** (current):
- GH Project write-back: `update_field_text` via `GithubProjectClient` (after recipient picker assignment)
- Note save: `save_note` via `DashboardDataClient` -> `service::api::items::save_package_note` -> Repository
- Item add/remove/rename: `add_item` / `remove_item` / `rename_item` via `DashboardDataClient`

---

## Architecture Patterns

### Recommended DATA-FLOW.md Document Structure

```
# DATA-FLOW.md

## AGENT RULES (read this section first)
  - Critical corrections and prohibitions
  - Three mandatory rules summary

## Overview: Four Data Layers
  - ASCII diagram of the full system

## Layer 1: Recipients
  - Source, fields table, sync direction, rules

## Layer 2: Cards
  - Source, fields table, sync direction, rules

## Layer 3: Products
  - Source, fields table, sync direction, rules

## Layer 4: Serial Instances
  - Source, lifecycle, rules

## Sync Pipeline
  - ASCII flow diagram
  - Step-by-step description
  - Shipment status mapping table

## Storage Architecture
  - Current state (files + in-memory)
  - Target state (SQLite schema philosophy)
  - Offline mode behavior
  - Conflict resolution policy

## Transformation Pipeline
  - Layer-by-layer transformation diagram
  - Key struct at each layer

## Prescriptive Rules
  - Data ownership rules
  - Source authority rules
  - Agent obligation rules
```

### ASCII Diagram Style

Use box-and-arrow style with Unicode box-drawing characters — these are widely supported in markdown viewers and GitHub. For plaintext-only safety, use ASCII `+--+` boxes. Since this document lives in `.planning/` and is viewed in editors/GitHub, use Unicode box-drawing.

**Recommended style (Unicode box-drawing):**
```
┌─────────────────────┐     ┌─────────────────┐
│  GitHub Project     │────>│  GH CLI ingest  │
│  (source of truth)  │     └────────┬────────┘
└─────────────────────┘              │
                                     v
                             ┌───────────────┐
                             │  SQLite cache │
                             │  (read layer) │
                             └───────────────┘
```

**ASCII fallback style (when Unicode uncertain):**
```
+---------------------+     +-----------------+
|  GitHub Project     |---->|  GH CLI ingest  |
|  (source of truth)  |     +--------+--------+
+---------------------+              |
                                      v
                             +---------------+
                             |  SQLite cache |
                             |  (read layer) |
                             +---------------+
```

Since the project uses Slint (not the docs renderer), and the file lives at `.planning/`, GitHub markdown rendering is the target. Unicode box-drawing is safe to use. Confidence: HIGH (verified by reading current `.planning/` docs).

### Prescriptive Rules Section Patterns

**Best practice from architecture documentation:** Put corrections and prohibitions in a dedicated, highly visible section at the TOP of the document, not buried in per-layer sections. Agents scan documents; they will see what is first. Inline rules within each layer section should be short reminders with a "see AGENT RULES" pointer.

**Effective rule format:**
```
### RULE-01: github_profile_url does not exist
**Status:** PROHIBITED
**Background:** Early agents introduced `github_profile_url` assuming it was a GH Project
column. It is not and has never been. The field is present in some structs as technical debt.
**Action:** Never add or read this field. When refactoring touching Recipient or
RecipientCardSnapshot, remove it.
```

This format (rule ID, status label, background, action) makes it machine-parseable and scannable.

### SQLite Schema Design Patterns

For a full-mirror cache of API data with normalized tables:

**Schema philosophy rules for this project:**
1. Every entity gets its own normalized table (not JSON blob columns).
2. Foreign keys enforced with `ON DELETE CASCADE` for dependent entities (notes -> cards, serial_instances -> products).
3. A `sync_audit_log` table captures every write with timestamp and source (GH or Shopify) for conflict resolution.
4. Conflict resolution is last-write-wins: the higher `synced_at` timestamp wins. The losing value goes into `sync_audit_log`.
5. Offline edits enqueued in a `pending_edits` table with status (Pending | Flushing | Failed). On reconnect, flush in order.

**Proposed SQLite table list** (for the document to define):

```
recipients        -- id, recipient_key, title, purpose, vision_rx_od, vision_rx_os,
                  --   discord_username, discord_user_id, shopify_customer_id,
                  --   shopify_profile_url, github_item_id, synced_at

cards             -- id, recipient_key (FK nullable), shopify_order_id, shopify_customer_id,
                  --   is_unassigned, unassigned_customer_name, shipment_status,
                  --   shipment_status_date, email, github_issue_id, synced_at

card_products     -- card_id (FK), product_id (FK), serial_id (FK nullable)
                  --   (junction table linking cards to their assigned products/serials)

notes             -- id, card_id (FK), date, content, synced_at

products          -- id, name, image_url, shopify_product_url, is_serializable,
                  --   github_issue_id, synced_at

serial_instances  -- id, product_id (FK), serial_number, state, assigned_card_id (FK nullable),
                  --   assigned_at, synced_at

archive_records   -- recipient_id, state (0=Active,1=TBA,2=Archived), archived_at,
                  --   auto_archived, manual_unarchive_override

pending_edits     -- id, kind, payload_json, status, created_at, attempted_at, error

sync_audit_log    -- id, entity_type, entity_id, field_name, old_value, new_value,
                  --   written_at, source (GH|Shopify|Manual)
```

---

## Don't Hand-Roll

| Problem | Don't Build | Use Instead | Why |
|---------|-------------|-------------|-----|
| Conflict detection | Custom timestamp comparison code | Standard last-write-wins with audit log | Complexity scales badly with number of fields |
| ASCII diagrams | Custom diagram renderer | Literal text in the .md file | This is documentation, not code |
| Field validation | Custom parser for GH column values | Existing `map_rows` skip+warn pattern | Already implemented; extend it |
| Note ordering | Custom date sort | Sort by `date` field descending; latest first | Simple and deterministic |

---

## Common Pitfalls

### Pitfall 1: Reintroducing github_profile_url
**What goes wrong:** An agent reads the Recipient struct, sees `github_profile_url`, and assumes it maps to a real GH Project column. It queries or writes that column name, gets no data, and silently produces empty-string bugs.
**Why it happens:** The field exists in multiple structs (Recipient, RecipientCardSnapshot, DashboardCardViewModel, service RecipientSnapshot) as technical debt from early design.
**How to avoid:** The DATA-FLOW.md AGENT RULES section must be the first thing agents read. The prohibition must be explicit with "PROHIBITED" status.
**Warning signs:** Any code that reads `row.fields.get("github_profile_url")` outside of the existing legacy path in project_mapping.rs.

### Pitfall 2: Putting shipment_status on Recipient
**What goes wrong:** A future agent adds shipment tracking to the Recipient domain model because it already has `shipment_status` and `tracking_state` fields there. This creates a data ownership ambiguity where the UI might show stale recipient-level status instead of the authoritative card-level (Shopify) status.
**Why it happens:** Both fields exist on `Recipient` today (they are technical debt). Agents copy existing patterns.
**How to avoid:** Document explicitly that these fields are on Recipient only as migration residue. All shipment status reads must come from cards (Shopify data).

### Pitfall 3: Using item_summary as the product reference
**What goes wrong:** An agent builds product-related UI features by reading `item_summary` (a pipe-separated string) and splitting on `|`. This is fragile, loses ordering, and cannot carry product IDs.
**Why it happens:** `item_summary` is currently used everywhere in the pipeline. It's the easy path.
**How to avoid:** Document that `item_summary` is deprecated architectural debt. New product-aware code must use the product reference list.

### Pitfall 4: Treating latest_note as the full notes history
**What goes wrong:** An agent implementing note history reads `latest_note` (a single string) and cannot find older notes. Or an agent writes only `latest_note` on a new card, discarding the full notes collection that lives in the GH Issue comments.
**Why it happens:** `latest_note` is present in the current pipeline at every layer.
**How to avoid:** Document that `latest_note` is a UI convenience field (the most recent note, truncated to 80 chars for preview). The authoritative notes are `Vec<NoteEntry {date, content}>` sourced from GH Issue comments.

### Pitfall 5: Reading from Repository instead of SQLite
**What goes wrong:** Once SQLite is implemented, a new agent writes a feature that calls `repo.load_recipient_aggregate()` (the in-memory Repository) instead of the SQLite read path. The data is stale or empty.
**Why it happens:** Repository is a concrete struct with convenient methods. Agents follow the existing pattern.
**How to avoid:** The AGENT RULES section must state: "SQLite is the single read source. Never read from in-memory Repository in production code."

### Pitfall 6: Misreading the Shopify-first pipeline as GH-first
**What goes wrong:** An agent adds a GH-only recipient to the cards list (bypassing Shopify orders). This creates a card with no shipment data, causing confusing UI state.
**Why it happens:** The pre-Phase-12.1.1 pipeline was GH-first. Documentation references to the old pipeline still exist.
**How to avoid:** The sync pipeline section must be unambiguous: cards exist only because a Shopify order tagged "wit-what" exists. GH recipients without matching orders do NOT produce cards.

---

## Code Examples

### Shipment Status Derivation (current, authoritative)
```rust
// Source: crates/app/src/live_client.rs run_sync_cycle()
let shipment_status = Some(match order.fulfillment_status.as_deref() {
    Some("fulfilled") => "Delivered".to_string(),
    Some("partial")   => "In Transit".to_string(),
    Some("restocked") => "Returned".to_string(),
    Some(other)       => other.to_string(),
    None              => "Unfulfilled".to_string(),
});
```

### customer_id Extraction from Shopify Profile URL
```rust
// Source: crates/app/src/live_client.rs
pub fn extract_customer_id_from_shopify_profile_url(url: &str) -> Option<String> {
    url.trim_end_matches('/')
        .split('/')
        .next_back()
        .filter(|s| !s.is_empty() && s.chars().all(|c| c.is_ascii_digit()))
        .map(|s| s.to_string())
}
// Accepts: https://admin.shopify.com/store/mystore/customers/12345
// Returns: Some("12345")
```

### Recipient Key Derivation (current GH mapping)
```rust
// Source: crates/integrations/src/github/project_mapping.rs
// Priority: github_profile_url username > Title field
// github_profile_url column does NOT exist in the current GH Project.
// In practice, Title is always used (the Shopify-first pipeline).
// recipient_key = row.fields.get("Title").trim().to_string()
```

### Archive State Three-State Enum
```rust
// Source: crates/app/src/dashboard/archive.rs
pub enum ArchiveState {
    Active,        // Normal, fully visible
    ToBeArchived,  // Dimmed, will become Archived after 12h or on restart
    Archived,      // Hidden by default (show_archived=false hides these)
}
// Auto-archive trigger: status_pill == "Returned"
// TBA -> Archived: on app restart (unconditional) or after 12h mid-session
```

### PendingEdit Variants (current offline queue)
```rust
// Source: crates/app/src/dashboard/edit_queue.rs
pub enum PendingEdit {
    SaveNote    { package_id: String, note: String },
    AddItem     { package_id: String, display_name: String, image_hint: Option<String>, shopify_product_id: Option<String> },
    RemoveItem  { item_id: String },
    RenameItem  { item_id: String, new_display_name: String },
}
// Serialized with serde tag='kind'. Persisted to pending_edits.json.
// On successful sync, drain_all() is called to flush the queue.
```

---

## State of the Art

| Old Approach | Current Approach | When Changed | Impact on Document |
|--------------|------------------|--------------|-------------------|
| GH-first pipeline (recipients drove cards) | Shopify-first pipeline (orders drive cards) | Phase 12.1.1 | Document must describe CURRENT pipeline only; old pipeline is dead |
| `latest_note` scalar | Notes collection `{date, content}[]` | Architecture target (Phase 13) | Document the target; mark scalar as migration debt |
| `item_summary` pipe-string | Product reference list | Architecture target | Document the target; mark pipe-string as migration debt |
| In-memory Repository | SQLite full-mirror cache | Architecture target | Document the target; current state is "transitional" |
| JSON files (archive_store.json, pending_edits.json) | SQLite tables | Architecture target | Files migrate into SQLite; document both current and target |

**Deprecated/outdated:**
- `first_item_image_hint` on RecipientCardSnapshot: retired; products shown as item-squares
- `email` field on Recipient domain model: was never populated (see Phase 08 decision); lives only on Shopify orders
- `package_id` / `active_item_id` on CardData: current placeholder for item CRUD; will be superseded by product reference model

---

## Open Questions

1. **Serial instance cloud storage location within GH Issue**
   - What we know: Per CONTEXT.md, serial instances are stored "within the parent product's GH Issue (body or structured comment TBD during planning)"
   - What's unclear: Whether instances go in the Issue body (structured JSON/YAML block) or as individual comments (one per instance lifecycle event). Comments give audit trail but complicate bulk read.
   - Recommendation: Document both options in DATA-FLOW.md with a "TBD during GH Issues implementation phase" note. The architecture doc should describe what must be represented; the exact encoding is a Phase 14+ decision.

2. **recipient_key vs recipient_id naming**
   - What we know: CONTEXT.md says `recipient_name` is renamed to `recipient_key` and cards reference a valid recipient key. But the codebase uses `recipient_id` everywhere as the primary key on Recipient.
   - What's unclear: Is `recipient_key` the human-readable key (GH username / Title) and `recipient_id` the system UUID? Or are they the same concept with a rename?
   - From the code: `recipient_id` IS the human-readable key (it is set to `github_recipient.recipient_key.clone()` in run_sync_cycle). There is no UUID.
   - Recommendation: Document that `recipient_id` and `recipient_key` are the same field. In future architecture (SQLite), add a true UUID primary key and preserve `recipient_key` as the human-readable identifier.

3. **GH Issue body format for ww-card and ww-product**
   - What we know: Issues tagged `ww-card` store card metadata; issues tagged `ww-product` store product definitions.
   - What's unclear: The exact structured format (JSON front-matter? YAML block? Custom format?) is not yet decided.
   - Recommendation: Document the required fields that must be representable; leave the exact encoding format as a "TBD during GH Issues implementation phase" note with a placeholder example.

---

## Validation Architecture

The `.planning/config.json` does not explicitly set `workflow.nyquist_validation`. The key is absent, so validation is treated as enabled. However, Phase 13 produces only documentation files — no Rust code changes, no tests needed.

### Test Framework
| Property | Value |
|----------|-------|
| Framework | cargo test (Rust built-in) |
| Config file | none (uses workspace defaults) |
| Quick run command | `cargo test --workspace` |
| Full suite command | `cargo test --workspace` |

### Phase Requirements -> Test Map
This phase produces `.planning/DATA-FLOW.md` and a CLAUDE.md update. There are no behavioral code changes and no test mappings. The verification criterion is document existence and content review.

| Req ID | Behavior | Test Type | Automated Command | File Exists? |
|--------|----------|-----------|-------------------|-------------|
| DOC-01 | DATA-FLOW.md exists at .planning/DATA-FLOW.md | manual | `ls .planning/DATA-FLOW.md` | Wave 0 |
| DOC-02 | CLAUDE.md contains DATA-FLOW.md reference | manual | `grep -l DATA-FLOW CLAUDE.md` | Wave 0 |

### Wave 0 Gaps
None — this phase creates documentation only. No test infrastructure is needed.

---

## Sources

### Primary (HIGH confidence)
All findings are derived from direct codebase reads. Every struct, field, and function name cited was read from source.

- `crates/core/src/domain/recipient.rs` — Recipient, RecipientDiscrepancy, RecipientOverride
- `crates/core/src/domain/package.rs` — Package
- `crates/core/src/domain/item.rs` — Item, OwnershipMode
- `crates/app/src/service_client.rs` — RecipientCardSnapshot, DashboardDataClient, ItemCatalogEntry, RecipientSummary
- `crates/app/src/dashboard/view_model.rs` — DashboardCardViewModel
- `crates/app/src/dashboard/projection.rs` — project_snapshot, transformation logic
- `crates/app/src/live_client.rs` — run_sync_cycle, map_snapshot, RecipientListEntry, SyncUpdateCallback
- `crates/app/src/config.rs` — AppConfig, config paths
- `crates/app/src/dashboard/archive.rs` — ArchiveState, ArchiveRecord, ArchiveStore
- `crates/app/src/dashboard/edit_queue.rs` — PendingEdit, PendingEditQueue
- `crates/integrations/src/github/project_client.rs` — GithubProjectRow, GithubProjectClient
- `crates/integrations/src/github/project_mapping.rs` — GithubMappedRecipient, map_rows
- `crates/integrations/src/shopify/order_fulfillment_client.rs` — ShopifyOrder, ShopifyOrderFull, Fulfillment, TrackingEvent
- `crates/service/src/db/repository.rs` — Repository (in-memory)
- `crates/service/src/db/models.rs` — RecipientAggregate
- `crates/service/src/api/recipients.rs` — RecipientSnapshot, get_recipient_snapshot
- `crates/app/ui/dashboard.slint` — CardData struct (Slint)
- `.planning/phases/13-data-flow-architecture-document/13-CONTEXT.md` — User decisions

### Secondary (MEDIUM confidence)
- `.planning/STATE.md` — Historical decisions and phase decisions log
- `.planning/REQUIREMENTS.md` — v1 requirement definitions

---

## Metadata

**Confidence breakdown:**
- Current field inventory: HIGH — read directly from source
- Architecture corrections (github_profile_url, shipment_status on recipient): HIGH — confirmed by code and CONTEXT.md
- Target architecture (SQLite, product layer, serial instances): HIGH — locked decisions in CONTEXT.md
- ASCII diagram style recommendations: MEDIUM — based on markdown rendering conventions
- SQLite table design: MEDIUM — based on decisions; exact schema is Claude's discretion

**Research date:** 2026-03-22
**Valid until:** 2026-04-22 (stable domain; expires if codebase structs are refactored before Phase 13 executes)
