# Pitfalls Research

**Domain:** Adding SQLite persistence, GH Issues cloud storage, product catalog, serial tracking, and offline sync to an existing Rust/Slint desktop app
**Researched:** 2026-03-22
**Confidence:** HIGH (based on codebase inspection + established Rust/SQLite/GH API patterns)

---

## Critical Pitfalls

### Pitfall 1: Reading In-Memory Repository After SQLite Exists

**What goes wrong:**
After SQLite is introduced, `Repository` (HashMap-backed, `Arc<Mutex<Repository>>`) remains compiled into the codebase. Callers that previously read from it continue doing so. SQLite becomes a write-only sink that nobody reads from — the app still works (from stale in-memory state) so the bug is invisible until the process restarts.

**Why it happens:**
The existing `LiveClient` constructor wires `repo: Arc<Mutex<Repository>>` today. The path from sync → `repo.lock().unwrap().upsert_recipient(...)` is used everywhere. When SQLite is added alongside, developers add writes to SQLite but forget to remove/redirect the downstream reads. The `Repository` struct stays alive (it's used by `RecipientAggregate` / `get_recipient_snapshot()` in `service/api/recipients.rs`), making the code compile and appear correct.

**How to avoid:**
Treat the SQLite migration as a cut-over, not an addition. The target phase must:
1. Replace `Repository` reads with SQLite reads in one atomic refactor (not incrementally).
2. Compile-gate the old `Repository` reads behind `#[cfg(test)]` immediately after SQLite is wired.
3. RULE-03 in DATA-FLOW.md already mandates this — enforce it structurally, not just by convention.

**Warning signs:**
- `Repository` struct still has callers in `service/api/` that aren't test code after SQLite phase.
- `get_recipient_snapshot()` still touches `repo.packages` or `repo.items`.
- App shows stale item data after process restart (items added this session don't survive reboot).

**Phase to address:**
SQLite full-mirror cache phase (first SQLite phase). Must be the complete cut-over, not an additive layer.

---

### Pitfall 2: GH Issues Body Format Lock-In Before Schema Is Stable

**What goes wrong:**
The first implementation of `ww-card` GH Issue bodies uses an ad-hoc format (e.g., plain text with some structure). Later phases (serial instances, product refs, notes) need to extend this format. Parsing breaks on all existing issues, or the format forks into multiple incompatible versions across real issues in production.

**Why it happens:**
DATA-FLOW.md intentionally leaves the body format TBD ("JSON, YAML, or custom — TBD during GH Issues implementation phase"). Developers pick whatever is easiest for the first feature and don't plan for forward-compatibility. The `notes` field (GH Issue comments) and `product_refs` field (body) both need structured encoding, but they are implemented in separate phases.

**How to avoid:**
Define the canonical body format (strongly recommend JSON with a top-level `"version"` key) in the GH Issues phase before writing any real issues. Include a `schema_version` field from day one. Write a migration function that handles `schema_version` absent (legacy) → current. Test round-trip parse/emit before touching real issues.

**Warning signs:**
- GH Issue body format is decided per-feature ("just use YAML here, JSON there").
- No `schema_version` or `v:` field in the body.
- Serial instance phase parses body differently than card phase.

**Phase to address:**
GH Issues cloud storage phase (when ww-card issues are first created). The body schema must be finalized before any issues are written to production.

---

### Pitfall 3: Offline Queue Conflicts With Concurrent Sync Writes

**What goes wrong:**
A user edits a note offline (queued to `pending_edits`). While offline, a background sync reads GH Issue comments and updates the SQLite `notes` table. On reconnect, the flush writes the pending edit to GH Issues. The edit lands in GH, but the `notes` table already has a stale snapshot. The UI shows the pre-edit version for the next 5 minutes until the next sync.

More severely: two edits to the same card arrive out-of-order at GH Issues (flush races sync). The later sync cycle reads the GH state before the flush commit is visible, and overwrites the SQLite notes table with the pre-edit state — effectively losing the user's edit at the SQLite layer (even though GH has it).

**Why it happens:**
The current `pending_edits.json` + background sync thread already have a partial race. DATA-FLOW.md specifies `Pending → Flushing → Failed` states but does not specify how the sync cycle should behave while a flush is in-flight. The flush and sync run on the same background thread today (`run_sync_cycle` in `live_client.rs`) but will need coordination once SQLite is the read source.

**How to avoid:**
1. Flush pending edits **before** running the sync cycle on reconnect — in strict sequence, not concurrent.
2. After flush completes (all `Flushing → done`), run full sync to get the authoritative post-flush state from GH.
3. Do not run a sync cycle if any edit is in `Flushing` state.
4. After writing a pending edit to SQLite locally, also apply it to the in-memory `notes` / `product_refs` tables immediately so the UI reflects the edit without waiting for the next sync cycle. (Optimistic local apply.)

**Warning signs:**
- Flush and sync run concurrently (two separate spawned tasks without a gate).
- Sync cycle checks for `pending_edits` being empty but does not check `Flushing` state.
- User's note edits visually "revert" for one sync cycle after reconnect.

**Phase to address:**
Offline mode with pending edit flush-on-reconnect phase. Must define the explicit flush-then-sync sequencing rule before implementing reconnect logic.

---

### Pitfall 4: Serial Instance Reassignment Without State Machine Enforcement

**What goes wrong:**
A serial instance in state `InTransit` or `Delivered` gets reassigned to a new card. The product catalog shows a device as simultaneously assigned to two recipients. Data integrity breaks silently — the GH Issue body is updated with the new `assigned_card_id` but the old card's `product_refs` still references the serial number.

**Why it happens:**
The `SerialInstanceState` machine (Created → Assigned → InTransit → Delivered → ReturnInTransit → Returned → Available) enforces reassignment only when state is `Available`. But the enforcement point is in application code, not in the SQLite schema. When product assignment is done via GH Issue write-back, there is no database-level constraint preventing the double assignment. Offline edit flushes bypass any application-layer validation.

**How to avoid:**
1. Enforce the `assigned_card_id` constraint at the SQLite layer: add a `UNIQUE` constraint on `serial_instances.assigned_card_id` (when not NULL), or a CHECK constraint that prevents non-NULL `assigned_card_id` when state is not in `{Assigned, InTransit, Delivered, ReturnInTransit}`.
2. In application code, validate state machine transition before writing to either SQLite or GH Issues — not just one.
3. When reading serial instances from GH Issues during sync, detect any instance appearing in multiple card `product_refs` and log to `sync_audit_log` as a conflict rather than silently overwriting.

**Warning signs:**
- `card_products` table has two rows for the same `serial_id` with different `card_id` values and no constraint preventing this.
- Serial instance assignment is only validated in the UI callback, not in the sync ingest path.
- Sync cycle accepts GH Issue data without checking for duplicate serial assignments across cards.

**Phase to address:**
Serial instance tracking phase (Layer 4). SQLite constraints must be part of the initial schema, not added after the fact.

---

### Pitfall 5: Deprecated Field Removal Breaking the Transformation Pipeline Mid-Refactor

**What goes wrong:**
`item_summary`, `latest_note`, `github_profile_url`, `shipment_status` on `Recipient`, and `first_item_image_hint` are all deprecated per DATA-FLOW.md. Removing one field mid-refactor breaks the transformation pipeline: `GithubMappedRecipient → Recipient → RecipientAggregate → RecipientSnapshot → RecipientCardSnapshot → DashboardCardViewModel → Slint CardData`. Any struct that still reads the removed field fails to compile. The temptation is to remove the field from the domain model but leave a dummy value in the downstream structs — which reintroduces the field under a different name.

**Why it happens:**
The transformation pipeline has 7 layers (DATA-FLOW.md "Layer-by-Layer Diagram"). Each layer copies fields forward. A field must be removed at every layer simultaneously or the code won't compile — this feels risky, so developers remove it from the top but leave dummy values at the bottom ("I'll clean up later"). The dummy values become permanent.

**How to avoid:**
Remove deprecated fields as a complete vertical slice through all 7 layers in one PR per deprecated field. Do not remove from domain model and leave in DTO. The order should be: remove from Slint UI first (terminal layer), then ViewModel, then app DTO, then service DTO, then domain model. Bottom-up removal ensures the compiler confirms each layer is clean before moving up.

**Warning signs:**
- `github_profile_url` is removed from `Recipient` but `DashboardCardViewModel` still has the field (set to `None` or `""`).
- `item_summary` is removed from domain model but `RecipientCardSnapshot` has `item_summary: String` set to `""` as a placeholder.
- `Option::unwrap_or_default()` pattern on a field that was supposed to be removed.

**Phase to address:**
Deprecated field cleanup phase. Each field should be tracked as a separate task with a compiler-verified completion check.

---

### Pitfall 6: GH API Rate Limits on Bulk Issue Creation

**What goes wrong:**
The product catalog phase creates `ww-product` GH Issues for all existing products, and the card phase creates `ww-card` GH Issues for all existing cards, potentially in a batch. GitHub's REST API rate limit is 5,000 requests/hour for authenticated requests (or 1,000/hour for GHES). If the initial migration creates 50+ cards and 20+ products in a loop without backoff, rate limits are hit mid-migration, leaving the database in a partially-migrated state (some cards have `github_issue_id`, others don't).

**Why it happens:**
Initial migration is a one-time script that loops over all entities and creates issues sequentially. Developers assume that at small scale (50 cards, 20 products), rate limits won't apply. The real danger is the combination: create issue + add label + add body + possibly update GH Project field = 3-4 API calls per entity. At 50 cards that's 150-200 calls, which is fine for rate limits but hits the secondary rate limit (burst limit: no more than 90 requests per minute in some scenarios).

**How to avoid:**
1. Add a 200ms minimum delay between GH Issue creation calls during initial migration.
2. Check the `X-RateLimit-Remaining` header after each call; if below 100, sleep until `X-RateLimit-Reset`.
3. Write migration as idempotent: if `github_issue_id` is already set in SQLite, skip the create call.
4. On failure mid-migration, the next run resumes from where it left off (idempotency covers this).

**Warning signs:**
- GH Issue creation loop has no rate limit handling.
- Migration is not idempotent (re-running creates duplicate issues).
- `github_issue_id` is not written to SQLite atomically with the issue creation (crash between create and save = orphaned issue).

**Phase to address:**
GH Issues cloud storage phase. Rate limit handling must be in the initial implementation, not added after seeing failures.

---

### Pitfall 7: Slint Z-Order Blocking Touch Interactions on Product Item Squares

**What goes wrong:**
The product item squares (`item-squares` in `dashboard.slint`) need to be interactive (tap to view product detail, or to assign a serial number). If a full-card `TouchArea` is declared later in the `.slint` file than the item squares, it will intercept all pointer events on the squares, making them non-interactive. This already burned the project once (per MEMORY.md: "full-card TouchArea declared last blocks all child interactions").

**Why it happens:**
Slint renders and routes events in declaration order (last declared = topmost in z-order). Developers add a card-level click handler for the primary card action and declare it at the bottom of the card component for readability — unaware it will block all child elements.

**How to avoid:**
Declare full-card `TouchArea` elements **before** child interactive elements in the `.slint` file. Use `z: -1` on the background `TouchArea` if the element ordering cannot be changed. For new card interactions (product square tap, serial assignment button), ensure they are declared after any background touch areas.

**Warning signs:**
- A new `TouchArea` is added at the bottom of a card component that spans the full width/height.
- Product item squares don't respond to clicks in manual testing.
- A new button is added to the card but clicking it triggers the card's primary action instead.

**Phase to address:**
Any phase that adds new interactive elements to cards (product catalog UI, serial instance assignment UI).

---

### Pitfall 8: `Arc<Mutex<Repository>>` Deadlock During Sync-to-SQLite Transition

**What goes wrong:**
The existing `Repository` is `Arc<Mutex<Repository>>`. The background sync thread holds the lock during `upsert_recipient()` / `upsert_package()` calls (which loop over all entities). If UI callbacks also try to acquire the same lock (e.g., the item CRUD callbacks call `save_package_note` → `service::api::items::save_package_note` → `repo.lock()`), a deadlock occurs if the sync is mid-write when the callback fires.

When SQLite is introduced, this pattern must not be replicated. SQLite connections in rusqlite are not `Send` by default; using `Arc<Mutex<Connection>>` across threads is the standard workaround but reintroduces the same deadlock risk.

**How to avoid:**
1. Use a connection pool (e.g., `r2d2-sqlite` or `deadpool-sqlite`) instead of a single `Arc<Mutex<Connection>>`. Connection pools hand out connections without a global lock.
2. Keep sync writes and UI callback writes on separate connections from the pool.
3. Use SQLite WAL mode (`PRAGMA journal_mode=WAL`) so readers don't block writers.
4. Do not hold a SQLite connection lock across async boundaries or across Slint's `invoke_from_event_loop`.

**Warning signs:**
- `Arc<Mutex<rusqlite::Connection>>` is used as the single database handle passed to all threads.
- App freezes for 5 seconds during sync (UI thread waiting for the mutex).
- Deadlock is non-deterministic and only observed under load.

**Phase to address:**
SQLite full-mirror cache phase. Connection management must be decided before any SQLite code is written.

---

## Technical Debt Patterns

| Shortcut | Immediate Benefit | Long-term Cost | When Acceptable |
|----------|-------------------|----------------|-----------------|
| Keep `Repository` alive after SQLite as a fallback | Reduces scope of SQLite phase | Violates RULE-03; production reads never fully cut over; two sources diverge | Never — cut over completely |
| Use `item_summary` pipe-separated string for product display while product catalog is being built | Avoids blocking phase | `item_summary` is RULE-07 deprecated; any new code that reads it is debt that accumulates | Only if product display is gated behind a feature flag and removed in the same release |
| JSON blob for serial instance storage in GH Issue body | Simpler to read/write | Forward migrations require re-parsing all issues; no per-field audit trail | Acceptable for MVP if `schema_version` key is included from day one |
| Single `Arc<Mutex<Connection>>` for SQLite | Faster to implement than a pool | Deadlock under concurrent sync + UI write; no parallelism | Never in production; only in unit tests with single-threaded executor |
| Skip `schema_version` in GH Issue bodies | Saves 1 field | Any schema change requires querying all issues to determine their format | Never — 1 field prevents unlimited future pain |

---

## Integration Gotchas

| Integration | Common Mistake | Correct Approach |
|-------------|----------------|------------------|
| GitHub Issues REST API | Creating issue then separately adding label in a second call — if label call fails, issue exists without label and is unfindable by label query | Pass `labels: ["ww-card"]` in the initial `POST /repos/{owner}/{repo}/issues` body |
| GitHub Issues REST API | Assuming `GET /issues?labels=ww-card` returns all issues — GitHub paginates at 30 items by default | Always pass `per_page=100` and follow `Link: rel="next"` headers for pagination |
| GitHub Issues REST API | Updating issue body with `PATCH` using the full body string when only one field changed — concurrent sync + user edit overwrites each other's changes | Parse body JSON, mutate the specific field, re-serialize; do not overwrite with a stale full-body string |
| GitHub Issues REST API | Using `issue_number` (integer) vs `node_id` (string) — GraphQL needs `node_id`, REST needs `number` | Store both `github_issue_number` (int) and `github_issue_id` (node_id string) in SQLite; use the right one per API type |
| Shopify Admin REST API | Using deprecated `2021-01` API version for orders — fulfillment fields moved in `2022-04` | Always specify explicit API version in URL and target the most recent stable version |
| Windows Credential Manager (keyring) | Calling `keyring::Entry::get_password()` on the UI thread — it can show a system prompt that blocks Slint's event loop | Call credential reads on the background sync thread, never on the UI thread |

---

## Performance Traps

| Trap | Symptoms | Prevention | When It Breaks |
|------|----------|------------|----------------|
| Loading all cards from SQLite into memory on every sync update | Acceptable at 20 cards, imperceptible; at 200 cards causes noticeable UI stutter during sync | Use incremental upsert: only load changed records by comparing `synced_at` timestamps | ~100+ cards |
| Fetching all GH Issue comments for every card on every sync cycle to rebuild `notes` | Each card = 1 extra HTTP call; 50 cards = 50 extra calls per 5-minute cycle | Sync comments lazily (only when a card's issue `updated_at` has changed) or use `ETag` conditional requests | ~20+ cards with notes |
| Rebuilding entire Slint `ModelRc` on every sync | Causes all card views to re-render and flicker | Use diff-based update: add/remove/mutate individual rows in the `VecModel` rather than replacing it | Any scale — visible immediately |
| Holding SQLite write transaction open across GH API calls | GH API call takes 500ms–2s; other threads cannot write during this time | Fetch from GH, store in memory, open transaction, write, close transaction — never hold transaction across network I/O | Immediate — any network latency |

---

## Security Mistakes

| Mistake | Risk | Prevention |
|---------|------|------------|
| Writing Shopify access token to `config.toml` or `pending_edits.json` as a fallback | Token exposed in plaintext file on disk; accessible to any process running as the same user | Already prevented: token lives in Windows Credential Manager via `keyring` crate. MEMORY.md flags this. Never add a plaintext fallback. |
| Logging GH API tokens or Shopify tokens in `debug.log` | Token in logs is as bad as token in config | Sanitize all credentials before logging; log only the first 4 chars + `****` |
| GH Issue body contains `shopify_order_id` in plaintext | GH Issues in `beyond-outgoing` are readable by all org members; order IDs could expose customer data | Acceptable risk per project design (org-internal repo); but do not include customer email, name, or address in issue bodies |
| Storing `github_issue_id` in SQLite without validating it belongs to the expected repo | A crafted sync response could point to issues in a different repo | Validate that all `github_issue_id` values match the configured `beyond-outgoing` repo before storing |

---

## UX Pitfalls

| Pitfall | User Impact | Better Approach |
|---------|-------------|-----------------|
| No visual distinction between "syncing" and "offline" states | User does not know if data is fresh or stale | Show distinct icons/colors: spinning sync indicator vs. static offline icon; never show the same state for both |
| Serial assignment UI blocks card view while waiting for GH API write | User cannot glance at other cards during a slow write | Write to SQLite immediately (optimistic), queue GH write to `pending_edits`, update UI from SQLite write result |
| Product catalog shows all products without filtering to "available for assignment" | User sees already-assigned serials when trying to assign a new one | Filter serial instances to `state = Available` by default in assignment picker |
| Start Return action succeeds (queued) but no confirmation that Shopify return was actually initiated | User thinks return is done, but the pending edit is sitting in queue | Show pending state explicitly: "Return queued — waiting for sync" until the Shopify write confirms |
| Lists feature uses the same card layout as the main dashboard | Users confuse list membership with shipment status | Lists view needs a distinct visual treatment that makes the "this is a list" context clear |

---

## "Looks Done But Isn't" Checklist

- [ ] **SQLite migration:** `Repository` struct has no production callers after cutover — verify with `grep -r "repo.lock()" crates/service/src/api/` returns only test files.
- [ ] **GH Issue creation:** Every `ww-card` issue in production has a non-null `github_issue_id` in SQLite — verify no null `github_issue_id` in `cards` table after migration.
- [ ] **Deprecated fields removed:** `github_profile_url` field does not appear in any non-test Rust file — verify with `grep -r "github_profile_url" crates/` and confirm only test fixtures.
- [ ] **Serial instance state machine:** SQLite `serial_instances` table has a constraint preventing the same `serial_id` from appearing in two active `card_products` rows — verify with a test that tries to double-assign and confirms an error.
- [ ] **Offline flush sequencing:** Reconnect logic runs flush before sync — verify by reading the reconnect code path and confirming `pending_edits` with `Flushing` status blocks sync start.
- [ ] **GH Issues pagination:** The `list_issues_by_label()` function follows `Link: rel="next"` headers — verify with a mock server returning exactly 30 issues and confirming a second page request is made.
- [ ] **Rate limit handling:** GH Issue creation loop checks `X-RateLimit-Remaining` — verify with a mock that returns `remaining: 5` and confirms backoff.
- [ ] **Body schema version:** Every newly-created `ww-card` and `ww-product` issue body contains a `schema_version` key — verify in integration test.

---

## Recovery Strategies

| Pitfall | Recovery Cost | Recovery Steps |
|---------|---------------|----------------|
| In-memory Repository still read after SQLite exists | HIGH | Full audit of all `repo.lock()` callers; replace with SQLite reads; retest all sync + item CRUD paths |
| GH Issue body format incompatible across phases | HIGH | Write a migration that reads all `ww-card` and `ww-product` issues, re-parses with a lenient parser, and re-writes with canonical format; must be run on real issues before next phase ships |
| Serial instance double-assignment | MEDIUM | SQL query to find conflicting assignments; resolve by picking the most recent `assigned_at`; log to `sync_audit_log`; re-run sync to confirm clean state |
| Offline flush races with sync | MEDIUM | Add a global `FlushInProgress: AtomicBool` gate; sync cycle checks it before starting; flush sets it on start, clears on completion |
| Deprecated field left as dummy value | LOW | `grep` sweep + compiler-aided removal; low blast radius per field |
| Rate limit hit during bulk issue migration | LOW | Migration is idempotent; re-run after `X-RateLimit-Reset` time; issues already created are skipped |

---

## Pitfall-to-Phase Mapping

| Pitfall | Prevention Phase | Verification |
|---------|------------------|--------------|
| Reading in-memory Repository after SQLite exists | SQLite full-mirror cache | `grep -r "repo.lock()" src/` returns only test files |
| GH Issue body format lock-in | GH Issues cloud storage (first implementation) | Body parse/emit round-trip test on all schema fields |
| Offline queue conflicts with sync | Offline mode / pending edit flush phase | Reconnect integration test: offline edit → reconnect → sync → verify edit persists |
| Serial instance double-assignment | Serial instance tracking phase | SQL constraint test: double-assign returns error |
| Deprecated field removal breaking pipeline | Deprecated field cleanup phase | Zero non-test references to any deprecated field name |
| GH API rate limits on bulk migration | GH Issues cloud storage phase | Unit test with mock returning `X-RateLimit-Remaining: 5` confirms backoff |
| Slint z-order blocking product squares | Product catalog UI / serial assignment UI | Manual click test on item squares confirms they receive events |
| SQLite deadlock from single connection | SQLite full-mirror cache (connection setup) | Concurrent write test: sync thread + UI callback simultaneous write completes without hang |

---

## Sources

- Codebase inspection: `crates/service/src/db/repository.rs`, `crates/app/src/live_client.rs`, `crates/service/src/sync/` (direct read, 2026-03-22)
- DATA-FLOW.md authoritative rules: RULE-01 through RULE-08 (project document, 2026-03-22)
- MEMORY.md known issues: Slint z-order / TouchArea precedence, TextInput arrow key capture (project memory, current)
- GitHub REST API documentation: rate limits (5,000/hr authenticated), pagination defaults (30/page), issue creation parameters — HIGH confidence based on stable API behavior
- rusqlite threading model: not `Send` by default, `Arc<Mutex<Connection>>` pattern — HIGH confidence (well-established Rust ecosystem pattern)
- SQLite WAL mode behavior — HIGH confidence (official SQLite documentation, stable since 3.7.0)

---
*Pitfalls research for: v1.1 migration features (SQLite, GH Issues, product catalog, serial tracking, offline sync)*
*Researched: 2026-03-22*
