# Screen Timelapse MCP Server

## What This Is

An MCP (Model Context Protocol) server that enables AI agents to capture sequences of screenshots over time from specific windows, screen regions, or the full desktop, and compile them into a single grid image for visual comparison. Designed for agents debugging timing-based issues like app hangs, UI flicker, or transient visual bugs.

## Core Value

Agents can visually observe how a screen region changes over time in a single digestible image, enabling diagnosis of timing-based issues that a single screenshot cannot capture.

## Current Milestone: v1.2 Capture Profiles & Frame Repository

**Goal:** Named capture profiles, timing profiles, profile-aware capture, and frame repository with subset grid compilation.

**Target features:**
- Per-project named screenshot profiles (save/list/delete MCP tools)
- Timing profiles for capture interval presets
- Profile-aware start_capture with merge semantics
- Frame repository and subset grid compilation

## Requirements

### Validated

- [x] Window/region targeting: choose a specific app window, screen region, region relative to a window, or full desktop (v1.0 Phase 2)
- [x] Timed capture: take multiple screenshots over a configurable time period (v1.0 Phase 1)
- [x] Configurable interval: set the time between each screenshot capture (v1.0 Phase 1)
- [x] Maximum capture limit: set a cap on the number of screenshots taken (v1.0 Phase 1)
- [x] Grid compilation: compile all captured screenshots into a single grid image (v1.0 Phase 1)
- [x] MCP resource exposure: expose the compiled grid image as an MCP resource (v1.0 Phase 1)
- [x] MCP tool interface: expose capture configuration and triggering as MCP tools (v1.0 Phase 1)
- [x] Timestamp overlay: burn timestamp into each grid cell (v1.0 Phase 1)
- [x] Delta highlighting: optionally highlight regions that changed between frames (v1.0 Phase 3)
- [x] GIF/animation export: optionally export as animated GIF (v1.0 Phase 3)
- [x] Publish as cplugs marketplace plugin (v1.0 Phase 4)

### Active

- [ ] Window/region targeting: choose a specific app window, screen region, region relative to a window, or full desktop
- [ ] Timed capture: take multiple screenshots over a configurable time period
- [ ] Configurable interval: set the time between each screenshot capture
- [ ] Maximum capture limit: set a cap on the number of screenshots taken
- [ ] Grid compilation: compile all captured screenshots into a single square grid image
- [ ] MCP resource exposure: expose the compiled grid image as an MCP resource for agent consumption
- [ ] MCP tool interface: expose capture configuration and triggering as MCP tools
- [ ] Timestamp overlay: burn timestamp into each grid cell so agents know when each frame was captured
- [ ] Delta highlighting: optionally highlight regions that changed between consecutive frames
- [ ] GIF/animation export: optionally export the capture sequence as an animated GIF in addition to the grid
- [ ] Publish as cplugs marketplace plugin at C:\Users\decid\.claude\plugins\marketplaces\cplugs

### Out of Scope

- Video recording / continuous streaming -- too heavy for diagnostic use, grid image is sufficient
- Cross-machine remote capture -- local desktop only
- OCR / text extraction from screenshots -- agents can do this themselves from the grid image
- Real-time live preview -- this is a capture-and-review tool, not a monitoring dashboard

## Context

- Target platform: Windows 11 (primary), with potential cross-platform support later
- MCP servers communicate via stdio with JSON-RPC protocol
- The tool will be published to the cplugs marketplace (C:\Users\decid\.claude\plugins\marketplaces\cplugs)
- Primary consumers are AI agents (Claude Code, etc.) diagnosing timing-dependent UI issues
- Screenshot capture on Windows requires win32 APIs or libraries like sharp/screenshot-desktop
- Grid compilation needs image processing (sharp, jimp, or canvas)
- The MCP SDK (@modelcontextprotocol/sdk) provides the server framework

## Constraints

- **Platform**: Windows 11 primary target -- must use Windows-compatible screenshot APIs
- **Protocol**: Must implement MCP server spec (tools + resources over stdio)
- **Distribution**: Must be publishable as a cplugs marketplace plugin
- **Performance**: Screenshot capture should not noticeably slow the target application
- **Image Size**: Grid images must be reasonable size for LLM consumption (not excessively large)

## Key Decisions

| Decision | Rationale | Outcome |
|----------|-----------|---------|
| TypeScript + MCP SDK | Standard MCP server stack, good ecosystem support | -- Pending |
| Grid image (not video) | Single image is digestible by LLMs in one pass | -- Pending |
| cplugs marketplace distribution | User-specified requirement for plugin publishing | -- Pending |

## Evolution

This document evolves at phase transitions and milestone boundaries.

**After each phase transition** (via `/gsd-transition`):
1. Requirements invalidated? -> Move to Out of Scope with reason
2. Requirements validated? -> Move to Validated with phase reference
3. New requirements emerged? -> Add to Active
4. Decisions to log? -> Add to Key Decisions
5. "What This Is" still accurate? -> Update if drifted

**After each milestone** (via `/gsd-complete-milestone`):
1. Full review of all sections
2. Core Value check -- still the right priority?
3. Audit Out of Scope -- reasons still valid?
4. Update Context with current state

---
*Last updated: 2026-04-13 after Phase 8 (Screenshot Profiles) completion*
