# Decompiler Stack — How to Extract `.gmd` Files

> Era-appropriate tooling for GameMaker 5.3a `.gmd` extraction.
> **Primary daily extractor:** `tools/extract-gmd/` (this repo's TS Node CLI — port of LateralGM `GmFileReader` semantics).
> **Reference parser:** [LateralGM](wiki/11-tool-lateralgm.md) `GmFileReader.java` (used as cross-check oracle, not runtime).

This document is a thin wrapper around [`wiki/15-extraction-pipeline.md`](wiki/15-extraction-pipeline.md). For the full ranked methodology read that page first; this page adds installation/provenance notes and the canonical-platform rule for reproducible byte-identity.

---

## Primary daily extractor: `tools/extract-gmd/`

A TypeScript Node 22 CLI that walks the `.gmd` binary sequentially per [`wiki/03-gmd-format.md`](wiki/03-gmd-format.md) and emits a [GmkSplitter-style](wiki/12-tool-gmksplitter.md) one-file-per-resource tree under `extracted/<source>/...`.

### Install

```bash
cd tools/extract-gmd
npm install     # exact-pinned versions (no ^ or ~)
```

Pinned dependencies (deliberate — drift breaks EXT-07 byte-identity):

| Library | Version | Why |
|---------|---------|-----|
| sharp | 0.34.5 | Deterministic PNG encoder; surfaces libvips version in MANIFEST header |
| typescript | 5.6.3 | Strict mode + `noUncheckedIndexedAccess` |
| tsx | 4.21.0 | TS execution without build step |
| vitest | 4.1.5 | Test runner for golden fixtures |
| @types/node | 25.6.0 | Node 22 type defs |

### Run

```bash
# From repo root:
pnpm extract:all                                              # extract both source files
pnpm tsx tools/extract-gmd/cli.ts extract <input.gmd> <out>   # single file
pnpm tsx tools/extract-gmd/cli.ts verify <source-dir>         # SHA256 drift check
pnpm extract:verify                                           # verify both extracted trees
```

Output structure (per `decomp/wiki/12-tool-gmksplitter.md` + this project's D-07):

```
extracted/<source>/
  settings.json
  scripts/<NN>-<name>.gml
  objects/<NN>-<name>/
    meta.json
    events/<event-name>.dnd.json     # canonical lossless DnD descriptor
    events/<event-name>.gml          # auto-transcompiled (read-primary diff target)
  sprites/<NN>-<name>/
    meta.json
    frames/img_NNN.png               # deterministic PNG via pinned sharp
  backgrounds/<NN>-<name>/{meta.json, image.png}
  rooms/<NN>-<name>/{meta,instances,tiles,backgrounds}.json + creation-code.gml
  sounds/<NN>-<name>/{meta.json, audio.<wav|mid|mp3>}
  fonts/<NN>-<name>/{meta.json, glyphs.png}
  paths/<NN>-<name>.json
  timelines/<NN>-<name>/moments/<step>.{dnd.json,gml}
  datafiles/<NN>-<name>.<ext>
  UNKNOWN-ACTIONS.md                  # only present if any DnD action_id was unmapped
  MANIFEST.sha256                     # reproducibility checkpoint
```

Numeric `<NN>-` prefix preserves native source order — the position in the source `.gmd`'s sequential block stream IS the resource ID. Rooms cross-reference object ID 17, so the file at `objects/0017-objX/` must always be that ID regardless of alphabetical sort.

### Reproducibility (EXT-07)

Re-running the extractor on the same input produces byte-identical output. CI verifies this via `pnpm extract:verify`.

**Canonical platform: `linux-x64` (Linux x64-musl).** Sharp ships pre-built libvips per platform; minor libvips version differences between Linux / macOS / Windows can change PNG filter selection. The committed `MANIFEST.sha256` files are computed on `linux-x64` (Fly.io deploy target). On macOS or Windows dev machines, hash drift on PNG files is treated as a soft warning, NOT a CI failure. The MANIFEST header records the canonical sharp + libvips versions for this reason — any future libvips bump is immediately visible.

See [`wiki/15-extraction-pipeline.md`](wiki/15-extraction-pipeline.md) §Validation for the broader 4-tier methodology.

---

## Rank 1 — Native source (already satisfied)

The two source `.gmd` files for this project are present in `legacy/open-source-release/`:

- `BN Online Client 5-8.gmd`
- `BN Online Master 5-4.gmd`

No recovery needed. Run `tools/extract-gmd` directly. This is the daily-dev path.

---

## Rank 2 — Reference parser (LateralGM headless cross-check)

[LateralGM 1.8.234](https://github.com/IsmAvatar/LateralGM/releases) is the canonical open-source `.gmd` parser ([`wiki/11-tool-lateralgm.md`](wiki/11-tool-lateralgm.md)). It is NOT used as a runtime in this project — `tools/extract-gmd` is the daily extractor. LateralGM is used only as a verification oracle (D-02): a small Java probe loads the JAR, parses the `.gmd`, and prints resource counts per block; the TS extractor's output is asserted against those counts.

| Item | Value |
|------|-------|
| JAR | `lateralgm-1.8.234.jar` |
| Java runtime | 1.8+ (project box runs OpenJDK 1.8.0_392) |
| Action_ID library XMLs | Ported once into `tools/extract-gmd/data/action-ids.json` (D-03) |
| Probe script | `tools/extract-gmd/test/oracle/probe.java` — **deferred to local-dev follow-up** (not a Phase 1 deliverable; manual-only per `.planning/phases/01-extraction/01-VALIDATION.md`). Reviewer with JVM + LateralGM JAR can author the probe ad-hoc to assert per-block resource counts. |

---

## Rank 3 — WinXP VM + GameMaker 5.3a IDE (documented fallback only)

For this project, Rank 3 is **documentation-only** (D-04). Per CONTEXT.md, the Rank-1 native-source path is already satisfied for both `.gmd` files we care about. The WinXP VM is needed only if a future phase requires Rank-3 dynamic recovery from a `.exe`-only snapshot (e.g. an older Master build with no surviving `.gmd`).

If/when Rank-3 is needed, follow the procedure in [`wiki/15-extraction-pipeline.md`](wiki/15-extraction-pipeline.md) §"Rank 3 — VBGAMER45 GMD-Recovery + GM Decompiler v2.1":

1. Provision a Windows XP SP3 VM image (legitimate license; project does not ship a VM image).
2. Install GameMaker 5.3a IDE inside the VM (the `legacy/open-source-release-extras/Game Maker 5.3a/gmaker5.3a.zip` shipping in this repo is a cracked PARADOX-tagged release — do NOT use; source a legitimate copy from a YoYo Games archive or a personal license).
3. Install GMD-Recovery and GM Decompiler v2.1 inside the VM (era-appropriate Win32 tools).
4. Run the recovery tool against the orphaned `.exe`; export the recovered `.gmd`.
5. Copy the `.gmd` out of the VM via shared folder; run `tools/extract-gmd` against it.

---

## Rank 4 — Modern decompilers (HARD BAN — do not use)

Per [`wiki/13-modern-tool-incompat.md`](wiki/13-modern-tool-incompat.md), the following tools target GameMaker Studio's `data.win` chunk format and **WILL FAIL** on 5.3a `.gmd` (which is monolithic sequential ZLIB, no `FORM` headers):

- **UndertaleModTool (UTMT)** — fails with "FORM header not found"
- **Altar.NET** — same failure mode
- Any tool described as "Studio-era" or that mentions `data.win` chunks

Reaching for these is the canonical "Phase 1 burned a week" failure mode (RESEARCH.md Pitfall 1). If a future maintainer searches "decompile gamemaker", they will find these — this section is the warning.

---

## Source `.gmd` magic + version (one-time hex-dump record)

Captured 2026-05-02 from the two source files (resolves RESEARCH.md assumption A5):

| File | First 16 bytes (hex) | Magic | Format version |
|------|----------------------|-------|----------------|
| `legacy/open-source-release/BN Online Client 5-8.gmd` | `91 d5 12 00 12 02 00 00 00 00 00 00 10 15 03 00` | 1234321 (0x12D591 LE) | 530 |
| `legacy/open-source-release/BN Online Master 5-4.gmd` | `91 d5 12 00 12 02 00 00 00 00 00 00 10 15 03 00` | 1234321 (0x12D591 LE) | 530 |

Both files are GameMaker `530` format (5.3a-era). The TS extractor's `header.ts`
whitelist `[530, 542, 600, 701, 800, 810]` covers both.

Bytes 8-11 are the v530-specific 4-byte reserved slot (always `00 00 00 00`); bytes 12-15
are the start of the `gameId` int32 (here `0x00031510 = 202000`). The 16-byte DPlay GUID
follows at byte 16 (not shown above).

---

## File extension cross-reference

Per [`wiki/14-gb1-backups.md`](wiki/14-gb1-backups.md), `.gb1` files are byte-identical to `.gmd`. The `tools/extract-gmd` CLI accepts either extension (parser is content-typed via the magic number `1234321` at byte 0, not the filename suffix). When older snapshots become in-scope (Phase 3+), simply pass the `.gb1` path to `extract-gmd extract`.

---

## See also

- [`wiki/15-extraction-pipeline.md`](wiki/15-extraction-pipeline.md) — the substantive 4-tier methodology
- [`wiki/03-gmd-format.md`](wiki/03-gmd-format.md) — `.gmd` binary format spec
- [`wiki/04-dnd-serialization.md`](wiki/04-dnd-serialization.md) — DnD action node format
- [`wiki/11-tool-lateralgm.md`](wiki/11-tool-lateralgm.md) — LateralGM canonical parser
- [`wiki/12-tool-gmksplitter.md`](wiki/12-tool-gmksplitter.md) — GmkSplitter VCS-tree pattern
- [`wiki/13-modern-tool-incompat.md`](wiki/13-modern-tool-incompat.md) — modern-tool ban list
- [`wiki/14-gb1-backups.md`](wiki/14-gb1-backups.md) — `.gb1` ↔ `.gmd` byte-identity
- [`../tools/extract-gmd/README.md`](../tools/extract-gmd/README.md) — local tool README
- [`../.planning/codebase/CONCERNS.md`](../.planning/codebase/CONCERNS.md) — legal/IP gates before publication
