Using spec-driven development to vibecode a music composition tool

Spec-driven development - aka writing a PRD first - is the workflow I typically use with AI agents. I take my pre-existing notes on hobbies and projects and can quickly turn them into springboards for vibecoding.

I play in a few bands and often encounter the laborious process of transcribing music and arranging it for our instrumentation. I wanted to quickly make a low-stakes tool to help with the process.

For developing a PRD, I like to use an iterative 'ask me clarifying questions' prompt in ChatGPT and seed it with a rough description of my user needs, often pasted directly from throwaway Obsidian notes:

Let's iterate on a PRD together. Each response, give the current PRD and then ask me 5 questions which enhance it by 10%.

Try to avoid making assumptions about the target product. Your goal is to understand, document, and probe intelligently. Your job is NOT to randomly pad out the PRD.

Starting point: I want to tap a rhythm on my phone (to a metronome), have it quantised, and see the result as sheet music. I want to set the pitches of these notes and listen to playback, and also visualise as sheet music. Finally I want to export as a transferable format like MIDI or lilypond.

Once I'm happy with the output, I'll copy it directly into a PRD.md file (after first stripping out any of ChatGPT's header/footer fluff).

My kick-off prompt in Claude Code is then something like: Read PRD.md and turn it into a step-by-step PLAN.md. It's important to understand the conceptual difference: the PRD is a fixed specification of the use case we're aiming to achieve; the plan are the steps we need to go through to implement it.

Helpfully, Claude will often naturally write the plan into phases:

### Phase 1 — Project Scaffold
**Goal:** Running app skeleton with correct tooling wired up.

- `server/` — FastAPI app serving `dist/` (or Vite dev proxy)
- `client/` — Vite + React + TypeScript
- Vitest + Playwright configured
- Layout components: `ConfigPanel`, `TapZone`, `NotationView`, `PianoRoll`, `PlaybackBar`
- Placeholder routing between views (tabs or single-page stacked)
- CI: `pytest` + `vitest run` in Makefile

### Phase 2 — Tap Input + Metronome
**Goal:** User can configure BPM/time-sig, start a metronome, tap, and see raw tap events logged.

- `AudioEngine` singleton: Tone.js `Transport` + `MetronomePlayer`
  - Downbeat emphasis (louder click, different pitch)
  - Optional visual pulse (CSS flash on beat)
- `RecordingManager`: state machine `idle → armed → recording → stopped`
  - Start: sync tap clock to Tone.js transport start
  - Capture: `performance.now()` on every tap keypress/click/touch
  - Stop: at user trigger or 60s hard cap
- Visual: tap flash feedback, tap counter, elapsed time
- **Tests:** mock timestamps → verify event list

### Phase 3 — Quantisation Engine
**Goal:** Raw tap timestamps → `Note[]` sequence. This is the core algorithm; build TDD.
...

which has the added bonus of hyper-efficient next step prompting:

### 🤖 Assistant - 06/05/2026

9/9 tests passing, clean build at 48KB.

**What was built:**

- `engine/quantise.ts` — pure function `quantise(taps, config): Note[]` with exported `gridIntervalBeats` helper. Algorithm: ms→beats, snap to nearest grid point, deduplicate same-cell taps, note-per-tap (one grid unit each), rests fill gaps >1 grid unit, optional trailing rest to fill total recording duration.

- `tests/quantise.test.ts` — 9 tests covering: empty input, grid snapping, rest detection, triplet subdivisions, multi-bar sequences, deduplication, pitch/flag invariants.

- `TapZone.svelte` — wired to call `quantise()` on stop, writing results directly into the session store via `setNotes()`. The stopped UI now shows "N notes · M rests quantised".

Phase 4 (VexFlow notation rendering) is next.

---

### 👤 User - 06/05/2026

proceed with phase 4

One slight hurdle I would note is the need to handle environment setup - I had to run a few sudo, apt and python -m venv commands before Claude could get into coding the phases. I'm sure this could be made more efficient with a curated set of bootstrap scripts, or --dangerously-skip-permissions, but I'm OK with spending a few more minutes on careful, project-specific setup.

While a project is still fairly simple, Claude can quite reliably write and implement its own plan, and the user's role is mostly in UX/UI testing and reporting back on any bugs. For example, I found that quantisation wasn't working exactly as intended, but Claude hadn't picked up on it because it could only see that the code was functioning without errors (and I hadn't added much by way of rich feedback signals at this point). I did try to shortcut the process by asking it to "implement a 'gold standard' set of tests where simulate an imperfect human-like tapping of known rhythms to see if they match up after quantisation" but it didn't seem to improve the accuracy. A more sophisticated testing and optimisation suite would probably be a good thing to build out.