Pipeline Design 17

Jump to bottom

Seth Ford edited this page Feb 12, 2026 · 2 revisions

The design.md file is in a sensitive directory. Let me output the ADR directly instead.

Design: Add SQLite persistence layer for team state

Context

Shipwright currently stores all runtime state as flat files under ~/.shipwright/:

events.jsonl — append-only JSONL, no locking, rotates at 50 MB. Re-parsed in full by dashboard/server.ts via readEvents() on 18 separate call sites per request cycle.
daemon-state.json — JSON read-modify-write with flock (FD 7, 5 s timeout) + atomic tmp-mv. Re-read on every dashboard poll.
costs.json — JSON array capped at 1000 entries, flock (FD 200, 10 s timeout) + atomic write.
memory/{hash}/failures.json, decisions.json — per-repo JSON, flock 10 s + atomic write, capped at 100 entries.

Problems this solves:

Dashboard performance — readEvents() re-parses the entire events.jsonl (up to 50 MB) on every API call. calculateDoraGrades (dashboard/server.ts:1099) does a full linear scan of events for each request. With 18 call sites, a single dashboard page load can parse the same file dozens of times.
Query limitations — computing "cost per day", "stage duration percentiles", or "failure frequency by category" requires full-file scans with in-memory filtering. SQL indexing eliminates this.
Concurrency — multiple daemons (fleet mode) writing events concurrently rely on >> append atomicity, which is not guaranteed for lines exceeding PIPE_BUF (4 KB on macOS). SQLite WAL mode provides proper concurrent-read/single-write semantics.
Data evolution — adding new queryable fields requires changing every consumer. SQL migration files provide structured schema evolution.

Constraints:

Bash 3.2 compatible (no associative arrays, no readarray, no ${var,,})
sqlite3 CLI may not be installed on all systems — bash writes must degrade gracefully
Dashboard uses Bun runtime — must use bun:sqlite (built-in, zero-dep)
Existing JSONL/JSON files are the source of truth for non-dashboard consumers and cannot be removed
22 existing test suites must continue passing

Decision

Dual-write architecture with SQLite as read-optimized secondary store

Data flow:

Bash scripts (write path) Dashboard (read path)
───────────────────────── ─────────────────────
emit_event() ──┬──▶ events.jsonl server.ts
 └──▶ sqlite3 INSERT ├── bun:sqlite queries (primary)
 └── file-based fallback (if DB missing)
cost_record() ─┬──▶ costs.json
 └──▶ sqlite3 INSERT
daemon_track_job() ─┬──▶ daemon-state.json
 └──▶ sqlite3 INSERT/UPDATE

Key decisions:

JSONL remains authoritative. Bash scripts always write to JSONL/JSON first. The SQLite insert is a best-effort secondary write guarded by command -v sqlite3 >/dev/null 2>&1 and || true. If SQLite is unavailable or the write fails, the system behaves identically to today.
Dashboard reads SQLite first, files second. Each query function in server.ts attempts a SQL query via bun:sqlite. If the DB file doesn't exist or the query fails, it falls back to the current readFileSync + parse logic. This makes the migration completely non-breaking.
Two runtimes, one database. Bash uses the sqlite3 CLI binary. TypeScript uses bun:sqlite (Bun's built-in FFI binding). Both access ~/.shipwright/shipwright.db. WAL mode (PRAGMA journal_mode=WAL) allows concurrent readers while one writer holds the lock. A 5-second busy timeout (PRAGMA busy_timeout=5000) prevents immediate failures under contention.
Schema managed via numbered migration files. scripts/lib/migrations/001_initial_schema.sql contains the initial 8-table schema. A schema_version table tracks applied migrations. Both the bash library (sw_db_migrate) and the TypeScript module (ShipwrightDB.migrate()) run the same .sql files, ensuring schema consistency regardless of which runtime initializes the database first.
Feature-gated in bash. The presence of sqlite3 on PATH controls whether dual-writes happen. SW_SQLITE_DISABLED=1 can explicitly opt out. No code changes needed to disable.
extra TEXT column for overflow. Tables like events and pipeline_runs include an extra TEXT column for JSON-encoded fields that don't warrant dedicated columns. This prevents schema churn for uncommon event properties while keeping frequently-queried fields as proper indexed columns.

Schema (8 tables, 15 indexes)

schema_version (version INTEGER PK, applied_at TEXT, description TEXT)
events (id INTEGER PK AUTOINCREMENT, type TEXT NOT NULL, ts TEXT NOT NULL,
 ts_epoch INTEGER NOT NULL, issue INTEGER, stage TEXT, result TEXT,
 duration_s REAL, cost_usd REAL, model TEXT, extra TEXT)
 → idx: type, ts_epoch, issue, (type + ts_epoch)
pipeline_runs (id INTEGER PK AUTOINCREMENT, issue INTEGER NOT NULL,
 branch TEXT, template TEXT, status TEXT NOT NULL DEFAULT 'running',
 started_epoch INTEGER NOT NULL, completed_epoch INTEGER,
 duration_s REAL, pr_url TEXT, worktree_path TEXT, extra TEXT)
 → idx: status, issue, started_epoch
pipeline_stages (id INTEGER PK AUTOINCREMENT, run_id INTEGER NOT NULL,
 stage TEXT NOT NULL, status TEXT NOT NULL,
 started_epoch INTEGER, completed_epoch INTEGER,
 duration_s REAL, cost_usd REAL,
 FOREIGN KEY (run_id) REFERENCES pipeline_runs(id))
cost_entries (id INTEGER PK AUTOINCREMENT, ts_epoch INTEGER NOT NULL,
 input_tokens INTEGER, output_tokens INTEGER, model TEXT,
 stage TEXT, issue INTEGER, cost_usd REAL NOT NULL)
 → idx: ts_epoch, model, issue
sessions (id TEXT PK, machine TEXT, started_at TEXT, last_heartbeat TEXT,
 daemon_pid INTEGER, state TEXT DEFAULT 'active', active_jobs TEXT)
metrics (id INTEGER PK AUTOINCREMENT, metric_name TEXT NOT NULL,
 value REAL NOT NULL, period TEXT, period_start TEXT,
 computed_at TEXT NOT NULL)
 → idx: (metric_name + period_start)
memory_entries (id INTEGER PK AUTOINCREMENT, repo_hash TEXT NOT NULL,
 entry_type TEXT NOT NULL, pattern TEXT, root_cause TEXT,
 fix TEXT, category TEXT, seen_count INTEGER DEFAULT 1,
 frequency REAL, last_seen TEXT, extra TEXT)
 → idx: (repo_hash + entry_type), last_seen

Error handling

Bash writes: Every SQLite insert is wrapped in || true. sw_db_available() runs once per script invocation (cached in _SW_DB_AVAILABLE). If sqlite3 is not on PATH, all sw_db_* calls become no-ops.
Dashboard reads: Each SQL query is in try/catch. On any SQLite error, the function falls back to file-based reads. Logged to stderr, not surfaced to user.
Migration failures: schema_version isn't updated until migration succeeds. Next startup retries. Migrations use BEGIN/COMMIT for atomicity.

Import path

shipwright db import reads existing files and bulk-inserts:

events.jsonl → events (batched in 1000-row transactions)
costs.json → cost_entries
memory/*/failures.json → memory_entries (entry_type='failure')
memory/*/decisions.json → memory_entries (entry_type='decision')

Idempotent — checks MAX(ts_epoch) and only imports newer entries.

Alternatives Considered

1. Replace JSON files entirely with SQLite

Pros: Single source of truth, no dual-write complexity, simpler long-term.

Cons: Breaking change for ~15 bash consumers that use jq to parse JSON files. Pipeline scripts, cost checks, memory injection, self-optimization would all need rewriting. The flock + atomic-write pattern in sw-cost.sh:158, sw-daemon.sh:1041, and sw-memory.sh:219 is battle-tested; removing it risks regressions across all 22 test suites.

Verdict: Rejected. Migration risk outweighs simplicity. Dual-write validates SQLite before committing to it.

2. Use PostgreSQL or another client-server database

Pros: Better concurrency, richer query language, proven at scale.

Cons: Requires a running server process. Shipwright targets developer laptops and CI runners — a PostgreSQL dependency is a non-starter.

Verdict: Rejected. SQLite's serverless model matches Shipwright's architecture.

3. Use an embedded key-value store (LevelDB, LMDB)

Pros: Simpler data model, fast point lookups.

Cons: No SQL — DORA metrics, cost aggregations, activity feeds still need custom code. No CLI for bash. Requires native bindings (unlike bun:sqlite which is built-in).

Verdict: Rejected. SQL aggregation is the primary value.

4. Add caching layer to dashboard instead of SQLite

Pros: Simpler change — cache readEvents() with TTL invalidation.

Cons: Solves only repeated-parse, not query limitations. DORA still requires full scans. No help with concurrency, no schema evolution, memory grows with file size.

Verdict: Rejected. Caching is a band-aid.

Implementation Plan

Files to create

File	Purpose
`scripts/lib/migrations/001_initial_schema.sql`	8 tables, 15 indexes, `schema_version` tracking
`scripts/lib/db.sh`	Bash SQLite library: `sw_db_path()`, `sw_db_available()`, `sw_db_migrate()`, `sw_db_ensure()`, `sw_db_query()`, `sw_db_insert_event()`, `sw_db_insert_cost()`, `sw_db_insert_pipeline_run()`, `sw_db_update_pipeline_run()`. Double-source-guarded. WAL + 5 s busy timeout. Bash 3.2 compatible.
`scripts/sw-db.sh`	CLI: `shipwright db migrate\|status\|import\|export`
`scripts/sw-db-test.sh`	Test suite: schema, CRUD, DORA w/ seeded data, import idempotency, 5-writer concurrency, degradation without `sqlite3`
`dashboard/db.ts`	`ShipwrightDB` class via `bun:sqlite`: migrations, event CRUD, pipeline queries, SQL DORA grades, cost breakdowns, paginated activity, session mgmt, memory queries

Files to modify

File	Change
`scripts/lib/helpers.sh:56`	Dual-write in `emit_event()` — source `db.sh`, gate behind `sw_db_available`, `
`scripts/sw-cost.sh:146`	`sw_db_insert_cost` inside existing `flock` in `cost_record()`
`scripts/sw-daemon.sh:1466`	Pipeline lifecycle: `sw_db_insert_pipeline_run` at spawn, `sw_db_update_pipeline_run` at reap
`scripts/sw-memory.sh:190`	`sw_db_insert_memory` in `memory_capture_failure()` and decision recording
`dashboard/server.ts:604`	Import `ShipwrightDB`, replace 14+ file-scan functions with SQL, keep file fallback
`scripts/sw:~200`	Add `db)` case to subcommand router
`package.json:32`	Append `&& bash scripts/sw-db-test.sh` to test script
`.claude/CLAUDE.md`	Add new files to tables, `shipwright db` to commands, DB path to runtime state

Dependencies

sqlite3 CLI — optional for bash (pre-installed on macOS, apt install sqlite3 on Linux)
bun:sqlite — built-in to Bun, zero additional npm dependencies

Risk areas

emit_event() hot path — adds ~5–10 ms per sqlite3 CLI invocation. Guarded by || true and cached availability check. Monitor stage durations post-rollout.
WAL file growth — fleet mode with many daemons. Mitigated by short-lived queries and default wal_autocheckpoint (~4 MB).
Two runtimes, one DB — both must set PRAGMA journal_mode=WAL on every open. Inconsistent journal modes cause corruption.
Test isolation — must use isolated $HOME. Existing suites already mock this.
Missing sqlite3 — all bash writes gated behind command -v sqlite3. Dashboard uses bun:sqlite. System degrades to file-only.
Large import — 50 MB events.jsonl takes 10–30 s. Batched transactions + progress output + idempotent.

Validation Criteria

sqlite3 ~/.shipwright/shipwright.db ".tables" shows all 8 tables after shipwright db migrate
schema_version tracks migration 001 with timestamp and description
emit_event "test.event" "key=val" writes to both events.jsonl AND events table
cost_record writes to both costs.json AND cost_entries table
Dashboard /api/status returns identical DORA grades from SQLite vs file fallback
Dashboard falls back to file reads when shipwright.db is deleted mid-run
shipwright db import populates from existing JSONL/JSON, idempotent on re-run
shipwright db export produces valid JSONL from SQLite
sw-db-test.sh passes: schema, CRUD, DORA, import, concurrent writes (5 writers), degradation
All 22 existing test suites pass (only package.json changes)
With sqlite3 removed from PATH: emit_event still writes JSONL, no errors on stderr
PRAGMA journal_mode returns wal from both bash and TypeScript
No Bash 3.2 incompatibilities (no associative arrays, readarray, ${var,,}, ${var^^})
Pipeline stage durations don't regress measurably (< 50 ms overhead per emit_event)

Pipeline Design 17

Design: Add SQLite persistence layer for team state

Context

Decision

Dual-write architecture with SQLite as read-optimized secondary store

Schema (8 tables, 15 indexes)

Error handling

Import path

Alternatives Considered

1. Replace JSON files entirely with SQLite

2. Use PostgreSQL or another client-server database

3. Use an embedded key-value store (LevelDB, LMDB)

4. Add caching layer to dashboard instead of SQLite

Implementation Plan

Files to create

Files to modify

Dependencies

Risk areas

Validation Criteria

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!