Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Pipeline Design 16

Seth Ford edited this page Feb 13, 2026 · 1 revision

Now I have everything I need. Here's the ADR:


Design: Add Linux systemd support for process supervision

Context

Shipwright's sw-launchd.sh (364 lines) provides macOS-only process supervision via launchd user agents. It generates three plist files (daemon, dashboard, connect) under ~/Library/LaunchAgents/ and manages them via launchctl load/unload. A hard check_macos() gate at line 43 blocks execution on non-Darwin platforms. Linux users — particularly those running headless CI servers or remote development machines — have no equivalent way to auto-start the daemon, dashboard, or connect services.

Constraints from the codebase:

  • All scripts must be Bash 3.2 compatible (no associative arrays, no readarray, no ${var,,})
  • scripts/lib/compat.sh already provides is_macos(), is_linux(), and the _COMPAT_UNAME override for testing
  • The daemon traps SIGINT SIGTERM at line 4330 of sw-daemon.sh, touching a $SHUTDOWN_FLAG file for graceful shutdown. The poll loop checks this flag in 1-second intervals and waits up to 30 seconds for workers to finish (lines 4137–4199)
  • CI already runs on a macos-latest / ubuntu-latest matrix (.github/workflows/test.yml)
  • The test harness pattern uses mock binaries in $PATH, PASS/FAIL counters, TEMP_DIR sandboxing, and HOME redirection (see sw-daemon-test.sh lines 22–60)

Decision

Extend sw-launchd.sh with platform-dispatching install/uninstall/status commands that branch on is_macos() / is_linux() from lib/compat.sh. The macOS plist logic stays intact (extracted into install_launchd(), uninstall_launchd(), status_launchd() helper functions). Parallel install_systemd(), uninstall_systemd(), status_systemd() functions generate systemd user-level unit files.

systemd unit design

Property Value Rationale
Unit directory ~/.config/systemd/user/ User-level — no root required, mirrors launchd's user-agent model
KillSignal SIGTERM Daemon traps SIGTERM → touches shutdown flag → graceful drain
TimeoutStopSec 35 (daemon), 10 (dashboard, connect) Daemon waits 30s for workers; 35s gives margin before SIGKILL
Restart on-failure Auto-restart on crash, but not on clean shipwright daemon stop
StandardOutput journal Journald provides log rotation free; queryable via journalctl --user -u shipwright-daemon
WantedBy default.target Standard user session target
Environment PATH + HOME Matches the macOS plist EnvironmentVariables pattern

loginctl enable-linger is called during install so services survive user logout on headless servers. The install function checks for loginctl availability and warns (non-fatal) if absent.

Three service units are generated:

Unit ExecStart Notes
shipwright-daemon.service <sw_bin> daemon start 35s stop timeout
shipwright-dashboard.service <bun_bin> run <repo>/dashboard/server.ts 10s stop timeout
shipwright-connect.service <sw_bin> connect start Only created if ~/.shipwright/team-config.json exists

Error handling: Each systemctl --user call is guarded with || true + warning output, matching the existing launchd pattern of non-fatal load/unload failures (see lines 227–244 of current sw-launchd.sh). Unsupported platforms (neither macOS nor Linux) get a clear error message and exit 1.

Platform dispatch pattern:

cmd_install() {
 if is_macos; then install_launchd
 elif is_linux; then install_systemd
 else error "Unsupported platform"; exit 1
 fi
}

Alternatives Considered

  1. Separate sw-systemd.sh script + new CLI subcommand — Pros: no risk of breaking existing macOS behavior; clear separation. Cons: duplicates sw binary resolution, log directory setup, and connect-conditional logic; requires new CLI router entry in scripts/sw; users must learn a different command per platform. The plan correctly chose extending the existing script since the user-facing command should be platform-agnostic.

  2. System-level systemd units (/etc/systemd/system/) — Pros: survive reboots without linger; visible to all users. Cons: requires sudo for install/uninstall, which breaks the no-root pattern matching launchd user agents; multi-user install is out of scope. User-level units are the correct analog.

  3. Docker/container-based supervision — Pros: works on any OS. Cons: heavy dependency; the daemon itself spawns tmux sessions and Claude processes that assume a host environment; container isolation would break the core workflow.

Implementation Plan

  • Files to create:

    • scripts/sw-launchd-test.sh — New test suite (~13 tests) covering both platforms via _COMPAT_UNAME override
  • Files to modify:

    • scripts/sw-launchd.sh — Replace check_macos() with platform dispatch; extract macOS logic into install_launchd()/uninstall_launchd()/status_launchd(); add install_systemd()/uninstall_systemd()/status_systemd(); update help text and header
    • package.json (line 32) — Append && bash scripts/sw-launchd-test.sh to the test script chain
    • .github/workflows/test.yml — Add Run launchd tests step after the tmux tests block (line 92)
  • Dependencies: None. systemctl, loginctl, and journalctl are standard on all modern Linux distributions. The script gracefully degrades if they're missing.

  • Risk areas:

    • _COMPAT_UNAME override fidelity — Tests mock the platform but can't exercise real systemctl/launchctl calls. The mock binaries in $PATH must faithfully simulate exit codes and stdout patterns. Mitigated by testing file generation content (grep for KillSignal=SIGTERM, TimeoutStopSec=35, etc.) rather than relying on tool behavior.
    • loginctl enable-linger on CI — GitHub Actions ubuntu runners may not have a full systemd user session. The install test should verify unit file generation without requiring systemctl daemon-reload to succeed. Mock loginctl in the test PATH.
    • Plist regression — Extracting macOS logic into helper functions could introduce bugs if variable scoping (local) is wrong. The existing macOS tests (if run on macOS CI) serve as regression guard.
    • Bash 3.2 compatibility — All new code must avoid associative arrays, readarray, ${var,,} lowercase, ${var^^} uppercase. No new syntax risks in this change since it's straightforward conditionals and heredocs.

Validation Criteria

  • shipwright launchd install on Linux generates three .service files in ~/.config/systemd/user/ with correct KillSignal=SIGTERM, TimeoutStopSec=35, Restart=on-failure, and StandardOutput=journal directives
  • shipwright launchd install on macOS continues to generate three .plist files in ~/Library/LaunchAgents/ (no regression)
  • shipwright launchd uninstall on Linux removes unit files and calls systemctl --user disable/stop for each service
  • shipwright launchd status on Linux queries systemctl --user is-active and shows journal entries
  • Connect service is only generated when ~/.shipwright/team-config.json exists (both platforms)
  • loginctl enable-linger is called during install; failure is a warning, not a hard error
  • Unsupported platforms (neither macOS nor Linux) get a clear error and exit 1
  • New sw-launchd-test.sh passes on both macos-latest and ubuntu-latest CI runners
  • All 23 test suites pass (npm test — 22 existing + 1 new)
  • No Bash 3.2 violations (no associative arrays, no readarray, no ${var,,})
  • Unit file ExecStart paths are resolved to absolute paths (no relative sw references)

Clone this wiki locally

AltStyle によって変換されたページ (->オリジナル) /