-
Notifications
You must be signed in to change notification settings - Fork 0
Releases: Conalh/AgentPulse
v0.8.0: Headless --once parity (hide-idle/max-sessions), fail-closed CI gate, and hardening
A correctness-and-hardening pass driven by an external code review. The headless agentpulse live --once path (the form the GitHub Action runs) is brought in line with the TUI and the inputs the Action already documented; the strict CI gate no longer passes when its own analysis crashed; the macOS notifier escaping is fixed; and discovery gains cost bounds while the report gains an opt-in redaction layer. Detector scope, the Action's supply-chain model, and the CI output's privacy surface are now documented honestly instead of being overstated or left implicit.
Fixed — live --once silently ignored --hide-idle and --max-sessions
Both flags are parsed into LiveOptions by the CLI and forwarded by the GitHub Action (max-sessions defaults to 20), and the interactive TUI honours both — but runOnceMode never read them, so in headless/CI mode they were no-ops despite the Action documenting them. --once now applies both: --hide-idle drops zero-activity sessions from the report (error rows are kept; idle sessions never gate, so this can't change the gate), and --max-sessions caps the printed per-session listing. The cap is deliberately display-only — bucketCounts, hasGatingFinding, and the JSON sessions[] still reflect every analyzed session, so a drifting/stuck session below the cap still fails the build — and the text report announces the truncation (showing freshest N of M sessions) rather than hiding it. SessionSnapshot now carries toolInvocationCount (sourced from recap.enriched) for the idle filter and JSON consumers.
Fixed — strict CI gate failed open on transcript/parser errors
In --once, a session whose pulse() threw (unreadable or corrupt transcript, parser failure) becomes a snapshot with bucket error. The gating set is {drifting, stuck} and --strict exited 1 only on hasGatingFinding, so a run consisting solely of broken transcripts exited 0 under --strict — a governance gate passing green though its own analysis never ran. New --fail-on-error (and Action input fail-on-error, defaulting to true) makes --strict also fail when any session is in the error bucket. Bare --strict without the flag keeps the prior advisory-on-error behaviour so local use isn't disrupted. The exit-code decision is now a pure, exported gateExitCode() so the gate contract is unit-tested directly.
Fixed — macOS notification could break out of the AppleScript string literal
The --notify os macOS branch assembles a single display notification "..." with title "..." expression from transcript-derived text (the session label is ultimately basename(cwd), unsanitized) but escaped only double-quotes. A label ending in a backslash turned the closing quote into an escaped quote so the literal never terminated — an AppleScript-injection/breakage vector — and a raw newline made the one-line expression a syntax error. The escaper now strips control characters (CR/LF included) and escapes the backslash before the quote, and all platforms clamp label length. The Windows PowerShell branch was already sound (every value lands in a single-quoted literal with '→'' doubling, the complete escaping there) and is unchanged apart from the length clamp; Linux uses pure argv. New direct tests fuzz the escaper with quotes, backslashes, newlines, and an injection-shaped payload.
Changed — narrowed the text-only verification failure heuristic
classifyResultText (consulted only when a verification command has no exit code — callers prefer the exit code) included a bare /\berror\b/i in its failure table, so any result text merely mentioning the word "error" was classified as a failure, which could bias the stuck/converging/sequence verdicts. The bare match is replaced with runner-shaped error contexts (Error: at line start, npm ERR!, error TS####, N error(s), compilation/build/command failed); genuinely-red output still matches, an incidental "error" no longer does. A stale comment in sequences.ts (claiming indeterminate verifications count as failures for stuck_loop, when the detector only counts clear failures) is corrected.
Added — bounded discovery (--max-depth, --exclude)
Discovery and the polling watcher walked transcript roots with an unbounded recursive descent and applied staleness only after every file was found and stat'd, so a broad --roots (or a CI artifact dir with deep history) could incur unnecessary I/O. --max-depth <N> caps how far below each root the walk descends and --exclude <d1,d2,...> prunes directory names (e.g. node_modules,.git) before recursing; both are threaded through the one-shot discovery and the live watcher, and exposed as Action inputs. Defaults are unchanged (unbounded / no exclusions), so existing behaviour is preserved.
Added — opt-in redaction for --once output (--redact)
The Action streams the report into the GitHub step summary and an optional sticky PR comment; narratives, signals, and the transcript path embed transcript-derived absolute paths, path clusters, and file names, with no way to scrub them. New --redact <none|paths|all> (and Action input redact): paths reduces file paths / path clusters in narratives, signals, and the transcript path to basenames; all additionally drops narratives and signals, leaving label, bucket, confidence, and drift count. Applies to both text and json.
Documented — detector scope, supply chain, and CI privacy
- Drift-detector scope. The README now states that
driftingmeans "a known risky pattern fired," not "this session is safe," and lists the three rule families the detector actually implements (privileged-path access;curl/wgetpiped to a shell; write outside the repo root) plus the notable patterns it does not cover. The "network execution" wording that overstated the single shell-pipe rule is corrected. - Action supply chain. The Action executes the published npm package
@conalh/agentpulse@<version>, not the checked-out action source; pinning the git ref pins only the version string while the code and its semver-compatible dependencies resolve from npm at run time. This trust boundary is now documented in the README andaction.yml. - CI privacy. The README documents exactly what transcript-derived content reaches the step summary / PR comment (and the additional fields in the JSON snapshot), with guidance to use
redactand not to upload the raw JSON publicly.
Assets 2
v0.7.3: Normalize agent-gov-core dependency to ^1.3.0
Internal
- Bumped
agent-gov-coredependency^1.2.1→^1.3.0to align with the rest of the suite (all five detectors are on^1.3.0). - No behavior change. AgentPulse uses core's transcript parsers and the
Finding/Reportcontract — all unchanged across 1.2.1→1.3.0. The 1.3.0 additions (diff-input guards) are for PR-diff detectors and aren't on AgentPulse's path. Verdicts and report output are unchanged. - README + example workflow self-pins bumped to
v0.7.3. - Takes effect for the GitHub Action once
@conalh/agentpulse@0.7.3is published to npm (the Actionnpx-installs the published package).
Assets 2
v0.7.2: Classifier + parse-cache correctness pass (multibyte fix, dedup, fewer false positives)
A correctness-and-consistency pass across the classifier pipeline and the TUI. Several rules silently mis-fired on real (non-synthetic) sessions — duplicated verification vocabulary across three layers, a stuck_loop that never triggered on actual transcripts, prose-only sessions flagged as stuck, and an incremental parser that double-counted events on any non-ASCII transcript — alongside internal de-duplication of constants/namespaces and two TUI render/label fixes.
Fixed — verification vocabulary was duplicated across three layers
The set of commands that count as a "verification" (running tests, a type checker, a linter, or a build) and the pass/fail result-text tables were maintained as three divergent copies — one each in enrich.ts, sequences.ts, and trajectory.ts. The lists had genuinely drifted (go vet, cargo clippy, mypy, ruff, gradle, mvn, npm run, playwright, prettier were recognized by the sequence detector but not the verification-trend computation), so a command could count toward the edit→verify cycle but not the pass/fail trend — or vice versa — silently. All three layers now share src/verification.ts (isVerificationCommand + classifyResultText); the trend layer gains the tools it was missing and switches from a loose substring match to the precise word-boundary regex the other layers already used.
Fixed — stuck_loop never fired on real transcripts (only synthetic fixtures)
Layer 2.5's detectStuckLoop read the verification exit code off the Bash tool_use event, but the parser emits tool_use and tool_result as separate events with the exit code on the tool_result (keyed by toolUseId). So every real session's verification outcome resolved to indeterminate, no failures were counted, and a wedged edit→fail→edit→fail loop silently degraded to a healthy tdd_loop. analyzeSequences now links tool_result outcomes by id — the same fix computeVerificationTrend got in v0.6.1 — so stuck_loop fires on real transcripts. Locked in by a new expectedSequence assertion on the golden corpus.
Fixed — prose-only sessions false-positived as stuck
A documentation session (5+ .md edits, no test command) tripped refuse_to_verify and flipped to stuck, because the detector counted every edit toward its ≥4 threshold. Editing a doc set legitimately has nothing to verify, so detectRefuseToVerify now counts only code edits — prose extensions (.md/.mdx/.markdown/.txt/.rst/.adoc/.org) don't count, and unknown paths still count as code (conservative toward firing). New doc-edits-no-tests corpus fixture guards the converging outcome; a .ssh-keys cwd fixture guards that the privileged-path detector doesn't misfire on hyphenated lookalikes.
Fixed — incremental parse cache duplicated events on any non-ASCII session
The incremental tail-reader (parseCache.ts) tracks a byte offset of the last complete line, but it derived how far it advanced from a UTF-16 string index (raw.lastIndexOf('\n')). Any multibyte character — emoji, accents, CJK, all common in agent prose, code, and tool output — made the two diverge, so the next read rewound before the true position, re-parsed already-consumed lines, and appended them as duplicate events. Those duplicates inflate the edit/verification counts the classifier runs on, so the live dashboard could land on the wrong bucket for essentially any real session. (The whole-file recap path was unaffected; this was incremental-only — the live TUI and orchestrator.) The reader now finds the newline in the raw buffer (0x0A can't appear inside a UTF-8 multibyte sequence) so the offset is exact. New regression fixture reproduces the duplicate.
Fixed — narrative privileged-path lede claimed coverage the detector lacked
renderDrifting's privileged-path matcher tested for gnupg and credential, but PRIVILEGED_PATH_RULES only ever emits ssh_path / aws_path / kube_path / etc_shadow / private_var — so those terms were dead and the lede over-promised "credentials." The matcher and wording now reflect what's actually flagged (SSH/AWS/Kube/system paths).
Changed — internal duplication cleanup
Default timing constants (DEFAULT_WINDOW_MS, DEFAULT_REFRESH_INTERVAL_MS, DEFAULT_STALE_MS) were re-spelled as literals across five files with two different spellings of the same value; they now live in src/defaults.ts and are imported everywhere. The agent_pulse.live_drift_<slug> namespace convention was likewise built in one place and parsed back with two divergent copies (one using a magic +7); a new src/drift.ts owns the prefix plus both directions (driftKind / driftSlug). No behaviour change. Also corrected stale comments (orchestrator coalescing "exactly once" → "~2 pulses max", the exploring bucket's "no edits yet", stacked JSDoc on the exceptions loader, the DiscoverOptions.staleMs default documented as "24 hours" when it is 1 hour, and evictParseCache's "called by the watcher" — it's the orchestrator reacting to a watcher remove event).
Fixed — the TUI re-rendered every second even with nothing to animate
App's 1 s wall-clock interval drives only the whitelist-preview countdown banner, but it ran unconditionally for the whole session — so the root reconciled every second (children bail out via React.memo, but the parent render itself doesn't), the exact thrash the v0.5.3 self-clocking-footer refactor set out to remove. The interval now runs only while a preview banner is on screen.
Fixed — session detail header disagreed with its own list row
The detail pane resolved its title as projectName ?? id, while the session list, the sort key, and the headless --once snapshot all use fallbackLabel(). So the same session could read as a 12-char hex id in the detail header but as the path-inferred name (e.g. "AgentPulse") in the list, an empty projectName rendered as a blank header, and — most visibly — a user's rename alias updated the list row but not the detail header. The detail pane now shares fallbackLabel() and applies the same alias prefix.
Assets 2
v0.7.1: session-noise + cross-runtime accuracy pass
[0.7.1] — 2026年05月28日
Session-noise and cross-runtime accuracy pass, driven by real multi-terminal session sets. The TUI had quietly accumulated fixes that the headless agentpulse live --once path never inherited, so machine-readable output and --notify diverged from what the dashboard showed.
Fixed — session noise the headless --once snapshot missed (#11)
Three dashboard-hygiene rules lived only inside the TUI (src/tui/App.tsx, src/tui/SessionList.tsx) where the --once path couldn't reach them. They are now in shared modules (src/sessions/subagents.ts, src/labels.ts) and wired into both surfaces:
- SDK subagents leaked into the snapshot. The TUI filtered
agent-<hex>.jsonlsubagent transcripts by default (the--show-subagentsopt-in, since v0.2.3), but--oncedid not. A real session with onelabparent + four subagents reported five identicallabrows in JSON.once.tsnow uses the sharedisSubagentTranscript()predicate — same default-off behaviour, same--show-subagentsopt-in. The predicate also catches the structural<parent>/subagents/agent-<hex>.jsonllayout where the parent's project name leaks into the child. - Co-named sessions were indistinguishable in JSON/notify.
SessionListhas suffixed colliding<project>|<runtime>rows with· <hex>since v0.4.4, but--onceemitted bareprojectName. Two parallelAgentPulsesessions (e.g. one Claude Code, one Codex) couldn't be told apart in output or notifications. A newlabelfield on the snapshot carries the disambiguated string via the sharedcomputeDisambigSuffixes()helper;projectNameis preserved unchanged for existing programmatic consumers, and--notifybodies now uselabel.
Fixed — project names were systematically wrong on multi-word Claude Code projects (#11)
decodeSlug's last-hyphen-segment rule is structurally lossy: a slug like C--FullStee-nutrition-experiment-lab can't be split back into prefix + project because hyphens are both path-separator encodings and legal name characters, so every multi-word project decoded to its last segment (lab, brief, ...). The transcript's first line carries the authoritative cwd, so naming now prefers basename(cwd) and falls back to slug-decode only when cwd is absent. Compounding it, extractCwdFromFirstLine was strictly first-line-only — but Claude Code transcripts open with permission-mode / file-history-snapshot lines before the first cwd-bearing message, so the cwd-first path never ran. It now scans up to 25 lines / 64 KB. A subagent-shaped cwd basename (agent-<hex>) is rejected so a project is never mislabelled as a tooling artifact.
Fixed — exploring narrative contradicted its own signals (#11)
The low-confidence exploring fallback (trajectory.ts:799) fires regardless of edit count, but the narrative renderer unconditionally appended "No edits yet — it's still figuring out the shape." A session with 20 edits in the window directly contradicted its Signals line. renderExploring now branches on edit count and emits an honest hedge ("Mixed signal: N edits in the window but no clear direction yet") when the fallback fires with edits present.
Fixed — current Codex desktop session format
Recognizes the current Codex desktop transcript shape, including custom_tool_call / custom_tool_call_output events that the substrate parser doesn't yet model, so Codex desktop sessions classify (editing / verification / drift) instead of falling through to other. Adds codex-desktop-* golden corpus fixtures.
Bumped — agent-gov-core ^1.2.1
v0.7.0 pinned ^1.2.0 against an unpublished substrate version, so npm ci hit ETARGET in CI. Substrate has since shipped v1.2.0 + v1.2.1 (streaming reader patch on top of the Antigravity parser); the caret now matches the registry.
Tests
240 tests pass (up from 228 at v0.7.0). New coverage for subagent filtering (both directions), co-named disambiguation, cwd-first naming across the metadata prelude, the exploring-with-edits narrative branch, and Codex desktop classification. Windows CI flake fixed by adding maxRetries/retryDelay to ~60 recursive rmSync teardowns (lazy handle release intermittently threw ENOTEMPTY/EBUSY).
Assets 2
v0.7.0: Native Antigravity session tracking support
Added — Native Antigravity Session Tracking
Added first-class support for Google DeepMind's Antigravity transcript format (~/.gemini/antigravity/brain). Integrates seamlessly with the thread-safe agent-gov-core@1.2.0 substrate parser to provide fully aligned session tracking and live TUI verdict support.
-
Stateless Sequential
toolUseIdLinkage: Uses the new threadableactiveToolCallsparameter from the substrate parser to map Antigravity's asymmetricalreplace_file_content$\leftrightarrow$ REPLACE_FILE_CONTENTtools sequentially while avoiding any module-level state. -
Empty
assistant_messageConditionally Pruned: Automatically checks for empty planner responses and suppresses empty placeholder messages. -
CommandLine$\to$ commandNormalization: Auto-translates Antigravity parameters insideunwrapArgsso that existing verifiers and checkers require zero modification downstream. -
Cwd-level Path Resolution: Scans early lines in the session JSONL to locate and extract
Cwd/DirectoryPath/SearchPath, and populates the event-levelcwdfor robust drift and relative path analyses. -
Exit Code Extraction: Grounded parsing of the verified
RUN_COMMANDexit-code shapes (failure, completed successfully, and status error). -
Orphan Cleanup: Integrates memory-safe cleanup inside
parseCache.tsto prune active tool call mappings once their corresponding events fall out of the cache's active window. -
Golden Integration scenario: Added
antigravity-converging.jsonlgolden replay case to the test corpus to ensure regression protection.
Assets 2
v0.6.2: Codex external inspection fixes (cross-runtime file-paths + relative path drift)
Fixed — multi-runtime adapter drift (Codex external inspection)
The README claimed Claude Code / Cursor / Codex support, and the substrate parsers in agent-gov-core@1.1.0 do parse all three line formats correctly. But the consumer layers in this package (enrichment, sequences, trajectory) were written against the Anthropic tool vocabulary and silently misclassified Cursor + Codex events.
A Codex inspection (post-v0.6.1) confirmed: running a real Codex fixture through pulse() counted both Codex tool calls as other — zero editing, zero verification, no primary files. Cursor's toolInput.path (vs Anthropic's toolInput.file_path) was equally invisible to the enrichment and sequence layers (the trajectory layer already handled both — partial coverage that made the gap easy to miss).
The shape of the fix:
-
New
src/normalize.tscentralizes two helpers used wherever a consumer reaches into aTranscriptEvent:canonicalToolName(name)maps Codex'sshell→Bashandapply_patch→Edit. Unknown names pass through verbatim.extractFilePath(toolInput)triesfile_path/path/filePath/notebook_pathin order, plus anapply_patchfallback that parses*** Add File: <path>/*** Update File: <path>markers from the Codex patch body.
-
src/enrich.ts:classifyToolUseroutes the toolName throughcanonicalToolNamebefore the switch.getFilePathdelegates to the shared normalizer. -
src/sequences.ts:classifyEvent+extractFilePath— same treatment. -
src/trajectory.ts:isVerificationEventaccepts canonicalBash(so Codexshelllands). The Write-shaped drift detector also accepts canonicalEdit(so Codexapply_patchparticipates). -
MCP tool detection stays keyed off the raw name (canonicalization only rewrites known runtime aliases, not MCP-prefix patterns).
Fixed — relative paths false-tripped outside-repo drift
isPathOutsideNormalizedRoot compared the raw file path against the normalized root. A relative path like src/utils/date.ts never starts with c:/dev/agentpulse/, so every relative Write triggered the outside-repo detector — a false positive on legitimate edits. Cursor (and Codex's apply_patch) emit relative paths constantly.
src/trajectory.ts:isPathOutsideNormalizedRootnow accepts an optionalcwdparameter. NewisAbsolutePathhelper detects Unix/...and WindowsC:/...shapes; relative paths get resolved against the event'scwdbefore comparison. Whencwdis unavailable, the function defensively returnsfalse(NOT outside) — a false negative on drift detection is strictly better than a false positive on legitimate edits.
Fixed — watcher derived project name before extracting cwd
src/sessions/watcher.ts called deriveProjectName(absPath, root.path) BEFORE extractCwdFromFirstLine, missing the cwd-aware fallback that src/sessions/discovery.ts uses. For Codex's date-shaped slugs (~/.codex/sessions/2026/05/23/...), a newly-watched session showed worse labels (date stub) until the next discovery cycle picked it up.
- Swapped the call order to match discovery: extract cwd first, pass it into
deriveProjectName. Three-line fix.
Fixed — npm tarball missed the README hero image
package.json files whitelist included dist, LICENSE, README.md, CHANGELOG.md — but not assets/. The v0.6.1 README references ./assets/dashboard.svg, so anyone viewing the README on npmjs.com saw a broken image. Added assets/dashboard.svg to the whitelist.
Added — cross-runtime corpus fixtures
The golden corpus (added in v0.6.1) was Claude Code-only, which is exactly why the multi-runtime gap stayed hidden through 222 tests. v0.6.2 adds:
test/corpus/cursor-converging.jsonl— Anthropic envelope withtoolInput.pathinstead offile_path. Locks in theextractFilePathfix.test/corpus/codex-converging.jsonl— Codexresponse_itemshape withshell+apply_patch. Locks in the canonical-name + apply_patch-path-extraction fixes.
Both expect converging with confidence ≥ 0.6. Pre-fix they would have landed as exploring or idle.
Added — targeted relative-path drift tests
Three new tests in trajectory.test.mjs:
- Relative Write resolved against cwd inside repoRoot → no drift
- Relative Write without cwd → no drift (defensive default)
- Absolute Write outside repo still trips drift (regression check for the v0.6.2 narrowing)
Tests
227 (was 222). Five new: two corpus scenarios + three trajectory tests.
Bumped
- Action ref in
README.md+examples/agentpulse-pr-check.yml:@v0.6.1→@v0.6.2.
Assets 2
v0.6.1: Regression armor — property tests + golden corpus + classifier bug fix
The "regression armor" batch — property tests (Gemini #9) + golden replay corpus (Gemini #3). Both items closed the inspection backlog. Both invisible to users; both immediately earned their keep.
Added — golden replay corpus (Gemini #3)
test/corpus/ is a small, labelled collection of representative transcript JSONLs paired with the bucket each must classify into. test/corpus.test.mjs reads manifest.json, runs the full pulse() pipeline against each fixture, and asserts the verdict. Six scenarios cover the six trajectory buckets — converging, stuck, exploring, done, drifting (shell_exfil), idle.
Adding a scenario: drop <name>.jsonl into test/corpus/, add a manifest entry with { file, endAt, windowMs, expectedBucket, minConfidence, expectedDriftCount }, run npm test. New scenarios pick up automatically.
Fixed — verification trend was silently no_data for ALL real parsed transcripts
The corpus surfaced this on its first run. computeVerificationTrend in src/trajectory.ts looked for toolResultExitCode on the tool_use event — the synthetic test fixture shape. But the real parser emits tool_use and tool_result as SEPARATE TranscriptEvent objects linked by toolUseId; exit codes live on the tool_result. So every real Claude Code / Cursor / Codex transcript yielded verificationTrend === 'no_data', which meant the converging bucket effectively never fired in production — only in synthetic tests.
- Fix:
computeVerificationTrendnow builds atoolUseId → tool_resultmap up front and falls back to the linkedtool_resultwhen thetool_useitself has no exit code. Backward-compatible — the synthetic shape (exit code on the tool_use) still works. - The corpus catches the symptom:
converging.jsonlhas a fail→pass test sequence and asserts the bucket isconvergingwith confidence ≥ 0.6. Pre-fix that asserted, post-fix it passes.
This is the single biggest classifier-quality fix since v0.3.5. Anyone running v0.5.x against a real session was getting a downgraded verdict (likely exploring or idle instead of converging) anytime the agent did TDD-shaped work. v0.6.1 restores the intended behaviour.
Added — property tests on the pure-core layers (Gemini #9)
test/property.test.mjs runs randomized-input fuzz loops (200 iterations each) over the pure-function surfaces and asserts invariants that must hold for ANY input:
classifyTrajectory: confidence always in [0, 1], bucket always one of the six known values, deterministic (same inputs → same verdict), drifts array only non-empty ondriftingsparkline: output length always equals requested width, empty-input fallback respectedparseDuration: linearity (N * parseDuration('1m') === parseDuration('Nm')), malformed inputs reject without throwingapplyHysteresis: input state never mutated, stable bucket only flips after two consecutive agreementsanalyzeSequences: pattern always in the known set, confidence in [0, 1]readOutcomeSignal:idleGapMsnever negative
Hand-rolled Math.random() generator + seeded PRNG (set AGENTPULSE_PROPERTY_SEED=<n> to replay a specific run). Zero deps — fast-check would be a single-purpose addition and these invariants are simple enough.
Fixed — CLI module side-effected on import
src/cli.ts called main() unconditionally at module load. Importing parseDuration from it (which the property tests needed) ran the CLI's main function, printed usage, and exited — breaking any consumer that wanted to use the file as a library. Guarded main() with the standard ESM idiom (process.argv[1] === fileURLToPath(import.meta.url)).
Tests
222 (was 204). Eighteen new tests:
- 12 property tests covering invariants across
trajectory,sequences,sparkline,cli(parseDuration),hysteresis,outcome - 6 corpus scenarios (one per bucket) running end-to-end through the full pipeline
Bumped
- Action ref in
README.md+examples/agentpulse-pr-check.yml:@v0.6.0→@v0.6.1.
Assets 2
v0.6.0: Incremental transcript parsing — biggest perf win
Added — incremental transcript parsing (Gemini #1)
The biggest perf win on the inspection backlog. Pre-v0.6, every live-mode refresh re-read the full transcript file end-to-end: a 2-hour Claude Code session with a 10 MB JSONL ate ~10 MB of disk I/O and N thousand JSON.parse calls every 30 s, of which most events were immediately dropped by the windowing filter. With N sessions in the dashboard, multiply by N.
v0.6.0 reads only what's new. Per-path state tracks the byte offset of the last complete line we've parsed; subsequent pulses tail-read [lastOffset, currentSize) instead of the whole file.
How it works (src/parseCache.ts)
On each call to readWindowFromCache(path, since, until):
statthe file. ComparemtimeMs+sizeagainst the cached values.- No change: return window-filtered cached events. Zero I/O beyond the stat.
- Grew: open the file,
read(buf, 0, size - lastOffset, lastOffset)— exactly the new bytes. Parse appended lines, append to cached event list, advancelastOffsetpast the last complete\n. Bytes after the last\n(an in-progress line) get deferred to the next pulse. - Shrank or mtime regressed: file was rotated/truncated. Evict the cache entry and full-re-read on this pulse.
- Missing file: throws (matches
parseTranscriptDir's contract; the orchestrator captures this intoSessionState.error).
Cached events are pruned with a 1-hour margin past the window start so a 24-hour session doesn't grow memory without bound, while still tolerating --window resizes without a full re-read.
Wiring
PulseOptions.events(new) — caller-supplied pre-parsed events that bypasspulse()'s own parser entirely. CLI callers (recap,live --once) leave this unset; the orchestrator sets it.src/orchestrator.ts:runPulse— callsreadWindowFromCache(path, startAt, endAt, { silent: true })first, thenpulse({ events, endAt, ... }). The explicitendAtkeeps the window math consistent between the cache filter and pulse's internal startAt computation.evictParseCache(path)wired into the orchestrator'sremove(sessionId)so the cache doesn't hang onto state for a deleted session.
Substrate
No agent-gov-core change needed for this release. The substrate's v1.1.0 per-runtime line parsers (parseAnthropicLine, parseCodexLine, isCodexLine, etc.) and timestamp helpers (coerceTimestamp, interpolateTimestamps, isRecord) cover everything the cache needs. The cache is pure AgentPulse logic on top of the existing public surface.
Expected speedup
For a 10 MB transcript on a 30 s refresh:
- Pre-v0.6: ~10 MB read + N thousand
JSON.parsecalls per pulse. - Post-v0.6: typical pulse reads only what the agent appended in the last 30 s (often a few KB) + parses ~5–50 lines.
The bigger the transcript, the bigger the gain. CI / live --once mode is unaffected — that path keeps using parseTranscriptDir (whole-file, batch).
Why a minor bump (v0.6.0)
PulseOptions.events is an additive optional field — fully backward-compatible. The behaviour change is internal (live-mode orchestrator routes through the cache); external API is unchanged. Calling this v0.6.0 instead of v0.5.7 to mark a meaningful internal performance shift, not because anything broke.
Tests
204 (was 194). Ten new tests in parseCache.test.mjs:
- First-call full parse
- Second-call cache hit (no re-read)
- Tail-read only of appended bytes
- Window filter applied at each call (sliding window)
- Incomplete final line deferred to next read
- Rotation/truncation (file shrinks) evicts cache and re-reads
- Missing file throws (matches parseTranscriptDir contract)
- File disappearing between pulses throws on the next call
evictParseCache(path)forces a full re-read- Old events past the prune threshold drop out of the cached set
The orchestrator's existing 7 tests cover the integration; one test had to verify that "errors are captured" still works — the cache's new "throw on missing file" behaviour preserves it.
Bumped
- Action ref in
README.md+examples/agentpulse-pr-check.yml:@v0.5.6→@v0.6.0.
Assets 2
v0.5.6: JSON read cache + AGENTPULSE_PROFILE dev mode
Added — mtime-keyed JSON file cache (Gemini #4)
loadAliases and loadExceptions are called on every pulse — with 10 sessions on a 30 s refresh, that's 20+ small JSON file reads per cycle, most returning identical content. v0.5.6 wraps both behind a shared process-wide cache that compares mtime before re-reading.
- New
src/jsonCache.tsexportsreadJsonCached(path, parse, whenMissing)andinvalidateJsonCache(path). On every call:statthe file, compare mtime against cached value; mtime match → return cached parse output, mismatch (or ENOENT vs. present) → re-read + reparse + cache. aliases.ts:readAliasFile+exceptions.ts:loadExceptionsnow route through it. Same observable behaviour, fewer disk reads.setAliasandappendExceptionsinvalidate the cache after a successful write so the next read picks up the new content even if mtime resolution lags (Windows + some network filesystems can be ~1 s late).- Failure modes (parse error, read error) are cached as
nullwith an mtime sentinel of0— so a permanently-broken file doesn't get reparsed every pulse, but a fixed file is detected on the next mtime change. - Most visible on Windows where small-file stat+read latency adds up. On a 10-session refresh, this is roughly a ×ばつ reduction in alias/exception disk operations per cycle.
Added — dev-only per-layer profiling (Gemini #10)
Set AGENTPULSE_PROFILE=1 in the environment to log per-layer pulse timings to stderr:
[agentpulse:profile] parse: 142.3ms · enrich: 3.1ms · sequence: 0.4ms · exceptions: 0.2ms · classify: 1.1ms · narrative: 0.1ms
- Off by default — one env-var read per pulse, zero allocation in the hot path. Reads dynamically (not at module-load) so tests can toggle freely.
- Goes to stderr because stdout is in the TUI's alt-screen buffer during
agentpulse live— any write to stdout there causes whole-window flicker on cmd.exe. - Pairs naturally with the incremental-parse work coming in v0.6.0+: the per-layer numbers tell you immediately whether
parse:actually dropped after a refactor.
Tests
194 (was 183). Eleven new tests across two new files:
jsonCache.test.mjs(9 tests): missing-file → fallback, valid-file → parsed, parser-not-re-invoked-on-cache-hit, mtime-change-forces-reread, malformed-JSON → fallback + cached as missing, missing-then-appearing detected,invalidate()forces re-read,clear()drops everything.cli.test.mjs(2 tests):AGENTPULSE_PROFILE=1produces the expected stderr line shape AND stdout JSON stays valid; absence of the env var produces no profile line.
Bumped
- Action ref in
README.md+examples/agentpulse-pr-check.yml:@v0.5.5→@v0.5.6.
Assets 2
v0.5.5: CI speedup — parallel --once + Action uses the npm package
CI speedup batch
Two compounding changes from the external inspection backlog (Gemini #5 + #6). No user-facing behaviour change in the live TUI; the GitHub Action and live --once mode both get visibly faster.
Changed — live --once now runs sessions in parallel
Pre-fix, runOnceMode awaited each session's pulse() sequentially in a for loop. On CI runs against artifact directories with dozens of historical transcripts, this scaled linearly — and CI runs against artifact directories with dozens of historical transcripts is the whole point of --once.
- New
runConcurrent(tasks, concurrency)helper insrc/once.ts(exported for direct unit testing). Hand-rolled — 12 lines, zero deps. Spawns up toconcurrencyworkers that pull tasks off the list in order; per-index result assignment preserves the original input order so the text report iterates sessions in their discovery order, same as the pre-fix loop. - Default concurrency is
6. Tunable viaAP_ONCE_CONCURRENCY=<N>env var for users with unusual CI runner shapes (giant artifacts on slow disks → lower; fast NVMe + many sessions → higher). - On a typical run with 24 historical transcripts, CI wall-clock for the analysis step drops from ~24 ×ばつ per-session latency to ~ceil(24/6) ×ばつ per-session latency — roughly ×ばつ faster for that step.
Changed — GitHub Action invokes the published npm package
The composite action used to npm ci --omit=dev && npm run build on every workflow run before invoking AgentPulse. That's 5-15 s of pure overhead and one extra failure mode (a transient npm registry hiccup mid-ci surfaces as "AgentPulse build failed" rather than a real signal).
- The action now resolves the version from its own checked-out
package.jsonand invokesnpx --yes @conalh/agentpulse@${VERSION} live --once .... First call populates the npm cache; second is a cache hit. The action's git ref (e.g.Conalh/AgentPulse@v0.5.5) and the npm version stay in lockstep automatically — no hardcoded version string inaction.ymlthat could drift. - The
Install AgentPulse dependencies + buildstep is gone. Theactions/setup-node@v4step stays — npx still needs node, and the runner-default node version isn't guaranteed across runners. - The action ref in
README.mdandexamples/agentpulse-pr-check.ymlis bumped to@v0.5.5.
Tests
183 (was 179). Four new tests on the runConcurrent helper:
- Preserves input order in the results array even when tasks resolve in reverse order
- Caps in-flight tasks at the configured concurrency value
- Empty task list returns an empty array
concurrency = 1is strictly sequential (degenerate but legal — useful for debugging)
The action.yml change is verified by inspection — the workflow shape doesn't have a unit-test harness, but the script logic is straightforward shell + the existing live --once CLI surface is exhaustively tested.