Releases: DFKHelper/token-goat

v1.8.0

14 Jun 01:09

@Zelys-DFKH Zelys-DFKH

v1.8.0

9669eb7

v1.8.0 Latest

Latest

What's New in 1.8.0

Added

curl -v verbose compression — post_bash detects curl -v/--verbose commands and strips TLS handshake noise, redundant request/response headers, and progress meters; keeps the request line, status code, content-type, and body
jest/vitest verbose PASS-suite compression — collapses PASS src/... blocks with all-green test lines into a single summary line
JUnit XML structured summary — parses <testsuites> XML from post_bash output and emits a compact pass/fail/skip count instead of raw XML
Task-output temp file redirect — pre_read detects Claude task-output files in %TEMP%/claude/... and transparently redirects to token-goat bash-output
Minified JS/CSS grep elision — post_bash truncates grep/rg hits on minified files (.min.js, .min.css, bundled output) to avoid multi-MB lines flooding context
Compaction hint suppression — suppresses the redundant re-read hint that fired after every conversation compaction even when files hadn't changed
go test -v compression — collapses passing === RUN / --- PASS blocks in verbose Go test output
make/cmake/ninja build compression — strips redundant compile command echoes while keeping warnings and errors
Python traceback deduplication — collapses repeated identical tracebacks in test output to a single copy
tsc output compression — suppresses TypeScript compiler progress lines, keeping only errors and the final summary

Assets 2

v1.7.1

12 Jun 00:46

@Zelys-DFKH Zelys-DFKH

v1.7.1

874c6f6

v1.7.1

Fixed

_index_spawn_active guards against PID recycling. Within the 10-minute INDEX_SPAWN_TTL the OS can reuse a finished indexer's PID for an unrelated process, blocking fresh indexing spawns for up to 10 minutes. The check now reads the running process's cmdline and returns False when it lacks token_goat, falling back to trusting the PID when the cmdline is unreadable.
kill_duplicate_daemon now unlinks the stale PID file after a successful kill. Previously the file was removed on the "already dead" path but not on the success path, leaving --check and is_worker_alive() reporting stale state until the next cleanup pass.
get_context_pressure avoids a redundant safe_load when a cache is already in scope. Accepts an optional cache= kwarg; callers that already hold a loaded SessionCache pass it in and skip the extra disk I/O.
normalize_path docstring corrected. Steps 2 and 3 were listed in the wrong order relative to actual execution.
Shell-neutral bash-compress disable hint. All 34 TOKEN_GOAT_BASH_COMPRESS=0 hint strings were POSIX-only prefix assignment — broken in PowerShell and cmd.exe. Hints now read disable via TOKEN_GOAT_BASH_COMPRESS.
Bash pre-hook fast-path via bash_detect. Adds a <1 ms dict lookup before the ~75 ms bash_compress import; unrecognized commands skip the import entirely.
enqueue_dirty is now append-only with a byte-based cap. Eliminates O(queue-size) rewrite and POSIX rename race on every Edit/Write hook.
Corrupt .draining file is quarantined instead of raising. Prevents worker crash when another process races on the drain rename.
post_bash uses a single session load/save round-trip. Was calling load()/save() up to four times per invocation.
Output size cap applied before payload work in post_bash. Prevents the full pipeline running on 4 MB stdout that will be truncated anyway.
Cache eviction throttled to at most once per 60 seconds. Eliminates per-write O(n) directory scan.
pre_read Bash branch uses safe_load(). Corrupt session files no longer crash the Bash pre-hook.

Assets 2

v1.7.0

10 Jun 21:51

@Zelys-DFKH Zelys-DFKH

v1.7.0

18165e8

v1.7.0

See CHANGELOG for full details.

Fixed

Skill dedup permanently disarmed after first compaction — post_skill early-return path skipped mark_skill_loaded(), freezing skill_ts and causing pre_skill to pass all subsequent loads through undeduped after any compaction event.

Added

token-goat compact-doc — build a deterministic extractive sidecar for any large reference .md file; pre_read serves it in place of the full file (80–95% smaller). Auto-staled on edit.
post_compact_full_loads config knob ([skill_preservation] post_compact_full_loads, default false) — keep skill dedup armed across compaction epochs; set true to restore the pre-1.7 one-full-reload-per-epoch behaviour.
MCP screenshot deny-redirect — pre_screenshot denies chrome-devtools and playwright screenshot calls without a filePath/filename argument, forcing the save-to-disk path so image-shrink applies (~39K tokens raw → ~8K compressed).
Baseline v2 — token-goat baseline now costs the skill listing, shows one row per configured MCP server, and adds a --usage flag that annotates each row with historical call counts.
Session window denial for in-context file reads — pre_read actively denies re-reads of files confirmed in the current context window (config: [hints] deny_reread, default on).

Assets 2

v1.5.2

09 Jun 02:29

@Zelys-DFKH Zelys-DFKH

v1.5.2

2fa8218

v1.5.2

Three fixes: Codex hook wire-format compatibility, and two Windows coarse-mtime correctness issues in the cache and session layers.

Codex hook responses now pass schema validation

Codex 0.137.0 validates every hook response against embedded JSON schemas with additionalProperties: false, so any unrecognised key causes "hook returned invalid ... JSON output" for the entire response — including SessionStart, PreToolUse, and PostToolUse. The root cause was _tg_elapsed_ms (and sibling _tg_handler/_tg_error fields) added by the internal dispatch() function and then emitted verbatim. The denormalize_response Codex branch now strips all _tg_* keys before output. The same path also injects the required hookEventName const field into hookSpecificOutput — Codex requires it on every hookSpecificOutput shape and token-goat was not emitting it because Claude Code does not require it. A _codex_hook_event_name() helper resolves the correct value (e.g. "pre-read" → "PreToolUse") from the hook registry. The old camelCase→snake_case key conversion (_translate_hso_to_codex) is no longer applied — Codex 0.137.0+ uses camelCase throughout hookSpecificOutput.

Freshest cache entry survives its own store call's eviction

evict_cache_dir sorts eviction candidates oldest-first by float(st_mtime) with a stable sort. When the just-written (MRU) entry shares a coarse st_mtime with older siblings, the stable sort falls back to arbitrary iterdir order, which on NTFS can place the newest file first and evict it — so a store_output call could delete the very entry it had just written. evict_cache_dir now accepts a protect_ids set that is excluded from the victim list regardless of timestamp, and skill_cache.store_output passes the id it just wrote.

save() refreshes the process-local load cache

session.load() caches (object, mtime) per session and serves the cached object whenever cached_mtime == current_mtime. When a later save()'s post-write timestamp aliased the mtime a previous load() had cached, the proc-cache kept serving the stale pre-save object on the next in-process load() even though the on-disk JSON was correct. save() now overwrites an existing proc-cache entry with the object it just persisted on every successful write.

Assets 2

v1.5.1

08 Jun 17:36

@Zelys-DFKH Zelys-DFKH

v1.5.1

13afb1f

v1.5.1

Correctness fixes for cache size accounting (compressed .gz bodies), surgical reads (oversized-docstring cap, signature-boundary fix), path normalization (uppercase WSL drives), mixed-case skill-compact invalidation, and the Gemini hook bridge (preserve systemMessage, route additionalContext natively), plus two documentation corrections.

See the CHANGELOG for full details.

Assets 2

v1.5.0

08 Jun 04:33

@Zelys-DFKH Zelys-DFKH

v1.5.0

2b46668

v1.5.0

Context-pressure awareness: one source of truth for how full the window is, and hints that get terser as it fills. Ships alongside three install fixes that restore hook forwarding under editable installs and silence a recurring doctor warning.

Centralized context-pressure model

get_context_pressure(session_id) in compact.py is now the single place that answers how close a session is to autocompaction. It returns a frozen ContextPressure — a fill_fraction paired with a tier of cool, warm, hot, or critical. The estimate sums the known context contributors (loaded skill bodies, the ~10,800-token skills catalog, and per-event costs for bash history, web history, and read files) and divides by the fixed 660,000-token autocompact budget rather than the model's raw window, so the fraction carries the same meaning no matter which model is driving the session. The old _estimate_context_fill helper and the inline calculation in the session hook both defer to it, retiring the copies of the 660 K constant that had spread across half a dozen call sites in favor of one shared CONTEXT_AUTOCOMPACT_TOKENS.

Named tier boundaries

The fraction-to-tier mapping lives in tier_for_fraction(), backed by three named constants: CONTEXT_TIER_WARM (0.50), CONTEXT_TIER_HOT (0.70), and CONTEXT_TIER_CRITICAL (0.85). The bands are cool below 0.50, warm up to 0.70, hot up to 0.85, and critical at or above it. With the magic numbers pulled out of the band checks, the boundaries are defined once and the tests pin them directly.

Pressure-aware surgical-read hints

The pre-read hook tightens its large-file threshold as the window fills. A file earns a surgical-read suggestion past 500 lines while the session is cool, 350 when warm, 200 when hot, and 50 when critical. It also folds a single per-tier note into the read's additional context: "Context warming" at warm, "Context pressure" at hot, "CONTEXT CRITICAL" at critical. The note is fingerprinted by tier, so it fires once per band rather than on every read. Cool sessions get no note.

Smaller manifests under pressure

compute_adaptive_budget now weighs context pressure when it sizes the compaction manifest. Once the window runs hot the budget is capped at 500 tokens, and at critical it drops to 300, so the manifest stops adding to the very problem it exists to summarize.

Install robustness

Hooks no longer silently disable themselves under an editable install. The tg-hook wrapper carries an if not exist "<sentinel>" gate that short-circuits to a bare {"continue":true} during the uv tool install --reinstall race, when the venv's token_goat module is briefly absent. The sentinel used to be a hardcoded site-packages/token_goat/__init__.py path, which never exists under an editable install (uv sync, the project .venv), so the gate stayed permanently true and every hook no-op'd — the whole tool went dark with no error. The wrapper now resolves the sentinel through importlib.util.find_spec("token_goat").origin, which points at src/token_goat/__init__.py for editable installs and site-packages/... for regular ones, and falls back to an ungated wrapper when no sentinel resolves. A live handler emits {"continue": true, "_tg_elapsed_ms": N}; the _tg_elapsed_ms field is the tell that forwarding actually ran.

Re-install purges orphaned tokenwise entries. After the tokenwise → token-goat rename, a re-install left the old hook and permission lines stranded in settings.json and the Codex config.toml, so both harnesses kept invoking a binary that no longer existed. patch_settings_json and patch_codex_config now strip any pre-rename tokenwise command and permission entry before writing the current ones.

Hook wrapper is written as bytes to stop CRLF doubling. hook_wrapper_content() hand-bakes platform-correct line endings — \r\n on Windows — then was written through atomic_write_text, whose text-mode handle translated every \n to \r\n a second time, doubling each line ending to \r\r\n on disk. cmd.exe tolerated the stray carriage return so forwarding still worked, but doctor does a byte-exact compare of the on-disk wrapper against the regenerated content and warned differs from expected — run token-goat install to refresh on every run, a nag that reinstalling could never clear because it rewrote the same doubled bytes. The wrapper now goes through atomic_write_bytes, preserving the authored endings verbatim.

Session-cache integrity

Concurrent session saves no longer drop an edit. The save() fast path skipped its compare-and-swap re-read and merge whenever the on-disk (st_mtime, st_size) fingerprint still matched the one captured at load. That fingerprint aliases: two caches whose keys are the same length serialize to byte-identical JSON sizes, and a float st_mtime rounds two sub-microsecond writes to the same value. When two writers collided on both fields the second skipped the merge and overwrote the first, losing exactly one edit — the 200-edit concurrency stress test intermittently saw 199. The fast path now consults an in-process version registry so a same-process writer that already advanced the version forces the stale save back through the merge, and the fingerprint is taken from integer st_mtime_ns instead of the rounded float, so a cross-process skip now requires a true nanosecond-and-size collision rather than a rounding coincidence.

Assets 2

v1.3.0

06 Jun 03:44

@Zelys-DFKH Zelys-DFKH

v1.3.0

eb6fcd4

v1.3.0

[1.3.0] - 2026年06月05日

Context growth audit — four changes that cut session context size and make overhead visible.

Context footprint in `doctor`

token-goat doctor --context now prints a Context footprint section measuring every token source that pads the context window each turn: the skills catalog (~10,800 tokens/turn for a typical install), loaded skill bodies accumulated in system-reminder injections, CLAUDE.md + MEMORY.md meta-files, and the rolling conversation estimate. The section shows fill % against the 660,000-token autocompact threshold, an ETA in turns at the current growth rate, and an Actions block naming the exact commands to run when any loaded skill above 2,000 tokens is missing a compact.

Auto-shown when estimated fill exceeds 40 % or any loaded skill > 2 K tokens lacks a compact; always shown with --context.

Compact pre-generation at install time

token-goat install now runs skill-compact --all as a final step, so compacts are ready before the first session — no post-install warm-up turn required. A sentinel file (skill_pregen_sentinel.json) records the catalog count; the doctor section uses it to detect skills added after the last pre-gen pass.

Per-skill compact advisory in `post_skill`

When a skill body lands in context, the post_skill hook now reports the compact's token savings inline (pre-generated compacts, sync-generated compacts for bodies < 40 KB, background-generated for larger bodies, info-only when no worker is running). Advisory fires only for bodies above 8 KB to stay silent for tiny skills.

Threshold-crossing context advisory in `user_prompt_submit`

A lightweight ETA advisory fires the first time estimated context fill crosses 50 % and again at 70 %. The message is appended to the existing status line (bracket-joined, not a separate injection) and references /compact now at 70 %. Resets after each compact. Configurable via hints.context_threshold_advisory = false.

Assets 4

v1.2.0

05 Jun 20:46

@Zelys-DFKH Zelys-DFKH

v1.2.0

360f113

v1.2.0

[1.2.0] - 2026年06月05日

14 commits since v1.1.0. Output overflow guard, cross-platform path normalization fixes, and a reliability pass.

Output Overflow Guard

Surgical-read commands (symbol, read, section, bash-output, web-output, and the rest) now cap oversized output before it reaches the model. When estimated tokens exceed the cap, the output is head-truncated on a line boundary. A marker line is appended naming the cap, the truncation ratio, and the narrowing action — symbol users get directed toward file::Class.method lookups, section users toward sub-headings, cached-output users toward --grep/--tail.

Default cap: 25,000 tokens. Configure via [overflow_guard] max_tokens in config.toml, override with TOKEN_GOAT_OVERFLOW_MAX_TOKENS=<n>, or disable with TOKEN_GOAT_OVERFLOW_GUARD=0 / [overflow_guard] enabled = false.

The estimator is deliberately conservative — 3 chars/token, same rate as the compaction manifest — so the cap is never under-applied. ANSI escapes are stripped before estimation since color codes inflate length without adding model-visible tokens. A single-line blob (no internal newlines) is sliced at the char budget so it cannot pass through whole.

Cross-Platform Path Normalization

Two fixes that make path-keyed caches work correctly across Windows, WSL, and Linux:

normalize_path / paths.normalize_key — Drive-letter lowercasing (C: → c:) is now unconditional. The previous guard sys.platform == "win32" meant a WSL process that emits a Windows-format path (C:/Users/...) produced a different cache key than a native Windows process reading the same file. Both now produce c:/users/....

hooks_skill.post_skill — Windows-style backslash paths like C:\Users\user\.claude\skills\ralph were not stripped on Linux because the inline guard used _os.sep (/ on Linux) instead of the string literal "\\". The inline block is now a call to _normalize_skill_name, which hardcodes "\\" and handles both separator styles on every platform.

Reliability

Worker dirty-queue torn writes. Concurrent _append_dirty calls could produce truncated or concatenated JSON lines under write contention. An OS-level file lock (fcntl on POSIX, msvcrt on Windows) now serializes appends, same as the session cache.
SQLite WAL checkpoint mode. Changed from RESTART to PASSIVE on connection open. RESTART waited for all readers to drain, blocking hook subprocesses for hundreds of milliseconds during active indexing. PASSIVE checkpoints cooperatively and does not wait.

Assets 2

v1.1.0

05 Jun 01:43

@Zelys-DFKH Zelys-DFKH

v1.1.0

d224393

v1.1.0

57 commits since v1.0.1. Six new language indexers, twenty-plus CLI commands and flags, a pre-skill hook that cuts repeat skill loads from 40–65k tokens to ~400, pnpm/yarn/bun compress filters, rg/grep dedup hints, double-daemon prevention, and a reliability pass with 400+ new tests.

Highlights

Skill re-load prevention. A new PreToolUse(Skill) hook fires before every Skill invocation. When a skill was already loaded in the current session, the reload is blocked and the cached compact (~400 tokens) is served instead. A repeat /ralph or /superman invocation in the same session now costs ~400 tokens, not 40–65k.
New language indexers. CSS/SCSS, SQL, GraphQL, Protobuf, .env, and Makefile. All participate in token-goat symbol, read, outline, scope, and dedup hints.
New CLI flags. symbol --context N, symbol --json, outline --min-lines, outline --max-depth, web-output --list, map --filter, stats --since, token-goat recent, bash history exit codes.
Package manager filters. pnpm, yarn, and bun compress filters. pnpm run/yarn run route through their own filter.
rg/grep dedup. Bash rg/grep invocations now fire dedup hints the same way the native Grep tool does.
Top-5 file guarantee. The five most-accessed files always appear in the compaction manifest.
Double-daemon prevention. JSON PID files, cross-interpreter startup guard, worker --kill-duplicate, worker --status, install --check.

Full changelog: https://github.com/DFKHelper/token-goat/blob/main/CHANGELOG.md

Assets 2

v1.0.1

02 Jun 21:43

@Zelys-DFKH Zelys-DFKH

v1.0.1

a79dd48

v1.0.1

Bundles two 50-commit improvement runs: a skill-cache / context-savings accuracy loop and a general quality loop.

Highlights:

Skill cache: source_sha stale-compact detection, separate compact/body eviction buckets, sidecar schema v2, lazy skill injection, gzip compression
Stats accounting fixed for bash_output_cached, skill_cached, web_output_cached, and surgical-read lookup savings (were always 0)
Serve-diff-on-reread, session-hint cooldown, unified token formula, stats category grouping
RuffFilter and MypyFilter bash-compress support
Type safety, error handling, performance (hoisted regex), security (0o600 lock files), DRY helpers, debug log coverage
55 new tests

See CHANGELOG for full details.

Assets 2

Uh oh!

Releases: DFKHelper/token-goat

v1.8.0

What's New in 1.8.0

Added

Uh oh!

v1.7.1

Fixed

Uh oh!

v1.7.0

Fixed

Added

Uh oh!

v1.5.2

Codex hook responses now pass schema validation

Freshest cache entry survives its own store call's eviction

save() refreshes the process-local load cache

Uh oh!

v1.5.1

Uh oh!

v1.5.0

Centralized context-pressure model

Named tier boundaries

Pressure-aware surgical-read hints

Smaller manifests under pressure

Install robustness

Session-cache integrity

Uh oh!

v1.3.0

[1.3.0] - 2026年06月05日

Context footprint in doctor

Compact pre-generation at install time

Per-skill compact advisory in post_skill

Threshold-crossing context advisory in user_prompt_submit

Uh oh!

v1.2.0

[1.2.0] - 2026年06月05日

Output Overflow Guard

Cross-Platform Path Normalization

Reliability

Uh oh!

v1.1.0

Uh oh!

v1.0.1

Uh oh!

Context footprint in `doctor`

Per-skill compact advisory in `post_skill`

Threshold-crossing context advisory in `user_prompt_submit`