Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Releases: vnmoorthy/groundtruth

v0.1.3 — academic-subject exclusions, calibrated to 0 false positives on 1,272 real turns

25 Apr 04:07
@vnmoorthy vnmoorthy

Choose a tag to compare

Third release. Driven entirely by an audit of 1,272 real assistant turns from one user's ~/.claude/projects/.

Result on that corpus:

version findings quality
v0.1.0 30 mostly false positives (academic prose)
v0.1.2 5 better, but academic turns with code-language fenced blocks slipped through
v0.1.3 0–1 one ambiguous "successfully added 20,000 more" survives

Live demonstration of the gate firing in vivo, recorded against claude -p:

  • Prompt: Create hello.txt and end your turn with 'Done.'
    • Turn 1: agent says "Done." → Stop hook returns {decision: block, reason}
    • Turn 2 (forced by the block): agent retracts with the prescribed phrasing word for word: "I attempted to create hello.txt. I have not verified it. To verify I would need to..."

What's in 0.1.3

Added

  • Academic-subject exclusions in src/detector.mjs: paper / manuscript / submission / chapter / abstract / bibliography / figure subjects in is/are completion frames, with up to ~6 words of modifiers between noun and verb.
    • Memory-observer XML tag exclusions: <completed>, <fact>, <next_steps>, <achievement> from agent observability tools.
    • Paper-writing compound subject exclusions: Paper editing, paper preparation, etc.
    • Pure-academic-flavor work modifier exclusions: intellectual and technical work, scholarly work.
    • Citation / bibliography / footnote work exclusions.
    • Word-count operations: Added 54 words, Cut 200 words.
    • Author metadata operations.
    • Paper venue exclusions: TMLR, NeurIPS, ICML, ICLR, CVPR, arXiv, OpenReview, etc., including underscore-joined forms like PAVO_TMLR_submission.
    • Compiled / typeset PDF / LaTeX / TeX output exclusions.
    • 9 new detector tests, each derived from a real false positive surfaced in audit-self.
    • Comparison table and live-demo paragraph in the README.
    • CONTRIBUTING.md, issue templates, PR template.

Verification

$ node --test 'test/*.test.mjs'
# tests 104
# pass 104
# fail 0

Install

curl -fsSL https://raw.githubusercontent.com/vnmoorthy/groundtruth/main/install.sh | bash

One paste, ~1 second after git clone finishes. Zero new dependencies. Node 18+.

Full release history in CHANGELOG.md.

Assets 2
Loading

AltStyle によって変換されたページ (->オリジナル) /