Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gamutagent/verity

Repository files navigation

Verity

Config-driven information scanner with three-layer AI content authenticity detection.

Verity monitors topics you define, scores results for relevance using the LLM of your choice, and — before surfacing anything — runs every item through a three-layer authenticity pipeline to filter out low-credibility sources, AI-generated noise, and press release spam.

Built by Gamut Intelligence.

Why This Exists

Most search-and-score pipelines have the same blind spot: they score relevance but not authenticity. A high-relevance score on a PR Newswire repost or an AI-generated article is noise, not signal. Verity adds a second gate before anything reaches you.

  • Three-layer authenticity scoring — source reputation, content heuristics, and optional LLM-based detection run on every item before it surfaces
  • Scoped by design — searches and scores. No filesystem access, no email sending, no autonomous tool chaining
  • Localhost by default — binds to 127.0.0.1. Never 0.0.0.0
  • Audit everything — every search query, relevance score, and authenticity decision is logged to an append-only audit file
  • No plugin marketplace — your config is your config. No third-party skills, no supply chain risk

Quickstart

# Clone and install
git clone https://github.com/gamutagent/verity.git
cd verity
pip install -r requirements.txt
# Configure
cp config.example.yaml config.yaml
# Edit config.yaml with your topics, keywords, and thresholds
# Set environment variables
export SEARCH_API_KEY="your-serper-or-tavily-key"
export SCORING_API_KEY="your-gemini-or-openai-key"
export SLACK_WEBHOOK_URL="https://hooks.slack.com/services/..."
# Run
python src/scanner.py

How It Works

┌─────────────┐ ┌───────────┐ ┌─────────────────────┐ ┌────────────┐
│ Web Search │────▶│ LLM Score │────▶│ Authenticity Check │────▶│ Notify │
│ (per keyword)│ │ (0.0–1.0) │ │ (3-layer pipeline) │ │ (Slack / │
└─────────────┘ └───────────┘ └─────────────────────┘ │ Telegram) │
 └─────┬──────┘
 │
 ┌───────────▼──────────┐
 │ Human: 👍 approve / │
 │ 👎 skip │
 └───────────┬──────────┘
 │
 ┌─────────▼──────────┐
 │ Approved items │
 │ accumulate in │
 │ JSONL / Markdown │
 └─────────────────────┘
  1. Search — runs your keywords against a search API on a cron schedule
  2. Score — each result is scored by an LLM against your topic-specific relevance prompt
  3. Authenticity check — items that pass relevance go through three layers (see below)
  4. Deduplicate — URL hashing prevents resurfacing items you've already seen
  5. Notify — items that pass both gates are pushed to Slack/Telegram with scores and context
  6. Approve — react with 👍 to keep, 👎 to discard — or let high-confidence items auto-approve
  7. Accumulate — approved items build up in a structured file for downstream use

Three-Layer Authenticity Pipeline

Verity's authenticity engine runs after relevance scoring. All three layers produce a composite score (0.0–1.0). Items below authenticity.min_score are blocked before they reach you. Auto-approval requires both high relevance and high authenticity.

Layer What it checks Cost
Layer 1: Source Reputation Domain trust tier — authoritative registries and established outlets score high; press wire services and blocklisted domains score low Zero (YAML lookup)
Layer 2: Content Heuristics 7 deterministic checks: excessive capitalization, promotional language density, missing byline, link-to-text ratio, boilerplate patterns, duplicate-sentence ratio, AI fluency markers Zero (pure Python)
Layer 3: LLM Detection Optional LLM call asking: "Is this human-reported news or AI-generated/PR content?" — uses your existing scoring API key 1 API call per item
authenticity:
 min_score: 0.4 # block items below this composite score
 auto_approve_min_score: 0.8 # require this for auto-approval (alongside relevance)
 use_llm_layer: false # enable Layer 3 (costs money — disable for high-volume runs)
 source_reputation_path: "config/source_reputation.yaml"

Composite scoring: source ×ばつ 0.45 + heuristic ×ばつ 0.55 (without LLM), or source ×ばつ 0.30 + heuristic ×ばつ 0.35 + llm ×ばつ 0.35 (with LLM enabled).

The source reputation database (config/source_reputation.yaml) ships with ~80 pre-classified domains. Add your own.

Configuration

See config.example.yaml for the full reference. Key sections:

Section What it controls
topics What to monitor — keywords, relevance prompts, schedules
search Search provider (Serper, Tavily, Brave) and lookback window
scoring LLM provider (Gemini, OpenAI, Anthropic, Ollama) and thresholds
authenticity Three-layer authenticity gate and per-layer config
notifications Where results go (Slack, Telegram, webhook)
storage Where state lives (Firestore, SQLite, local JSON)
security Bind address, rate limits, domain filtering, audit logging

Model-Agnostic Scoring

Use any LLM for relevance scoring:

scoring:
 provider: "gemini" # or: openai, anthropic, ollama
 model: "gemini-2.5-flash" # cheap and fast for scoring
 temperature: 0.1

For fully local/private operation, use Ollama:

scoring:
 provider: "ollama"
 model: "qwen3:8b"

Gamut Intelligence Integration (Optional)

If you have Gamut API credentials, discovered entities are automatically verified against APAC government registries with confidence scoring:

gamut:
 enabled: true
 api_key_env: "GAMUT_API_KEY"
 auto_verify_entities: true
 attach_confidence_score: true

Deploy

Verity runs anywhere: your laptop, a VPS, or any major cloud provider.

Option 1: Docker Compose (any host)

cp config.example.yaml config.yaml # customize topics and thresholds
cp .env.example .env # fill in API keys
./deploy.sh docker # builds and starts containers

Option 2: Cloud-Native (auto-detect)

./deploy.sh # auto-detect: GCP, AWS, or Azure
./deploy.sh gcp # Cloud Run + Cloud Scheduler + Secret Manager
./deploy.sh aws # ECS Fargate + EventBridge + Secrets Manager
./deploy.sh azure # Container Apps + Timer Trigger + Key Vault
GCP AWS Azure
Container Cloud Run ECS Fargate Container Apps
Scheduler Cloud Scheduler EventBridge Timer Trigger
Secrets Secret Manager Secrets Manager Key Vault
Storage Firestore DynamoDB* CosmosDB*

* DynamoDB and CosmosDB storage backends are on the roadmap. Use SQLite (mounted volume) for now.

Option 3: Cron on a VPS

# Run competitors scan daily at 7am
0 7 * * * cd /path/to/verity && python src/scanner.py competitors
# Run tech patterns scan on Fridays
0 7 * * 5 cd /path/to/verity && python src/scanner.py tech_patterns

Security Model

See SECURITY.md for the full security model. Key principles:

  • No ambient authority — Verity can search the web and call an LLM. Nothing else.
  • No secrets in config — all API keys are resolved through a pluggable secrets backend (env vars, .env file, GCP Secret Manager, AWS Secrets Manager, Azure Key Vault)
  • Append-only audit log — every search, relevance score, and authenticity decision is recorded
  • Domain filtering — block or allow-list which domains can be fetched
  • Rate limiting — configurable per-hour caps on search and scoring calls

Project Structure

verity/
├── config.example.yaml # Full config reference (copy to config.yaml)
├── config/
│ └── source_reputation.yaml # Domain trust tier database (~80 pre-classified domains)
├── docker-compose.yml
├── Dockerfile
├── requirements.txt
├── src/
│ ├── scanner.py # Main orchestrator (Verity class) + CLI + Cloud Function entry
│ ├── authenticity.py # Three-layer authenticity engine
│ ├── searcher.py # Web search provider abstraction
│ ├── scorer.py # LLM relevance scoring (Gemini/OpenAI/Anthropic/Ollama)
│ ├── notifier.py # Slack, Telegram, webhook delivery
│ ├── store.py # Dedup + approval state + export (Firestore/SQLite/JSON)
│ ├── secrets_resolver.py # Pluggable secrets (env/.env/GCP/AWS/Azure)
│ ├── config_loader.py # YAML loading + validation
│ └── audit.py # Append-only audit logging
├── tests/ # 22 tests covering pipeline logic and authenticity layers
├── deploy.sh # Unified deploy script
├── deploy-gcp.sh
├── deploy-aws.sh
├── deploy-azure.sh
├── SECURITY.md
└── LICENSE # Apache 2.0

Contributing

PRs welcome. Please read SECURITY.md before contributing.

License

Apache 2.0. See LICENSE.


Built by Gamut Intelligence — AI-powered entity verification for PE/VC due diligence.

AltStyle によって変換されたページ (->オリジナル) /