Name	Name	Last commit message	Last commit date
Latest commit History 5 Commits
config	config
src	src
tests	tests
.gitignore	.gitignore
Dockerfile	Dockerfile
LICENSE	LICENSE
README.md	README.md
SECURITY.md	SECURITY.md
config.example.yaml	config.example.yaml
crontab	crontab
deploy-aws.sh	deploy-aws.sh
deploy-azure.sh	deploy-azure.sh
deploy-gcp.sh	deploy-gcp.sh
deploy.sh	deploy.sh
docker-compose.yml	docker-compose.yml
pyproject.toml	pyproject.toml
requirements.txt	requirements.txt

Verity

Config-driven information scanner with three-layer AI content authenticity detection.

Verity monitors topics you define, scores results for relevance using the LLM of your choice, and — before surfacing anything — runs every item through a three-layer authenticity pipeline to filter out low-credibility sources, AI-generated noise, and press release spam.

Built by Gamut Intelligence.

Why This Exists

Most search-and-score pipelines have the same blind spot: they score relevance but not authenticity. A high-relevance score on a PR Newswire repost or an AI-generated article is noise, not signal. Verity adds a second gate before anything reaches you.

Three-layer authenticity scoring — source reputation, content heuristics, and optional LLM-based detection run on every item before it surfaces
Scoped by design — searches and scores. No filesystem access, no email sending, no autonomous tool chaining
Localhost by default — binds to 127.0.0.1. Never 0.0.0.0
Audit everything — every search query, relevance score, and authenticity decision is logged to an append-only audit file
No plugin marketplace — your config is your config. No third-party skills, no supply chain risk

Quickstart

# Clone and install
git clone https://github.com/gamutagent/verity.git
cd verity
pip install -r requirements.txt
# Configure
cp config.example.yaml config.yaml
# Edit config.yaml with your topics, keywords, and thresholds
# Set environment variables
export SEARCH_API_KEY="your-serper-or-tavily-key"
export SCORING_API_KEY="your-gemini-or-openai-key"
export SLACK_WEBHOOK_URL="https://hooks.slack.com/services/..."
# Run
python src/scanner.py

How It Works

┌─────────────┐ ┌───────────┐ ┌─────────────────────┐ ┌────────────┐
│ Web Search │────▶│ LLM Score │────▶│ Authenticity Check │────▶│ Notify │
│ (per keyword)│ │ (0.0–1.0) │ │ (3-layer pipeline) │ │ (Slack / │
└─────────────┘ └───────────┘ └─────────────────────┘ │ Telegram) │
 └─────┬──────┘
 │
 ┌───────────▼──────────┐
 │ Human: 👍 approve / │
 │ 👎 skip │
 └───────────┬──────────┘
 │
 ┌─────────▼──────────┐
 │ Approved items │
 │ accumulate in │
 │ JSONL / Markdown │
 └─────────────────────┘

Search — runs your keywords against a search API on a cron schedule
Score — each result is scored by an LLM against your topic-specific relevance prompt
Authenticity check — items that pass relevance go through three layers (see below)
Deduplicate — URL hashing prevents resurfacing items you've already seen
Notify — items that pass both gates are pushed to Slack/Telegram with scores and context
Approve — react with 👍 to keep, 👎 to discard — or let high-confidence items auto-approve
Accumulate — approved items build up in a structured file for downstream use

Three-Layer Authenticity Pipeline

Verity's authenticity engine runs after relevance scoring. All three layers produce a composite score (0.0–1.0). Items below authenticity.min_score are blocked before they reach you. Auto-approval requires both high relevance and high authenticity.

Layer	What it checks	Cost
Layer 1: Source Reputation	Domain trust tier — authoritative registries and established outlets score high; press wire services and blocklisted domains score low	Zero (YAML lookup)
Layer 2: Content Heuristics	7 deterministic checks: excessive capitalization, promotional language density, missing byline, link-to-text ratio, boilerplate patterns, duplicate-sentence ratio, AI fluency markers	Zero (pure Python)
Layer 3: LLM Detection	Optional LLM call asking: "Is this human-reported news or AI-generated/PR content?" — uses your existing scoring API key	1 API call per item

authenticity:
 min_score: 0.4 # block items below this composite score
 auto_approve_min_score: 0.8 # require this for auto-approval (alongside relevance)
 use_llm_layer: false # enable Layer 3 (costs money — disable for high-volume runs)
 source_reputation_path: "config/source_reputation.yaml"

Composite scoring: source ×ばつ 0.45 + heuristic ×ばつ 0.55 (without LLM), or source ×ばつ 0.30 + heuristic ×ばつ 0.35 + llm ×ばつ 0.35 (with LLM enabled).

The source reputation database (config/source_reputation.yaml) ships with ~80 pre-classified domains. Add your own.

Configuration

See config.example.yaml for the full reference. Key sections:

Section	What it controls
`topics`	What to monitor — keywords, relevance prompts, schedules
`search`	Search provider (Serper, Tavily, Brave) and lookback window
`scoring`	LLM provider (Gemini, OpenAI, Anthropic, Ollama) and thresholds
`authenticity`	Three-layer authenticity gate and per-layer config
`notifications`	Where results go (Slack, Telegram, webhook)
`storage`	Where state lives (Firestore, SQLite, local JSON)
`security`	Bind address, rate limits, domain filtering, audit logging

Model-Agnostic Scoring

Use any LLM for relevance scoring:

scoring:
 provider: "gemini" # or: openai, anthropic, ollama
 model: "gemini-2.5-flash" # cheap and fast for scoring
 temperature: 0.1

For fully local/private operation, use Ollama:

scoring:
 provider: "ollama"
 model: "qwen3:8b"

Gamut Intelligence Integration (Optional)

If you have Gamut API credentials, discovered entities are automatically verified against APAC government registries with confidence scoring:

gamut:
 enabled: true
 api_key_env: "GAMUT_API_KEY"
 auto_verify_entities: true
 attach_confidence_score: true

Deploy

Verity runs anywhere: your laptop, a VPS, or any major cloud provider.

Option 1: Docker Compose (any host)

cp config.example.yaml config.yaml # customize topics and thresholds
cp .env.example .env # fill in API keys
./deploy.sh docker # builds and starts containers

Option 2: Cloud-Native (auto-detect)

./deploy.sh # auto-detect: GCP, AWS, or Azure
./deploy.sh gcp # Cloud Run + Cloud Scheduler + Secret Manager
./deploy.sh aws # ECS Fargate + EventBridge + Secrets Manager
./deploy.sh azure # Container Apps + Timer Trigger + Key Vault

GCP	AWS	Azure
Container	Cloud Run	ECS Fargate	Container Apps
Scheduler	Cloud Scheduler	EventBridge	Timer Trigger
Secrets	Secret Manager	Secrets Manager	Key Vault
Storage	Firestore	DynamoDB*	CosmosDB*

* DynamoDB and CosmosDB storage backends are on the roadmap. Use SQLite (mounted volume) for now.

Option 3: Cron on a VPS

# Run competitors scan daily at 7am
0 7 * * * cd /path/to/verity && python src/scanner.py competitors
# Run tech patterns scan on Fridays
0 7 * * 5 cd /path/to/verity && python src/scanner.py tech_patterns

Security Model

See SECURITY.md for the full security model. Key principles:

No ambient authority — Verity can search the web and call an LLM. Nothing else.
No secrets in config — all API keys are resolved through a pluggable secrets backend (env vars, .env file, GCP Secret Manager, AWS Secrets Manager, Azure Key Vault)
Append-only audit log — every search, relevance score, and authenticity decision is recorded
Domain filtering — block or allow-list which domains can be fetched
Rate limiting — configurable per-hour caps on search and scoring calls

Project Structure

verity/
├── config.example.yaml # Full config reference (copy to config.yaml)
├── config/
│ └── source_reputation.yaml # Domain trust tier database (~80 pre-classified domains)
├── docker-compose.yml
├── Dockerfile
├── requirements.txt
├── src/
│ ├── scanner.py # Main orchestrator (Verity class) + CLI + Cloud Function entry
│ ├── authenticity.py # Three-layer authenticity engine
│ ├── searcher.py # Web search provider abstraction
│ ├── scorer.py # LLM relevance scoring (Gemini/OpenAI/Anthropic/Ollama)
│ ├── notifier.py # Slack, Telegram, webhook delivery
│ ├── store.py # Dedup + approval state + export (Firestore/SQLite/JSON)
│ ├── secrets_resolver.py # Pluggable secrets (env/.env/GCP/AWS/Azure)
│ ├── config_loader.py # YAML loading + validation
│ └── audit.py # Append-only audit logging
├── tests/ # 22 tests covering pipeline logic and authenticity layers
├── deploy.sh # Unified deploy script
├── deploy-gcp.sh
├── deploy-aws.sh
├── deploy-azure.sh
├── SECURITY.md
└── LICENSE # Apache 2.0

Contributing

PRs welcome. Please read SECURITY.md before contributing.

License

Apache 2.0. See LICENSE.

Built by Gamut Intelligence — AI-powered entity verification for PE/VC due diligence.

Folders and files

Latest commit

History

Repository files navigation

Verity

Why This Exists

Quickstart

How It Works

Three-Layer Authenticity Pipeline

Configuration

Model-Agnostic Scoring

Gamut Intelligence Integration (Optional)

Deploy

Option 1: Docker Compose (any host)

Option 2: Cloud-Native (auto-detect)

Option 3: Cron on a VPS

Security Model

Project Structure

Contributing

License

About

Topics

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages