Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Rootless-Ghost/ClusterIQ

Repository files navigation

ClusterIQ — Contextual Alert Clustering Engine

Part of the Nebula Forge security tools suite.

ClusterIQ groups ECS-lite alerts by signal fingerprint, then scores each cluster contextually to determine whether it should be escalated, reviewed, or suppressed as noise. Unlike naive deduplication, ClusterIQ never suppresses a cluster solely because its signals match — context always wins.

version port python framework part%20of license


Pipeline Position

Nebula Forge pipeline — ClusterIQ highlighted

purple-loop: AtomicLoop → LogNorm → ClusterIQ → DriftWatch → HuntForge → repeat


Screenshots

ClusterIQ-Dashboard

ClusterIQ-Dashboard

Cluster-Generation

Cluster-Generation


Core Differentiator

Two alerts with identical signals (same process name, same destination IP, same event action) receive different verdicts if one carries a TI tag or involves a rare user:

Cluster A — powershell.exe / C2_IP → ESCALATE (TI indicator present)
Cluster B — powershell.exe / C2_IP → REVIEW (off-hours, rare user)
Cluster C — powershell.exe / C2_IP → SUPPRESSED (known user, business hours, no TI)

Features

  • Signal fingerprinting — configurable cluster-by fields (process.name, event.action, destination IP, etc.)
  • Fuzzy cluster merging — token-level Jaccard similarity above a configurable threshold
  • Five context dimensions — TI indicators, critical asset heuristic, user anomaly, time-of-day, hit rate
  • Context-first verdict engine — TI tags always escalate regardless of similarity score
  • Exact deduplication/api/deduplicate with sliding time-window
  • Session library — persistent SQLite storage, search, pagination, export
  • Export — JSON and Markdown per session
  • CLI — offline analysis without the web UI
  • Integration — accepts ECS-lite directly from LogNorm (port 5006)

Quick Start

cd ClusterIQ
pip install -r requirements.txt
cp config.example.yaml config.yaml # optional
python app.py

Open http://127.0.0.1:5009.


Docker (Nebula Forge suite)

This tool runs as a containerized service in the Nebula Forge suite. The recommended way to start everything together:

# From the Nebula-Forge repo root
cp .env.example .env # add secrets (NVD_API_KEY, ATOMICLOOP_API_KEY, POSTGRES_PASSWORD — all required)
docker compose up -d # starts all services including clusteriq

Access: http://localhost:5009

Standalone container:

docker build -t clusteriq .
docker run -p 5009:5009 \
 -e DATABASE_URL=postgresql://nebula:changeme@localhost:5432/nebula_forge \
 clusteriq

Usage

Web UI

  1. Paste or upload ECS-lite alerts (JSON array or NDJSON).
  2. Check the fields to cluster by, set the similarity threshold.
  3. Click Cluster Alerts.
  4. Results appear as color-coded cluster cards:
    • Red — Escalate
    • Yellow — Review
    • Grey — Suppressed
  5. Click any card to open the detail modal (Overview, Context Scores, Members).

CLI

# Cluster alerts, print summary
python cli.py --alerts alerts.json
# Custom fields and threshold
python cli.py --alerts alerts.json --fields process.name,event.action --threshold 0.8
# Save output as Markdown
python cli.py --alerts alerts.json --output session.md
# Deduplicate with 5-minute window
python cli.py --dedup --alerts alerts.json --window 300
# Print as JSON
python cli.py --alerts alerts.json --format json

API Reference

Method Endpoint Description
GET /api/health Health check
POST /api/cluster Cluster alerts
POST /api/deduplicate Remove exact duplicates
GET /api/sessions List sessions (paginated)
GET /api/session/<id> Get a single session
DELETE /api/session/<id> Delete a session
GET /api/session/<id>/export Export (JSON or Markdown)

POST /api/cluster

{
 "alerts": [{...ECS-lite...}],
 "similarity_threshold": 0.75,
 "cluster_by": ["process.name", "event.action", "network.destination.ip"],
 "label": "Monday SOC triage",
 "save": true
}

Also accepts alerts_json (raw JSON string) and multipart/form-data.

Response:

{
 "success": true,
 "session_id": "uuid",
 "clusters": [...],
 "original_count": 847,
 "cluster_count": 12,
 "suppressed_count": 821,
 "review_count": 18,
 "escalate_count": 8,
 "noise_reduction_pct": 96.8
}

POST /api/deduplicate

{"alerts": [...], "window_seconds": 300}

Response: {"success": true, "unique": [...], "removed": 103, "original": 206}


Verdict Logic

Condition Verdict
TI indicator in any member escalate (always, overrides similarity)
Critical asset + risk ≥ 65% escalate
Rare/unknown user ≥ 65% escalate
Off-hours ≥ 30% of members review
Uncommon user 20–65% review
Elevated asset risk 30–65% review
No anomalous context suppressed

Context overrides signal similarity at every level.


Context Score Dimensions

Dimension Description
ti_tags Threat-intel indicator patterns in tags, rule.tags, threat.indicator.*
has_critical_asset Hostname heuristic: dc-, exchange, sql, prod-, srv-, backup, etc.
user_anomaly Users appearing in ≤ 1% of session alerts
asset_risk Fraction of cluster members on critical assets
time_anomaly Fraction of members outside Mon–Fri 09:00–17:00
hit_rate_anomaly Cluster size relative to session average

Clustering Algorithm

  1. Fingerprint each alert by extracting the cluster_by fields.
  2. Group exactly by SHA-256 hash of the fingerprint.
  3. Merge near-similar groups where token-level Jaccard similarity ≥ threshold.
  4. Score context per cluster across five dimensions.
  5. Assign verdict — escalate → review → suppressed in priority order.

Configuration

Key Default Description
port 5009 HTTP port
db_path ./clusteriq.db SQLite database
clustering.default_threshold 0.75 Similarity threshold
clustering.default_fields [process.name, event.action, network.destination.ip] Default cluster-by
clustering.max_alerts 50000 Input cap
clustering.auto_save true Persist sessions automatically
integrations.lognorm_url http://127.0.0.1:5006 LogNorm endpoint

Nebula Forge Integration

Add to nebula-dashboard/config.yaml:

tools:
 clusteriq:
 label: "ClusterIQ"
 url: "http://127.0.0.1:5009"
 health_path: "/api/health"
 description: "Contextual alert clustering engine"
 category: "Detection"

License

This project is licensed under the MIT License — see the LICENSE file for details.

Built by Rootless-Ghost

Part of the Nebula Forge security tools suite.

About

Contextual Alert Clustering Engine — similarity grouping with context scoring for noise reduction | Part of Nebula Forge

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

Contributors

AltStyle によって変換されたページ (->オリジナル) /