Name	Name	Last commit message	Last commit date
Latest commit History 31 Commits
.github/workflows	.github/workflows
backend	backend
docs	docs
frontend	frontend
tests	tests
.dockerignore	.dockerignore
.env.example	.env.example
.gitignore	.gitignore
Dockerfile	Dockerfile
LICENSE	LICENSE
Makefile	Makefile
README.md	README.md
alembic.ini	alembic.ini
docker-compose.yml	docker-compose.yml
entrypoint.sh	entrypoint.sh
pyproject.toml	pyproject.toml

Status: Alpha Python 3.12+ License: Apache 2.0

Solace

Open-source alert management and incident response platform. Ingest alerts from any monitoring source, deduplicate them, auto-correlate into incidents, and manage the response — all from a single dashboard.

Think PagerDuty / OpsGenie, but open-source and self-hosted.

Features

Authentication & Access Control

JWT-based authentication — Secure login with username/password, 8-hour token expiry
Role-based access control (RBAC) — Three roles: Admin (full access), User (read + acknowledge/resolve), Viewer (read-only)
Default admin account — Auto-seeded on first startup with configurable credentials
First-login password change — Admin account requires password change on first login
API key backward compatibility — Webhook ingestion continues to use X-API-Key header, existing integrations unaffected
User management — Admin panel to create, edit, and deactivate user accounts

On-Call Scheduling

Flexible rotations — Hourly, daily, weekly, or custom rotation intervals
Member management — Add team members to schedules with ordered rotation positions
Timezone-aware handoffs — Configure handoff time and timezone per schedule
Temporary overrides — Swap on-call duty for a time range with reason tracking
"Who's On Call" view — Real-time display of the current on-call person per schedule

Escalation Policies

Multi-level escalation — Define escalation levels with configurable timeouts (1-1440 minutes)
Mixed targets — Each level can notify users directly or the current on-call from a schedule
Repeat support — Policies can repeat through all levels N times before stopping
Service-to-policy mapping — Map services to escalation policies using glob patterns (e.g., billing-*, *)
Priority ordering — When multiple mappings match, the lowest priority number wins
Severity filtering — Optionally restrict mappings to specific severity levels

Alert Ingestion & Normalization

6 built-in webhook normalizers — Generic, Prometheus Alertmanager, Grafana, Splunk, Datadog, and Email ingest
Pluggable architecture — Each provider has its own normalizer that maps vendor-specific payloads to Solace's internal format
Auto-severity mapping — Provider-specific priority/severity levels are normalized to Solace's 5-level model (critical, high, warning, low, info)

Deduplication

Fingerprint-based dedup — SHA256 hash of identity fields (source, name, service, host, labels) ensures identical alerts merge rather than duplicate
Configurable dedup window — Default 5 minutes; identical alerts within the window increment duplicate_count
Occurrence timeline — Every duplicate arrival is tracked with a timestamp for frequency analysis

Incident Correlation

Automatic service-based grouping — Alerts from the same service within a configurable time window (default 10 min) are grouped into a single incident
Severity auto-promotion — Incident severity always reflects the worst alert severity
Auto-resolve — When all alerts in an incident resolve, the incident auto-resolves

Alert Lifecycle

Full status workflow — Firing → Acknowledged → Resolved, plus Suppressed and Archived states
Acknowledge & resolve — One-click actions from the dashboard or via API
Bulk operations — Select multiple alerts and acknowledge or resolve them in one action
Archive — Archive resolved alerts older than N days to keep the dashboard clean

Incident Management

Incident timeline — Every action (created, alert added, severity changed, acknowledged, resolved) is recorded as a timestamped event
Incident detail view — See all correlated alerts, event audit trail, and incident metadata in one place
Cascade actions — Acknowledging/resolving an incident applies to all its alerts

Notification Channels (5 types)

Slack — Block Kit formatted messages with severity color coding, alert counts, service info, and dashboard links
Microsoft Teams — Adaptive Card messages via incoming webhook or Power Automate workflow URLs
Email — HTML-formatted incident notifications via SMTP with correlated alert tables
Generic Webhook (Outbound) — JSON payload with full incident and alert data, optional shared secret for HMAC verification, custom headers support
PagerDuty — Events API v2 integration; triggers, resolves, and dedup keys sync incidents to PagerDuty services
Per-channel filters — Filter notifications by severity and/or service
Rate limiting — Per-channel, per-incident cooldown prevents notification spam
Delivery logs — Every notification attempt is logged with status (pending/sent/failed) and error details
Test button — Send a test notification through any channel from the UI

Silence / Maintenance Windows

Time-based suppression — Define start/end times for maintenance windows
Flexible matchers — Match by service (list), severity (list), or label key-value pairs
AND logic — All matchers must match for an alert to be suppressed
CRUD management — Create, edit, and view active/expired windows from the UI

Alert Enrichment

Tags — Free-form string tags with add/remove from UI or API; stored as JSONB with GIN index for fast queries
Investigation notes — Timestamped notes with author attribution and full CRUD
External ticket linking — Link alerts to Jira, GitHub, or any URL; auto-prepends https:// if missing
Runbook URL — Editable from the alert detail panel; manually paste a URL or auto-attach via runbook rules
Runbook rules — Pattern-based rules that auto-attach runbook URLs to incoming alerts. Define a service glob pattern (e.g., payment-*), an optional name pattern, and a URL template with variables ({service}, {host}, {name}, {environment}). First matching rule wins (priority-ordered). "Save as Rule" checkbox on the alert panel creates a rule from the current alert in one click.
Raw payload — Full original webhook payload preserved for forensic inspection

Alert Auto-Expire

Configurable TTL — Firing alerts auto-resolve after a configurable time-to-live (default 24 hours, 0 to disable)
Admin-only control — Only admins can adjust the TTL at runtime via Settings; env var ALERT_TTL_SECONDS for persistence across restarts
Smart exclusions — Acknowledged alerts are excluded from auto-expire (someone is working on it)
Freshness tracking — Duplicate arrivals reset the expiry timer via last_received_at, so actively recurring issues stay open
Auto-expired tag — Expired alerts are tagged auto-expired to distinguish from manual resolution
Cascade — Auto-expired alerts trigger incident resolution and WebSocket events like any other resolve

Analytics Dashboard

Alert volume trends — Hourly area chart showing alert ingest rate over time
MTTA/MTTR trends — Daily line chart tracking mean time to acknowledge and resolve
Top noisy services — Bar chart ranking services by alert volume
Severity breakdown — Distribution of alerts across severity levels
Time range selector — Toggle between 7-day, 14-day, and 30-day views
Integrated into Statistics — Expands the existing Statistics view, no separate navigation

Heartbeat / Dead-Man Monitoring

Dead-man switch — Register expected check-ins; if a service doesn't ping within the interval + grace period, Solace fires an alert
HTTP health checks — Periodically GET a URL; if the response is non-2xx or times out, Solace fires an alert
Automatic recovery — When a failed heartbeat recovers, Solace sends a resolved alert to close the incident
Full pipeline integration — Heartbeat alerts go through the standard ingestion pipeline (dedup, correlation, notifications, escalation)
CRUD management — Create, edit, delete, and monitor heartbeats from the Heartbeats tab
Slug-based ping endpoint — Dead-man pings use POST /api/v1/heartbeats/{slug}/ping with API key auth

Dashboard & UI

Light and dark themes — Toggle between a high-contrast dark ops-console theme and a clean light theme; preference persisted in localStorage
Real-time updates — WebSocket connection with automatic reconnect and fallback polling
Keyboard shortcuts — j/k navigation, a acknowledge, r resolve, Esc close, ? help
Search & filter — Full-text search across name, service, host, tags with status/severity/service filters
Sortable columns — Sort by time, severity, name, service, duplicate count, or status
Pagination — Configurable page size with server-side pagination
Stats bar — Live counts of alerts by status/severity, incident counts, MTTA, and MTTR

API & Integration

Full REST API — Every feature is accessible via API (alerts, incidents, silences, notifications, on-call, stats, settings)
OpenAPI docs — Auto-generated Swagger UI at /docs
Health checks — Liveness (/health) and readiness (/health/ready) endpoints for Kubernetes probes
WebSocket events — Real-time event stream for alert.created, incident.updated, incident_created, severity_changed, incident_resolved
Dual auth — JWT Bearer tokens for user sessions, X-API-Key header for webhook ingestion and external integrations

Architecture

Prometheus ──┐
Grafana ─────┤ ┌─────────────┐ ┌────────────┐
Datadog ─────┼─▶ Webhook API ──▶ │ Normalizer │ ──▶ │ Dedup │
Splunk ──────┤ (X-API-Key) │ (pluggable) │ │ Engine │
Email ───────┤ └─────────────┘ └─────┬──────┘
Custom ──────┘ │
 ┌─────▼──────┐
 │ Silence │
 │ Check │
 └─────┬──────┘
 │
 ┌─────▼──────┐ ┌──────────────┐
 │ Correlation │──▶ │ Notifications │
 │ Engine │ └──────┬───────┘
 └─────┬──────┘ │
 │ ┌──────▼───────┐
 │ │ Escalation │
 │ │ Engine │
 │ └──────┬───────┘
 │ │
 ┌────────────▼───────────────────▼┐
 │ PostgreSQL + Redis │
 └────────────┬────────────────────┘
 │
 ┌────────────▼────────────┐
 │ React Dashboard (WS) │
 │ JWT Auth + RBAC │
 │ (Vite + Tailwind) │
 └─────────────────────────┘

Quick Start

Docker Compose (recommended)

git clone https://github.com/springdom/solace.git
cd solace
docker compose up --build

Dashboard: http://localhost:3000
API: http://localhost:8000
API Docs: http://localhost:8000/docs

Default login: admin / admin (you'll be prompted to change the password on first login)

Send a test alert

# Generic webhook
curl -X POST http://localhost:8000/api/v1/webhooks/generic \
 -H "Content-Type: application/json" \
 -d '{
 "name": "HighCPU",
 "severity": "critical",
 "service": "payment-api",
 "host": "web-01",
 "description": "CPU usage above 95% for 10 minutes",
 "tags": ["production", "us-east-1"]
 }'
# Prometheus Alertmanager
curl -X POST http://localhost:8000/api/v1/webhooks/prometheus \
 -H "Content-Type: application/json" \
 -d '{
 "version": "4",
 "status": "firing",
 "alerts": [{
 "status": "firing",
 "labels": {
 "alertname": "DiskFull",
 "instance": "db-01:9090",
 "job": "postgres",
 "severity": "critical"
 },
 "annotations": {
 "summary": "Disk 95% full on db-01"
 },
 "startsAt": "2024-01-15T10:00:00.000Z",
 "endsAt": "0001-01-01T00:00:00Z"
 }]
 }'
# Grafana unified alerting
curl -X POST http://localhost:8000/api/v1/webhooks/grafana \
 -H "Content-Type: application/json" \
 -d '{
 "alerts": [{
 "status": "firing",
 "labels": { "alertname": "HighMemory", "grafana_folder": "Infrastructure" },
 "annotations": { "summary": "Memory above 90%", "severity": "high" },
 "startsAt": "2024-01-15T10:00:00.000Z",
 "endsAt": "0001-01-01T00:00:00Z",
 "values": { "B": 92.5 }
 }]
 }'
# Datadog monitor webhook
curl -X POST http://localhost:8000/api/v1/webhooks/datadog \
 -H "Content-Type: application/json" \
 -d '{
 "id": "123456789",
 "title": "CPU is high on web-01",
 "text": "CPU utilization above threshold",
 "alert_status": "triggered",
 "priority": "P1",
 "hostname": "web-01",
 "org": { "name": "MyOrg" },
 "tags": "env:production,service:payment-api"
 }'
# Splunk webhook alert
curl -X POST http://localhost:8000/api/v1/webhooks/splunk \
 -H "Content-Type: application/json" \
 -d '{
 "result": {
 "host": "web-01",
 "severity": "critical",
 "service": "payment-api",
 "message": "CPU usage above 95% for 10 minutes"
 },
 "sid": "scheduler_admin_HighCPU_at_17000000_132",
 "search_name": "High CPU Usage Alert"
 }'

Test incident correlation

Alerts from the same service auto-group into a single incident:

# These two alerts will be correlated into ONE incident
curl -X POST http://localhost:8000/api/v1/webhooks/generic \
 -H "Content-Type: application/json" \
 -d '{"name":"HighCPU","severity":"critical","service":"payment-api","host":"web-01"}'
curl -X POST http://localhost:8000/api/v1/webhooks/generic \
 -H "Content-Type: application/json" \
 -d '{"name":"HighMemory","severity":"high","service":"payment-api","host":"web-02"}'
# This creates a SEPARATE incident (different service)
curl -X POST http://localhost:8000/api/v1/webhooks/generic \
 -H "Content-Type: application/json" \
 -d '{"name":"HighErrorRate","severity":"warning","service":"auth-service"}'

Configure notification channels

# Slack
curl -X POST http://localhost:8000/api/v1/notifications/channels \
 -H "Content-Type: application/json" \
 -d '{
 "name": "Ops Slack",
 "channel_type": "slack",
 "config": { "webhook_url": "https://hooks.slack.com/services/YOUR/HOOK/URL" },
 "filters": { "severity": ["critical", "high"] }
 }'
# Microsoft Teams
curl -X POST http://localhost:8000/api/v1/notifications/channels \
 -H "Content-Type: application/json" \
 -d '{
 "name": "DevOps Teams",
 "channel_type": "teams",
 "config": { "webhook_url": "https://your-org.webhook.office.com/..." },
 "filters": { "severity": ["critical"] }
 }'
# PagerDuty
curl -X POST http://localhost:8000/api/v1/notifications/channels \
 -H "Content-Type: application/json" \
 -d '{
 "name": "PagerDuty On-Call",
 "channel_type": "pagerduty",
 "config": { "routing_key": "YOUR_PAGERDUTY_INTEGRATION_KEY" },
 "filters": { "severity": ["critical"] }
 }'
# Generic outbound webhook
curl -X POST http://localhost:8000/api/v1/notifications/channels \
 -H "Content-Type: application/json" \
 -d '{
 "name": "Automation Webhook",
 "channel_type": "webhook",
 "config": {
 "webhook_url": "https://your-service.com/hooks/solace",
 "secret": "optional-shared-secret",
 "headers": { "X-Custom-Header": "value" }
 }
 }'

API Endpoints

Authentication

Method	Endpoint	Description
`POST`	`/api/v1/auth/login`	Login with username/password, returns JWT
`GET`	`/api/v1/auth/me`	Get current user profile
`POST`	`/api/v1/auth/change-password`	Change password

Users (Admin only)

Method	Endpoint	Description
`GET`	`/api/v1/users`	List users
`POST`	`/api/v1/users`	Create user
`PUT`	`/api/v1/users/{id}`	Update user profile/role
`POST`	`/api/v1/users/{id}/reset-password`	Reset user password
`DELETE`	`/api/v1/users/{id}`	Deactivate user

Health

Method	Endpoint	Description
`GET`	`/health`	Liveness check
`GET`	`/health/ready`	Readiness check (DB + Redis)

Webhooks (Alert Ingestion)

Method	Endpoint	Description
`POST`	`/api/v1/webhooks/generic`	Generic webhook
`POST`	`/api/v1/webhooks/prometheus`	Prometheus Alertmanager
`POST`	`/api/v1/webhooks/grafana`	Grafana unified alerting
`POST`	`/api/v1/webhooks/datadog`	Datadog monitor webhook
`POST`	`/api/v1/webhooks/splunk`	Splunk saved search webhook
`POST`	`/api/v1/webhooks/email_ingest`	Email-based alert ingestion

Alerts

Method	Endpoint	Description
`GET`	`/api/v1/alerts`	List alerts (filterable, sortable, paginated)
`GET`	`/api/v1/alerts/{id}`	Get alert by ID
`POST`	`/api/v1/alerts/{id}/acknowledge`	Acknowledge alert
`POST`	`/api/v1/alerts/{id}/resolve`	Resolve alert
`PUT`	`/api/v1/alerts/{id}/tags`	Replace all tags
`POST`	`/api/v1/alerts/{id}/tags/{tag}`	Add a single tag
`DELETE`	`/api/v1/alerts/{id}/tags/{tag}`	Remove a tag
`GET`	`/api/v1/alerts/{id}/notes`	List investigation notes
`POST`	`/api/v1/alerts/{id}/notes`	Add a note
`PUT`	`/api/v1/alerts/notes/{id}`	Edit a note
`DELETE`	`/api/v1/alerts/notes/{id}`	Delete a note
`GET`	`/api/v1/alerts/{id}/history`	Get occurrence timeline
`PUT`	`/api/v1/alerts/{id}/ticket`	Set external ticket URL
`PUT`	`/api/v1/alerts/{id}/runbook`	Set runbook URL (optionally create rule)
`POST`	`/api/v1/alerts/bulk/acknowledge`	Bulk acknowledge
`POST`	`/api/v1/alerts/bulk/resolve`	Bulk resolve
`POST`	`/api/v1/alerts/archive`	Archive old resolved alerts

Incidents

Method	Endpoint	Description
`GET`	`/api/v1/incidents`	List incidents (filterable, sortable, paginated)
`GET`	`/api/v1/incidents/{id}`	Get incident with alerts + event timeline
`POST`	`/api/v1/incidents/{id}/acknowledge`	Acknowledge incident + all alerts
`POST`	`/api/v1/incidents/{id}/resolve`	Resolve incident + all alerts

Silences (Maintenance Windows)

Method	Endpoint	Description
`GET`	`/api/v1/silences`	List silence windows (filterable by state)
`POST`	`/api/v1/silences`	Create silence window
`GET`	`/api/v1/silences/{id}`	Get silence window
`PUT`	`/api/v1/silences/{id}`	Update silence window
`DELETE`	`/api/v1/silences/{id}`	Delete silence window

Notification Channels

Method	Endpoint	Description
`GET`	`/api/v1/notifications/channels`	List all channels
`POST`	`/api/v1/notifications/channels`	Create channel (slack/teams/email/webhook/pagerduty)
`GET`	`/api/v1/notifications/channels/{id}`	Get channel
`PUT`	`/api/v1/notifications/channels/{id}`	Update channel
`DELETE`	`/api/v1/notifications/channels/{id}`	Delete channel
`POST`	`/api/v1/notifications/channels/{id}/test`	Send test notification
`GET`	`/api/v1/notifications/logs`	List delivery logs

On-Call Schedules

Method	Endpoint	Description
`GET`	`/api/v1/oncall/schedules`	List schedules (paginated, `active_only` filter)
`POST`	`/api/v1/oncall/schedules`	Create schedule (admin)
`GET`	`/api/v1/oncall/schedules/{id}`	Get schedule
`PUT`	`/api/v1/oncall/schedules/{id}`	Update schedule (admin)
`DELETE`	`/api/v1/oncall/schedules/{id}`	Delete schedule (admin)
`GET`	`/api/v1/oncall/schedules/{id}/current`	Get who is currently on call
`POST`	`/api/v1/oncall/schedules/{id}/overrides`	Create temporary override (admin)
`DELETE`	`/api/v1/oncall/overrides/{id}`	Delete override (admin)

Escalation Policies

Method	Endpoint	Description
`GET`	`/api/v1/oncall/policies`	List escalation policies
`POST`	`/api/v1/oncall/policies`	Create policy (admin)
`GET`	`/api/v1/oncall/policies/{id}`	Get policy
`PUT`	`/api/v1/oncall/policies/{id}`	Update policy (admin)
`DELETE`	`/api/v1/oncall/policies/{id}`	Delete policy (admin)

Service Mappings

Method	Endpoint	Description
`GET`	`/api/v1/oncall/mappings`	List service-to-policy mappings
`POST`	`/api/v1/oncall/mappings`	Create mapping (admin)
`DELETE`	`/api/v1/oncall/mappings/{id}`	Delete mapping (admin)

Runbook Rules

Method	Endpoint	Description
`GET`	`/api/v1/runbooks/rules`	List runbook rules
`POST`	`/api/v1/runbooks/rules`	Create rule (admin)
`PUT`	`/api/v1/runbooks/rules/{id}`	Update rule (admin)
`DELETE`	`/api/v1/runbooks/rules/{id}`	Delete rule (admin)

Stats & Settings

Method	Endpoint	Description
`GET`	`/api/v1/stats`	Dashboard statistics (counts, MTTA, MTTR)
`GET`	`/api/v1/stats/trends`	Time-series analytics (alert volume, MTTA/MTTR daily, top services)
`GET`	`/api/v1/settings`	Application configuration (includes alert TTL)
`PUT`	`/api/v1/settings/alert-ttl`	Update alert auto-expire TTL (admin only)

Heartbeats

Method	Endpoint	Description
`GET`	`/api/v1/heartbeats`	List all heartbeats
`POST`	`/api/v1/heartbeats`	Create heartbeat (admin only)
`PUT`	`/api/v1/heartbeats/{id}`	Update heartbeat (admin only)
`DELETE`	`/api/v1/heartbeats/{id}`	Delete heartbeat (admin only)
`POST`	`/api/v1/heartbeats/{slug}/ping`	Record dead-man check-in (API key auth)

WebSocket

Endpoint	Description
`GET /api/v1/ws?token={jwt_or_api_key}`	Real-time event stream

Configuration

All settings are configurable via environment variables:

Variable	Default	Description
`DATABASE_URL`	`postgresql+asyncpg://solace:solace@localhost:5432/solace`	PostgreSQL connection
`REDIS_URL`	`redis://localhost:6379/0`	Redis connection
`API_KEY`	`""`	API key for webhook ingestion (empty = no auth in dev)
`SECRET_KEY`	`change-me-to-a-random-secret-key`	Secret for JWT signing
`ADMIN_USERNAME`	`admin`	Default admin username (created on first startup)
`ADMIN_PASSWORD`	`admin`	Default admin password
`ADMIN_EMAIL`	`admin@solace.local`	Default admin email
`JWT_EXPIRE_MINUTES`	`480`	JWT token expiry (8 hours)
`DEDUP_WINDOW_SECONDS`	`300`	Window for deduplicating identical alerts (5 min)
`CORRELATION_WINDOW_SECONDS`	`600`	Window for correlating alerts into incidents (10 min)
`NOTIFICATION_COOLDOWN_SECONDS`	`300`	Per-channel, per-incident notification cooldown (5 min)
`SOLACE_DASHBOARD_URL`	`http://localhost:3000`	Dashboard URL (used in notification links)
`APP_ENV`	`development`	Environment (`development` / `production`)
`LOG_LEVEL`	`INFO`	Logging level
`ALERT_TTL_SECONDS`	`86400`	Auto-expire firing alerts after N seconds (0 = disabled, default 24h)
`ALERT_EXPIRE_CHECK_INTERVAL_SECONDS`	`60`	How often to check for expired alerts
`HEARTBEAT_CHECK_INTERVAL_SECONDS`	`30`	How often to run heartbeat monitoring checks
`SMTP_HOST`	`""`	SMTP server for email notifications
`SMTP_PORT`	`587`	SMTP port
`SMTP_USER`	`""`	SMTP username
`SMTP_PASSWORD`	`""`	SMTP password
`SMTP_USE_TLS`	`true`	Enable STARTTLS
`SMTP_FROM_ADDRESS`	`solace@localhost`	Sender address for email notifications

Tech Stack

Backend: Python 3.12+, FastAPI, async SQLAlchemy (asyncpg), Alembic, PostgreSQL, Redis, python-jose (JWT), passlib (bcrypt)

Frontend: React 18, TypeScript, Vite, Tailwind CSS, Zustand

Deployment: Docker Compose, Kubernetes-ready health probes

Development

Run tests

pip install -e ".[dev]"
pytest tests/ -v

Lint

ruff check backend/

Local development (without Docker)

# Start PostgreSQL and Redis
# Create database: CREATE DATABASE solace;
# Run migrations
alembic upgrade head
# Start API server
uvicorn backend.main:app --reload --port 8000
# Start frontend (separate terminal)
cd frontend && npm install && npm run dev

Roadmap

Completed

Multi-source webhook ingestion (Generic, Prometheus, Grafana, Datadog, Splunk, Email)
Fingerprint-based deduplication with configurable window
Service-based automatic incident correlation
Full alert lifecycle (firing, acknowledged, resolved, suppressed, archived)
Incident management with event audit trail
Notification channels: Slack, Microsoft Teams, Email, Webhook (outbound), PagerDuty
Notification filters, rate limiting, delivery logs, and test button
Silence / maintenance windows with flexible matchers
Alert tagging and investigation notes
External ticket URL linking (Jira, GitHub, etc.)
Runbook URL support with editable UI and auto-attach rules
Bulk acknowledge/resolve operations
Archive old resolved alerts
Dashboard stats (MTTA, MTTR, counts by status/severity)
Real-time WebSocket updates with fallback polling
Keyboard shortcuts for fast navigation
Light and dark theme toggle
JWT authentication with default admin account
Role-based access control (admin, user, viewer)
User management (create, edit, deactivate)
On-call scheduling (hourly/daily/weekly/custom rotations)
Temporary on-call overrides
Escalation policies with multi-level targets
Service-to-policy mapping with glob patterns and priority ordering
Alert auto-expire with configurable TTL (admin-controlled)
Analytics dashboard with time-series trends (alert volume, MTTA/MTTR, top services)
Heartbeat / dead-man monitoring with HTTP health checks

Next Up

Background escalation checker (auto-escalate if not ack'd in N minutes)
SSO integration (Google, GitHub, SAML)
SMS and voice call notifications (Twilio)
Status pages (public incident status)
Plugin system (custom normalizers, notification channels, enrichment hooks)
Alert pattern detection and noise scoring
Post-incident review and retrospectives
Topology-aware correlation (service dependency graph)

License

Apache 2.0

Folders and files

Latest commit

History

Repository files navigation

Solace

Features

Authentication & Access Control

On-Call Scheduling

Escalation Policies

Alert Ingestion & Normalization

Deduplication

Incident Correlation

Alert Lifecycle

Incident Management

Notification Channels (5 types)

Silence / Maintenance Windows

Alert Enrichment

Alert Auto-Expire

Analytics Dashboard

Heartbeat / Dead-Man Monitoring

Dashboard & UI

API & Integration

Architecture

Quick Start

Docker Compose (recommended)

Send a test alert

Test incident correlation

Configure notification channels

API Endpoints

Authentication

Users (Admin only)

Health

Webhooks (Alert Ingestion)

Alerts

Incidents

Silences (Maintenance Windows)

Notification Channels

On-Call Schedules

Escalation Policies

Service Mappings

Runbook Rules

Stats & Settings

Heartbeats

WebSocket

Configuration

Tech Stack

Development

Run tests

Lint

Local development (without Docker)

Roadmap

Completed

Next Up

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages