Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Releases: timescale/rsigma

v0.17.0

23 Jun 10:23
@mostafa mostafa
5883af7
This commit was created on GitHub.com and signed with GitHub’s verified signature.
GPG key ID: B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

TL;DR
RSigma v0.17.0 is the "detection-engineering toolkit" release: the rule-side reporting suite that closes the program loop, plus the daemon output-delivery layer and live daemon introspection.

  • Detection-engineering reports: rule backtest replays an event corpus against a ruleset and diffs per-rule fire counts against declared expectations (#216); rule coverage maps a ruleset onto MITRE ATT&CK, exports a Navigator layer, and reports coverage gaps (#221); rule visibility turns the field-observability signal into a DeTT&CT administration pair and a visibility Navigator layer (#242); rule scorecard fuses backtest precision/recall, coverage, and fire volume into per-rule keep/tune/retire verdicts (#243).
  • Output delivery: detection results now flow through a per-sink async delivery layer with bounded queues, retry/backoff, batching, and an at-least-once ack-join across fan-out (#222); an OTLP output sink exports detections over OTLP/HTTP and OTLP/gRPC (#223); a generic, template-driven webhook sink delivers to Slack, Teams, Discord, PagerDuty, or any HTTP endpoint (#227).
  • Daemon introspection: engine status queries a running daemon from the command line (#237), engine tap records a redactable, replayable live event fixture (#238), and engine tail streams live detections to the terminal (#239).
  • Conversion reach: backend convert resolves targets native-first and delegates anything without a native backend to an installed sigma-cli, reaching the full pySigma backend ecosystem with no new dependency (#241).
  • rstix: Phase 2 adds STIX meta objects (#213) and relationship/sighting objects (#220), thanks to @SecurityEnthusiast; the crate is not releasable on its own yet.
  • Fibratus conversion fixes: emit the required version field so converted rules load (#219), and map file_access/file_event/create_remote_thread to their idiomatic macros (#217), thanks to @rabbitstack.
  • Faster NATS and daemon integration tests: deterministic waits replace fixed sleeps and long-poll timeouts, cutting each suite's runtime by roughly 7x with no production code changes (#240).
  • Security: bump the transitive quinn-proto (via reqwest) to 0.11.15 to clear RUSTSEC-2026-0185, a high-severity remote memory exhaustion advisory.

rule scorecard: fuse the rule-side reports into per-rule keep/tune/retire verdicts (#243)

A new rsigma rule scorecard subcommand fuses the toolkit's existing rule-side outputs into the per-rule keep/tune/retire verdict table a detection program reviews on a cadence. It reads JSON the toolkit already emits, so it adds no new collection or evaluation: it is an offline fusion-and-verdict layer over already-aggregated reports.

  • Inputs and the join. Joins the rule backtest report (precision proxy, recall, the corpus false-positive signal, per-rule fire counts) and the rule coverage report (per-rule ATT&CK mapping and the per-technique rule count for sole-coverage analysis), both required, into a per-rule record keyed by rule_id. Optionally enriches it with a Prometheus production-volume snapshot or /metrics endpoint (--metrics, joined by rule_title with colliding titles summed and flagged), a Prometheus query-API range window (--metrics-window) for last-fired, and a triage disposition feed (--triage) for the live false-positive ratio and MTTD/MTTR. Every cell records which input supplied it, and a missing optional input degrades the verdict rather than blocking it.
  • Verdict model. Bands default to the SOC quality-metrics thresholds and are configurable through flags and the scorecard config section: retire on a precision proxy below the retire floor (0.10) or zero volume across the corpus and the metrics window (a dead rule), tune on the review band or a live false-positive ratio over the ceiling (0.50), keep on a healthy precision proxy (0.80) with enough volume and a recent fire. A retire candidate that is the sole coverage for an ATT&CK technique is downgraded to tune with a coverage-risk note, so the program never silently drops coverage.
  • Output and CI. Renders through the global --output-format layer (table on a TTY, json/ndjson/csv/tsv) plus a --report markdown or HTML program artifact grouped by verdict (extension dispatch, --report-format override). --fail-on <none|tune|retire> turns it into a CI gate. Exit codes follow the house scheme: 0 success or under policy, 1 verdicts hit --fail-on, 2 an input is missing or unfetchable, 3 a bad flag or a malformed/version-mismatched report.
  • Config. A scorecard config section follows the layered-config conventions: the verdict thresholds carry single-source defaults (pinned to the clap flags by a drift-guard test), and every input (including the two required reports, scorecard.backtest/scorecard.coverage) and the report path can be supplied from the config file. Relatedly, rule coverage now also accepts its rule paths from coverage.rules.
  • No new dependencies. The Prometheus exposition-snapshot parser is hand-rolled (the single new untrusted-input surface, fuzzed by fuzz_scorecard_promtext); the query-API path reuses the existing ureq client. The backtest and coverage reports deserialize through structs shared with their producers, so the consumer and producers cannot drift.

rule visibility: DeTT&CT export and a visibility Navigator layer (#242)

A new rsigma rule visibility subcommand turns the shipped field-observability signal into the two artifacts blue teams consume for data-source maturity: a DeTT&CT administration pair and a visibility ATT&CK Navigator layer. Where rule coverage reports the detection axis ("which techniques your rules detect"), rule visibility reports the data axis ("which fields and logsources you actually see"), and the two Navigator layers stack to expose data-without-detection and detection-without-data cells.

  • Inputs and the join. Joins the rule logsource inventory and rule field set (from --rules) with the observed field signal (--observed <file|->: the engine eval --observe-fields JSON, a saved GET /api/v1/fields snapshot, or stdin; or --addr for a live daemon) through a bundled, overridable mapping table (--mapping[=<path|url>]). With no observed signal the command reports the rule-expected baseline.
  • Mapping table. A curated logsource/field -> ATT&CK data source/data component/technique table ships in-repo so the default invocation needs no network; --mapping reads a local JSON table or fetches a URL through the same 7-day cache the lint schema download uses. Rule logsources the table does not recognize are surfaced as a hygiene list.
  • Scoring. Visibility rides DeTT&CT's 0-to-4 scale, derived from the fraction of a data source's mapped rule fields that were observed. A data source whose mapped fields are all unobserved is a blind spot; an observed source no rule consumes is untapped. Scores are conservative seeds marked for analyst review, with data_quality dimensions carrying the seed value rather than fabricated precision.
  • Outputs. Writes a DeTT&CT data-source administration YAML (--dettect-data-sources), a technique-administration YAML (--dettect-techniques, visibility axis only), and a format 4.5 visibility Navigator layer (--navigator, scored 0-4). The report renders through the global --output-format layer (table/json/ndjson/csv/tsv).
  • CI signal. --fail-on-blind-spots exits 1 when a rule-expected data source has no observed telemetry. A visibility config section (mapping, fail_on_blind_spots) follows the layered-config conventions.

Reuse pySigma backends through sigma-cli delegation (#241)

rsigma backend convert now resolves targets native-first: it uses a native rsigma backend when one exists and otherwise delegates the conversion to an external sigma-cli when one is installed, so the full pySigma backend ecosystem (splunk, elasticsearch, kusto, qradar, loki, crowdstrike, and 30+ more) is reachable from the same command. It is a light subprocess wrapper with no new dependencies; no Python runtime is required unless a delegated target is actually used.

  • Native-first dispatch. postgres/postgresql/pg, lynxdb, and fibratus keep converting natively and always win; any other target is delegated. A future native backend transparently supersedes its delegated path.
  • Discovery. sigma-cli is found via the RSIGMA_SIGMA_CLI path override or a bare sigma on PATH. When a target has no native backend and sigma-cli is absent, the command exits 3 with install guidance (pipx install sigma-cli, sigma plugin install <target>).
  • Flag mapping. -t, -f, -p, --without-pipeline, -s, and -O key=value pass through to sigma convert verbatim; -O correlation_method=<m> maps to sigma-cli's -c/--correlation-method. The original rule files are handed to sigma-cli, which parses, pipelines, and converts them.
  • Output. sigma-cli stdout is relayed through the normal output handling (stdout, -o <file>, and the --output-format json envelope). A non-zero sigma-cli exit maps to 2 with its stderr relayed; a missing binary or a directory --output in delegated mode maps to 3.
  • Listing. backend targets appends the installed sigma-cli targets, and backend formats <target> shows a delegated target's formats.
  • Scope. CLI backend convert only; the MCP convert tool and the rsigma_convert library API convert with native backends. rsigma builtin pipeline names (ecs_windows, sysmon) are not translated; pass sigma-cli pipeline names or YAML paths in delegated mode.

Faster NATS and daemon integration tests (#240)

The nats_integration, cli_daemon_nats, and cli_daemon_dynamic suites spent most of their wall time waiting on fixed sleeps and long-poll timeouts rather than doing ...

Read more

Contributors

SecurityEnthusiast and rabbitstack
Assets 9
Loading

v0.16.0

15 Jun 08:27
@mostafa mostafa
fa246de
This commit was created on GitHub.com and signed with GitHub’s verified signature.
GPG key ID: B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

TL;DR
RSigma v0.16.0 is the "MCP server" release:

  • MCP server: a new Model Context Protocol integration that exposes the Sigma toolchain to AI agents.
    • rsigma-mcp crate and rsigma mcp serve (opt-in mcp feature): typed tools (parse, lint, validate, evaluate, convert, fix) plus field/backend/pipeline introspection and reference resources, with enrichment-aware evaluation (#208).
    • Remote transport and config: Streamable HTTP (rsigma mcp serve --http), constant-time bearer-token auth, in-process TLS, and a new mcp config section wired through rsigma config and the environment layer (#209).
    • Smoke harness: scripts/mcp-smoke.py drives a built server end to end over stdio and HTTP across every tool and resource as a standard-library CI job (#210).
    • Prerequisite refactor: the auto-fix applier, modifier/MITRE reference data, and the 75-rule lint catalogue move into rsigma-parser so the CLI, the LSP, and the MCP server share one implementation, behavior unchanged (#207).
  • backend convert per-rule file output: point --output at a directory to write one file per converted rule, named from the rule title with the backend's native extension (#205).
  • Configurable correlation state caps: --max-state-entries exposes the global entry cap and a new --max-group-entries bounds a single group's window state, with matching config keys and a per-rule attribute (#200).
  • Fibratus conversion fixes: corrected process_creation/process_termination/create_remote_thread field mappings and registry_event scoping against the Fibratus 3.0.0 vocabulary, thanks to @rabbitstack (#202).
  • Correlation window-mode benchmarks: a throughput suite plus a non-Criterion peak-memory stress target for the sliding/tumbling/session modes shipped in v0.15.0 (#199).
  • rstix: data-model skeleton and common property containers for the STIX 2.1 library, with leaf-type serde, thanks to @SecurityEnthusiast (not yet releasable on its own) (#201).
  • Dependency and security bumps: rolls up six Dependabot PRs and patches three RustSec PostgreSQL advisories (#206).

Developer tooling: MCP smoke harness (#210)

scripts/mcp-smoke.py drives a built rsigma mcp serve binary end to end over stdio and Streamable HTTP (with bearer auth), exercising all 11 tools and 3 resources as a post-build sanity check, and runs as the MCP Smoke CI job. Standard-library only.

MCP server: Streamable HTTP transport, bearer auth, and mcp config keys (#209)

Adds a remote transport and configuration to the MCP server.

  • Streamable HTTP transport. rsigma mcp serve --http <addr> serves the MCP endpoint at /mcp over HTTP (stdio stays the default). Built on rmcp's StreamableHttpService mounted on axum.
  • Bearer-token auth. --auth-token <token> (or RSIGMA_MCP_AUTH_TOKEN) requires a static token on every request, compared in constant time; requests without it get 401. The token is flag/env-only and never read from config files.
  • TLS. --tls-cert/--tls-key terminate TLS in-process using the daemon's rustls loader (requires the daemon-tls feature). Plaintext binds on non-loopback addresses are refused unless --allow-plaintext.
  • Config keys. A new mcp config section (mcp.http_addr, mcp.lint_config, mcp.rules_dir) is wired through rsigma config init/validate/show/schema and the RSIGMA_MCP__* environment layer. The auth token stays flag/env-only by design.

MCP server: rsigma mcp serve and the rsigma-mcp crate (#208)

A new Model Context Protocol server exposes the rsigma Sigma toolchain to AI agents (Cursor, Claude Code, ...) as structured tools. Instead of scraping CLI text, an agent calls typed tools and gets back JSON: ASTs, lint findings with spans and fix availability, evaluation matches, backend queries, and field inventories.

  • rsigma mcp serve. A new command group (Commands::Mcp) running the server over stdio, gated behind a new opt-in mcp Cargo feature (build with --features mcp; the prebuilt binaries and Docker image include it). Flags: --lint-config (applied by the lint tool) and --rules-dir (a default root for relative path-based tool calls).
  • rsigma-mcp crate. A new library crate built on rmcp 1.7 with the RsigmaMcp handler and serve_stdio. Ten core tools: parse_rule, parse_condition, lint_rules, validate_rules, evaluate_events, convert_rules, list_backends, list_fields, resolve_pipeline, and list_builtin_pipelines. Every tool accepts inline content (yaml/condition/events) xor a file path; stdout is reserved for the transport and diagnostics go to stderr.
  • fix_rules tool. Applies safe auto-fixes to Sigma YAML (lowercase keys, status/level typos, duplicate removal, ...) preserving comments and formatting, and returns the fixed YAML plus applied/failed/skipped-unsafe counts. Unsafe fixes are never auto-applied. write: true (only valid with a file path) persists the change to disk; an optional lint_rules filter restricts which lint rules are fixed.
  • MCP resources. rsigma://lint/catalogue (the 75-rule catalogue as JSON), rsigma://reference/modifiers, and rsigma://reference/mitre-tactics let agents ground themselves on the exact lint vocabulary and modifier semantics without spending tool calls.
  • Enrichment-aware evaluate_events. An optional enrichers (inline YAML/JSON) or enrichers_path parameter builds an enrichment pipeline and enriches results before returning; loader validation errors (including template-namespace checks) come back as structured errors, so the tool doubles as an enricher-config validator.
  • rsigma_runtime::enrichment::config. The enrichers YAML loader (load_enrichers_file, build_enrichers, build_enrichers_full, EnrichersFile) moves from the CLI daemon into rsigma-runtime so the daemon and the MCP server share one loader. The daemon is rewired to the moved loader with behavior and error text unchanged.
  • Docs. A new MCP server guide, the mcp serve CLI page, an rsigma-mcp library page, and the mcp feature entry in the feature-flags reference.

MCP server prerequisites: shared fix applier, reference data, and lint catalogue (#207)

Internal refactors that lift three pieces of lint and reference machinery into rsigma-parser so the CLI, the LSP, and the upcoming MCP server share one implementation. Behavior is unchanged for existing commands.

  • rsigma_parser::lint::fix. The string-level auto-fix applier (json_pointer_to_route, apply_single_fix_patch, apply_rename_key) moves from rsigma-cli into the parser, with a new apply_fixes_to_source(source, &[&LintWarning]) -> SourceFixOutcome entry point that applies every safe fix to a YAML string and reports applied/failed counts. The yamlpath/yamlpatch dependencies move with it. rsigma rule lint --fix keeps its file-on-disk behavior through a thin wrapper.
  • rsigma_parser::reference. The MODIFIERS and MITRE_TACTICS tables move out of the LSP binary (where they were unreachable cross-crate) into a public parser module; the LSP re-exports them so hover/completion are unchanged.
  • rsigma_parser::lint::catalogue. A new catalogue() returns per-rule metadata (id, default severity, fix disposition, one-line description) for all 75 lint rules, generated from a single list whose exhaustive match makes adding a rule without a catalogue entry a compile error.

backend convert: per-rule file output when --output is a directory (#205)

rsigma backend convert can now write one file per converted rule instead of a single concatenated stream. When --output points at a directory (an existing directory, or a path with a trailing separator that is created on demand), each rule is written to its own file named after a snake_case slug of the rule title, with the backend's native extension. This was prompted by Fibratus rule-deployment ergonomics: the engine loads one YAML rule per file from its Rules/ directory, so the split output drops straight in without hand-separating the ----joined stream.

  • Naming. File stems are a slug of the rule title (Detect Whoami becomes detect_whoami), falling back to the rule id and then a rule literal when the title slugifies to nothing. Colliding names get a numeric suffix (same.yml, same_2.yml) so two rules never overwrite each other. A rule that converts to several documents (for example a temporal correlation expanded with -O temporal_permute=true) keeps them together in its one file, finalized through the backend so the format-aware separators land inside.
  • Extensions. A new Backend::output_file_extension hook picks the per-rule extension: yml for the Fibratus YAML envelope (txt for its bare-expression expr format), sql for PostgreSQL, and txt by default. Single-file and stdout output are unchanged.
  • Docs. The Fibratus backend reference, the rule-conversion guide, and the README document the directory-output workflow (rsigma backend convert rules/ -t fibratus -p fibratus_windows -o ./Rules/).

Fibratus conversion: corrected field mappings and registry event scoping (#202)

Three correctness fixes to the fibratus_windows pipeline shipped in #191, found while converting more of the upstream Fibratus rules library.

  • Process field coverage. process_creation and process_termination gain the field mappings they were missing against the Fibratus 3.0.0 vocabulary: OriginalFileName -> ps.pe.file.name, CurrentDirectory -> ps.cwd, ProcessGuid -> ps.uuid, ParentProcessGuid -> ps.parent.uuid, IntegrityLevel -> ps.token.integrity_level, Company -> ps.pe.company, Description -> ps.pe.description, Product -> ps.pe.product, and FileVersion -> `process.pe.fi...
Read more

Contributors

SecurityEnthusiast and rabbitstack
Loading

v0.15.0

11 Jun 08:48
@mostafa mostafa
9c6742d
This commit was created on GitHub.com and signed with GitHub’s verified signature.
GPG key ID: B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

TL;DR
RSigma v0.15.0 is the "new conversion target and Sigma extensions" release:

  • Fibratus conversion backend: convert Sigma rules into Fibratus rule YAML for the first endpoint-sensor target, with a fibratus_windows field-mapping pipeline, idiomatic macro recognition, ATT&CK label flattening, and sequence-DSL correlation lowering (#191).
  • Array matching: [any]/[all]/[all_or_empty]/[none] object-scope blocks, implicit any-member matching, and positional indexing (args[0], negative indices), evaluated in the engine and lowered to PostgreSQL JSONB (#159).
  • Declarable correlation window modes: sliding/tumbling/session windows plus a session gap, end to end across the parser, runtime evaluator, and PostgreSQL conversion, with pySigma-style correlation_method selection at convert time (#192).
  • sigma-version: an optional top-level spec-major attribute that gates breaking spec changes by the declared version (array matching now activates only at major 3), plus cross-document reference lints (#188).
  • rstix: a new STIX 2.1 + TAXII 2.1 library crate; Phase 1 lands the core foundation (validated typed IDs, timestamps, deterministic SCO IDs, controlled vocabularies) (#185), thanks to @SecurityEnthusiast.
  • Gated match-detail enrichment: a new MatchDetailLevel (off/summary/full) that explains why each field matched, off by default so the default wire shape is byte-for-byte unchanged (#186).
  • RFC 5424 syslog now strips a leading UTF-8 BOM by default, fixing corrupted _raw fields, broken anchored matchers, and BOM-blocked embedded-JSON detection (#187).
  • Daemon shutdown fix: SIGINT/SIGTERM handlers are now installed before the API listener is announced, closing a startup race that could hard-kill the process instead of draining cleanly.

Fixed

  • Daemon startup signal race. The daemon now installs its SIGINT/SIGTERM handlers eagerly, before the API listener is announced and reachable, and reuses those same streams for the serve task's graceful shutdown. Previously the handlers were installed lazily on the serve task's first poll, so a signal arriving in the window between the socket becoming connectable (the kernel completes handshakes from the listen backlog) and that first poll hit the default disposition and killed the process instead of draining cleanly.

Fibratus conversion backend (#191)

Convert Sigma rules into rule YAML for Fibratus, an Apache-2.0 kernel-event detection and EDR engine. Fibratus is the first conversion target aimed at an endpoint sensor rather than a centralized log store; rules emitted by rsigma backend convert -t fibratus drop into a Fibratus installation's Rules/ directory and load with the same parser as the upstream rules library.

Output formats. Four format names cover two output shapes. default (alias yaml, rule) emits a complete YAML rule document per Sigma rule (name, id, description, labels, condition, min-engine-version, optional action) with --- separators between multi-rule output so the whole stream is a valid YAML document set. expr strips the envelope and emits the bare filter expression only, for piping into ad-hoc Fibratus commands.

Modifier coverage. Sigma's case-insensitive default flips to Fibratus's case-insensitive operators (icontains/istartswith/iendswith); the |cased modifier or -O case_sensitive=true flips to the bare forms. Plain literal equality (no wildcards) uses the dedicated string-equality operators ~= (case-insensitive default) and = (|cased) rather than a wildcard match, which evaluates more efficiently and reads the way the upstream rules library writes literal equality; the evt.name event discriminator always uses the exact =. Wildcard-bearing values lower to imatches/matches. Multi-value OR lists collapse into a single Fibratus list-operator clause (field iin ('a', 'b'), field imatches ('a*', 'b?'), field icontains ('a', 'b'), ...); a |all list stays AND-joined because a list right-hand side is OR-only. Regex (|re) lowers to the regex(field, 'pat1', 'pat2', ...) = true filter function, with multi-value lists collapsing into a single call and negation expressed as a leading not; patterns that use lookarounds or backreferences are rejected with a structured UnsupportedModifier rather than emitting something Fibratus's RE2 engine would reject at load time. CIDR (|cidr) lowers to cidr_contains(field, '...'), with multi-value lists collapsing into a single variadic call. Numeric comparisons map to </<=/>/>=. exists lowers to field != false / field = false and a null value to field = '' (Fibratus has no null token). Field references are native (field1 = field2). Keywords return UnsupportedKeyword because Sigma keywords have no bound field and Fibratus operators require one.

Field naming. A new fibratus_windows builtin pipeline (registered alongside ecs_windows and sysmon) maps Sigma's PascalCase Windows fields to the lowercase-dotted Fibratus vocabulary and adds the right evt.name discriminator per logsource category (process_creation -> CreateProcess, network_connection -> Connect, dns_query -> QueryDns, registry_set -> RegSetValue, ...). Most categories map Image -> ps.exe, CommandLine -> ps.cmdline, TargetFilename -> file.path, TargetObject -> registry.path, DestinationIp -> net.dip, ImageLoaded -> module.path, QueryName -> dns.name. Field names target the Fibratus 3.0.0 registry: DNS fields live under dns.*, loaded executables/DLLs under module.* (the legacy image.* namespace is deprecated), and Sigma fields with no 3.0.0 equivalent (SignatureStatus/Hashes/Imphash under image_load/driver_load, DestinationHostname/Initiated under network_connection) are intentionally unmapped so a dependent rule fails conversion instead of emitting a field the loader rejects. The evt.name discriminator is injected as the first condition (the new add_condition prepend: true option), so the emitted rule leads with the cheapest, most selective predicate and Fibratus short-circuits before the rule body. On a Fibratus 3.0.0 process_creation (CreateProcess) event ps.* is the created (child) process, so Image/CommandLine/ProcessId/User -> ps.exe/ps.cmdline/ps.pid/ps.username and the spawning process is ParentImage/ParentCommandLine/ParentProcessId -> ps.parent.exe/ps.parent.cmdline/ps.parent.pid (Fibratus 3.0.0 decommissioned the legacy ps.sibling.* namespace and unified process attributes under ps.*). For process_access (OpenProcess) the caller is ps.* and the opened process is exposed as event arguments, so TargetImage/TargetProcessId -> evt.arg[exe]/evt.arg[pid] (matching the upstream LSASS-access rule) and GrantedAccess -> ps.access.mask.names. file_event (file creation) excludes the OPEN disposition (the create_file macro semantics) so it does not fire on plain file access, and registry_set/registry_event map Details -> registry.data. The pipe_created logsource is intentionally not mapped because Fibratus has no named-pipe visibility without a kernel driver. Use it whenever you convert SigmaHQ Windows rules: rsigma backend convert rules/windows/ -t fibratus -p fibratus_windows. ATT&CK tags in tags: flatten into Fibratus's labels: block via a static MITRE lookup: attack.<tactic_short_name> -> tactic.id/tactic.name/tactic.ref, attack.t<NNNN> -> technique.id/technique.ref, and attack.t<NNNN>.<sub> -> subtechnique.id/subtechnique.ref (the base technique and sub-technique live in separate label namespaces matching the upstream Fibratus rules library convention). Unknown tags pass through as tag.<original>: <original>.

Correlation. Sigma correlation rules lower to Fibratus's inline sequence ... maxspan ... by <fields> | stage | | stage | DSL (the form Fibratus 1.10 introduced when it decommissioned policy: sequence). The group-by fields, shared across every referenced rule, are emitted once as a sequence-level by field1, field2, ... clause (the upstream rules-library style) instead of repeated per stage, so multi-field group-by needs no inline bindings. temporal_ordered and temporal (ordered fallback) emit one |...| stage per referenced rule; small-threshold event_count and value_count expand into N repeated or N distinct stages capped at -O max_repeated_slots (default 5), with value_count distinctness expressed via positional pattern bindings (field != 1ドル.field and field != 2ドル.field and ...). The four math-aggregate types (value_sum, value_avg, value_percentile, value_median), thresholds above the cap, range/equality predicates, and multi-rule event_count/value_count all return UnsupportedCorrelation with structured rationales the operator can act on; the coverage matrix in the new Fibratus backend reference is the source of truth.

Backend options. -O action=kill,isolate appends an action: block to every rule envelope. -O min_engine=3.0.0 sets min-engine-version:. -O emit_metadata=false drops the description: and labels: blocks for a minimal envelope. -O max_repeated_slots=N raises the correlation cap. -O case_sensitive=true forces the bare operators globally. -O temporal_permute=true expands a temporal (any-order) correlation into one ordered sequence document per permutation of the referenced rules (capped at N <= 3, so 1/2/6 documents per correlation; each permutation gets distinct title and id suffixes), so any matching order alerts; larger correlations return UnsupportedCorrelation. -O use_macros (default true) walks top-level and clauses and replaces recognized runs with idiomatic Fibratus macro calls (spawn_process, create_thread,...

Read more

Contributors

SecurityEnthusiast
Loading
SecurityEnthusiast reacted with heart emoji
1 person reacted

v0.14.0

05 Jun 09:56
@mostafa mostafa
b5334eb
This commit was created on GitHub.com and signed with GitHub’s verified signature.
GPG key ID: B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

TL;DR
RSigma v0.14.0 is the "layered config, structured output, and correctness/hardening" release:

  • Layered YAML configuration with explicit precedence (flag > env > project > user > system > default) plus a new rsigma config group (init, validate, show, schema, path, reload).
  • Structured output everywhere: a global --output-format <json|ndjson|table|csv|tsv> selector with a TTY-aware default, plus global --color, --quiet, and --no-stats.
  • Custom linter tag namespaces via a repeatable --tag-namespace flag and a tag_namespaces config key, so organisation-specific tags no longer force disabling unknown_tag_namespace wholesale, thanks to @fwosar.
  • Sigma correctness: multi-field value_count composite keys, compile-time rejection of multi-field numeric aggregations, empty value_median returns None, cross-crate detection-name selector consistency, and convert-side rejection of modifiers it cannot express.
  • Runtime hardening: a category-based HTTP egress policy (SSRF/cloud-metadata defense applied at DNS resolution), a 10 MiB enricher response cap, hot-reload that preserves engine tuning, and fail-closed dynamic-source resolution.
  • Evaluator and parser robustness: compile-time rejection of conflicting detection-modifier combinations, allocation-free JsonEvent dot-path traversal, and CLI diagnostics that stop silently swallowing invalid status / level / related: metadata.
  • Detached dynamic sources: pipeline-embedded sources: now warns louder on stderr and through the daemon hot-reload path.
  • Release pipeline, CI, Docker, and supply-chain hardening before publish, two batched Dependabot rollups, and a docs-accuracy sweep across the site.

Documentation accuracy: TLS, feature flags, metric and lint counts, CLI surface, endpoint inventory, benchmark freshness (#181)

A docs-only sweep that closes the accuracy gaps that accumulated over the v0.13.x line. No source code changes; every fix points the documentation at the actual behaviour that ships in the binary.

  • Daemon TLS is no longer described as roadmap. docs/reference/http-api.md and docs/reference/architecture.md previously told operators that in-process TLS termination was planned and linked to issue #128. The daemon-tls Cargo feature, the --tls-cert / --tls-key / --tls-client-ca / --tls-min-version flag set, and the SIGHUP cert hot-reload all shipped in the v0.14.0 release window; both pages now point at the existing security.md#tls-termination-for-the-api-listener write-up instead.
  • Feature flag catalogue matches the manifest again. docs/reference/feature-flags.md opened by claiming a workspace of seven crates (it has been six since the binary / rsigma-cli split). The daemon-tls row listed rustls-pemfile as a pulled-in dependency; the actual manifest pulls rustls, tokio-rustls, rustls-pki-types, x509-parser, hyper, hyper-util, and tower-service. The "per-feature CI matrix" section described a per-feature opt-in matrix that does not exist in .github/workflows/ci.yml today (CI runs --all-features plus the three-OS test matrix). All three drifts are corrected, and the production-recommended cargo install recipe now includes daemon-tls.
  • Metric counts agree across the three pages that publish them. docs/reference/metrics.md headlined "30 metric names across four concerns" while its own section headings summed to 37 rows; the actual registry in crates/rsigma-cli/src/daemon/metrics.rs exposes 38 metric names under --all-features (33 always-present plus 3 OTLP and 2 TLS gated on the matching build features), grouped into seven concerns. Engine core is 17 metrics, not 16. docs/guide/streaming-detection.md and docs/guide/observability.md propagated the stale "27" number; both are now aligned, and observability gains the previously missing enrichment (6) and TLS (2) rows.
  • Lint rule counts are honest. docs/reference/lint-rules.md claimed 66 built-in checks; one of them (empty_filter_rules) is enum-only and not emitted in production. Page now reads "65 built-in checks plus 1 reserved enum value". The "Filter rules (7)" heading was actually a table of 8 rows including the reserved variant -- relabelled "Filter rules (8 IDs, 7 emitted)". The "Detection-modifier hygiene (5)" heading listed 7 rows that are not duplicates of the detection section above -- relabelled "Detection-modifier hygiene (7)" with the misleading "subset of the detection rules above" wording removed.
  • CLI global flags are fully documented. docs/cli/index.md listed only --log-format and asserted "every subcommand accepts one global flag", missing the other four globals (--output-format, --color, --quiet, --no-stats) that have shipped alongside it. The overview now describes all five with their defaults, accepted values, effect, and the layered flag > env > config > default precedence model. The command tree gains the previously omitted rule migrate-sources entry, and docs/cli/rule/lint.md drops the stale command-local --color flag (color is global now) and documents the four machine renderers (json, ndjson, csv, tsv) the lint command honours when --output-format is set explicitly.
  • Command-group overviews list every group. docs/getting-started/concepts.md claimed "the five command groups" but the table only listed four (engine, rule, backend, pipeline); add the missing config row with its six subcommands (init, validate, show, schema, path, reload). The rule row picks up migrate-sources. docs/reference/output.md drops rule validate from the table output consumers (the command always prints its bespoke per-file summary regardless of --output-format) and spells that out so operators are not surprised when the selector does nothing on that command.
  • POST /api/v1/sources/resolve/{source_id} is in the HTTP API inventory. The daemon registers both the body variant (/api/v1/sources/resolve with a JSON body that names one source) and the path-parameter variant (/api/v1/sources/resolve/{source_id} with no body). Only the body variant was documented; the path variant now appears in both the summary table and a short body section with the success response (200 {"status":"resolve_triggered","source_id":"..."}) and the two failure responses (404 when no dynamic sources are configured, 429 when a refresh for the same source is still in flight).
  • Benchmark figures are labelled as captured on v0.9.0. BENCHMARKS.md (and the docs-site mirror docs/benchmarks.md that includes it) carried Date: 2026年05月07日 / Version: 0.9.0 headers; the workspace has since shipped through v0.13.0 and parts of the hot path have moved. Relabel as "Date captured" / "Captured on version" and add a one-paragraph freshness admonition that asks anyone refreshing the numbers to update the metadata block in the same commit.
  • Site-level loose ends. The llmstxt plugin block in mkdocs.yml now lists rule/migrate-sources, every cli/config/* page, reference/output.md, reference/configuration.md, and guide/enrichers.md -- five public pages that an LLM consuming the generated llms.txt had no way to surface before. docs/developers/testing.md had a stale CLI E2E table ("12 files / 167 tests") that missed seven files added since (cli_config.rs, cli_daemon_enrichment.rs, cli_daemon_fields_observer.rs, cli_daemon_tls.rs, cli_migrate_sources.rs, cli_output_format.rs, cli_sources_deprecation.rs); the page now lists 19 files with their per-file test counts and asks readers to verify the exact total against their tree rather than copy a stale number forward.

Eval and convert internals: modifier validation, dot-path perf, golden routing (#180)

Three independent quality fixes for the evaluator and converter that all surface bugs the previous code silently swallowed or paid an avoidable allocation for.

Conflicting modifier combinations are now rejected at compile time. compile_detection_item previously turned the parsed modifier list into a flat boolean context and dispatched through compile_value in a fixed precedence order. Whichever flag the dispatch checked first won, so a rule declared as Field|cidr|contains silently produced a CIDR match with contains dropped, Field|re|contains produced a regex match with contains dropped, Field|gt|contains ran the numeric comparison and dropped contains, Field|exists|contains collapsed to an existence check that dropped both the substring matcher and the value, Field|wide|utf16 silently picked whichever UTF-16 dialect the dispatch implemented first, and Field|i with no |re silently became a no-op. The rules still compiled, still matched something, but the semantics were never what the author wrote. A new validate_modifiers pass runs before compile_value and rejects five categories of contradiction: more than one operator per item (the operator set spans contains / startswith / endswith / re / cidr / exists / fieldref / gt / gte / lt / lte and every timestamp part); more than one UTF-16 encoding from wide / utf16 / utf16be; base64 together with base64offset; any value transformation (base64 / base64offset / wide / utf16 / utf16be / windash / expand) on a field that also carries a non-string operator that does not consume the transformed value; and the regex flag modifiers (|i / |m / |s) without |re. Legal combinations stay legal: |re|i|m|s, |base64|wide, |contains|cased, |contains|all with multiple values, |contains|neq, |re|neq, and a single timestamp part all continue to compile. Errors flow through the existing EvalError::InvalidModifiers variant with a message that lists every offending modifier so the rule author can pick which one to drop. The full SigmaHQ corpus (rules/ plus rules-compliance/ plus rules-emerging-threats/ plus `rules-p...

Read more

Contributors

fwosar
Loading
frack113 reacted with thumbs up emoji
1 person reacted

v0.13.0

26 May 14:47
@mostafa mostafa
v0.13.0
This tag was signed with the committer’s verified signature.
mostafa Mostafa Moradian
GPG key ID: 2E1771094B081787
Verified
Learn about vigilant mode.
c0df66a
This commit was created on GitHub.com and signed with GitHub’s verified signature.
GPG key ID: B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

TL;DR
RSigma v0.13.0 is the "post-evaluation enrichment, server-side TLS, and field observability" release:

  • Post-evaluation enrichment between engine.evaluate() and the sinks: four primitives (template, lookup, http, command), strict detection-vs-correlation kind separation, scope filters, on_error policies, six new Prometheus metrics, and a public register_builtin(name, factory) registry.
  • Server-side TLS on the daemon API listener (Axum REST + Prometheus + OTLP/HTTP + OTLP/gRPC sharing one socket via ALPN), gated by the new daemon-tls Cargo feature, with optional mutual TLS and cross-platform cert hot-reload via POST /api/v1/reload.
  • Field observability: opt-in --observe-fields on engine daemon and engine eval exposes the gap and broken-coverage signals via four /api/v1/fields/* endpoints and three Prometheus surfaces, sharing a RuleFieldSet + FieldCoverage join primitive across CLI and daemon.
  • Detached dynamic sources: declare sources in standalone YAML loaded via --source <file_or_dir>, with a unified DaemonSourceRegistry and a new rsigma rule migrate-sources helper. Pipeline-embedded sources: is visible-deprecated this release.
  • Library API: MatchResult and CorrelationResult collapse into a single EvaluationResult (RuleHeader + ResultBody), wire shape preserved. Deprecated CLI aliases are now hidden from rsigma --help. The reserved-but-empty attack subcommand group is removed.
  • Dependency bumps: jsonschema 0.46.5, jaq-core / jaq-std 1.x to 3.0 with jaq-json 2.0 (Radically Open Security audit fixes), assert_cmd 2.2.2, plus CI action bumps and two VS Code Dependabot security fixes (@azure/msal-node ^5.2.2, brace-expansion ^5.0.6).

Unknown-field discovery API (#149)

The engine daemon learns to surface two halves of detection coverage live from inside the process: which event fields are not referenced by any loaded rule (gap signal) and which rule fields have never appeared in an event (broken-coverage signal). RSigma owns both rule parsing and event ingestion end-to-end, so this view does not need an external pipeline.

Two new flags on rsigma engine daemon (off by default; zero overhead when not set):

Flag Default Purpose
--observe-fields off Enable the field observer. When enabled, every event evaluated by the engine task has its dotted field paths recorded.
--observe-fields-max-keys <N> 10000 Hard ceiling on distinct field names. Existing keys keep counting once the cap is hit; new keys are dropped and counted as overflow.

Four new HTTP endpoints.

Method Path Description
GET /api/v1/fields Snapshot bundling summary + unknown + missing for a one-shot dashboard read.
GET /api/v1/fields/unknown Event fields not referenced by any rule. Sorted by descending count.
GET /api/v1/fields/missing Rule fields never seen in events. Each entry includes up to 10 rule titles with a truncated flag for fields that span more rules.
DELETE /api/v1/fields/observer Clear the observer's counters and return {previous_keys, previous_events}.

Each list endpoint accepts ?limit=N&offset=M (default limit=100, cap 1000) and returns total + next_offset for deterministic pagination. All four return 503 Service Unavailable with {"error":"field observation disabled","hint":"..."} when --observe-fields is not set.

Three new Prometheus surfaces.

Metric Type Description
rsigma_fields_observed_total counter Total events scanned by the opt-in field observer.
rsigma_fields_observer_unique_keys gauge Distinct field names currently tracked.
rsigma_fields_observer_overflow_dropped_total counter New-key insert attempts dropped because the observer was at capacity.

The gauges refresh on every /metrics scrape and after every successful /api/v1/fields/* call, so a Prometheus alert on rsigma_fields_observer_overflow_dropped_total fires the moment an operator's --observe-fields-max-keys choice is too low for the deployment.

Shared extraction with rsigma rule fields. The rule-field side of the join lives in a new rsigma_eval::fields module (RuleFieldSet) that both the CLI subcommand and the daemon import. The daemon caches the post-pipeline set on RuntimeEngine via ArcSwap and refreshes it on every successful load_rules(), so the HTTP handlers run lock-free against a stable view even during hot reloads.

Shared join primitive. FieldObservation::coverage(&RuleFieldSet) -> FieldCoverage lives in rsigma-eval and partitions an observation snapshot into the unknown / intersection / missing buckets in one pass. Both the daemon's HTTP handlers and the eval report consume this, so the partition semantics cannot drift across runtimes.

Implementation cost. Default-off; the engine task takes a single ArcSwap load per batch when no observer is attached and skips field iteration entirely. With --observe-fields set, the only added work is one Event::field_keys() walk per parsed event (one String allocation per leaf path, depth-capped at 64; flat formats like KvEvent return Cow::Borrowed) plus a short std::sync::Mutex lock to update counters. Memory is bounded by --observe-fields-max-keys (10k default ≈ a few hundred KB; keys stored as Arc<str> so snapshots refcount-bump rather than copy).

Offline coverage report. rsigma engine eval mirrors the daemon's field-observability surface with three new flags: --observe-fields enables observation; --observe-fields-max-keys <N> (default 10000, validated as NonZeroUsize so 0 is rejected at parse time); --observe-fields-report <PATH> writes the JSON report to a file (defaults to stderr if omitted so detections on stdout stay machine-consumable; clap-requires --observe-fields so the typo case fails fast). The report has the same shape as GET /api/v1/fields, so the same jq queries work against either runtime. To make this possible without coupling engine eval to the daemon Cargo feature, FieldObserver lives in rsigma-eval (which every consumer already links) and uses std::sync::Mutex to keep rsigma-eval dependency-light. rsigma-runtime keeps a pub use rsigma_eval::{FieldObserver, FieldObservation, FieldObservationEntry, FieldCoverage} re-export so existing imports continue to compile unchanged.

Docs. Endpoint reference under "Field observability" in docs/reference/http-api.md; flag rows in docs/cli/engine/daemon.md and docs/cli/engine/eval.md; metric rows in docs/reference/metrics.md; combined daemon/eval workflow in docs/guide/observability.md.

Server-side TLS for the daemon API listener (#128)

The engine daemon API listener now terminates TLS in-process for every protocol that already shares --api-addr: the Axum HTTP REST API (/healthz, /readyz, /metrics, /api/v1/*), OTLP/HTTP on POST /v1/logs, and OTLP/gRPC via LogsService/Export. Operators can drop the sidecar reverse proxy they previously needed for confidentiality, integrity, and agent-to-daemon pinning.

New Cargo feature. daemon-tls on rsigma-cli gates the TLS surface and pulls in rustls (with the aws-lc-rs provider, matching the NATS client TLS path and inheriting upstream FIPS-mode work), tokio-rustls, rustls-pemfile, rustls-pki-types, x509-parser, and hyper/hyper-util. The default build is unchanged.

Six new flags on rsigma engine daemon.

Flag Env Default Purpose
--tls-cert <PATH> -- -- PEM-encoded leaf certificate (chain). Requires --tls-key.
--tls-key <PATH> -- -- PEM-encoded private key (PKCS#8, PKCS#1, or SEC1). Requires --tls-cert.
--tls-key-password <PASS> RSIGMA_TLS_KEY_PASSWORD -- Password for an encrypted --tls-key. Currently rejected with a clear hint pointing at openssl rsa for offline decryption; reserved for a future release.
--tls-client-ca <PATH> -- -- PEM bundle of trusted CAs. Enables mutual TLS: clients without a CA-signed cert are rejected during the handshake.
--tls-min-version <1.2|1.3> -- 1.3 Minimum negotiated TLS protocol version.
--allow-plaintext -- off Opt-in for plaintext on a non-loopback --api-addr.

Plaintext refusal policy. When daemon-tls is built in, the daemon refuses to start on any non-loopback address unless either --tls-cert/--tls-key or --allow-plaintext is supplied. Loopback (127.0.0.0/8, ::1) always allows plaintext to keep local development friction-free.

Unified serving path. The implementation collapses the previous split between axum::serve (for plaintext non-OTLP) and tonic::transport::Server::serve_with_incoming_shutdown (for OTLP) into a single axum::Router built via tonic::service::Routes::into_axum_router. For TLS, a small custom axum::serve::Listener wraps the TcpListener and performs the tokio-rustls handshake on every accepted connection. ALPN advertises both h2 and http/1.1, so the same socket continues to serve REST + Prometheus + OTLP/HTTP + gRPC after TLS termination.

Cross-platform cert hot-reload. Cert rotation funnels through the daemon's central debounced reload task, which is triggered by POST /api/v1/reload (works on every platform, including Windows), SIGHUP (Unix), or a YAML change picked up by the file watcher. All three paths re-read the certificate and key from disk and atomically swap the active rustls::ServerConfig via Arc<ArcSwap<...>>. Inflight TLS connections are not dropped; failed reloads keep the previous certificate active, bump rsigma_reloads_failed_total, and log an error so a typo in the cert path cannot black-hole the listener. Encrypted-key...

Read more
Loading
frack113 reacted with heart emoji
1 person reacted

v0.12.0

20 May 08:24
@mostafa mostafa
2336130
This commit was created on GitHub.com and signed with GitHub’s verified signature.
GPG key ID: B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

TL;DR
RSigma v0.12.0 is the "operability, performance, and documentation" release:

  • Comprehensive daemon and CLI observability: tower-http API access logs, per-request OTLP tracing, batch processing spans, source resolution spans, DLQ visibility, NATS and sink lifecycle events, correlation state eviction warnings, rule load diagnostics, daemon lifecycle logs, and a global --log-format flag for non-daemon subcommands.
  • Eval rule loading is no longer O(N2): Engine::add_rule is amortized O(1), and bulk loaders (Engine::add_rules, extend_compiled_rules, add_collection) rebuild indexes exactly once per batch. The full 3,120-rule SigmaHQ corpus that previously appeared to hang now loads in ~120 ms.
  • CLI subcommands reorganized into five noun-led groups (engine, rule, backend, pipeline). Flat aliases continue to work as deprecated forwarders for one release.
  • Full documentation site live at https://timescale.github.io/rsigma/: 47 pages spanning Getting Started, User Guide, CLI Reference, Library API, Developers, Reference (including a 66-rule lint catalogue and a 27-metric Prometheus catalogue), Deployment, Editors, and Ecosystem. Built from docs/ on every merge to main via the new .github/workflows/docs.yml.
  • Test reliability: cli_daemon_http and cli_daemon_otlp E2E suites are now flake-free on macOS under load.
  • Dependency bumps: opentelemetry-proto 0.31.0 to 0.32.0, async-nats 0.47 to 0.48, yamlpath/yamlpatch 1.25.2 (with the serde_yaml cargo rename replaced by yaml_serde directly), tokio 1.52.3, jsonschema 0.46.4, tower-http 0.6.10, tonic 0.14.6.

Daemon and CLI observability (PR #107)

The daemon and CLI ship with structured logs, distributed tracing spans, and profiling hooks across the three observability pillars. All new instrumentation flows through the existing tracing-subscriber (JSON, env-filter) and is controlled via RUST_LOG. Spans are designed to be consumable by future tokio-console or tracing-opentelemetry exporters without code changes.

Phases. One commit per phase, in landing order:

Phase Scope
HTTP API access logs tower-http::TraceLayer::new_for_http() on the Axum router; each request produces a span with method, URI, status, and latency
Event pipeline Per-batch debug span (batch_size, input_format, match count, elapsed_ms); DLQ parse-failure debug events; checked DLQ channel send with warn-on-closed; DLQ task lifecycle logging
Source resolution InstrumentedResolver debug span (source_id, source_type); cache hit / fetch boundary events; refresh scheduler cycle completion logs (sources, duration_ms)
Correlation memory pressure Warn on hard-cap eviction (current count, max, evicted, target capacity) so high-cardinality traffic causing data loss is no longer silent
NATS, sinks, backpressure NATS source/sink publish and ack events; spawn_source backpressure warn alongside the existing metric; Sink::FanOut per-sink labels (sink_index, sink_type, error)
Rule load diagnostics load_rules info span (rules_path, duration_ms); first three parse error details when bad rules fail to compile
OTLP per-request tracing otlp_ingest debug span on both HTTP and gRPC handlers; record_count event after decoding ExportLogsServiceRequest
Daemon lifecycle Health state transitions; file watcher errors; reload-channel coalesce vs closed events; periodic state snapshot duration and serialized size; SQLite migration column events; per-task shutdown-join logs
--log-format for CLI Global --log-format <json|text> initializes a stderr subscriber on non-daemon subcommands. engine eval, rule validate, and rule lint emit info events on completion (rules loaded, validation totals, lint summary) when a subscriber is installed. The daemon always logs JSON, so the flag is a no-op there.

Verbosity targets.

RUST_LOG filter Surfaces
info,tower_http=debug HTTP API access logs
info,rsigma=debug Batch processing spans, DLQ routing, OTLP per-request fields, snapshot save duration
info,rsigma_runtime::sources=debug Dynamic source resolution and refresh scheduler
info,rsigma_eval=debug Correlation engine internals

Span correctness fix. Holding an EnteredSpan guard from Span::enter() across .await is an anti-pattern on the multi-threaded tokio runtime: when the task is suspended, the thread-local span context can leak into other tasks scheduled on the same thread, producing incorrect span nesting. InstrumentedResolver::resolve, the OTLP HTTP and gRPC handlers, and the engine batch loop now use .instrument() on async blocks instead. Span fields, event payloads, and runtime behavior are unchanged.

Documentation. A new Observability section in the root README and an updated Logging paragraph in the CLI README list the supported RUST_LOG filter targets and document the new --log-format flag.

Eval rule loading performance (PRs #119, #121, #122, #123)

Loading rules into an engine is no longer O(N2) in the rule count.

Batched loaders rebuild indexes exactly once. New Engine::add_rules (compiles each rule with the configured pipelines and collects per-rule compile errors without aborting the batch) and Engine::extend_compiled_rules (pre-compiled equivalent) rebuild the inverted index and per-field bloom exactly once at the end of the batch. Engine::add_collection, the rsigma rule validate path, and the rsigma engine eval rule load path now route through these APIs so the daemon and every RuntimeEngine caller share the one-rebuild fast path. Loading the SigmaHQ corpus (~3,120 rules) used to pay around 3K full index rebuilds and appeared to hang; it now completes in roughly 120 ms.

Single-rule add path is amortized O(1). Engine::add_rule and Engine::add_compiled_rule no longer rebuild the indexes from scratch on every push. They fold the new rule into the inverted index incrementally via the new RuleIndex::append_rule(rule_idx, rule) primitive, and into the per-field bloom via FieldBloomIndex::append_rule(rule). The bloom uses a doubling watermark with a 64-rule floor to schedule full rebuilds when the rule count has at least doubled past the last rebuild, capping false-positive-rate drift while keeping the amortized per-rule cost O(1). Rules that introduce a brand-new indexed field get a fresh bloom on the fly.

Rules add_collection add_rules add_rule loop
1,000 1.15 ms 1.17 ms 1.64 ms
10,000 11.82 ms 11.85 ms 17.23 ms
100,000 121.65 ms 122.13 ms 166.07 ms

(M4 Pro, release build. Run via cargo bench -p rsigma-eval --bench eval -- rule_load.)

When cross_rule_ac_enabled is on, the daachorse cross-rule index has no incremental update story, so the single-rule add path falls back to a full Engine::rebuild_index. Bulk loaders are unaffected.

Correctness. Between bloom rebuilds, probes may answer MaybeMatch where the batched-rebuild path would answer DefinitelyNoMatch. Both verdicts are correct (MaybeMatch is always safe); the engine just evaluates the rule directly instead of short-circuiting. The new differential test append_rule_matches_build_verdicts pins this property by checking that positive verdicts match exactly and that disjoint haystacks are still rejected at >= 90% under incremental builds.

Benchmarks. A new rule_load Criterion group compares the three load entry points at 1K / 10K / 100K rules. Numbers recorded in BENCHMARKS.md under the Rule Load Paths (0.11.x) subsection.

CLI command groups (PR #124)

The 12 flat top-level subcommands are reorganized into five noun-led command groups so the CLI scales as more subcommands arrive. The flat aliases continue to work for one release as visible-deprecated forwarders, are hidden in the next release, and are removed in v1.0. Every existing invocation keeps working, so there is no breaking change in this release.

$ rsigma
Parse, validate, and evaluate Sigma detection rules
Usage: rsigma [OPTIONS] <COMMAND>
Commands:
 engine Run rules against events (eval / daemon)
 rule Inspect and operate on Sigma rule files
 backend Convert Sigma rules to backend-native queries
 pipeline Pipeline tooling (resolve dynamic sources, ...)
 attack MITRE ATT&CK tooling (reserved; populated by the ATT&CK contributor PR)
 eval [deprecated] Use `rsigma engine eval` instead
 daemon [deprecated] Use `rsigma engine daemon` instead
 parse [deprecated] Use `rsigma rule parse` instead
 validate [deprecated] Use `rsigma rule validate` instead
 lint [deprecated] Use `rsigma rule lint` instead
 fields [deprecated] Use `rsigma rule fields` instead
 condition [deprecated] Use `rsigma rule condition` instead
 stdin [deprecated] Use `rsigma rule stdin` instead
 convert [deprecated] Use `rsigma backend convert` instead
 list-targets [deprecated] Use `rsigma backend targets` instead
 list-formats [deprecated] Use `rsigma backend formats` instead
 resolve [deprecated] Use `rsigma pipeline resolve` instead
 help Print this message or the help of the given subcommand(s)
Options:
 --log-format <LOG_FORMAT> Emit structured diagnostic logs to stderr (for CI / log aggregation) [possible values: json, text]
 -h, --help Print help (see more with '--help')
 -V, --version Print version

Migration:

Old (flat) New (grouped)
rsigma eval ... rsigma engine eval ...
rsigma daemon ... rsigma engine daemon ...
rsigma parse ... rsigma rule parse ...
rsigma validate ... `r...
Read more
Loading

v0.11.0

14 May 09:21
@mostafa mostafa
84b46ed
This commit was signed with the committer’s verified signature.
mostafa Mostafa Moradian
GPG key ID: 2E1771094B081787
Verified
Learn about vigilant mode.

Choose a tag to compare

TL;DR
RSigma v0.11.0 is the "eval performance" release:

  • Matcher optimizer: batches |contains lists into Aho-Corasick automata, groups sibling regex matchers into RegexSet DFAs, and eliminates redundant to_lowercase() calls via shared case-folding groups.
  • Opt-in bloom filter pre-filtering for substring matchers, skipping entire detection items when trigrams cannot match.
  • Opt-in cross-rule Aho-Corasick prefilter via daachorse (behind the daachorse-index feature flag), pruning entire rules before evaluation with up to ~100x speedup on substring-heavy workloads.
  • Security hardening for dynamic pipeline sources: 10 MB body/payload caps on HTTP, command stdout, and NATS; 30-second command execution timeout; 1-second refresh interval floor. Closes all v0.10.0 Known Limitations.
  • Parser fix: the unsupported |not modifier is now rejected with guidance toward condition-level negation.
  • Dependency bumps: criterion 0.5.1 to 0.8.2, jsonschema 0.42.2 to 0.46.3.

What's New

Matcher optimizer (PRs #99, #100, #101, #105)

The compiler now includes an optimization pass that restructures AnyOf matcher trees for better runtime performance. The optimizer is always on and preserves evaluation semantics exactly. Three transformations are applied in order:

Aho-Corasick batching. When an AnyOf node contains 8 or more plain |contains children with the same case sensitivity, they are collapsed into a single Aho-Corasick automaton (AhoCorasickSet). Instead of N sequential substring scans, the engine makes one linear pass over the haystack. The threshold of 8 was chosen empirically from a benchmark sweep: below 8 patterns, sequential str::contains with SIMD acceleration (memchr / Two-Way) is faster; at 8 and above, throughput flattens because the AC automaton scans once regardless of pattern count.

Patterns h=100 B h=1 KB h=8 KB h=64 KB
1 13.0 Melem/s 7.77 Melem/s 1.85 Melem/s 248 Kelem/s
4 9.08 Melem/s 2.03 Melem/s 293 Kelem/s 35.6 Kelem/s
8 5.17 Melem/s 620 Kelem/s 79.0 Kelem/s 9.76 Kelem/s
16 5.19 Melem/s 628 Kelem/s 78.6 Kelem/s 9.67 Kelem/s
32 4.99 Melem/s 607 Kelem/s 76.4 Kelem/s 8.88 Kelem/s

RegexSet batching. When an AnyOf node contains 3 or more |re children, they are collapsed into a single RegexSet DFA. One DFA pass replaces N independent regex evaluations. Falls back to individual matchers if set construction fails.

Case-insensitive grouping. After AC and RegexSet restructuring, if 2 or more surviving children are all case-insensitive and "pre-lowerable," they are wrapped in a CaseInsensitiveGroup. The haystack is lowered once via ascii_lowercase_cow (borrow-if-already-lower fast path), and all children use matches_pre_lowered against the shared lowered string, eliminating repeated allocation.

The optimizer only applies to AnyOf (OR) groups, never to AllOf (AND). This is a correctness constraint: collapsing AND-of-contains into AC with any-match semantics would change the logic.

Correctness guarantee. A new differential fuzz target (fuzz_eval_matcher_diff) asserts that optimize_any_of(matchers) produces identical match results to AnyOf(matchers) for arbitrary needle sets, haystacks, and case sensitivity.

Bloom filter pre-filtering (PRs #102, #104)

An opt-in trigram-based bloom index that can skip expensive substring matching before it starts. The bloom filter operates at the detection-item level, inside evaluate_rule.

How it works. At rule load time, the engine extracts positive substring needles (|contains, |startswith, |endswith, and AhoCorasickSet needles) from all compiled rules and inserts every 3-byte trigram into a per-field bloom filter (double hashing from AHash-derived pairs). At eval time, for each string field value, the engine slides trigrams over the lowered haystack; if no trigram from any pattern is present in the bloom, the item returns DefinitelyNoMatch and the matcher is skipped entirely.

One-sided correctness. The bloom filter has no false negatives for "definitely no match." If it says MaybeMatch, the full matcher runs as usual. Negated branches, non-string fields, and short/huge values conservatively return MaybeMatch.

Memory budget. Default total budget is 1 MiB (DEFAULT_MAX_TOTAL_BYTES), with a 64 KiB per-field cap. If the total exceeds the budget, fields with the worst bits-per-pattern density are dropped first. The budget is configurable via Engine::set_bloom_max_bytes.

CLI flags.

rsigma eval -r rules/ -e @events.json --bloom-prefilter
rsigma eval -r rules/ -e @events.json --bloom-prefilter --bloom-max-bytes 131072
rsigma daemon -r rules/ --bloom-prefilter
rsigma daemon -r rules/ --bloom-prefilter --bloom-max-bytes 2097152

When to enable. The bloom index adds approximately 1 microsecond of per-event trigram probing overhead. It pays off when you have many substring-heavy rules and most events do not match (the common case for threat intel feeds against high-volume telemetry). Benchmark with your own data before enabling in production.

Cross-rule Aho-Corasick prefilter (PR #106)

An opt-in whole-rule prefilter that prunes entire rules before evaluate_rule runs. This is distinct from the per-item matcher optimizer and the per-item bloom filter: it operates at the rule level.

How it works. At index build time, the engine collects all positive substring needles (lowered) from every rule and builds one DoubleArrayAhoCorasick<u32> automaton per field using the daachorse crate. Pattern IDs map back to rule indices. At eval time, for each indexed field with a string value, one overlapping scan on the lowered haystack marks which rules had at least one pattern hit. Rules that are "AC-prunable" (all detections consist exclusively of positive substring matchers, no negation in conditions, no field-less keywords) and received zero hits are skipped entirely.

Benchmark results. 200 non-matching events against N pure-substring rules (best-case workload):

Rules Off (default) On (--cross-rule-ac) Speedup
1,000 17.34 ms (11.5 Kelem/s) 253.0 us (790 Kelem/s) ~68x
5,000 85.51 ms (2.34 Kelem/s) 883.0 us (226 Kelem/s) ~97x
10,000 173.37 ms (1.15 Kelem/s) 1.71 ms (117 Kelem/s) ~101x

The cross-rule index turns O(rules x patterns) per event into O(haystack_length) for the AC scan, so throughput is essentially constant in rule count.

Feature flag. The daachorse dependency is optional and gated behind the daachorse-index Cargo feature. Build with:

cargo install rsigma --features daachorse-index
# or
cargo build --release --features daachorse-index

CLI flags.

rsigma eval -r rules/ -e @events.json --cross-rule-ac
rsigma daemon -r rules/ --cross-rule-ac

When to enable. This is off by default. For typical mixed workloads (substring + exact + regex rules, events that hit multiple fields, smaller rule sets), the index adds build-time and lookup overhead with smaller wins or none, and can cause a slowdown. Enable for large (5K+ rules), substring-heavy, shared-pattern packs where most events do not match. Always benchmark against representative data first.

Composition. The three prefilter layers stack: the rule index narrows by exact field values, the cross-rule AC narrows by substring patterns, and the bloom filter skips individual detection items. All three can be enabled simultaneously; regression tests assert that the combined output matches the no-prefilter baseline.

Security hardening for dynamic pipeline sources (PR #96)

This release closes all four items listed under "Known Limitations" in the v0.10.0 release notes. Dynamic pipeline sources that fetch from HTTP, command, or NATS now enforce resource limits.

HTTP response body size limit. Responses are capped at 10 MB (MAX_SOURCE_RESPONSE_BYTES). If the server advertises a Content-Length exceeding the limit, the response is rejected without buffering the body. During streaming, if the accumulated body exceeds the limit, the connection is dropped. A 30-second client timeout is also enforced.

Command execution timeout and stdout size limit. Command sources are killed after 30 seconds (DEFAULT_COMMAND_TIMEOUT). Stdout is read in 8 KB chunks and capped at 10 MB; exceeding the limit kills the child process. Stderr is separately capped at 64 KB to prevent a chatty failing command from exhausting memory.

NATS message payload size limit. NATS messages exceeding 10 MB are rejected before parsing.

Refresh interval floor. Source refresh intervals below 1 second are clamped to 1 second with a structured warning log. This prevents config mistakes or hostile configs from causing tight polling loops.

All limits use a new SourceErrorKind::ResourceLimit variant with descriptive messages. Integration tests validate timeout killing, stdout size rejection, and NATS payload rejection.

Parser: reject |not modifier (PR #103)

Writing field|not: value in a Sigma rule is a common mistake. The not keyword is a condition-level operator, not a value modifier. Previously this would produce a generic "unknown modifier" error. Now the parser returns a dedicated NotIsNotAModifier error with guidance:

not is not a value modifier in Sigma; express negation in the condition (e.g. not selection) or move the inverted check into a separate detection used as a filter (e.g. selection and not other)

Regression test suite (PRs #105, #106)

A new regression_eval.rs test file (459 lines) locks down optimizer and prefilter correctness with differential tests:

| Test | What it validates |
|------|---------...

Read more
Loading

v0.10.0

08 May 09:45
@mostafa mostafa
12388fa
This commit was signed with the committer’s verified signature.
mostafa Mostafa Moradian
GPG key ID: 2E1771094B081787
Verified
Learn about vigilant mode.

Choose a tag to compare

TL;DR
RSigma v0.10.0 is the "dynamic pipelines" release:

  • Dynamic Sigma Pipelines: declare HTTP, command, file, and NATS sources inside pipeline YAML, with template expansion, include directives, TTL caching, background refresh, and three extract languages (jq, JSONPath, CEL).
  • A new rsigma resolve CLI command and full daemon integration with Prometheus instrumentation.
  • Native EVTX input: evaluate Sigma rules directly against Windows Event Log binary files.
  • Pipeline hot-reload: the daemon now watches pipeline files alongside rules.
  • Builtin pipelines: ecs_windows and sysmon embedded at compile time.
  • Comprehensive fuzz testing: 14 cargo-fuzz harnesses covering all untrusted input surfaces.
  • Security hardening: SQL injection prevention, recursion limits, condition DoS caps, SIGTERM handler, and event size limits.
  • CI and supply chain: MSRV enforcement, cargo-deny, serde_yaml migration, Dependabot, SECURITY.md, and CONTRIBUTING.md.

What's New

Dynamic Sigma Pipelines (PRs #86-#93)

Pipelines can now declare external data sources that are resolved at runtime and injected into pipeline fields via template expansion. This is a capability unique to RSigma: no other Sigma engine supports dynamic processing pipelines.

Four source types. A new sources section in pipeline YAML declares named data sources:

sources:
 threat_intel:
 type: http
 url: https://feeds.example.com/iocs.json
 format: json
 extract:
 expr: ".indicators[].value"
 type: jsonpath
 refresh:
 interval: 300
 on_error: use_cached
 required: false
Source type Description
http Fetch from a URL (GET/POST) with optional headers
command Execute a local command and capture stdout
file Read from a local file path
nats Subscribe to a NATS subject for push-based updates

Template expansion. Pipeline field values reference resolved source data via ${source.threat_intel} syntax. Templates are expanded after all sources resolve, before the pipeline is applied to rules.

Three extract languages. Source responses can be filtered before injection:

Type Engine Example
jq (default) jaq .records[] | .ip
jsonpath jsonpath-rust $.indicators[*].value
cel cel-interpreter data.filter(x, x.severity > 3)

Include directives. Pipelines can include other pipeline fragments via include sources, with a recursive depth limit of 1. Remote includes (HTTP, NATS) require the --allow-remote-include daemon flag.

TTL-based caching. Resolved source data is cached in SQLite with configurable TTL. A cache invalidation API allows on-demand refresh without waiting for expiry.

Background refresh. After startup, sources refresh on their configured interval in the background. Failures for non-required sources do not block the pipeline; the last cached value is used (configurable via on_error: use_cached | fail | ignore).

SIGHUP re-resolution. Sending SIGHUP to the daemon triggers both a rule reload and a full source re-resolution cycle.

NATS control subject. A NATS message on a configurable control subject triggers source re-resolution, enabling external orchestration of pipeline updates.

rsigma resolve command (PR #88). A new CLI subcommand resolves dynamic sources and prints results:

# Resolve all sources in a pipeline
rsigma resolve -p pipelines/dynamic_threat_intel.yml
# Resolve a specific source by ID
rsigma resolve -p pipelines/dynamic_threat_intel.yml -s threat_intel --pretty
# Dry-run: show source metadata without resolving
rsigma resolve -p pipelines/dynamic_threat_intel.yml --dry-run

rsigma validate --resolve-sources (PR #88). Validate that pipeline sources can be resolved successfully alongside rule validation.

Prometheus metrics (PR #88). Five new metrics track source resolution in the daemon:

Metric Labels Description
rsigma_source_resolves_total source_id, source_type Total source resolution attempts
rsigma_source_resolve_errors_total source_id, error_kind Resolution errors by kind (Fetch, Parse, Extract, Timeout)
rsigma_source_resolve_seconds Resolution latency histogram
rsigma_source_cache_hits_total Cache hit counter
rsigma_source_last_resolved_timestamp source_id Unix timestamp of last successful resolution

/api/v1/status extension (PR #88). The status endpoint now includes a dynamic_sources summary when sources are configured:

{
 "status": "running",
 "dynamic_sources": {
 "total": 3,
 "resolves_total": 42,
 "errors_total": 1,
 "cache_hits": 38
 }
}

Full test coverage. Integration and E2E tests validate the entire dynamic pipeline lifecycle against real daemon instances (PR #90). Criterion benchmarks measure resolution throughput and template expansion overhead (PR #91). Seven dedicated fuzz targets cover source YAML parsing, template expansion, extract expressions, include parsing, and HTTP response handling (PR #92). SigmaHQ corpus regression validates that dynamic pipelines do not regress existing static pipeline behavior (PR #93).

EVTX input adapter (PR #85)

RSigma can now evaluate Sigma rules directly against Windows Event Log binary files (.evtx). The adapter uses the evtx crate to parse the binary format and yield JSON records that feed directly into the detection engine.

# Evaluate rules against a Windows Event Log file
rsigma eval -r rules/windows/ -e @Security.evtx
# Works with pipelines
rsigma eval -r rules/ -p sysmon -e @Microsoft-Windows-Sysmon%4Operational.evtx

Auto-detection is extension-based: any @path argument ending in .evtx (case-insensitive) is routed through the EVTX parser. The feature is compile-time gated behind the evtx feature flag (included in default features).

Pipeline hot-reload (PR #68)

The daemon file watcher now monitors pipeline YAML files alongside the rules directory. Changes to any referenced pipeline file trigger the same debounced reload cycle as rule changes:

  1. Filesystem events on watched .yml/.yaml files (500 ms debounce)
  2. SIGHUP signal (Unix)
  3. POST /api/v1/reload endpoint

If a pipeline file fails to parse during reload, the old engine configuration is preserved and rsigma_reloads_failed_total is incremented.

Builtin pipelines (ecs_windows, sysmon) are embedded at compile time and excluded from the file watcher.

Bundled pipelines (PR #69)

Two processing pipelines are now embedded in the binary via include_str!():

Name Description
ecs_windows Sigma/Sysmon field names to Elastic Common Schema (process creation, network, file, registry, DNS, pipe, driver, remote thread, process access)
sysmon Adds EventID conditions for logsource-to-Sysmon-event routing

Reference them by name instead of a file path:

rsigma eval -r rules/ -p ecs_windows -e @events.json
rsigma daemon -r rules/ -p sysmon
rsigma convert -r rules/ -t postgres -p ecs_windows

Fuzz testing (PR #70, PR #92)

Fourteen cargo-fuzz harnesses now cover every untrusted input surface:

Target Surface
fuzz_parse_yaml Sigma YAML parser
fuzz_condition Condition expression parser
fuzz_field_modifiers Field modifier parsing
fuzz_eval_matching Event evaluation engine
fuzz_regex_compile Regex pattern compilation
fuzz_pipeline_yaml Pipeline YAML parsing
fuzz_input_formats Input format auto-detection (JSON, syslog, logfmt, CEF)
fuzz_pipeline_sources_yaml Dynamic source YAML parsing
fuzz_extract_jq jq extract expression evaluation
fuzz_extract_jsonpath JSONPath extract expression evaluation
fuzz_extract_cel CEL extract expression evaluation
fuzz_template_expand Template ${source.*} expansion
fuzz_include_parse Include directive parsing
fuzz_http_response HTTP response body handling

Seed corpora include real SigmaHQ rules, handcrafted adversarial inputs, and valid pipeline examples. A weekly scheduled CI job runs all targets with per-target --max_len limits. Crashes upload as artifacts.

Security hardening (PRs #71-#76)

Six PRs address security, robustness, and code quality:

SQL injection prevention (PR #71). The PostgreSQL backend now validates all identifiers (table, schema, field segments) against ^[A-Za-z_][A-Za-z0-9_$]*$ before embedding them in SQL. Malicious inputs are rejected with ConvertError::InvalidIdentifier instead of being interpolated.

Unbounded recursion limits (PR #71). YAML deep-merge is capped at 64 levels (MAX_DEPTH). Exceeding the limit returns SigmaParserError::MergeTooDeep.

Condition DoS caps (PR #71). Condition expressions are limited to 64 KiB (MAX_CONDITION_LEN) and 64 nesting levels (MAX_CONDITION_DEPTH). Both limits return descriptive parse errors instead of stack overflow.

SIGTERM handler (PR #74). The daemon now handles SIGTERM with the same graceful shutdown path as Ctrl+C: drain the pipeline within --drain-timeout, persist correlation state, and exit cleanly.

parking_lot mutexes (PR #74). Internal mutexes migrated from std::sync::Mutex to parking_lot::Mutex for fairer scheduling and no poisoning.

Event size cap (PR #74). HTTP ingestion rejects individual lines exceeding 1 MiB with 413 Payload Too Large.

Code quality (PR #75). KEY_CACHE completeness test ensures all modifier keys are cached. partial_cmp replaced with total_cmp for deterministic float comparisons.

Testing gaps (PR #76). Runtime integration tests and parser AST snapshot tests added to cover previously untested paths.

CI and supply chain (PRs #72-#73)

MSRV enforcement. A dedicated CI job runs cargo check --workspace --all-features --locked on the declared MSRV (...

Read more
Loading
frack113 reacted with hooray emoji
1 person reacted

v0.9.0

04 May 09:30
@mostafa mostafa
v0.9.0
This tag was signed with the committer’s verified signature.
mostafa Mostafa Moradian
GPG key ID: 2E1771094B081787
Verified
Learn about vigilant mode.
ca41e3e
This commit was signed with the committer’s verified signature.
mostafa Mostafa Moradian
GPG key ID: 2E1771094B081787
Verified
Learn about vigilant mode.

Choose a tag to compare

TL;DR
RSigma v0.9.0 is one of the largest releases yet:

  • Production-grade NATS JetStream with at-least-once delivery, authentication and TLS, dead-letter queues, replay from offset or timestamp, consumer groups, and sequence-aware correlation state restoration
  • Native OpenTelemetry log ingestion over HTTP (protobuf + JSON) and gRPC
  • A new LynxDB conversion backend for SPL2-compatible queries
  • The rsigma fields field catalog
  • Structured exit codes for CI/CD scripting
  • Per-rule Prometheus metric labels
  • The entire codebase restructured into directory-based modules
  • And a comprehensive E2E test suite validating every I/O path against real Postgres and NATS instances via testcontainers

What's New

NATS production hardening (PR #59)

Five features bring the NATS pipeline from development-grade to production-ready.

At-least-once delivery with deferred ack. The streaming pipeline has been refactored from at-most-once to at-least-once delivery. Messages are now held in an AckToken until the sink confirms delivery. A new RawEvent struct bundles each payload with its ack token, and a dedicated ack task resolves tokens after sink confirmation. If the daemon crashes before ack, NATS redelivers the message after ack_wait expires. The EventSource trait now returns Option<RawEvent> instead of Option<String>, and NatsSink has been upgraded from core NATS publish to JetStream publish with server-confirmed persistence.

Authentication and TLS. A new NatsConnectConfig struct supports credentials file, token, username/password, NKey, mutual TLS (client cert + key), and require-TLS. Auth methods are mutually exclusive; the first configured one wins. Sensitive values can also be read from environment variables.

CLI flag Environment variable Description
--nats-creds NATS_CREDS Credentials file path
--nats-token NATS_TOKEN Authentication token
--nats-user / --nats-password NATS_USER / NATS_PASSWORD Username and password
--nats-nkey NATS_NKEY NKey seed
--nats-tls-cert / --nats-tls-key Client certificate and key for mutual TLS
--nats-require-tls Require TLS on the connection

Dead-letter queue. Events that fail processing are routed to a configurable DLQ instead of being silently discarded. The --dlq flag accepts the same URL schemes as --output (stdout://, file://, nats://). Each DLQ entry is a JSON object containing original_event, error, and timestamp. Integration points: parse errors detected before engine processing and sink delivery failures. A new rsigma_dlq_events_total Prometheus counter tracks DLQ volume.

# Route failed events to a file
rsigma daemon -r rules/ --input nats://localhost:4222/events.> --dlq file:///var/log/rsigma-dlq.ndjson
# Route failed events to a NATS subject
rsigma daemon -r rules/ --input nats://localhost:4222/events.> --dlq nats://localhost:4222/dlq.rsigma

Replay from offset or timestamp. A ReplayPolicy enum (Resume, FromSequence, FromTime, Latest) controls the JetStream consumer's starting position. Three mutually exclusive CLI flags set the policy. Correlation state restoration is handled intelligently based on the replay direction (see "Smart correlation state restoration" below).

# Replay from a specific stream sequence
rsigma daemon -r rules/ --input nats://localhost:4222/events.> --replay-from-sequence 42
# Replay from a point in time
rsigma daemon -r rules/ --input nats://localhost:4222/events.> --replay-from-time 2026年04月30日T00:00:00Z
# Start from the latest message, ignoring history
rsigma daemon -r rules/ --input nats://localhost:4222/events.> --replay-from-latest

Consumer groups for horizontal scaling. The --consumer-group flag sets a shared durable consumer name across multiple daemon instances. All instances using the same group name pull from a single JetStream consumer, and NATS automatically distributes messages for load balancing. When not specified, the consumer name is auto-derived from the subject (existing behavior).

# Two daemon instances sharing a consumer group
rsigma daemon -r rules/ --input nats://localhost:4222/events.> --consumer-group detection-workers

Smart correlation state restoration (PR #61)

The daemon now makes intelligent decisions about whether to restore correlation state from SQLite when restarting with a replay flag. Previously, any non-Resume replay policy unconditionally cleared correlation state to avoid double-counting. This was correct for forensic replay but overly conservative for forward catch-up scenarios where the daemon shuts down and restarts with --replay-from-sequence pointing after the last processed event.

Sequence-aware auto-restore. The daemon now tracks the NATS JetStream stream sequence and published timestamp of the last acknowledged message. This SourcePosition is stored alongside the correlation snapshot in SQLite (two new columns added via automatic schema migration). On restart, the decide_state_restore function compares the replay start point against the stored position: if the replay starts after the stored position (forward catch-up), state is restored safely; if at or before (backward replay), state is cleared to prevent double-counting.

Explicit overrides. Two new mutually exclusive CLI flags give operators direct control when the automatic decision is not appropriate:

Flag Behavior
--keep-state Always restore correlation state, regardless of replay policy
--clear-state Always clear correlation state and start fresh
(neither) Automatic decision based on replay direction and stored position
# Forward catch-up: state is auto-restored (replay starts after stored position)
rsigma daemon -r rules/ --input nats://localhost:4222/events.> --replay-from-sequence 1001 --state-db state.db
# Forensic replay: state is auto-cleared (replay starts before stored position)
rsigma daemon -r rules/ --input nats://localhost:4222/events.> --replay-from-sequence 1 --state-db state.db
# Force restore regardless of replay direction
rsigma daemon -r rules/ --input nats://localhost:4222/events.> --replay-from-sequence 1 --state-db state.db --keep-state

Timestamp fallback control. A new --timestamp-fallback flag (wallclock or skip) controls how correlation windows handle events without parseable timestamp fields. The default wallclock substitutes the current time (existing behavior). The new skip mode causes detections to still fire but omits the event from correlation state updates, preventing wall-clock times from corrupting temporal windows during forensic replay of historical logs.

# Skip events without timestamps for correlation (detections still fire)
rsigma daemon -r rules/ --input nats://localhost:4222/events.> --timestamp-fallback skip

Automatic schema migration. Existing SQLite state databases are transparently migrated on first open. The migration adds the source_sequence and source_timestamp columns without losing the existing correlation snapshot.

Codebase modularization (PRs #46-#58)

Thirteen PRs systematically split 12 large single-file modules into directory-based module structures across all six crates, improving navigability and reducing merge conflicts. The refactoring is purely structural with no behavioral changes.

PR File Result
#46 lint.rs (4,991 lines) lint/{mod,rules/{metadata,detection,correlation,filter,shared}}.rs
#47 main.rs (2,221 lines) commands/{parse,validate,lint,eval,convert}.rs
#48 postgres.rs (3,183 lines) postgres/{mod,correlation,tests}.rs
#49 correlation_engine.rs (4,395 lines) correlation_engine/{mod,types,tests}.rs
#50 transformations.rs (3,379 lines) pipeline/transformations/{mod,helpers,tests}.rs
#51 parser.rs (2,276 lines) parser/{mod,detection,correlation,filter,tests}.rs
#52 pipeline/mod.rs (2,235 lines) pipeline/{mod,parsing}.rs
#53 compiler.rs (1,824 lines) compiler/{mod,helpers,tests}.rs
#54 correlation.rs (1,781 lines) correlation/{mod,types,buffers,compiler,keys,window,tests}.rs
#55 engine.rs (1,656 lines) engine/{mod,filters,tests}.rs
#56 matcher.rs (1,118 lines) matcher/{mod,matching,helpers}.rs
#57 event.rs (758 lines) event/{mod,json,kv,plain,map}.rs
#58 cli/tests/cli.rs (1,745 lines) tests/{cli_parse,cli_validate,cli_lint,cli_eval,cli_daemon,common/mod}.rs

Additional cleanup: is_valid_uuid was de-duplicated across lint rule modules, and pipeline parsing logic was extracted from mod.rs into its own submodule.

E2E test suite (PR #60)

A comprehensive end-to-end test suite validates every major I/O path against real infrastructure. All container-based tests use testcontainers and are automatically skipped when Docker is unavailable.

PostgreSQL integration tests. Convert Sigma rules to SQL and execute the generated queries against a real PostgreSQL instance. Uses the Okta cross-tenant impersonation scenario with JSONB schema, 6 sample events, and 4 SigmaHQ detection rules. Tests cover default format, VIEW creation, multi-rule conversion, event_count correlation, and the no-match case.

NATS E2E tests (binary-level). Spawn the rsigma daemon as a child process with --input/--output NATS URLs pointed at a testcontainers NATS instance. Four tests cover single detection, no-match silence, event_count correlation, and fan-out to multiple output subjects.

NATS E2E tests (library-level). Additional integration tests in rsigma-runtime covering JetStream publish/subscribe, detection routing, and the article scenarios from the companion blog ...

Read more
Loading

v0.8.1

29 Apr 08:38
@mostafa mostafa
v0.8.1
This tag was signed with the committer’s verified signature.
mostafa Mostafa Moradian
GPG key ID: 2E1771094B081787
Verified
Learn about vigilant mode.
fb84d6c
This commit was signed with the committer’s verified signature.
mostafa Mostafa Moradian
GPG key ID: 2E1771094B081787
Verified
Learn about vigilant mode.

Choose a tag to compare

TL;DR
RSigma v0.8.1 is a patch release for the PostgreSQL backend. Dotted Sigma field names (like securityContext.isProxy) now generate correct chained JSONB operators when using -O json_field=....

What's New

Nested JSONB field paths (#45)

When json_field is set (e.g. -O json_field=data), the PostgreSQL backend now generates chained -> / ->> operators for dotted Sigma field names instead of treating the entire dotted string as a single flat key.

Before (v0.8.0):

-- securityContext.isProxy treated as a literal top-level key (incorrect)
SELECT * FROM okta_events WHERE data->>'securityContext.isProxy' = 'true'

After (v0.8.1):

-- Nested traversal into the securityContext object (correct)
SELECT * FROM okta_events WHERE data->'securityContext'->>'isProxy' = 'true'

Deeply nested paths work as expected:

Sigma field Generated SQL
eventType data->>'eventType' (unchanged)
securityContext.isProxy data->'securityContext'->>'isProxy'
actor.detail.sub.field data->'actor'->'detail'->'sub'->>'field'

All intermediate segments use -> (returns jsonb), and the final segment uses ->> (returns text). Flat field names without dots are unaffected. NULL propagation works correctly for existence checks: data->'nonexistent'->>'child' returns NULL, so IS NOT NULL behaves as expected on nested paths.

This is particularly important for Okta System Log rules from SigmaHQ, where fields like securityContext.isProxy and client.ipAddress reference nested JSON objects.

Upgrade

cargo install rsigma
# or
docker pull ghcr.io/timescale/rsigma:0.8.1

Full Changelog

v0.8.0...v0.8.1

Loading
Previous 1
Previous

AltStyle によって変換されたページ (->オリジナル) /