-
Notifications
You must be signed in to change notification settings - Fork 6
Lint: AuditLogger.Emit inside per-invocation scope should be EmitFromContext #175
Description
Context
PR #173 fixed three audit-emit sites that were using `AuditLogger.Emit(...)` inside per-A2A-invocation scope, where they should have been using `AuditLogger.EmitFromContext(ctx, ...)`. The bug was operator-visible: the emitted events silently dropped `seq`, `trace_id`, `span_id`, and workflow-correlation fields the FWS-8 / FWS-4-Phase-4 contracts promise.
This issue is about catching the drift at PR time, not after a user reports a missing `seq` field in production.
The pattern to catch
```go
// BAD when ctx is in scope:
auditLogger.Emit(coreruntime.AuditEvent{
Event: coreruntime.AuditToolExec,
CorrelationID: hctx.CorrelationID,
TaskID: hctx.TaskID,
Fields: fields,
})
// GOOD:
auditLogger.EmitFromContext(ctx, coreruntime.AuditEvent{
Event: coreruntime.AuditToolExec,
CorrelationID: hctx.CorrelationID,
TaskID: hctx.TaskID,
Fields: fields,
})
```
The `Emit` form is genuinely the right call outside an invocation scope (auth callback, scheduler tick, MCP startup, subprocess egress proxy). The catch is: inside a function that takes a `context.Context` parameter (or has a ctx local in scope), `Emit` is almost always a bug.
Three options, increasing complexity
Option A — custom analyzer using `go/analysis` framework
Write a small `golang.org/x/tools/go/analysis` Analyzer that flags any call to `*coreruntime.AuditLogger.Emit` inside a function where a `context.Context`-typed value is in scope. Plug into the existing `golangci-lint` config via the `gocritic` or `custom` mechanism.
Pros: catches the exact pattern. Author-time feedback.
Cons: ~50-100 lines of analyzer code. New maintenance surface.
Option B — runtime assertion in dev / test mode
Add a debug-build assertion: `AuditLogger.Emit` panics (or logs a loud warning) when called inside a goroutine that holds a context with a `SequenceCounter` installed. Build tag `-tags=auditstrict` or env `FORGE_AUDIT_STRICT=1`.
Pros: no analyzer to maintain. Catches at test time.
Cons: only catches code paths that the test suite exercises. The user's specific `tool_exec` bug WAS reachable via the test suite (BeforeToolExec hook is fired in existing tests) but the test suite didn't assert on `seq`.
Option C — review checklist + grep guard in CI
Add a pre-commit/CI grep that flags `auditLogger.Emit\(` (note: not `EmitFromContext`) and requires a `//nolint:audit-emit` annotation with justification. Match the four documented out-of-scope sites in PR #173 by annotating them inline.
```bash
.github/workflows/lint.yaml (or scripts/lint-audit-emit.sh)
matches=$(grep -rn 'auditLogger.Emit(' forge-cli/ forge-core/ \
| grep -v 'EmitFromContext' \
| grep -v '//nolint:audit-emit')
if [ -n "$matches" ]; then
echo "audit emit without justification:"
echo "$matches"
exit 1
fi
```
Pros: ~10 lines. Zero new tooling. Forces every `Emit` call to be justified in code.
Cons: false positives (the four legit sites need annotations). Lower precision than a real analyzer.
Recommendation: Option C first, then Option A if it pays off
Start with Option C — annotate the four documented out-of-scope sites with `//nolint:audit-emit` + a brief comment explaining why, add the grep guard to CI. That gets us the regression protection with ~10 lines of CI + 4 inline annotations.
If the grep grows brittle (e.g., when channel plugins or forge-skills start emitting audit events with their own AuditLogger embedding), revisit Option A and write the real analyzer.
Files that would change with Option C
| File | Change |
|---|---|
| `scripts/lint-audit-emit.sh` (new) or `.github/workflows/lint.yaml` | The grep guard |
| `forge-cli/runtime/runner.go` | Inline annotations on the four legit `Emit` call sites (auth callback, egress proxy, MCP conflict, scheduler dispatcher) |
| `docs/security/audit-logging.md` | One-paragraph note describing the contract ("Emit inside invocation scope is a bug — use EmitFromContext") |
Annotation format
```go
//nolint:audit-emit // Auth middleware runs before the seq counter is
// installed on ctx; using EmitFromContext here would still produce a
// seq-less event. See issue #174 for the planned fix.
auditLogger.Emit(coreruntime.AuditEvent{
Event: coreruntime.EventAuthVerify,
...
})
```
Related
- PR fix(audit): stamp seq on tool_exec + session_end events #173 — the fix that motivated this
- Audit emit: stamp seq on auth_verify / auth_fail by installing the sequence counter before the auth middleware #174 — the structural fix for auth_verify / auth_fail (when that lands, one of the four annotated sites disappears)
- FWS-8 (FWS-8 — Hardened audit emission (sequence numbers, payload stripping, optional signing) #91 ) — the seq + schema_version contract this lint enforces