So score-candidate checks tenant equality itself. score-candidates filters candidates through candidate-eligible?, which repeats tenant, status, counterparty, and time-window checks before scoring.
This is defensive duplication with a purpose. The route should pass tenant-scoped collections. The domain should still reject anything outside the tenant boundary. In a single-tenant fixture, this looks redundant. In a two-tenant test, it is the difference between "the route behaved" and "the invariant held."
The same philosophy appears in reports. The audit PDF renderer captures a tenant snapshot, renders with strict token lookup, and the setup check renders two tenants to make sure one tenant's identity literals never appear in the other tenant's output.
The theme is the same: do not trust a boundary because a previous layer probably handled it.
The Result
The finished local build passed 160 tests with 869 assertions. The full-flow E2E drives invoice adapter polling into exception creation, assignment, investigation, Workflow Engine resolution polling, Notification Hub event capture, and audit PDF generation.
The number I care about most is smaller: the dashboard guard. It caught a practical operations failure. The first dashboard shape rendered every dispute link in the tenant fixture. The fix capped the overview at 50 open disputes and kept totals intact.
That is the Workbench in miniature. Preserve the facts. Limit the surface. Make the operator decide when the machine only has a probability.
Confidence is useful. Ownership is a human or policy decision.