Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Releases: SponsioLabs/Sponsio

v0.2.0a3: redirect-to-safe fails closed (alpha)

08 Jun 22:38
@yfxiao16 yfxiao16

Choose a tag to compare

Sponsio 0.2.0a3: redirect-to-safe fails closed

Released: 2026年06月08日 · Status: alpha · pip install --pre sponsio==0.2.0a3

If you are on 0.2.0a2 and use redirect_to_safe with any adapter other than LangGraph, upgrade. A review of the v0.2.0a2 release surfaced a fail-open bug: the guard correctly rolled the unsafe call out of the trace, but six adapters then ran the original unsafe tool anyway. This release closes that hole.

The 0.2.0a3 release is one safety-relevant fix on top of the 0.2.0a2 "runtime-value comparisons" work, plus three smaller fixes from the same review pass. Nothing behavior-breaking; the change is "the unsafe call now actually does not execute" everywhere.


What's fixed

1. redirect_to_safe now fails closed everywhere

What happened. When a contract using redirect_to_safe(unsafe, safe) fired, the guard returned action="redirected" with blocked=False. LangGraph's adapter checked for .redirected first and substituted the safe tool transparently, as designed. But every OTHER adapter (CrewAI, OpenAI Agents SDK, Vercel AI, Claude Agent SDK, Google ADK, MCP, and the base custom-loop helper) only checked if check.blocked before executing the call — and blocked is False on a redirect. So:

  1. The guard rolled the unsafe event back from the trace (correct).
  2. The adapter, seeing blocked=False, fell through to run_tool(unsafe, args) and executed the original unsafe call anyway (wrong).

This is a fail-OPEN, the worst outcome for an enforcement layer. The contract fired; the safety control silently degraded to a no-op.

The fix. A new CheckResult.stop_original property folds blocked and redirected together. Every non-substituting adapter now gates execution on stop_original, so a redirect refuses the unsafe call:

# Old (fail-OPEN on redirect)
if check.blocked:
 return refusal_message(check)
return run_tool(name, args)
# New (fail-CLOSED on redirect)
if check.stop_original: # blocked OR redirected
 return refusal_message(check)
return run_tool(name, args)

LangGraph is unchanged: it branches on .redirected first and performs the substitution, so it never reaches the stop_original gate. Existing tests against blocked / redirected / allowed still hold.

Adapters with stop_original gating: base.py (covers most), crewai.py, agents.py, claude_agent.py, google_adk.py, vercel_ai.py, mcp.py.

Tracked follow-up: the Cursor adapter takes a separate evaluate_event outcome path and is not wired through stop_original yet.

2. TS Eq matches Python value equality for composite types

Eq(ArgValue("tool", "field"), CtxValue("expected")) is a v0.2 surface (the Term abstraction made composite-value equality reachable). The TS evaluator used ===, which is reference equality for arrays and objects:

// Python: True. TS: False (different refs).
[1, 2] == [1, 2]

So a contract that the Python guard let through could fire on TS for the same trace. New valuesEqual does element- and key-wise deep comparison, restoring parity. Regression test at ts/packages/sdk/src/__tests__/parity.test.ts.

3. TS SDK no longer crashes Cloudflare Workers at import

The YAML loader called createRequire(import.meta.url) eagerly at module top level. On Cloudflare Workers import.meta.url is undefined, and createRequire(undefined) threw, taking the whole bundle down even when YAML was never loaded.

Now the require instance is built lazily on first YAML load, with a ?? "file:///sponsio-noop.js" fallback. Workers that never touch YAML never call createRequire. (The sponsio-demo repo had patched this via patch-package; with 0.2.0a3 you can drop the patch.)

4. Pytest setup errors cleared up

A pre-existing autouse fixture in tests/conftest.py (the rich-style cache reset) called isinstance(obj, Style) on every live object. Optional SDK lazy proxies raised from their __class__ getter (OpenAI's voice helpers try to pull sounddevice), erroring 1684 of 2312 test setups. The check now swallows introspection failures.


Documentation repairs

  • filter_tools documents its O(candidates ×ばつ trace_length) re-grounding cost.
  • workflow_step documents the end-of-trace weak-next vacuity caveat (matters only for batch verify / replay; live enforce mode self-corrects on the next event).
  • Var.__eq__ documents that it builds AST nodes, not booleans.
  • _warned_missing_vars and arg_value retention get explicit footgun notes.
  • Several docstring first-lines repaired (artifacts of the earlier em-dash sweep that surfaced in help() and IDE hover popups).

Upgrading

pip install --pre sponsio==0.2.0a3

No CLI, config, or runtime API changes. CheckResult.stop_original is a new derived property; everything you wrote against blocked / redirected / allowed keeps working.

If you have a custom adapter that calls guard.guard_before(...) and gates on if check.blocked, switch to if check.stop_original to pick up the fail-closed behavior for redirects.

Compatibility

  • No breaking API changes.
  • TS users on Cloudflare Workers can remove any patch-package workaround for the createRequire crash.
  • TS Eq semantics change from reference- to value-equality for arrays and objects. The Python-side semantics are unchanged; the TS side now matches Python. Contracts that relied on TS reference-equality (i.e. the bug) will see different verdicts.

Credits

Thanks to @donalddellapietra for the review pass that surfaced the fail-open bug, the TS Eq parity gap, and the Worker runtime crash. Full PR: #78.

What's next

  • Wire stop_original through the Cursor adapter's evaluate_event path.
  • TS NL parser port for workflow_step and the Term comparison forms (still factories-only on TS).
  • TS DFA-compiled evaluator port.

If you are using 0.2.0a3 and hit something we did not predict, open an issue.

Loading

v0.2.0a2: runtime-value comparisons + benchmark libraries (alpha)

07 Jun 06:47
@yfxiao16 yfxiao16

Choose a tag to compare

Sponsio 0.2.0a2: runtime-value comparisons + benchmark libraries

Released: 2026年06月07日 · Status: alpha · pip install --pre sponsio==0.2.0a2

The 0.2.0a1 "softer landings" release made contracts more graceful when they fire. 0.2.0a2 makes them more expressive: contracts can now read runtime values out of tool arguments and context facts, compare them against each other, and prescribe the next action instead of only forbidding it.

It also ships the five hand-curated benchmark contract libraries that produce Sponsio's published RedCode-Exec, ODCV-Bench, τ2-bench, AgentDojo, and SWE-bench headline numbers, plus brings the TypeScript SDK to parity on the new deterministic core.


What's new

1. Term abstraction: compare runtime values

What it is. The arithmetic comparison family (Eq, Le, Lt, Ge, Gt) now accepts any Term, not just Var or Const. Four runtime-bound term subclasses ship with this release:

  • ArgValue(tool, field): raw value of args[field] when the current event is a call to tool.
  • CtxValue(key): raw value of an externally pushed context fact (guard.observe_context).
  • ArgLength(tool, field): len(args[field]) shorthand.
  • UnaryFn(fn, term): apply a Python callable to another term's value.
from sponsio.formulas.formula import ArgValue, CtxValue, Eq, G, Implies, Atom
# "If we issue a refund, the amount must equal what the supervisor approved."
contract("refund matches approval").guarantees(
 G(Implies(
 Atom("called", "issue_refund"),
 Eq(ArgValue("issue_refund", "amount"), CtxValue("approved_amount")),
 ))
)

Why it exists. Until 0.2.0a2 the only way to compare a runtime arg against an out-of-band fact was to push the comparison up into Python and use a custom strategy callback. The Term abstraction lets the comparison live inside the contract, so it shows up in sponsio validate, in audit logs, and in the DFA-compiled fast path.

Why it's good for users.

  • Audit-friendly. The constraint is declarative, not buried in callback code. A security reviewer reads the contract and sees what's being compared.
  • Cheap. Polymorphic dispatch is microseconds; no per-event Python callback overhead.
  • Composable. UnaryFn(len, ArgValue(...)) and ArgLength(...) cover length caps; UnaryFn(str.lower, ...) covers case-insensitive matches; arbitrary callables cover the rest.
  • Safe on missing data. Either operand resolving to None evaluates the comparison to false (the comparison cannot decide) rather than raising. Wrap fragile comparisons in Implies(scope_predicate, comparison) to suppress them where the relevant arg is not applicable.

2. workflow_step(trigger, next_action): prescriptive next-step

What it is. A new pattern that says "when trigger holds at the current event, the next event must satisfy next_action". Compiles to G(trigger -> X(next_action)).

from sponsio.patterns import workflow_step
from sponsio.formulas.formula import Atom
contract("toggle roaming on disabled status").guarantees(
 workflow_step(
 Atom("ctx", "roaming_status", "disabled"),
 Atom("called", "toggle_roaming"),
 )
)

Why it exists. Sponsio's existing patterns are all block-style: "you must not do X", "X requires Y first". workflow_step is the prescriptive counterpart: "you must do X next". Workflow-style policies ("if you observe X, the next step is Y") map directly onto the pattern without bending the contract into an awkward never-followed-by.

Why it's good for users.

  • Both arguments are arbitrary atoms. called(...), ctx(k, v), arg_field_has(...) all work in either position, so the same factory covers tool ordering, ctx-driven remediation, and arg-conditional follow-ups.
  • One-step bounded. Unlike the F-style always_followed_by, workflow_step decides after a single event. No liveness obligation hanging at session end.

3. Five benchmark contract libraries

What they are. Hand-curated YAML libraries that reproduce Sponsio's published benchmark headline numbers:

Library Benchmark Contracts
sponsio:benchmark/redcode_exec RedCode-Exec dangerous-snippet detection 26
sponsio:benchmark/odcv_bench ODCV-Bench KPI-pressure protection 19 + per-scenario LLM-scan cache
sponsio:benchmark/tau2_bench τ2-bench procedural-correctness 120 materialised contracts
sponsio:benchmark/agentdojo AgentDojo prompt-injection / lethal-trifecta defence 31
sponsio:benchmark/swebench SWE-bench Verified procedural-correctness ~20 per instance

Load like a capability pack:

agents:
 my_bot:
 include:
 - sponsio:benchmark/redcode_exec
 - sponsio:benchmark/odcv_bench

Why they exist. The numbers in the benchmark documents (95.6% on ODCV-Bench, 92% combined on RedCode, 0.746 AUC on τ2-bench) are reproducible only if the exact contracts are available. The libraries are the documentation-of-record for those results.

Why they're good for users.

  • Reproducibility. The published numbers stop being "trust us" and become "run this script on this YAML".
  • Forks-as-starting-points. Most rules tagged code-execution or code-quality generalise; a handful are calibrated to dataset-specific markers. The library is meant to be forked, edited, and pruned, not used verbatim in production.
  • Cross-runtime. The YAML loads identically on the Python guard and on the TypeScript SDK. Both runtimes ship the same five files.

4. TypeScript SDK reaches parity on the deterministic core

The TS SDK (@sponsio/sdk) now mirrors:

  • The Term abstraction and all four runtime-bound term classes (ArgValue, CtxValue, UnaryFn, ArgLength).
  • The workflowStep(trigger, nextAction, desc?) pattern factory.
  • The five benchmark contract YAML libraries under ts/packages/sdk/contracts/benchmark/.
  • Grounding emits arg_value(tool, field) and ctx_value(key) on every event.
  • The textual (formula, trace) -> verdict round-trip parser accepts the three new term tokens.

Verdicts agree on both runtimes for any contract built from primitives that exist in both. Same (formula, trace) pair always produces the same outcome.


Upgrading

pip install --pre sponsio==0.2.0a2

No breaking changes vs 0.2.0a1. Existing contracts continue to compile and behave identically. The new primitives are additive.

Compatibility

  • Var and Const are now Term subclasses. The ArithExpr type is an alias for Term, so existing type hints keep working.
  • Valuation (TS) is now Record<string, unknown>. If your TypeScript code stored boolean / number atoms with an explicit Record<string, boolean | number> typing, narrow at the call site or upcast as needed.
  • No CLI or config schema changes. sponsio validate, sponsio onboard, sponsio.yaml all unchanged.

Known limitations

  • TS's parseNl() does not yet recognise workflow_step or the Term comparison forms as natural-language strings. The factories ARE available for direct construction; only the NL parser is behind. See docs/reference/ts-sdk-parity.md.
  • TypeScript SDK still does not ship a DFA-compiled evaluator (only the recursive one). Verdicts agree, but the DFA path is faster on long traces. This stays on the roadmap.

What's next

  • TS NL parser port for workflow_step and the Term forms.
  • TS DFA-compiled evaluator port.
  • Continue closing the v0.2 strategy system gap on TS (RedirectToSafe dispatch in @sponsio/sdk/langchain, EscalateToHuman.notify callback hooks).

If you are using 0.2.0a2 and hit something we did not predict, open an issue.

Loading

v0.2.0a1: softer landings (alpha)

07 Jun 01:24
@yfxiao16 yfxiao16

Choose a tag to compare

Pre-release

Sponsio 0.2.0a1: softer landings

Released: 2026年06月06日 · Status: alpha · pip install --pre sponsio==0.2.0a1

Note on the version. The "softer landings" work was developed against 0.2.0a0; the alpha that actually shipped to PyPI is 0.2.0a1. The bump exists because the 0.2.0a0 upload to TestPyPI had relative image paths in its README that PyPI's renderer does not resolve, and PyPI does not allow re-uploading a version even after deletion. No runtime changes between 0.2.0a0 and 0.2.0a1.

Until 0.2, every Sponsio contract had effectively one failure mode: block the call and let the agent figure it out. That worked for the "AI tried to rm -rf /" demo, but in production it meant brittle agent loops bouncing off refusals every time the policy fired.

0.2 ships three softer landings that keep the agent making progress while still gating the unsafe behavior, plus a few smaller fixes that round out the failure-strategy surface.


What's new

1. tool_policy: default-deny tool access

What it is. A declarative YAML block (or inline kwarg) that says "the agent can only call tools in approved:. Anything else is denied."

tool_policy:
 default: deny
 approved: [search, read_file, list_dir]

Why it exists. Adding a new tool to your agent framework would silently expand the agent's authority. With tool_policy, the policy is the single source of truth for what the agent can reach. Adding a tool to your codebase is a deliberate act of trust; you have to put its name in approved: to make it callable.

Why it's good for users.

  • Audit-friendly. The allowlist is the artifact you show in a security review. One file, one list, one source of truth.
  • Prompt-injection-resistant. Combined with enforcement: proactive (below), denied tools never reach the agent's prompt. An attacker who tricks the model into asking for shell_exec finds that shell_exec does not exist in the model's available tools.
  • Backwards-compatible. Default is allow, so existing yaml files keep working byte-for-byte. Users opt in to deny.

2. enforcement: proactive + filter_tools: proactive tool filtering

What it is. Two paths to the same outcome: shrink the tool menu the agent sees down to the subset that is currently legal.

  • enforcement: proactive (wrap-time). Set on tool_policy. The LangGraph, CrewAI, OpenAI Agents SDK, and Google ADK adapters strip denied tools from the bound toolset at wrap() time. The model literally never sees them.
  • filter_tools(candidates) (per-turn). Pure-probe API on the guard. Returns the subset of tool names that will not be blocked given the live trace. Useful in custom loops where the application owns the LLM call site.

Why it exists. Reactive blocking (the agent tries, gets refused, tries again) wastes tokens and turns. For static rules (default-deny allowlist) the answer does not change between turns; for temporal rules (must_precede(A, B) only allows B after A) the answer changes per turn. Both should be reflected in what the agent sees, not what gets refused on the back end.

Why it's good for users.

  • No wasted attempts. The model does not burn turns on tools it cannot actually call.
  • Cleaner prompts. Fewer tools in the prompt means fewer distractors and a smaller token bill.
  • Works with any framework that supports custom loops. filter_tools is the universal hook; the proactive wrap-time variant is the zero-configuration version for the four adapters above.
  • Side-effect free. filter_tools is a pure probe: no log entry, no callback fanout, no perf sample contamination. Safe to call before every model turn.

3. redirect_to_safe: substitute, do not block

What it is. A pattern + strategy combo that, on violation, substitutes the model's chosen tool with a pre-declared safe alternative.

contract("trash instead of rm")
 .guarantees(redirect_to_safe("rm_rf", "trash"))

The model calls rm_rf; Sponsio rolls that event back from the trace, the LangGraph adapter invokes trash with the same arguments, the trace records the substitute call. From the model's perspective, the call succeeded.

Why it exists. A hard block forces the agent to bail out of the current task. A redirect keeps it making progress on a safer path. Most "destructive vs recoverable" tool pairs (rm_rf vs trash, issue_refund vs log_refund_request, force_push vs open_pull_request) are good candidates for this.

Why it's good for users.

  • Agent does not have to learn to recover from policy violations. The recovery is built into the policy.
  • Audit trail reflects what actually executed. The trace records the safe substitute, not the attempted-and-blocked unsafe call. Counters (rate_limit(unsafe, N)) do not tick on the rollback.
  • Composes with conditional contracts. assume(...).guarantees(redirect_to_safe(...)) makes the substitution conditional on a precondition (for example, redirect refunds over 10ドルk while letting smaller ones through).

4. EscalateToHuman(notify=[...]): notifier hooks

What it is. The escalate strategy now accepts a callable or list of callables (Slack webhook, email sender, oncall pager) that fire synchronously when the contract trips.

EscalateToHuman(
 reason="refund > 10ドルk requires CFO approval",
 notify=[slack_oncall, email_finance_lead],
)

Why it exists. Until 0.2, EscalateToHuman differed from DetBlock only in the action literal and the agent-facing message. No actual side effect, no notification, no out-of-band reach to a human. 0.2 makes the notification real.

Why it's good for users.

  • Isolated failures. A broken Slack webhook does not crash the agent loop and does not silence the remaining notifiers; the exception becomes a RuntimeWarning naming the offending callable.
  • Composable with DetBlock for hard refuse + notify. If you want the call gated AND the page fired, pair DetBlock with monitor.register_callback. The case study at examples/integrations/python/v0_2_finance_escalate_vanilla.py shows the pattern.

Smaller fixes

  • sponsio mode <observe|enforce> CLI is now parent-aware. Prefers updating runtime.mode (the only line the TS loader reads), falls back to defaults.mode, refuses to append a fresh enforce block when neither exists. CI scripts that relied on the old exit-1 behavior for malformed configs keep working.
  • LangGraph adapter rejects chained redirects and self-redirects. A contract that says "redirect A to B" combined with another saying "redirect B to C" no longer silently executes B; both raise ToolCallBlocked with a clear chain-naming error.
  • Pattern factories uniformly accept desc=. Including redirect_to_safe, which previously did not and silently broke LLM-extracted rules.
  • TS SDK gets redirectToSafe (formula side; runtime strategy bundle is Python-only for now).
  • Discovery replay_formula now passes content_atoms to grounding. Historical-trace replay against contracts referencing contains(pii) / arg_has(...) no longer silently returns false negatives.
  • render/components.contracts_table wraps the name column in Text(name). Rich was eating bracketed contract descriptions (only [search, read_file] approved) as malformed markup.

Upgrading

This is an alpha, so pip install sponsio still pulls 0.1.1. To try 0.2.0a1:

pip install --pre sponsio==0.2.0a1

Run the verification script to confirm:

python scripts/verify_v0_2.py

15 checks across core runtime + four adapters. Adapters with the SDK not installed are skipped rather than failed.

Compatibility

  • No breaking changes to the 0.1.x API. Every yaml file, every Sponsio(...) call, every contract factory call from 0.1.1 still works.
  • tool_policy.default is allow by default. You opt into deny.
  • enforcement is reactive by default. You opt into proactive.
  • EscalateToHuman() with no notify= argument behaves exactly as in 0.1.x.

Real-LLM verification

The v0.2 surface was end-to-end verified against Gemini 2.5 Flash through a LangGraph react agent (not just scripted tool calls). See examples/integrations/python/v0_2_real_llm_refund_langgraph.py for the runnable script.

What the verification confirmed under a real model:

  • enforcement: proactive strips the bound tool set in the prompt. The model saw 3 tools (check_policy, issue_refund, log_refund_request), not 4. delete_customer was completely absent. Prompt-injection attempts to call it have nothing to bind to.
  • redirect_to_safe is transparent to the model. Gemini called issue_refund(customer_id="C-42", amount=5000), the LangGraph adapter substituted log_refund_request, the model read back the ticket-opened result, and adapted its final reply to "Your refund for 5,000ドル has been submitted and is currently under review". The model did not claim a successful refund. It described what actually ran (the substitute call), not the original unsafe call.
  • Trace integrity. Only log_refund_request events recorded; zero issue_refund survived. Downstream counters and rate limits would see only the substitute call.

The script auto-loads .env from the repo root, so a GOOGLE_API_KEY=AIza... line is all you need:

GOOGLE_API_KEY=AIza... python examples/integrations/python/v0_2_real_llm_refund_langgraph.py

Cross-check with the verification harness for cross-integration sanity:

python scripts/verify_v0_2.py

15 checks across the core runtime and four adapters. Adapters with the SDK not installed skip rather than fail.

Known limitations

  • redirect_to_safe runtime dispatch is implemented only in the LangGraph adapter. CrewAI / Agent...
Read more
Loading

v0.1.1 — fix missing pyyaml core dependency

23 May 00:27
@yfxiao16 yfxiao16

Choose a tag to compare

Patch release: pyyaml is now a core dependency (closes #61). A base pip/pipx install shipped without it, crashing 'sponsio host install' with ModuleNotFoundError. Upgrade: pipx upgrade sponsio

Assets 2
Loading

v0.1.0 — first stable open-source release

06 May 14:47
@yfxiao16 yfxiao16

Choose a tag to compare

Open-source launch build. Closes the missing-implementation gap in 0.1.0a3
(CLI imported sponsio.daemon / sponsio.plugin.append_ops but the wheel
shipped without them) and tunes the bundled capability rules.

Added

  • sponsio.daemon — Unix-socket IPC server + client + handlers; powers
    the privileged-process side of sponsio plugin append so a system install
    can give kernel-level (separate-UID) self-modify protection.
  • sponsio plugin append — structurally-additive merge from a staging
    YAML into a host bucket library; the only blessed write path through the
    self-modify pack.

Changed

  • Capability/shell pack — drop session-wide rate_limit(exec, 50) and
    loop_detection(exec, 20). The 24-hour cross-session trace store turned
    these into rolling caps that false-positived heavy interactive work; the
    targeted arg_blacklist and confirm-gate rules already cover the real
    attacks.
  • Capability/self-modify pack — extend protection to the upstream
    sponsio package (contract bundles + engine .py) so an editable / --user
    / venv install can't be used as an "edit the bundle to silence the rule"
    bypass. Maintainer workflow: override with customized: {match: {source: "library:tier1.self-modify"}, disabled: true}.
  • Onboard wizard — drop redundant trailing "mode flip" hint (axis 3
    already asks); language-aware bare-loop guard API hint
    (guardBefore/guardAfter for TS, guard_before/guard_after for Python).

Fixed

  • sponsio --version was hardcoded to "0.2.0a0" in the Click
    version_option; now reads sponsio.__version__ so it tracks
    pyproject.toml automatically.
  • 0.1.0a3 wheel was missing sponsio/daemon/ and sponsio/plugin/append_ops.py,
    causing sponsio plugin append and sponsio daemon ... to ImportError on a
    fresh pip install. 0.1.0 ships them.
Loading

AltStyle によって変換されたページ (->オリジナル) /