Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Verifiable Labs

Clean feedback and promotion gates for increasingly general AI agents.

Verifiable Labs

Verifiable Labs builds clean feedback and promotion gates for increasingly general AI agents.

AI agents are getting better at passing the tests they were tuned on. Verifiable Labs helps agents improve through generated clean feedback loops, then verifies whether those improvements truly generalize before promotion — on hidden, out-of-distribution, and adversarial scenarios the agent has never seen.

Website PyPI Zenodo DOI License

What we build

  • Evaluate — compile an evaluation contract from your agent's goal, generate public / hidden / OOD / adversarial scenarios, and score clean performance with contamination and hack-risk analysis.
  • Gate — a contamination-resistant promotion gate: clean verified-generalization score (CleanVGS), generalization gap, and an ACCEPT / REJECT / LIMITED_ROLLOUT decision with an assurance card.
  • Improve — human-reviewed improvement suggestions and candidate agent configs, re-verified by the gate. Improvements are never auto-applied.
  • Substrate — clean feedback records, transfer metrics, failure memory, and generated curriculum for teams building increasingly general agents.

The privacy-preserving default is evaluate-only: nothing is exported, nothing is reused for training, and human review is required.

Formal foundation

Selected mathematical properties behind the contamination-resistant promotion gate are machine-verified in Lean 4. The implementation is property-tested against the formal specification.

The Lean 4 development and its Python property-test mirror are open source in verifiable-labs-envs (formal/ and src/verifiable_labs_envs/formal_spec/).

Open core

Repository What it holds
verifiable-labs-envs SDK, 25 procedurally generated environments, formal track, CLI (Apache-2.0)
vlabs-sdk SDK contracts: run modes, provider interface, schemas (pointer)
vlabs-formal Lean 4 formal track + property-test mirror (pointer)
vlabs-examples Public-safe examples and quickstarts
vlabs-evidence Redacted sample assurance cards and aggregate metrics
vlabs-docs Product and positioning documentation

The evaluation platform (scenario generation, contamination firewall, anti-hack engine, billing, API) is private. Hidden evaluation content, gold answers, detection details, customer data, and raw traces are never published — that separation is what keeps the feedback clean.

Published evidence

Public, synthetic / redacted demo evidence:

All published evidence is synthetic / redacted and is not a training dataset. It contains no customer data, hidden evaluations, gold answers, raw traces, private anti-hack traps, or private engine internals.

Install the SDK: pip install "vlabs-sdk==0.0.2"

Links

Popular repositories Loading

  1. verifiable-labs-envs verifiable-labs-envs Public

    Open-source SDK (Apache-2.0): RL environments, conformal calibration, a TRL-compatible reward function, the Lean 4 formal track, and the verifiable/vlabs CLI.

    Python

  2. .github .github Public

    Verifiable Labs organization profile

  3. vlabs-sdk vlabs-sdk Public

    Public SDK and CLI for clean promotion gates, typed assurance cards, provider interfaces, and formal-spec mirrors.

    Python

  4. vlabs-formal vlabs-formal Public

    Lean 4 proofs and property-tested formal-spec mirrors for selected promotion-gate properties.

    Python

  5. vlabs-examples vlabs-examples Public

    Runnable synthetic examples for Verifiable Labs clean-gate workflows.

    Python

  6. vlabs-evidence vlabs-evidence Public

    Synthetic/redacted evidence artifacts, assurance cards, and public reproducibility notes.

    Python

Repositories

Loading
Type
Select type
Language
Select language
Sort
Select order
Showing 7 of 7 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading...

Most used topics

Loading...

AltStyle によって変換されたページ (->オリジナル) /