Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

MathisWellmann/symbiont

Repository files navigation

Symbiont Agent Harness

Symbiont Logo

An agentic feedback loop that evolves Rust functions: an LLM generates type-safe code, the harness compiles and hot-swaps it into your running binary, evaluates the results, and feeds performance back to the LLM for the next iteration — bare-metal execution, zero interpreter overhead.

How it works

flowchart LR
 A["LLM writes\nRust function"] --> B["Constrained\nGeneration"]
 B -->|"validate + compile"| C["Native .so"]
 C -->|"hot-swap"| D["Running\nBinary"]
 D -->|"bare-metal\nexecution"| E["Evaluate"]
 E -->|"feedback"| A
 style A fill:#1a1a2e,stroke:#e94560,color:#eee
 style B fill:#16213e,stroke:#e94560,color:#eee
 style C fill:#0f3460,stroke:#e94560,color:#eee
 style D fill:#0f3460,stroke:#e94560,color:#eee
 style E fill:#1a1a2e,stroke:#e94560,color:#eee
Loading

Declare function signatures with the evolvable! macro and provide an evaluation function. The agent autonomously implements, and refines the code each iteration — the harness validates, compiles, and hot-swaps the native code into the running process without a restart.

Constrained generation is what makes this reliable: the harness enforces that LLM output is valid Rust, matches the declared function signature, and compiles successfully. When any check fails, the specific error (parse failure, signature mismatch, or compiler diagnostics) is appended to the prompt and the LLM retries automatically until it produces correct code.

Quick start

symbiont::evolvable! {
 fn step(counter: &mut usize) {
 // default implementation body, entirely evolved by the LLM
 *counter += 1;
 println!("doing stuff in iteration {}", counter);
 }
}
#[tokio::main]
async fn main() -> symbiont::Result<()> {
 let runtime = symbiont::Runtime::new(SYMBIONT_DECLS, SYMBIONT_PRELUDE, symbiont::Profile::Debug).await?;
 let agent = symbiont::inference::init_agent(None)?;
 let fn_sigs = runtime.fn_sigs();
 let base_prompt = format!(
 "Give a concise implementation for this function signature: ```{}```, \
 that increments the counter by a constant in the range (5..20). \
 Give Rust Code Only.",
 fn_sigs[0]
 );
 let mut counter = 0;
 let mut last_evolution = std::time::Instant::now();
 loop {
 step(&mut counter); // bare-metal: calls into the hot-loaded native dylib
 println!("counter: {counter}");
 if last_evolution.elapsed() >= std::time::Duration::from_secs(10) {
 // LLM rewrites the function, harness validates + compiles + hot-swaps
 runtime.evolve(&agent, &base_prompt).await?;
 last_evolution = std::time::Instant::now();
 // New Agent written code is available next time `step` is called and executed natively.
 }
 }
}

The example shows a basic counter function where the Agent evolves the implementation, based on a user-defined prompt. The compiled dylib (of the function) gets hot-swapped in the evaluation loop, achieving bare-metal performance. This is agentic code mode in action. The harness provides constrained generation and nudges the LLM prompt if necessary.

symbiont-counter-example.mp4

See the Development setup section and the examples/ directory for more.

Showcase: evolving a trading strategy

The evolving-trader-example is the most complete demonstration of what symbiont can do. An LLM evolves a futures trading strategy as compiled Rust against a realistic exchange simulation:

  • ~1M raw BitMEX XBTUSD trades are aggregated into information-driven volume candles with trade_aggregation.
  • Executions are simulated with the leveraged futures exchange lfest — taker fees, bid-ask spread and margin requirements included.
  • The evolvable fn decide(candles: &[Candle], account: &AccountState) -> Action receives a sliding window of candles plus the account state and returns a market-order action.
  • Each round, the backtest report (return, buy & hold benchmark, drawdown, Sharpe, fees, rejected orders) is fed back to the LLM; the best strategy is evaluated on a held-out test segment it has never seen.

The LLM must discover features (momentum, volatility, order flow), position sizing and fee-awareness — quantitative reasoning expressed as hot-swapped native code.

cargo run -p evolving-trader-example

Showcase: evolving a live fractal shader

The fractal-studio-example is an interactive egui window whose per-pixel shader is written by the LLM and hot-swapped into the running binary as optimized native code. Type a prompt — "an animated Julia set, c orbiting the main cardioid, with a glowing sunset palette" — and the agent implements fn shade(x: f64, y: f64, t: f64) -> u32; the live animation morphs in place, no restart:

fractal-studio-example.mp4
  • shade is called once per pixel (~0.5M calls/frame at ×ばつ540), parallelized over all cores with rayon — an interpreted agent-code loop would be orders of magnitude too slow to animate.
  • The user is the evaluator: the runtime keeps the chat history, so follow-up prompts refine the current shader.
  • Agent code panics are caught inside the dylib (rendered as black pixels) and fed back into the next evolution prompt.
cargo run -p fractal-studio-example --release

Core highlights

  • Type-safe agentic code: Agents express intent as Rust functions with enforced signatures.
  • Constrained generation: Parse errors, signature mismatches, and compiler diagnostics steer the LLM until it produces valid code.
  • Hot-swap dylibs: Functions are compiled to native shared libraries and swapped in-place via libloading — no process restart.
  • Bare-metal performance: Evolved functions run as native compiled code. The dispatch overhead is ~1 ns per call (a single atomic pointer load + indirect call). The hot path is fully lock-free and multi-thread safe.
  • Plug-in inference: Any Inference provider is supported via rig.
  • Tool calling: Register any rig Tool on the agent and it becomes available during evolution — rig drives the multi-turn tool-calling loop internally while the harness consumes only the final code. This lets the agent gather information (run tests, probe black-box systems, query data) before committing to an implementation. See the tool-calling-example.
  • Tiny Core: Only ~1000 LOC for the Agent harness and constrained generation part.
  • Catches Agent Code Panics Any LLM code that generate a runtime panic will be caught using catch_unwind, and the panic message is used to provide backpressure in the prompt. See unwind.rs for details.

When Symbiont wins

Positioning Quadrant — When Symbiont Wins

Use cases

  • Quantitative strategy evolution against a realistic market simulation.
  • Typed function body search (e.g., find an implementation that satisfies a test suite).
  • Performance Optimization under functional equivalence
  • Game AI / strategic reasoning through evolved code
  • Interactive, human-in-the-loop visual evolution where the user is the evaluator.
  • Tool-augmented evolution, where the agent must discover the specification through tool calls before writing code.
  • Auto-research workflows with native-speed evaluation.
  • Black-box optimization of inputs that produce desired outputs, e.g. Parameter Search.
  • Self-evolving feature processing pipelines.
  • Agentic code evolution generally.

Development setup

The project uses Nix for reproducible builds and devenv to manage a local inference server.

Prerequisites: Nix with flakes enabled.

Setup your .env file like this for the next steps (or use your desired inference provider):

export API_KEY=""
export BASE_URL="http://127.0.0.1:8321/v1"
export MODEL="google/gemma-4-E2B-it"

Then execute the following:

# Enter the development shell (provides Rust nightly, cargo tools, formatters)
nix develop
# Start a local llama-cpp server with gemma-4-E2B-it (auto-downloads on first run)
devenv up
# In another terminal, run the counter example
cargo run -p counter-example

Dispatch overhead

Function pointers are cached in AtomicPtr statics after each load — callers never touch a lock or perform a symbol lookup.

Time per call
Direct function call 0.91 ns
evolvable! dispatch 1.64 ns

Benchmark: cargo bench -p symbiont --bench dispatch_overhead

On reload, the runtime updates the atomic pointers and drops the old library. This is safe because the feedback loop contract guarantees no evolvable functions are executing during evolution — only one .so is loaded at any time.

Per-evolution timings

A typical evolution cycle (LLM inference → constrained generation → compilation -> fn evaluation) highly depends on:

  • The model being used (Inference latency)
  • Size of the generated Rust code.
  • Optimization level for the compiled dylib.
  • Did the LLM make a misstake? -> Repeat cycle again with new steering prompt.

Example timings for fizzbuzz-example using unsloth/Qwen3.6-35B-A3B-GGUF:UD-Q4_K_M on an RTX Pro 6000 Blackwell and llama-cpp (~150TPS):

Stage Time
LLM inference 4852 ms
Harness checks 0 ms
Compilation 118 ms
Function Eval <3ns

The function evaluation pipeline should be built to keep these proportions in mind. The fizzbuzz-example can be oneshot and function evaluation is super cheap, so its not representative, just a toy example.

Limitations

These constraints arise from the binary/dylib interaction boundary. The harness mitigates most of them, but users should be aware:

  • Static function signatures: The LLM can only rewrite function bodies — the signature declared in evolvable! is fixed at compile time and enforced on every evolution. This is by design (it's what makes constrained generation possible), but it means the agent cannot add parameters, change return types, or introduce new functions at runtime. It would be UB to hot-swap a different function signature in, when the main binary expects a certain memory layout.
  • Sequential feedback loop: All evolvable function calls must have returned before evolve() is called. The old library is dropped on reload, so in-flight calls through stale pointers would be UB. This matches the intended usage pattern (run functions, collect results, evolve, repeat) and is enforced with an assertion in debug builds at zero cost in release. Multi-threading is possible, but requires extra care.
  • Same toolchain required: Rust has no stable ABI. The binary and dylib must be compiled with the same rustc version to guarantee matching calling conventions and memory layouts. The harness ensures this by compiling the dylib on the same machine with the same toolchain.
  • Shared API crates for custom types: Evolvable signatures may use custom or upstream dependency types when the generated dylib is configured with matching Cargo dependencies. For single-package applications, move boundary types and methods into the package library target, re-export them from prelude, and initialize with DylibConfig::host_package(...) so the dylib depends on the host crate as host and imports host::prelude::*.
  • unsafe at the boundary: Dynamic symbol lookup is inherently unsafe. The harness validates function signatures against the evolvable! declaration and only loads code that parses, type-checks, and compiles — but the extern "Rust" pointer cast remains an unsafe invariant.

See CAVEATS.md for more details.

See also:

  • slopc for function body implementations at compile time, but no evolution or feedback cycles there.
  • hot-lib-reloader for the idea of hot-swapping functions at runtime.
  • GEPA for optimizing any system with textual parameters against any evaluation metric. But not Rust :(
  • Agentica for a Python Agent SDK, providing persistent REPL and sub-agents.

Also checkout the TODOs file for what might come next for symbiont. Stay tuned!

License

Copyright (C) 2026 MathisWellmann

This project is licensed under the Mozilla Public License 2.0 — see LICENSE for details

AltStyle によって変換されたページ (->オリジナル) /