Name	Name	Last commit message	Last commit date
Latest commit History 246 Commits
.github	.github
agentune	agentune
agentune_simulate	agentune_simulate
ci/scripts	ci/scripts
docs	docs
examples	examples
tests/agentune	tests/agentune
.gitignore	.gitignore
.ruff.toml	.ruff.toml
CONTRIBUTING.md	CONTRIBUTING.md
LICENSE	LICENSE
README.md	README.md
mypy.ini	mypy.ini
poetry.lock	poetry.lock
pyproject.toml	pyproject.toml

Agentune

CI PyPI version License Twitter Follow Discord

Open-source framework for continuously improving AI agents.

Agentune helps teams analyze, improve, and evaluate customer-facing AI agents through measurable, data-driven iterations — not guesswork.

Instead of tweaking prompts and hoping for the best, Agentune connects real conversations, context data, and simulations into a repeatable optimization loop that drives actual KPI improvements such as conversion, CSAT, and retention.

Why Agentune

Most agents are launched and left to stagnate — tuned by intuition, not evidence.

Agentune enables continuous agent improvement by combining analytics, optimization, and simulation in a single open framework:

Analyze – uncover what drives your agent’s KPIs up or down
Improve – generate actionable recommendations to lift performance
Simulate – safely test and benchmark improvements before deployment

The result: agents that don’t just respond — they learn what works.

The agentune-simulate library

Agentune Simulate is a separately installable library that enables you to create customer simulations to test and benchmark your agent's behavior before production.

Together with agentune, it forms the Analyze → Improve → Simulate loop — a disciplined framework for building smarter, higher-performing AI agents.

A future version of agentune-simulate will merge it into the main agentune package.

Real-World Use Cases

Agentune is built for teams who want to move beyond trial-and-error:

AI platform / infra teams managing production-grade agents across multiple domains or use cases
ML / data teams accountable for KPI impact, not just model accuracy
Product / ops teams who need to measure and harden conversational behavior before it reaches users

Common scenarios:

Diagnose why conversion or CSAT is dropping
Quantify which behaviors, intents, or flows impact KPIs
Test new prompt or policy versions safely
Continuously improve deployed agents over time

Agentune Analyze & Improve

Turn real conversations into insights that measurably improve your AI agents.

Agentune Analyze & Improve helps teams discover what drives an agent’s KPIs up or down — and generate concrete recommendations to enhance performance.
It transforms messy operational data into interpretable, data-driven actions that actually move business metrics.

Why It Matters

Most AI agents are optimized by intuition: a few sample chats, some prompt edits, and best guesses.

Agentune replaces guesswork with evidence.
Using structured and unstructured data from real conversations, it:

Identifies patterns that correlate with KPI outcomes
Surfaces interpretable insights (not opaque scores)
Recommends targeted changes to prompts, policies, and logic

No more trial-and-error tuning — just measurable improvement grounded in data.

For example: suppose you built a sales agent and now have a dataset of conversations with labeled outcomes as win, undecided, or lost. Using Agentune Analyze & Improve, you can discover insights showing which patterns or intents correlate with those outcomes and receive concrete recommendations to refine the agent’s playbook — for instance, improving how it handles discounts, competitor mentions, or shipping questions.

How It Works

Agentune Analyze & Improve follows a transparent, two-step process:

1. Analyze

Ingests conversations, outcomes, and optional context data (e.g., product, policy, CRM).
Generates semantic and structural features that capture patterns in language, behavior, or flow.
Selects statistically significant features correlated with KPI changes — these become your drivers of performance.

Example insights:

"Mentions of competitors early in chat increase conversion probability."
"Discount discussion combined with shipping-time questions lowers CSAT."

2. Improve

Maps the discovered drivers into actionable recommendations — changes to prompts, tool usage, escalation logic, or playbooks.
Outputs a ranked list of improvement opportunities, each linked to its supporting data.

These recommendations can then be validated using Agentune Simulate before deployment.

Example Usage

Getting Started - 01_getting_started.ipynb for an introductory walkthrough of library fundamentals
End-to-End Script Example - e2e_script_example.md - a runnable example executing the entire analysis workflow
Advanced Examples - advanced_examples.md for customizing components, using LLM requests caching, and advanced workflows

Testing & Costs

We've tested Agentune Analyse with the combination of OpenAI o3 and gpt-4o-mini. In our tests, the cost per conversation was approximately 5-10 cents per conversation.

Installation

pip install agentune

Requirements

Python ≥ 3.12
Note for Mac users: If you encounter errors related to lightgbm, you may need to install OpenMP first: brew install libomp. See the LightGBM macOS installation guide for details.

Key Features

🧩 Feature Generation – semantic, structural, and behavioral signals derived from real interactions
📈 Feature Selection – statistical and semantic correlation with target KPIs
💡 Actionable Insights – interpretable drivers with examples and metrics
🧠 Context Awareness (upcoming) – integrates CRM, product, and policy metadata for deeper understanding

Roadmap

Current focus: advancing Analyze & Improve with structured, context-aware optimization.

Planned milestones:

Context-aware feature generation and insight discovery
Integration of context features into the recommendation layer for targeted improvement actions
Expanded evaluation and visualization tooling for Analyze & Improve results
Visualization tools for insight exploration
Seamless flow into agentune-simulate for validating improvements

Longer-term:

Multi KPI analytics: understand how improving one KPI impacts other KPIs and account for that in the suggested improvement recommendations.
Optional multi-agent analytics and cross-agent benchmarking

Contributing

We welcome contributions from engineers who care about robust, measurable agents.

Open issues for bugs, integrations, or feature proposals
Early adopters: reach us at agentune-dev@sparkbeyond.com
💬 Join our community on Discord to connect with maintainers, share ideas, and get support

Folders and files

Latest commit

History

Repository files navigation

Agentune

Why Agentune

The agentune-simulate library

Real-World Use Cases

Agentune Analyze & Improve

Why It Matters

How It Works

1. Analyze

2. Improve

Example Usage

Testing & Costs

Installation

Key Features

Roadmap

Contributing

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages