Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

lupantech/AgentFlow

Repository files navigation

AgentFlow

AgentFlow: In-the-Flow Agentic System Optimization

Arxiv Gradio Demo Huggingface Paper Huggingface Model Website X Youtube DeepWiki Slack Wechat AgentFlow

πŸ“£ News

  • [2025εΉ΄10月26ζ—₯] πŸ“š Our project introduction has been featured on DeepWiki !
  • [2025εΉ΄10月16ζ—₯] πŸ† Our paper has been accepted by NeurIPS 2025 Efficient Reasoning Workshop!
  • [2025εΉ΄10月13ζ—₯] πŸ“Έ Excited to have a tutorial video for AgentFlow covered by Discover AI on YouTube !
  • [2025εΉ΄10月10ζ—₯] πŸš€ Our X post received 1K+ likes! Feel free to check out the post and join the discussion! πŸ’¬
  • [2025εΉ΄10月08ζ—₯] πŸ”₯ We are honored to be featured as πŸ€— HuggingFace Daily Paper #2 .

🌟 Why AgentFlow?

AgentFlow is a trainable, tool-integrated agentic framework designed to overcome the scalability and generalization limits of today’s tool-augmented reasoning approaches.

Unlike prevailing approaches such as Search-R1 which train a single LLM to interleave reasoning steps with tool calls, AgentFlow introduces a modular agentic system with four specialized modules: 🧭 Planner, πŸ›  Executor, βœ… Verifier, and ✍️ Generator.

framework_overall

For effective planning and tool use, the framework directly optimizes planner agent within the system in an online fashion using Flow-based Group Refined Policy Optimization (Flow-GRPO), achieving superior performance across diverse domains with improved tool-calling reliability and long-horizon reasoning capabilities.

flow_grpo

πŸ“Ί YouTube Tutorial

Excited to have a tutorial video for AgentFlow covered by Discover AI on YouTube!

πŸš€ Key Features

  • 🧩 Modular Agentic System – Four specialized agent modules (Planner, Executor, Verifier, Generator) that coordinate via evolving memory and integrated tools across multiple turns.
  • πŸ”— Multi-Tool Integration – Seamlessly connect with diverse tool ecosystems, including base_generator, python_coder, google_search, wikipedia_search, web_search, and more.
  • 🎯 Flow-GRPO Algorithm – Enables in-the-flow agent optimization for long-horizon reasoning tasks with sparse rewards.
  • πŸ“ˆ Proven Results – AgentFlow (7B Backbone) beats top baselines on 10 benchmarks, with +14.9% search, +14.0% agentic, +14.5% math, +4.1% science, even outperforming ~200B-parameter GPT-4o.

πŸ† Experiments

πŸ“Š Main Results

AgentFlow (Qwen-2.5-7B-Instruct Backbone) outperforms top baselines on 10 benchmarks:

  • +14.9% on search
  • +14.0% on agentic reasoning
  • +14.5% on math
  • +4.1% on science

πŸ’‘ Even surpasses larger proprietary models like GPT-4o (~200B).

main_table1 main_table2

πŸ” In-Depth Analysis

  • Improved planning and decision-making
  • Enhanced tool-calling reliability
  • Positive scaling trends with model size & reasoning turns

Explore more in our paper or project page.

tool_call


πŸ“‘ Table of Contents

βš™οΈ Setup

Prerequisites

  • Python 3.11 (recommended)

Installation

bash setup.sh
source .venv/bin/activate
# (Optional) Install `parallel` for running benchmark experiments in parallel:
sudo apt-get update
sudo apt-get install parallel

Setup Environment Variables

Copy the .env.template file from agentflow/.env.template and rename it to .env, then place it in the agentflow/ folder. Update the following variables with your own API keys:

  • OPENAI_API_KEY (for judging reasponse)
  • GOOGLE_API_KEY (for Google Search tool)
  • DASHSCOPE_API_KEY (for calling Qwen-2.5-7B-Instruct as engine for agents and tools)
  • TOGETHER_API_KEY (alternative for calling Qwen-2.5-7B-Instruct as engine for agents and tools - recommended for international users)
  • More ways: serve Qwen2.5-7B-instruct model with vLLM (details refer to serve_vllm_local.md).

Please check API Key Setup Guide for detailed instructions on how to obtain these keys.

cp agentflow/.env.template agentflow/.env
# Then edit agentflow/.env with your API keys

⚑ Quick Start on AgentFlow Inference

AgentFlow provides a modular agentic system with four specialized modules (planner, executor, verifier, generator) that coordinate through evolving memory and a toolkit over multiple turns to solve complex reasoning tasks.

To quickly experience the system in action, run the command below (don’t forget to set up your API key):

python quick_start.py

Here is the content of quick_start.py:

# Import the solver
from agentflow.agentflow.solver import construct_solver
# Set the LLM engine name
llm_engine_name = "dashscope"
# Construct the solver
solver = construct_solver(llm_engine_name=llm_engine_name)
# Solve the user query
output = solver.solve("What is the capital of France?")
print(output["direct_output"])

πŸ’₯ Quick Start on AgentFlow Flow-GRPO Training

For effective planning and tool use, the framework directly optimizes the planner agent within the system in an online fashion using Flow-GRPO. Below is a quick start for training.

(Optional) Test Your Environment

Before diving in, we recommend verifying that AgentFlow's tools, LLM engines, and network configuration are properly set up. See test_env.md for detailed testing instructions.

Dataset Preparation

We mix two datasets for training: NQ (Natural Questions) for agentic search and DeepMath-103K for mathematical reasoning.

# train data
python data/get_train_data.py
# validation data
python data/aime24_data.py

After that, data dir should be:

data/
β”œβ”€β”€ train/
β”‚ └── combined_train.parquet (182,190 samples)
β”œβ”€β”€ val/
β”‚ └── aime24.parquet (30 samples)
β”œβ”€β”€ aime24_data.py
└── get_train_data.py

Flow-GRPO Training

Start agentflow training using Flow-GRPO with tmux:

# Create tmux session and start agentflow service (Window 0)
tmux new-session -s agentflow
bash train/serve_with_logs.sh
# Create new window (Ctrl+B then C) and start training (Window 1)
bash train/train_with_logs.sh

Configuration: All training hyperparameters are in train/config.yaml (model settings, tools, RL parameters, resources, etc.)

Logging: We provide a comprehensive logging to monitor training. See logs.md for more details.

🎯 AgentFlow Benchmark

Serve the trained planner model with VLLM (here we deploy our 7B Flow-GRPO planner model):

bash scripts/serve_vllm.sh

Run inference on benchmark tasks:

cd test
bash exp/run_all_models_all_datasets.sh

You can find more benchmarking details in benchmark.md.

🧩 Use Your Own Model in AgentFlow

AgentFlow supports different LLM engines for each agent module. See llm_engine.md for supported models and factory.py for the corresponding model_string configuration:

Planner Agent:

Other Agents (Executor, Verifier, Generator):

self.llm_engine_fixed = create_llm_engine(model_string="your-engine", is_multimodal=False, temperature=temperature)

and

# Instantiate Executor
executor = Executor(
 # llm_engine_name=llm_engine_name,
 llm_engine_name="dashscope",
 root_cache_dir=root_cache_dir,
 verbose=verbose,
 # base_url=base_url,
 temperature=temperature
)
  • For detailed information on supported engines and model_string formats, see llm_engine.md

🀝 Core Contributors

πŸŽ“ Advisors

πŸ™ Acknowledgements

We thank the following open-source projects:

  • verl for the excellent RL framework design.
  • vLLM for fast LLM inference support.
  • Verl-Tool and agent-lightning for their early-stage exploration in agentic RL Training.

We thank Lambda for GPU support!

πŸš€ Contributing

We are truly looking forward to open-source contributions to AgentFlow! If you’re interested in contributing, collaborating, or reporting issues, please feel free to open an issue or submit a pull request (PR). You can also reach us at zhuofengli12345@gmail.com, isaacpfino@gmail.com, lupantech@gmail.com or join our Slack community: AgentFlow.

We are also looking forward to your feedback and suggestions!

πŸ“š Citation

@article{li2025flow,
 title={In-the-Flow Agentic System Optimization for Effective Planning and Tool Use},
 author={Li, Zhuofeng and Zhang, Haoxiang and Han, Seungju and Liu, Sheng and Xie, Jianwen and Zhang, Yu and Choi, Yejin and Zou, James and Lu, Pan},
 journal={arXiv preprint arXiv:2510.05592},
 year={2025}
}

⭐ Star History

Star History Chart

↑ Back to Top ↑

AltStyle γ«γ‚ˆγ£γ¦ε€‰ζ›γ•γ‚ŒγŸγƒšγƒΌγ‚Έ (->γ‚ͺγƒͺγ‚ΈγƒŠγƒ«) /