Skip to content

#

agentic-rl

Here are 24 public repositories matching this topic...

Language: All

Filter by language

All 24 Python 20 Go 1 HTML 1 JavaScript 1

Sort: Most stars

Sort options

Most stars Fewest stars Most forks Fewest forks Recently updated Least recently updated

walkinglabs / hands-on-modern-rl

🚀 An open-source, hands-on curriculum bridging the gap from basic RL concepts to LLM alignment, RLVR, and advanced Agentic systems.

agent tutorial pytorch dpo reinforcemen llm rlhf agentic agentic-ai grpo llm-alignment agentic-rl

Updated Jun 12, 2026
Python

AgentR1 / Agent-R1

Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning

agent llm agentic-rl

Updated Jun 10, 2026
Python

rlix

rlops / rlix

Run more RL experiments. Wait less for GPUs.

reinforcement-learning rl lora tinker mlops ml-systems gpu-scheduling llm-training agentic-rl

Updated May 24, 2026
Python

InternLM / ARM-Thinker

[CVPR 2026] Official Code for "ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning"

vlm llm vision-language-model reward-modeling agentic-rl think-with-image

Updated Feb 13, 2026
Python

AgentR1 / Claw-R1

Claw-R1: Empowering OpenClaw with Advanced Agentic RL.

agent agentic-rl openclaw

Updated Jun 9, 2026
Python

AMAP-ML / Thinking-with-Map

[ACL 2026 Findings] Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization

agent reasoning geo-localization mllm agentic-rl

Updated Mar 9, 2026
Python

0bserver07 / Study-Reinforcement-Learning

RL study guide — foundations through RLHF, DPO, GRPO, RLVR, agentic RL, and offline RL. Hand-written CS294 notes, 19 lecture drafts, 5 tested exercises, citations that resolve.

machine-learning reinforcement-learning deep-learning q-learning policy-gradient study-notes lecture-notes ppo dpo rlhf constitutional-ai deepseek-r1 grpo llm-alignment rlvr sutton-barto agentic-rl

Updated May 15, 2026
Python

Computer-use-agents / dart-gui

DART-GUI: Efficient Multi-turn RL for GUI Agents via Decoupled Training and Adaptive Data Curation

gui-agent computer-use-agent agentic-rl

Updated Feb 26, 2026
Python

hscspring / rl-llm-nlp

Curated, opinionated index of post-R1 LLM ×ばつ Reinforcement Learning. Many deep-dive blog posts cross-linked to many papers — GRPO, DAPO, DPO, PPO, RLHF, GSPO, CISPO, VAPO, Reward Modeling, MoE RL stability, Verifier-Free RL, Training-Free RL, Agentic RL, DeepSeek-R1 reproduction.

awesome reinforcement-learning alignment moe awesome-list curated-list reasoning post-training ppo paper-list dpo llm rlhf llm-training llm-reasoning reward-modeling deepseek-r1 grpo agentic-rl rl-from-human-feedback

Updated Apr 25, 2026

strands-rl / strands-sglang

SGLang model provider for Strands Agents for on-policy agentic RL training.

ai-agents sglang strands-agents agentic-rl

Updated Jun 13, 2026
Python

FlyTune / ProxMO-RL

Proximity-based Multi-turn Optimization (ProxMO) - Official Implementation

efficiency rl llm agentic-rl

Updated Mar 29, 2026
Python

horizon-llm / AlphaQuanter

[ACL2026] AlphaQuanter: An End-to-End Tool-Orchestrated Agentic Reinforcement Learning Framework for Stock Trading.

agent agentic-rl

Updated Oct 17, 2025
Python

X-PLUG / ToolCUA

ToolCUA: Towards Optimal GUI-Tool Path Orchestration for Computer Use Agents

sandbox-environment mllm gui-agent computer-use-agent agentic-rl

Updated May 13, 2026
Python

strands-rl / strands-env

A gym-like framework for building agent environments for RL training.

ai-agents strands-agents agentic-rl agent-environments

Updated Jun 13, 2026
Python

EvolvingLMMs-Lab / ParaVT

ParaVT: Taming the Tool Prior Paradox for Parallel Tool Use in Agentic Video Reinforcement Learning

reinforcement-learning tool-use long-video-understanding video-llm grpo agentic-rl multimodal-rl

Updated Jun 2, 2026
Python

thu-unicorn / Doctor-R1

This is the official repository for our paper "Doctor-R1: Mastering Clinical Inquiry with Experiential Agentic Reinforcement Learning" published in ICRL 2026.

experience medical-ai agentic-rl

Updated Apr 11, 2026
Python

WxxShirley / Agent-STAR

Official implementation for paper "Demystifying Reinforcement Learning for Long-Horizon Tool-Using Agents: A Comprehensive Recipe"

agent reinforcement-learning reinforcement-learning-agent agentic-rl

Updated May 12, 2026
Python

little1d / MolAct

Official Code of Paper: MolAct: An Agentic RL Framework for Molecular Editing and Property Optimization

agent ai drug-discovery drug-design llms molecule-editing molecule-optimization tool-augmented-agents agentic-rl

Updated Apr 13, 2026
Python

XiaoRed5 / Agentic-RL-Most-Detailed-Intro

Agentic RL最详细入门

tutorial reinforcement-learning credit-assignment llm-agents agentic-rl

Updated Jun 10, 2026
HTML

scitix / Agent-Sandbox

Fast, Multi-Cloud Sandbox Engine for AI Agents

kubernetes reinforcement-learning sandbox agents e2b agent-sandbox rlvr swe-bench terminal-bench agentic-rl e2b-compatible swe-rex

Updated Jun 11, 2026
Go

Improve this page

Add a description, image, and links to the agentic-rl topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the agentic-rl topic, visit your repo's landing page and select "manage topics."