Unakar/Logic-RL

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
data/kk/instruct		data/kk/instruct
docker		docker
docs		docs
eval_kk		eval_kk
examples		examples
math_eval		math_eval
patches		patches
pics		pics
scripts		scripts
tests		tests
verl		verl
.gitignore		.gitignore
LICENSE		LICENSE
Notice.txt		Notice.txt
README.md		README.md
main_grpo.sh		main_grpo.sh
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Repository files navigation

Logic-RL

Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning

News

[2025年03月20日] We release the ADORA: A Scalable Paradigm for Steering Learning Trajectories .

[2025年03月19日] For stable length control, refer to https://github.com/lblankl/Short-RL

Teaser Image

Main results

Benchmark

Model	2ppl	3ppl	4ppl	5ppl	6ppl	7ppl	8ppl
o3-mini-high	0.99	0.98	0.97	0.95	0.94	0.89	0.83
o1-2024年12月17日	0.83	0.51	0.38	0.38	0.35	0.30	0.20
GPT-4o	0.68	0.57	0.49	0.32	0.23	0.21	0.11
Deepseek-Math-7b	0.35	0.21	0.08	0.06	0.02	0.00	0.00
Qwen2.5-7B-Instruct-1M	0.49	0.40	0.25	0.11	0.02	0.06	0.01
Qwen2.5-7B-Logic-RL (ours)	0.99	0.99	0.94	0.92	0.91	0.80	0.67

Installation

conda create -n logic python=3.9
pip install torch==2.4.0 --index-url https://download.pytorch.org/whl/cu121
pip3 install vllm==0.6.3 ray
pip3 install flash-attn --no-build-isolation
pip install -e . # For verl integration
pip install wandb IPython matplotlib

Data Preparation

You can directly use /data.

For your own data generation, here's a demo:

Base Model

python ./examples/data_preprocess/kk.py \
 --local_dir {processed_data_path} \
 --data_path {raw_data_path}

Instruct Model

python ./examples/data_preprocess/kk.py \
 --template_type=qwen-instruct \
 --local_dir {processed_data_path} \
 --data_path {raw_data_path}

Training Execution

×ばつA100 80G">

conda activate logic
bash main_grpo.sh # ×ばつA100 80G

⚙️ Implementation Details

Component	Location
Reward Modeling	`verl/utils/reward_score/kk.py`
Data Preprocessing	`examples/data_preprocess/kk.py`

Citation

@misc{xie2025logicrlunleashingllmreasoning,
 title={Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning}, 
 author={Tian Xie and Zitian Gao and Qingnan Ren and Haoming Luo and Yuqian Hong and Bryan Dai and Joey Zhou and Kai Qiu and Zhirong Wu and Chong Luo},
 year={2025},
 eprint={2502.14768},
 archivePrefix={arXiv},
 primaryClass={cs.CL},
 url={https://arxiv.org/abs/2502.14768}, 
}

Acknowledgements

Verl 🔗
TinyZero 🔗
Knights and Knaves (K&K) puzzles dataset 🔗

Star History

Star History Chart

About

Reproduce R1 Zero on Logic Puzzle

Releases

No releases published

Packages

No packages published

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Unakar/Logic-RL

Folders and files

Latest commit

History

Repository files navigation

Logic-RL

Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning

News

Benchmark

Installation

Data Preparation

Base Model

Instruct Model

Training Execution

⚙️ Implementation Details

Citation

Acknowledgements

Star History

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Contributors 5

Languages

License

Unakar/Logic-RL

Folders and files

Latest commit

History

Repository files navigation

Logic-RL

Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning

News

Benchmark

Installation

Data Preparation

Base Model

Instruct Model

Training Execution

⚙️ Implementation Details

Citation

Acknowledgements

Star History

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 5

Languages

Packages