| On the Convergence Rate of MCTS for the Optimal Value Estimation in Markov Decision Processes |
IEEE TAC |
2025-01 |
| Search-o1: Agentic Search-Enhanced Large Reasoning Models |
arXiv |
2025-01 |
| rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking |
arXiv |
2025-01 |
| ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search |
NeurIPS |
2024-12 |
| Forest-of-Thought: Scaling Test-Time Compute for Enhancing LLM Reasoning |
arXiv |
2024-12 |
| HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs |
arXiv |
2024-12 |
| Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search |
arXiv |
2024-12 |
| Proposing and solving olympiad geometry with guided tree search |
arXiv |
2024-12 |
| SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models |
arXiv |
2024-12 |
| Towards Intrinsic Self-Correction Enhancement in Monte Carlo Tree Search Boosted Reasoning via Iterative Preference Learning |
arXiv |
2024-12 |
| CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models |
arXiv |
2024-11 |
| GPT-Guided Monte Carlo Tree Search for Symbolic Regression in Financial Fraud Detection |
arXiv |
2024-11 |
| MC-NEST -- Enhancing Mathematical Reasoning in Large Language Models with a Monte Carlo Nash Equilibrium Self-Refine Tree |
arXiv |
2024-11 |
| Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions |
arXiv |
2024-11 |
| SRA-MCTS: Self-driven Reasoning Augmentation with Monte Carlo Tree Search for Code Generation |
arXiv |
2024-11 |
| Don’t throw away your value model! Generating more preferable text with Value-Guided Monte-Carlo Tree Search decoding |
CoLM |
2024-10 |
| AFlow: Automating Agentic Workflow Generation |
arXiv |
2024-10 |
| Interpretable Contrastive Monte Carlo Tree Search Reasoning |
arXiv |
2024-10 |
| LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning |
arXiv |
2024-10 |
| Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learning |
arXiv |
2024-10 |
| TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling |
arXiv |
2024-10 |
| Understanding When Tree of Thoughts Succeeds: Larger Models Excel in Generation, Not Discrimination |
arXiv |
2024-10 |
| RethinkMCTS: Refining Erroneous Thoughts in Monte Carlo Tree Search for Code Generation |
arXiv |
2024-09 |
| Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree Search |
arXiv |
2024-08 |
| LiteSearch: Efficacious Tree Search for LLM |
arXiv |
2024-07 |
| Tree Search for Language Model Agents |
arXiv |
2024-07 |
| Uncertainty-Guided Optimization on Large Language Model Search Trees |
arXiv |
2024-07 |
| Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B |
arXiv |
2024-06 |
| Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping |
ICLR WorkShop |
2024-05 |
| LLM Reasoners: New Evaluation, Library, and Analysis of Step-by-Step Reasoning with Large Language Models |
ICLR WorkShop |
2024-05 |
| AlphaMath Almost Zero: process Supervision without process |
arXiv |
2024-05 |
| Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search |
arXiv |
2024-05 |
| MindStar: Enhancing Math Reasoning in Pre-trained LLMs at Inference Time |
arXiv |
2024-05 |
| Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning |
arXiv |
2024-05 |
| Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning |
arXiv |
2024-05 |
| Stream of Search (SoS): Learning to Search in Language |
arXiv |
2024-04 |
| Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing |
arXiv |
2024-04 |
| Reasoning with Language Model is Planning with World Model |
EMNLP |
2023-12 |
| Large Language Models as Commonsense Knowledge for Large-Scale Task Planning |
NeurIPS |
2023-12 |
| ALPHAZERO-LIKE TREE-SEARCH CAN GUIDE LARGE LANGUAGE MODEL DECODING AND TRAINING |
NeurIPS WorkShop |
2023-12 |
| Alphazero-like Tree-Search can Guide Large Language Model Decoding and Training |
NeurIPS WorkShop |
2023-12 |
| MAKING PPO EVEN BETTER: VALUE-GUIDED MONTE-CARLO TREE SEARCH DECODING |
arXiv |
2023-09 |