Name	Name	Last commit message	Last commit date
Latest commit History 14 Commits
CODI	CODI
Coconut	Coconut
assets	assets
.gitignore	.gitignore
LICENSE	LICENSE
README.md	README.md
requirements.txt	requirements.txt

SIM-CoT: Supervised Implicit Chain-of-Thought
[ICLR 2026 🔥]

Authors: Xilin Wei, Xiaoran Liu, Yuhang Zang, Xiaoyi Dong, Yuhang Cao, Jiaqi Wang, Xipeng Qiu, Dahua Lin
Institutes: Fudan University; Shanghai AI Laboratory; The Chinese University of Hong Kong; Shanghai Innovation Institute;
Resources: [📖Paper] [🏠Project Page] [🤗Huggingface]

Introduction

🌈 SIM-CoT (Supervised Implicit Chain-of-Thought) is a training framework for implicit reasoning that makes latent (implicit) CoT stable, scalable, and interpretable.

While implicit CoT can greatly reduce inference-time token cost compared to explicit chain-of-thought, prior approaches often suffer from latent instability when scaling the number of implicit tokens—leading to semantic homogenization, operator information loss, and even training collapse.

SIM-CoT addresses this by introducing step-level supervision for implicit latents. During training, we attach a lightweight auxiliary decoder to align each implicit latent token with a corresponding reasoning step, enforcing structured semantics in the latent space and improving optimization stability. Importantly, the auxiliary decoder is removed at inference time, so SIM-CoT preserves the efficiency advantages of implicit reasoning without adding any extra inference overhead.

💡 Highlights

🔥 Latent Instability in Implicit CoT: We systematically analyze the limitations of implicit Chain-of-Thought methods and reveal a latent instability issue—as the number of implicit tokens increases, models tend to collapse into homogeneous latent states that lose operator semantics.
🔥 Step-Level Supervision with SIM-CoT: We propose Supervised IMplicit-CoT (SIM-CoT), a plug-and-play module that introduces step-level supervision via an auxiliary decoder. This stabilizes optimization, prevents collapse, and ensures that latent tokens capture meaningful reasoning steps.
🔥 Strong and Consistent Performance: SIM-CoT consistently outperforms both explicit and implicit baselines. On GPT-2, it exceeds supervised CoT by +2.1%, Coconut by +8.2%, and CODI by +4.3%. Across larger LLaMA models (1B/3B/8B), it delivers +1.5% to +9.0% gains, and remains stable even with 8–16 implicit tokens, where prior methods collapse.
🔥 Efficiency and Interpretability: SIM-CoT adds no extra inference cost since the auxiliary decoder is discarded after training. It also provides interpretability, allowing each latent token to be decoded into a human-readable reasoning step.

📜 News

[2026年1月26日] 🎉 Our paper is accepted to ICLR 2026!

[2025年9月24日] Code and Paper are released!

👨‍💻 Todo

Code Release
Checkpoint Release
Usage Instructions Release

🛠️ Usage

1. Clone the repository

git clone https://github.com/InternLM/SIM-CoT.git
cd SIM-CoT

2. Install dependencies

pip install -r requirements.txt

3. Training with Coconut + SIM-CoT

Step 1: Train the Coconut baseline

cd Coconut
torchrun --nnodes 1 --nproc_per_node 8 run.py args/gsm_coconut.yaml

Step 2: Continue training with SIM-CoT

Select a checkpoint that has been expanded to predefined implicit tokens, then continue training with SIM-CoT:

torchrun --nnodes 1 --nproc_per_node 8 run.py args/gsm_simcot.yaml

4. Evaluation with Coconut + SIM-CoT

torchrun --nnodes 1 --nproc_per_node 8 run.py args/gsm_simcot_eval.yaml

5. Training with CODI + SIM-CoT

cd CODI
bash scripts/train_llama3b_gsm8k-aug-decoder-2.sh

6. Evaluation with CODI + SIM-CoT

bash CODI/scripts/test_llama3b-copy.sh

✒️ Citation

If you find our work helpful for your research, please consider giving a star ⭐ and citation 📝

@inproceedings{wei2025simcot,
 title={{SIM-COT}: Supervised Implicit Chain-of-Thought},
 author={Wei, Xilin and Liu, Xiaoran and Zang, Yuhang and Dong, Xiaoyi and Cao, Yuhang and Wang, Jiaqi and Qiu, Xipeng and Lin, Dahua},
 booktitle={International Conference on Learning Representations},
 year={2026}
}

❤️ Acknowledgments

Coconut: The codebase we built upon. Thanks for their wonderful work.
CODI: Our work is based on this codebase; we are grateful for their valuable contribution.
LLaMA series: The amazing open-sourced large language model!
GPT2: An impressive open-source large language model!

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

InternLM/SIM-CoT

Folders and files

Latest commit

History

Repository files navigation

SIM-CoT: Supervised Implicit Chain-of-Thought
[ICLR 2026 🔥]

Introduction

💡 Highlights

📜 News

👨‍💻 Todo

🛠️ Usage

1. Clone the repository

2. Install dependencies

3. Training with Coconut + SIM-CoT

Step 1: Train the Coconut baseline

Step 2: Continue training with SIM-CoT

4. Evaluation with Coconut + SIM-CoT

5. Training with CODI + SIM-CoT

6. Evaluation with CODI + SIM-CoT

✒️ Citation

❤️ Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Contributors 2

Uh oh!

Languages

License

InternLM/SIM-CoT

Folders and files

Latest commit

History

Repository files navigation

SIM-CoT: Supervised Implicit Chain-of-Thought [ICLR 2026 🔥]

Introduction

💡 Highlights

📜 News

👨‍💻 Todo

🛠️ Usage

1. Clone the repository

2. Install dependencies

3. Training with Coconut + SIM-CoT

Step 1: Train the Coconut baseline

Step 2: Continue training with SIM-CoT

4. Evaluation with Coconut + SIM-CoT

5. Training with CODI + SIM-CoT

6. Evaluation with CODI + SIM-CoT

✒️ Citation

❤️ Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

SIM-CoT: Supervised Implicit Chain-of-Thought
[ICLR 2026 🔥]

Packages