Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

[ICLR 2026] An official implementation of "SIM-CoT: Supervised Implicit Chain-of-Thought"

License

Notifications You must be signed in to change notification settings

InternLM/SIM-CoT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

14 Commits

Repository files navigation

SIM-CoT: Supervised Implicit Chain-of-Thought
[ICLR 2026 🔥]

Introduction

🌈 SIM-CoT (Supervised Implicit Chain-of-Thought) is a training framework for implicit reasoning that makes latent (implicit) CoT stable, scalable, and interpretable.

While implicit CoT can greatly reduce inference-time token cost compared to explicit chain-of-thought, prior approaches often suffer from latent instability when scaling the number of implicit tokens—leading to semantic homogenization, operator information loss, and even training collapse.

SIM-CoT addresses this by introducing step-level supervision for implicit latents. During training, we attach a lightweight auxiliary decoder to align each implicit latent token with a corresponding reasoning step, enforcing structured semantics in the latent space and improving optimization stability. Importantly, the auxiliary decoder is removed at inference time, so SIM-CoT preserves the efficiency advantages of implicit reasoning without adding any extra inference overhead.

💡 Highlights

  • 🔥 Latent Instability in Implicit CoT: We systematically analyze the limitations of implicit Chain-of-Thought methods and reveal a latent instability issue—as the number of implicit tokens increases, models tend to collapse into homogeneous latent states that lose operator semantics.

  • 🔥 Step-Level Supervision with SIM-CoT: We propose Supervised IMplicit-CoT (SIM-CoT), a plug-and-play module that introduces step-level supervision via an auxiliary decoder. This stabilizes optimization, prevents collapse, and ensures that latent tokens capture meaningful reasoning steps.

  • 🔥 Strong and Consistent Performance: SIM-CoT consistently outperforms both explicit and implicit baselines. On GPT-2, it exceeds supervised CoT by +2.1%, Coconut by +8.2%, and CODI by +4.3%. Across larger LLaMA models (1B/3B/8B), it delivers +1.5% to +9.0% gains, and remains stable even with 8–16 implicit tokens, where prior methods collapse.

  • 🔥 Efficiency and Interpretability: SIM-CoT adds no extra inference cost since the auxiliary decoder is discarded after training. It also provides interpretability, allowing each latent token to be decoded into a human-readable reasoning step.

📜 News

[2026年1月26日] 🎉 Our paper is accepted to ICLR 2026!

[2025年9月24日] Code and Paper are released!

👨‍💻 Todo

  • Code Release
  • Checkpoint Release
  • Usage Instructions Release

🛠️ Usage

1. Clone the repository

git clone https://github.com/InternLM/SIM-CoT.git
cd SIM-CoT

2. Install dependencies

pip install -r requirements.txt

3. Training with Coconut + SIM-CoT

Step 1: Train the Coconut baseline

cd Coconut
torchrun --nnodes 1 --nproc_per_node 8 run.py args/gsm_coconut.yaml

Step 2: Continue training with SIM-CoT

Select a checkpoint that has been expanded to predefined implicit tokens, then continue training with SIM-CoT:

torchrun --nnodes 1 --nproc_per_node 8 run.py args/gsm_simcot.yaml

4. Evaluation with Coconut + SIM-CoT

torchrun --nnodes 1 --nproc_per_node 8 run.py args/gsm_simcot_eval.yaml

5. Training with CODI + SIM-CoT

cd CODI
bash scripts/train_llama3b_gsm8k-aug-decoder-2.sh

6. Evaluation with CODI + SIM-CoT

bash CODI/scripts/test_llama3b-copy.sh

✒️ Citation

If you find our work helpful for your research, please consider giving a star ⭐ and citation 📝

@inproceedings{wei2025simcot,
 title={{SIM-COT}: Supervised Implicit Chain-of-Thought},
 author={Wei, Xilin and Liu, Xiaoran and Zang, Yuhang and Dong, Xiaoyi and Cao, Yuhang and Wang, Jiaqi and Qiu, Xipeng and Lin, Dahua},
 booktitle={International Conference on Learning Representations},
 year={2026}
}

❤️ Acknowledgments

  • Coconut: The codebase we built upon. Thanks for their wonderful work.
  • CODI: Our work is based on this codebase; we are grateful for their valuable contribution.
  • LLaMA series: The amazing open-sourced large language model!
  • GPT2: An impressive open-source large language model!

About

[ICLR 2026] An official implementation of "SIM-CoT: Supervised Implicit Chain-of-Thought"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

AltStyle によって変換されたページ (->オリジナル) /