Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Releases: DeepLink-org/LightRFT

v0.1.0: Initial Release

06 Jan 06:13
@sallyjunjun sallyjunjun

Choose a tag to compare

This is the initial release of LightRFT, a light, efficient, omni-modal & reward-model driven reinforcement fine-tuning framework.

🧠 Rich Algorithm Ecosystem

  • Implemented PPO and GRPO algorithms for Large Language Models.
  • Added comprehensive interfaces for Reward Model training and inference.

🎯 Innovative Resource Collaboration

  • Introduced "Colocate Anything" strategy to maximize GPU memory efficiency by colocating Actor, Critic, and Reward models.

🔧 Flexible Training Strategies

  • Integrated DeepSpeed ZeRO and FSDP for scalable distributed training.
  • Added PEFT (LoRA) integration for lightweight fine-tuning.

🌐 Environments & Models

  • Added support for GSM8K (Math reasoning) and Geo3K (Multimodal) environments and datasets.
  • Enabled support for Qwen and DeepSeek model families.

📚 Documentation & Toolkit

  • Integrated Weights & Biases (W&B) for training metric logging.
  • Released initial Quick Start guide, architecture overview, and reproduction scripts.

Full Changelog: https://github.com/DeepLink-org/LightRFT/commits/v0.1.0

Contributors: Deeplink Team, OpenDILab, System Platform Center and Safe and Trustworthy AI Center at Shanghai AI Laboratory.

Assets 2
Loading

AltStyle によって変換されたページ (->オリジナル) /