-
Notifications
You must be signed in to change notification settings - Fork 70
Releases: codefuse-ai/MFTCoder
Releases · codefuse-ai/MFTCoder
MFTCoder v0.4.3: Bugfix
@chencyudel
chencyudel
cc55b06
This commit was created on GitHub.com and signed with GitHub’s verified signature.
Bugfix: Remove default tensor board writer which may cause permission problem
P.S. If you have problem like "permission denied" of "/home/admin", please try the new fixed release v0.4.3
Assets 2
1 person reacted
MFTCoder v0.4.2: Support more open source models; Support QLoRA + Deepspeed ZeRO3 / FSDP
@chencyudel
chencyudel
d0b8457
This commit was created on GitHub.com and signed with GitHub’s verified signature.
Support more open source models like Qwen2, Qwen2-moe, Starcoder2, etc.
Support QLoRA + Deepspeed ZeRO3 / FSDP, which is efficient for very large models.
Assets 2
MFTCoder v0.3.0: Support more open source models, support Self-Paced Loss, support FSDP
@chencyudel
chencyudel
e5243da
This commit was created on GitHub.com and signed with GitHub’s verified signature.
Updates:
- Mainly for MFTCoder-accelerate.
- It now supports more open source models like Mistral, Mixtral(MoE), DeepSeek-coder, chatglm3.
- It supports FSDP as an option.
- It also supports Self-paced Loss as a solution for convergence balance in Multitask Fine-tuning.
Assets 2
v0.1.0 release: Multi Task Fintuning Framework for Multiple base modles
@chencyudel
chencyudel
7946e4f
This commit was created on GitHub.com and signed with GitHub’s verified signature.
The key has expired.
- We released MFTCoder which supports finetuning Code Llama, Llama, Llama2, StarCoder, ChatGLM2, CodeGeeX2, Qwen, and GPT-NeoX models with LoRA/QLoRA.
- mft_peft_hf is based on the HuggingFace Accelerate and deepspeed framework.
mft_atorch is based on the ATorch frameworks, which is a fast distributed training framework of LLM.