Commit ae61b4d

committed

readme

1 parent 2cd7712 commit ae61b4dCopy full SHA for ae61b4d

File tree

4 files changed

+19

-19

lines changed

README.md
README_cn.md
mftcoder_accelerate
- README.md
- README_cn.md

4 files changed

+19

-19

lines changed

`‎README.md‎`

Lines changed: 2 additions & 2 deletions

Original file line number	Diff line number	Diff line change
`@@ -46,9 +46,9 @@`
`46`	`46`
`47`	`47`
`48`	`48`	`## News`
`49`		`-🔥🔥🔥 [2024/11/01] We released MFTCoder v0.5 mainly for MFTCoder-accelerate, which is now supporting preference alignment methods like DPO/RPO/ORPO in the new xxpo module, adding full-parameter continue-training in the additional mpt module along with its offline_tokenization module, updating selfpaced method to new convergence balance(CoBa) method for MFT in the original pefts module.`
	`49`	`+🔥🔥🔥 [2024/10/31] We released MFTCoder v0.5 mainly for MFTCoder-accelerate, which is now supporting preference alignment methods like DPO/RPO/ORPO in the new xxpo module, adding full-parameter continue-training in the additional mpt module along with its offline_tokenization module, updating selfpaced method to new convergence balance(CoBa) method for MFT in the original pefts module.`
`50`	`50`
`51`		`-🔥🔥🔥 [2024/11/01] Our paper [CoBa: Convergence Balancer for Multitask Finetuning of Large Language Models](https://arxiv.org/abs/2410.06741) has been accepted by EMNLP-2024, which achieves balanced convergence across various tasks.`
	`51`	`+🔥🔥🔥 [2024/10/31] Our paper [CoBa: Convergence Balancer for Multitask Finetuning of Large Language Models](https://arxiv.org/abs/2410.06741) has been accepted by EMNLP-2024, which achieves balanced convergence across various tasks.`
`52`	`52`
`53`	`53`	`🔥🔥🔥 [2024年05月20日] We released MFTCoder v0.4, mainly for MFTCoder-accelerate. It supports QLoRA + DeepSpeed Zero3 and QLoRA + FSDP as options allowing you training very large models. It now supports new models like Qwen2, Qwen2-MoE, Starcoder2, Gemma, etc.`
`54`	`54`

`‎README_cn.md‎`

Lines changed: 2 additions & 2 deletions

Original file line number	Diff line number	Diff line change
`@@ -45,9 +45,9 @@`
`45`	`45`
`46`	`46`
`47`	`47`	`## 新闻`
`48`		`-🔥🔥🔥 [2024/11/01] MFTCoder-v0.5发布,新增xxpo模块支持偏好对齐DPO/RPO/ORPO;新增mpt和offline_tokenization模块支持全量参数的加训;在原本的pefts模块(MFT)更新selfpaced收敛均衡技术并更名CoBa。`
	`48`	`+🔥🔥🔥 [2024/10/31] MFTCoder-v0.5发布,新增xxpo模块支持偏好对齐DPO/RPO/ORPO;新增mpt和offline_tokenization模块支持全量参数的加训;在原本的pefts模块(MFT)更新selfpaced收敛均衡技术并更名CoBa。`
`49`	`49`
`50`		`-🔥🔥🔥 [2024/11/01] 我们的论文 [CoBa: Convergence Balancer for Multitask Finetuning of Large Language Models](https://arxiv.org/abs/2410.06741) 已被 EMNLP 2024 接收,可以实现多任务收敛均衡。`
	`50`	`+🔥🔥🔥 [2024/10/31] 我们的论文 [CoBa: Convergence Balancer for Multitask Finetuning of Large Language Models](https://arxiv.org/abs/2410.06741) 已被 EMNLP 2024 接收,可以实现多任务收敛均衡。`
`51`	`51`
`52`	`52`	`🔥🔥🔥 [2024年05月20日] MFTCoder-v0.4发布。新增支持QLoRA+ DeepSpeed Zero3, QLoRA + FSDP训练模式,可以更好的支持微调更大的模型,比如Qwen1.5-70B等。新增对Qwen2, Qwen2-MoE, Starcoder2, Gemma等模型的支持。`
`53`	`53`

`‎mftcoder_accelerate/README.md‎`

Lines changed: 5 additions & 5 deletions

Original file line number	Diff line number	Diff line change
`@@ -15,13 +15,13 @@`
`15`	`15`
`16`	`16`	`🔥 MFTCoder-accelerate now support these modes: QLoRA/LoRA + DeepSpeed ZeRO2, QLoRA + DeepSpeed ZeRO3, Full-parameter + DeepSpeed ZeRO3, QLoRA + FSDP, Full-parameter + FSDP.`
`17`	`17`
`18`		`-🔥 MFTCoder-accelerate supports QLoRA + DeepSpeed ZeRO3 and QLoRA + FSDP, which both work for larger models;`
	`18`	`+🔥 MFTCoder-accelerate supports QLoRA + DeepSpeed ZeRO3 and QLoRA + FSDP, which both work for larger models.`
`19`	`19`
`20`		`-🔥 MFTCoder-accelerate supports MFT/SFT on more new mainstream open-source base models: mistral, mixtral-8x7b(Mixture of Experts), deepseek, chatglm3;`
	`20`	`+🔥 MFTCoder-accelerate supports MFT/SFT on more new mainstream open-source base models: mistral, mixtral-8x7b(Mixture of Experts), deepseek, chatglm3.`
`21`	`21`
`22`		`-🔥 MFTCoder-accelerate supports Self-Paced Loss for Convergence Balance;`
	`22`	`+🔥 MFTCoder-accelerate supports Self-Paced Loss for Convergence Balance.`
`23`	`23`
`24`		`-🔥 MFTCoder-accelerate supports Full-parameters/QLoRA/LoRA using accelerate + DeepSpeed Framework;`
	`24`	`+🔥 MFTCoder-accelerate supports Full-parameters/QLoRA/LoRA using accelerate + DeepSpeed Framework.`
`25`	`25`
`26`	`26`	`🔥 MFTCoder-accelerate supports Multitask Fine-Tuning(MFT), which is able to balance diffenrent tasks in data level.`
`27`	`27`
`@@ -94,7 +94,7 @@ User nth round input`
`94`	`94`	When applying inference, you always make your input string end with ```<s>bot\n``` to request the model generating answers.
`95`	`95`
`96`	`96`	`### 2.3 DPO训练数据格式`
`97`		-The training data is required to be a uniformed JSONL format, in which each line of data has the following JSON format. The "chosen" and "rejected" fields are required as ```chosen``` and ```rejected``` in DPO training and both includes "chatml-style" contents.
	`97`	+The training data is required to be a uniformed JSONL format, in which each line of data has the following JSON format. The "chosen" and "rejected" fields are required as ```chosen``` and ```rejected``` in DPO training and both includes "chatml-style" contents(only last content of bot differs).
`98`	`98`	```json
`99`	`99`	`{`
`100`	`100`	`"chosen":[`

`‎mftcoder_accelerate/README_cn.md‎`

Lines changed: 10 additions & 10 deletions

Original file line number	Diff line number	Diff line change
`@@ -15,19 +15,19 @@`
`15`	`15`
`16`	`16`	`🔥 MFTCoder-accelerate 最新支持的训练模式包括: QLoRA/LoRA + DeepSpeed ZeRO2, QLoRA + DeepSpeed ZeRO3, 全量 + DeepSpeed ZeRO3, QLoRA + FSDP, 全量 + FSDP。`
`17`	`17`
`18`		`-🔥 MFTCoder-accelerate 新增支持QLoRA + DeepSpeed ZeRO3, 支持QLoRA + FSDP, 可以训练更大的模型;`
	`18`	`+🔥 MFTCoder-accelerate 新增支持QLoRA + DeepSpeed ZeRO3, 支持QLoRA + FSDP, 可以训练更大的模型。`
`19`	`19`
`20`		`-🔥 MFTCoder-accelerate 新增支持accelerate + FSDP框架, 支持全量微调和LoRA;`
	`20`	`+🔥 MFTCoder-accelerate 新增支持accelerate + FSDP框架, 支持全量微调和LoRA。`
`21`	`21`
`22`		`-🔥 MFTCoder-accelerate 支持最新更多主流开源模型: mistral, mixtral-8x7b(Mixture of Experts), deepseek, chatglm3;`
	`22`	`+🔥 MFTCoder-accelerate 支持最新更多主流开源模型: mistral, mixtral-8x7b(Mixture of Experts), deepseek, chatglm3。`
`23`	`23`
`24`		`-🔥 MFTCoder-accelerate 新增self-paced Loss, 用于收敛均衡;`
	`24`	`+🔥 MFTCoder-accelerate 新增self-paced Loss, 用于收敛均衡。`
`25`	`25`
`26`		`-🔥 MFTCoder-accelerate 支持使用accelerate + DeepSpeed框架下支持全量参数/QLoRA/LoRA微调;`
	`26`	`+🔥 MFTCoder-accelerate 支持使用accelerate + DeepSpeed框架下支持全量参数/QLoRA/LoRA微调。`
`27`	`27`
`28`		`-🔥 MFTCoder-accelerate 在训练中支持了多任务微调MFT, 可以同时平衡多个任务的训练,训练的模型支持多任务推理;`
	`28`	`+🔥 MFTCoder-accelerate 在训练中支持了多任务微调MFT, 可以同时平衡多个任务的训练,训练的模型支持多任务推理。`
`29`	`29`
`30`		`-🔥 MFTCoder-accelerate 在训练中支持多种模型基座: codellama, llama2, llama, starcoder, codegeex2, chatglm2, qwen等`
	`30`	`+🔥 MFTCoder-accelerate 在训练中支持多种模型基座: codellama, llama2, llama, starcoder, codegeex2, chatglm2, qwen等。`
`31`	`31`
`32`	`32`	`## 2. 数据格式`
`33`	`33`	`### 2.1 MFT训练数据格式`
`@@ -87,7 +87,7 @@`
`87`	`87`	```
`88`	`88`
`89`	`89`	`### 2.3 DPO训练数据格式`
`90`		-训练数据为jsonl格式,每一行的数据格式如下,其中chosen字段和rejected字段分别代表偏好对齐中的```chosen```和```rejected```,其内部依然是MFT的chatml格式。
	`90`	+训练数据为jsonl格式,每一行的数据格式如下,其中chosen字段和rejected字段分别代表偏好对齐中的```chosen```和```rejected```,其内部依然是MFT的chatml格式,并且只有最后一轮对话的bot content不同。
`91`	`91`	```json
`92`	`92`	`{`
`93`	`93`	`"chosen":[`
@@ -292,8 +292,8 @@ _*训练需要的参数配置在```configs/_train_config```中,主要参数
`292`	`292`	`- coba_sample_valid_num: CoBa每一步要取的valid batch数。理论上当该值等于valid batch总数量时,拟合出的收敛斜率最逼近真实情况,但考虑到计算需求,建议设置为1。`
`293`	`293`
`294`	`294`	`#### DPO 相关参数配置`
`295`		`-- xxpo: 偏好对齐方法, "dpo" 或者 "orpo".`
`296`		`-- beta: DPO beta, beta 越小,允许对齐后的dpo模型与ref模型的距离越远`
	`295`	`+- xxpo: 偏好对齐方法, "dpo" 或者 "orpo"。`
	`296`	`+- beta: DPO beta, beta 越小,允许对齐后的dpo模型与ref模型的距离越远。`
`297`	`297`	- rpo_alpha: 加到dop损失的```chosen``` NLL损失的系数,0的话就是原始DPO。
`298`	`298`	`-`
`299`	`299`	`## 4. 模型使用`

0 commit comments

Comments

(0)

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit ae61b4d

File tree

4 files changed

4 files changed

`‎README.md‎`

`‎README_cn.md‎`

`‎mftcoder_accelerate/README.md‎`

`‎mftcoder_accelerate/README_cn.md‎`

0 commit comments