Name	Name	Last commit message	Last commit date
Latest commit History 31 Commits
autodl	autodl
colab	colab
README.md	README.md

✏️LLM微调上手项目

一步一步使用Colab训练法律LLM,基于microsoft/phi-1_5 ,ChatGLM3-6B。通过本项目你可以0成本手动了解微调LLM。如果想要了解LLM微调具体代码实现,可以参考 my_finetune 项目🤓。

name	Colab	Datasets
自我认知 lora-SFT 微调	Colab	self_cognition.json
法律问答 lora-SFT 微调	Colab	DISC-LawLLM
法律问答全参数-SFT 微调*	Colab	DISC-LawLLM
ChatGLM3-6B 自我认知 lora-SFT 微调*	Colab	self_cognition.json

*如果是Colab Pro会员用户,可以尝试全参数-SFT微调,使用高RAM+T4,1000条数据大概需要20+小时
*如果是Colab Pro会员用户,ChatGLM3-6B 自我认知lora-SFT 微调,使用高RAM+T4,只需要几分钟,效果比较好

目标

使用colab免费的T4显卡,完成法律问答指令监督微调(SFT) microsoft/phi-1_5 模型

自我认知微调

自我认知数据来源:self_cognition.json

80条数据,使用T4 lora微调phi-1_5,几分钟就可以微调完毕

微调参数,具体步骤详见colab

python src/train_bash.py \
 --stage sft \
 --model_name_or_path microsoft/phi-1_5 \
 --do_train True\
 --finetuning_type lora \
 --template vanilla \
 --flash_attn False \
 --shift_attn False \
 --dataset_dir data \
 --dataset self_cognition \
 --cutoff_len 1024 \
 --learning_rate 2e-04 \
 --num_train_epochs 20.0 \
 --max_samples 1000 \
 --per_device_train_batch_size 6 \
 --per_device_eval_batch_size 6 \
 --gradient_accumulation_steps 1 \
 --lr_scheduler_type cosine \
 --max_grad_norm 1.0 \
 --logging_steps 5 \
 --save_steps 100 \
 --warmup_steps 0 \
 --neft_alpha 0 \
 --train_on_prompt False \
 --upcast_layernorm False \
 --lora_rank 8 \
 --lora_dropout 0.1 \
 --lora_target Wqkv \
 --resume_lora_training True \
 --output_dir saves/Phi1.5-1.3B/lora/my \
 --fp16 True \
 --plot_loss True

效果

法律问答微调

法律问答数据来源:DISC-LawLLM
为了减省显存,使用deepspeed stage2,cutoff_len可以最多到1792,再多就要爆显存了

deepspeed配置

{
 "train_batch_size": "auto",
 "train_micro_batch_size_per_gpu": "auto",
 "gradient_accumulation_steps": "auto",
 "gradient_clipping": "auto",
 "zero_allow_untested_optimizer": true,
 "fp16": {
 "enabled": "auto",
 "loss_scale": 0,
 "initial_scale_power": 16,
 "loss_scale_window": 1000,
 "hysteresis": 2,
 "min_loss_scale": 1
 },
 "zero_optimization": {
 "stage": 2,
 "offload_optimizer": {
 "device": "cpu",
 "pin_memory": true
 },
 "allgather_partitions": true,
 "allgather_bucket_size": 2e8,
 "reduce_scatter": true,
 "reduce_bucket_size": 2e8,
 "overlap_comm": false,
 "contiguous_gradients": true
 }
}

微调参数

1000条数据,T4大概需要60分钟

deepspeed --num_gpus 1 --master_port=9901 src/train_bash.py \
 --deepspeed ds_config.json \
 --stage sft \
 --model_name_or_path microsoft/phi-1_5 \
 --do_train True \
 --finetuning_type lora \
 --template vanilla \
 --flash_attn False \
 --shift_attn False \
 --dataset_dir data \
 --dataset self_cognition,law_sft_triplet \
 --cutoff_len 1792 \
 --learning_rate 2e-04 \
 --num_train_epochs 5.0 \
 --max_samples 1000 \
 --per_device_train_batch_size 1 \
 --per_device_eval_batch_size 1 \
 --gradient_accumulation_steps 1 \
 --lr_scheduler_type cosine \
 --max_grad_norm 1.0 \
 --logging_steps 5 \
 --save_steps 1000 \
 --warmup_steps 0 \
 --neft_alpha 0 \
 --train_on_prompt False \
 --upcast_layernorm False \
 --lora_rank 8 \
 --lora_dropout 0.1 \
 --lora_target Wqkv \
 --resume_lora_training True \
 --output_dir saves/Phi1.5-1.3B/lora/law \
 --fp16 True \
 --plot_loss True

全参微调

可以通过,estimate_zero3_model_states_mem_needs_all_live查看deepspeed各个ZeRO stage 所需要的内存。

from transformers import AutoModel, AutoModelForCausalLM
from deepspeed.runtime.zero.stage3 import estimate_zero3_model_states_mem_needs_all_live
model_name = "microsoft/phi-1_5"
model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True)
estimate_zero3_model_states_mem_needs_all_live(model, num_gpus_per_node=1, num_nodes=1)

如图所适 offload_optimizer -> cpu 后microsoft/phi-1_5 需要32G内存,colab高内存有52G可以满足需求。

deepspeed配置

{
 "train_batch_size": "auto",
 "train_micro_batch_size_per_gpu": "auto",
 "gradient_accumulation_steps": "auto",
 "gradient_clipping": "auto",
 "zero_allow_untested_optimizer": true,
 "fp16": {
 "enabled": "auto",
 "loss_scale": 0,
 "initial_scale_power": 16,
 "loss_scale_window": 1000,
 "hysteresis": 2,
 "min_loss_scale": 1
 },
 "zero_optimization": {
 "stage": 2,
 "offload_optimizer": {
 "device": "cpu",
 "pin_memory": true
 },
 "allgather_partitions": true,
 "allgather_bucket_size": 2e8,
 "reduce_scatter": true,
 "reduce_bucket_size": 2e8,
 "overlap_comm": false,
 "contiguous_gradients": true
 }
}

deepspeed --num_gpus 1 --master_port=9901 src/train_bash.py \
 --deepspeed ds_config.json \
 --stage sft \
 --model_name_or_path microsoft/phi-1_5 \
 --do_train True \
 --finetuning_type full \
 --template vanilla \
 --flash_attn False \
 --shift_attn False \
 --dataset_dir data \
 --dataset self_cognition,law_sft_triplet \
 --cutoff_len 1024 \
 --learning_rate 2e-04 \
 --num_train_epochs 10.0 \
 --max_samples 1000 \
 --per_device_train_batch_size 1 \
 --per_device_eval_batch_size 1 \
 --gradient_accumulation_steps 1 \
 --lr_scheduler_type cosine \
 --max_grad_norm 1.0 \
 --logging_steps 5 \
 --save_steps 1000 \
 --warmup_steps 0 \
 --neft_alpha 0 \
 --train_on_prompt False \
 --upcast_layernorm False \
 --lora_rank 8 \
 --lora_dropout 0.1 \
 --lora_target Wqkv \
 --resume_lora_training True \
 --output_dir saves/Phi1.5-1.3B/lora/law_full \
 --fp16 True \
 --plot_loss True

也可以考虑使用 kaggle,可以每周使用30个小时,可以选择2张T4,使用ZeRO stage 3 全参微调

deepspeed配置

{
 "train_batch_size": "auto",
 "train_micro_batch_size_per_gpu": "auto",
 "gradient_accumulation_steps": "auto",
 "gradient_clipping": "auto",
 "zero_allow_untested_optimizer": true,
 "fp16": {
 "enabled": "auto",
 "loss_scale": 0,
 "initial_scale_power": 16,
 "loss_scale_window": 1000,
 "hysteresis": 2,
 "min_loss_scale": 1
 },
 "zero_optimization": {
 "stage": 3,
 "overlap_comm": false,
 "contiguous_gradients": true,
 "sub_group_size": 5e7,
 "reduce_bucket_size": "auto",
 "stage3_prefetch_bucket_size": "auto",
 "stage3_param_persistence_threshold": "auto",
 "stage3_max_live_parameters": 5e7,
 "stage3_max_reuse_distance": 5e7,
 "stage3_gather_16bit_weights_on_model_save": true
 }
}

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

billvsme/train_law_llm

Folders and files

Latest commit

History

Repository files navigation

✏️LLM微调上手项目

目标

自我认知微调

法律问答微调

全参微调

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Languages

billvsme/train_law_llm

Folders and files

Latest commit

History

Repository files navigation

✏️LLM微调上手项目

目标

自我认知微调

法律问答微调

全参微调

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages