Welcome to Minami-su's GitHub home.
GitHub: github.com/Minami-su
Huggingface: huggingface.co/Minami-su
Welcome to Minami-su's GitHub home.
GitHub: github.com/Minami-su
Huggingface: huggingface.co/Minami-su
Generate multi-round conversation roleplay data based on self-instruct and evol-instruct.
This repository, deepspeed-grpo-qlora-vllm, provides a complete framework for fine-tuning LLMs using Group Relative Policy Optimization (GRPO) on 4-bit quantized models (QLoRA). It utilizes DeepSpe...
Python 13
Forked from tomaarsen/attention_sinks
attention_sinks can use autogptq,and support all model at autogptq,like qwen baichuan,etc
Python 1