-
-
Notifications
You must be signed in to change notification settings - Fork 10.6k
Open
@a32543254
The output of
Description
Your current environment
The output of python collect_env.py
Your output of `python collect_env.py` here
🐛 Describe the bug
vllm deep seek v3 modeling rope implement is here:
https://github.com/vllm-project/vllm/blob/main/vllm/model_executor/layers/rotary_embedding/deepseek_scaling_rope.py
however, I checked the transformers deep seek modeling and hf deep seek modeling,
both of them will do transpose for q_rope and k rope
please see link below:
transformer deep seek modeling:
https://github.com/huggingface/transformers/blob/e20df45bf676d80bdddb9757eeeafe6c0c81ecfa/src/transformers/models/deepseek_v3/modeling_deepseek_v3.py#L283
hf deep seek modeling:
https://huggingface.co/deepseek-ai/DeepSeek-R1/blob/main/modeling_deepseek.py#L339
Here is code for transpose q in deep seek rope
q = q.view(b, h, s, d // 2, 2).transpose(4, 3).reshape(b, h, s, d)
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.