-
Notifications
You must be signed in to change notification settings - Fork 15
Open
@joez17
Description
It seems that in Octivius, lora-moe uses conversation[0]['value'] to obtain the soft_gate value.
image
There are 2 questions:
1 Where are the system message and modality embedding introduced into gate activation?
image
2. In the case of multi-turn dialogues, incorporating only the initial question for gate computation throughout the entire conversation seems illogical.
Could there be aspects I'm misunderstanding? Please help clarify my confusion. Thanks!
Metadata
Metadata
Assignees
Labels
No labels