This repository was archived by the owner on Sep 10, 2025. It is now read-only.
 
 
 - 
  Notifications
 
You must be signed in to change notification settings  - Fork 248
 
 This repository was archived by the owner on Sep 10, 2025. It is now read-only.
 
 
Slimming down torchchat: Replace replace_attention_with_custom_sdpa_attention() with ET's implementation #1058
Open
@Jack-Khuu 
Description
🚀 The feature, motivation and pitch
First surfaced in #1057, the replace_attention_with_custom_sdpa_attention function, used when exporting models in torchchat, can be replaced with the equivalent API provided in the Excecutorch https://github.com/pytorch/executorch/blob/main/examples/models/llama2/source_transformation/sdpa.py 
Task: Swap the torchchat implementation with that of ExecuTorch's. Delete the then defunct code from torchchat
Alternatives
No response
Additional context
No response
RFC (Optional)
No response
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
No status