-
-
Notifications
You must be signed in to change notification settings - Fork 337
Releases: turboderp-org/exllamav2
Releases · turboderp-org/exllamav2
0.3.2
Actions: Remove sentencepiece from other workflows Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
Assets 109
- sha256:940562bf8c65ba8fd17bb11d97667787818890ebb71ef1daddd4f0d33038f339154 MB
2025年07月13日T22:38:32Z - sha256:b09a2fdf36283f9a9fc0ac9f6e4eb58b79cf1f9a442c74576af2a3870a672e50129 MB
2025年07月13日T22:40:28Z - sha256:b195e60a21fa972b72cbb3155cd7faf9c03da1c21102da7cb8fdfee03e780ca7154 MB
2025年07月13日T22:38:27Z - sha256:26ea93a4d862ff34431d80c9b60a2257301c26373578539e96ba63f52ce80980129 MB
2025年07月13日T22:43:32Z - sha256:492e3f85b25cf420d165e28f499f04a4c33b6334c1f206d75275d3fc4fa6253a154 MB
2025年07月13日T22:36:58Z - sha256:e4c79c0366b4e96b2ecc7ef90e3fc3d42316af52eb8aec706aae59314b6bdc7b129 MB
2025年07月14日T03:27:22Z - sha256:5c9fd8ee0156cdcee0de9f8f59b232a23463fff685a469b4eec542680698e8c3155 MB
2025年07月13日T22:38:13Z - sha256:90213e65cac75cf950c7b82301eacb7f71dbfc4fc731c11c30373e27c2243dc8129 MB
2025年07月13日T22:39:50Z - sha256:8d2f3fbfca76738c06488bc0d5026a037506a3c4006a7f593fd81681a0ac03b3155 MB
2025年07月13日T22:39:01Z - sha256:d3189c1a937ea2ae0867bedc255ecf57813099fed306d277cd5b67f8d0b150c8129 MB
2025年07月13日T22:42:23Z -
2025年07月13日T22:11:53Z -
2025年07月13日T22:11:53Z - Loading
0.3.1
v0.3.1 Merge branch 'dev'
Assets 85
0.3.0
- Add Qwen3 and Qwen3MoE support
Full Changelog: v0.2.9...v0.3.0
Assets 85
6 people reacted
0.2.9
- Add Torch 2.7.0 wheels (big thanks to @kingbri1 for unborking the build action)
- Support Gemma3, text + vision
- Support Mistral 3.1, text + vision
- Support partial_rotary_factor (Phi-4 mini etc.)
- Support GLM4 (32B model still broken)
- Various fixes
Full Changelog: v0.2.8...v0.2.9
Assets 85
5 people reacted
0.2.8
- Support Qwen2.5-VL
- Minor bugfixes
Full Changelog: v0.2.7...v0.2.8
Assets 73
3 people reacted
0.2.7
- Basic video support for Qwen2-VL
- Support Cohere2 arch
- Support Granite3 arch
- Couple of bugfixes
Full Changelog: v0.2.6...v0.2.7
Assets 90
9 people reacted
0.2.6
- Some small fixes, most notably for Qwen2-VL inference on Windows
Full Changelog: v0.2.5...v0.2.6
Assets 88
3 people reacted
0.2.5
- Initial support for Qwen2-VL (images for now, no video)
- Some bugfixes
Full Changelog: v0.2.4...v0.2.5
Assets 88
4 people reacted
0.2.4
- Support Pixtral
- Refactoring for more multimodal support
- Faster filter evaluation
- Various optimizations and bugfixes
- Various quality of life improvements
Full Changelog: v0.2.3...v0.2.4
Assets 88
11 people reacted
0.2.3
- No longer use safetensors for loading weights (fix virtual memory issues on Windows especially)
- Disable fasttensors option (now redundant)
- Prioritize HF Tokenizers model when both HF and SPM models available
- Add XTC sampler
- Add YaRN support
- Various fixes and QoL improvements
Full Changelog: v0.2.2...v0.2.3
Assets 70
9 people reacted