-
Notifications
You must be signed in to change notification settings - Fork 15
Releases: DeepLink-org/dlinfer
Releases · DeepLink-org/dlinfer
Release list
bump version to 0.2.8
@jinminxi104
jinminxi104
released this
16 Apr 11:24
1a391de
This commit was created on GitHub.com and signed with GitHub’s verified signature.
What's Changed
- rm unused lmdeploy patch by @jinminxi104 in #310
- fix prefix caching by @yao-fengchen in #307
- fix PA param list by @jinminxi104 in #314
- [Ascend] support qwen3next by @wanfengcxz in #304
- remove unused import by @jinminxi104 in #317
- fix ci by @jinminxi104 in #318
- add mix_prefill test case by @yao-fengchen in #320
- fix re-capture in RL by @jinminxi104 in #319
- optimize rope and remove unused l2norm_fwd by @wanfengcxz in #316
- fix state dtype and refactor gdn kernel by @wanfengcxz in #321
- Bump028 by @jinminxi104 in #323
Full Changelog: v0.2.7...v0.2.8
Assets 2
bump version to 0.2.7
@jinminxi104
jinminxi104
released this
02 Apr 06:19
689f5a0
This commit was created on GitHub.com and signed with GitHub’s verified signature.
What's Changed
- support ep by @yao-fengchen in #237
- adapt for s1-pro dp*tp_ep by @yao-fengchen in #305
- Copilot/add all routed experts support by @jinminxi104 in #293
- fix padding by @jinminxi104 in #306
- Bump version to 0.2.7 by @jinminxi104 in #309
Full Changelog: v0.2.6...v0.2.7
Assets 2
v0.2.6
@jinminxi104
jinminxi104
released this
05 Feb 06:16
e19b12c
This commit was created on GitHub.com and signed with GitHub’s verified signature.
What's Changed
- [ascend] fix paged_prefill and use max_batchsize as max capture size by @tangzhiyi11 in #294
- [ascend] update multinode doc by @tangzhiyi11 in #296
- Fix refact by @jinminxi104 in #297
- Patch rotary_embedding builder and DlinferSoftmaxTopKImpl by @tangzhiyi11 in #301
- patch profile npu by @jinminxi104 in #300
- [fix]fix camb run qwen vl by @huaxiaofen in #298
- fix neg seqlen by @jinminxi104 in #302
- bump to 0.2.6 by @jinminxi104 in #303
New Contributors
- @huaxiaofen made their first contribution in #298
Full Changelog: v0.2.5...v0.2.6
Assets 2
bump version to 0.2.5
@jinminxi104
jinminxi104
released this
06 Jan 14:53
dcad072
This commit was created on GitHub.com and signed with GitHub’s verified signature.
What's Changed
- [Maca] Fix qwenvl and internvl using mcoplib by @wanfengcxz in #279
- refactor code by @yao-fengchen in #282
- [ascend] patch dptp moe by @tangzhiyi11 in #285
- fix qwen vl mask temporarily by @yao-fengchen in #286
- [ascend] support w8a8 using torch_npu ops by @wanfengcxz in #284
- [ascend] disable compile dicp and use torch_npu replay by @tangzhiyi11 in #287
- fix qwen vl vision part flash-attention by @yao-fengchen in #288
- [ascend] add multinodes doc by @tangzhiyi11 in #290
- Copilot/add patch for ascend in lmdeploy by @jinminxi104 in #291
- Copilot/update capture batch size logic by @jinminxi104 in #292
- Copilot/bump version to 025 by @jinminxi104 in #289
New Contributors
- @wanfengcxz made their first contribution in #279
Full Changelog: v0.2.4...v0.2.5
Contributors
tangzhiyi11, jinminxi104, and 2 other contributors
Assets 2
bump version to 0.2.4
@jinminxi104
jinminxi104
released this
25 Dec 11:44
1c6abaf
This commit was created on GitHub.com and signed with GitHub’s verified signature.
v0.2.4 bump version to 0.2.4 (#283)
Assets 2
dlinfer release v0.2.3.post2
@jinminxi104
jinminxi104
released this
04 Nov 11:42
4745c92
This commit was created on GitHub.com and signed with GitHub’s verified signature.
- 支持华为A3
- 支持华为aclgraph
Assets 2
dlinfer release v0.2.2
@jinminxi104
jinminxi104
released this
09 Sep 08:16
a99f77e
This commit was created on GitHub.com and signed with GitHub’s verified signature.
- 华为
- 支持A3上的Qwen系列
- 修复图模式下的内存泄漏问题
- 沐曦
- Refine代码。适配最新软件栈。
Assets 2
dlinfer release v0.2.1.post2
@jinminxi104
jinminxi104
released this
14 Jun 09:48
5a703d6
This commit was created on GitHub.com and signed with GitHub’s verified signature.
华为
- 多卡下默认使用ray,修复稳定性问题
- 其他bug fix
Assets 2
v0.1.8
@jinminxi104
jinminxi104
released this
16 Apr 10:18
8784ce6
This commit was created on GitHub.com and signed with GitHub’s verified signature.
华为
- MoE优化
What's Changed
- revert ci by @jinminxi104 in #198
- [ascend]remove attention patch in lmdeploy by @yao-fengchen in #194
- [feat]support ascend w8a8 graph_mode by @yao-fengchen in #191
- Refactor rope by @yao-fengchen in #199
- change ci setting by @jinminxi104 in #202
- fix ascend awq by @yao-fengchen in #204
- [ascend]Optimize moe by @yao-fengchen in #203
- Bump018 by @jinminxi104 in #209
Full Changelog: v0.1.7...v0.1.8
Assets 2
dlinfer release v0.1.7
@jinminxi104
jinminxi104
released this
16 Apr 10:16
fafa8eb
This commit was created on GitHub.com and signed with GitHub’s verified signature.
华为&沐曦
- 支持多节点
- 支持mla,支持dsv2
What's Changed
- [maca] support deepseekv2 for maca backend. by @Reinerzhou in #133
- [camb]fix fused_moe param order by @JackWeiw in #173
- [camb]add w8a8 support by @JackWeiw in #176
- [ascend] add new_empty op and opt reshape ops by @tangzhiyi11 in #178
- fix glm-4v by @jinminxi104 in #184
- revert ci by @jinminxi104 in #187
- default enable npu launch by @jinminxi104 in #188
- [ascend] support multi ndoes by @tangzhiyi11 in #190
- [maca] support multinode for maca. by @Reinerzhou in #192
- [ascend] remove allreduce group in eager mode by @tangzhiyi11 in #193
- [ascend]support mla by @yao-fengchen in #182
- bump version to 0.1.7 by @jinminxi104 in #195
Full Changelog: v0.1.6...v0.1.7
Contributors
tangzhiyi11, jinminxi104, and 3 other contributors