Releases: modelscope/mcore-bridge

v1.4.3

07 Jun 14:57

@Jintao-Huang Jintao-Huang

v1.4.3

a302407

v1.4.3 Latest

Latest

新特性

新增 model_type 支持:gemma4_unified;kimi_k25 新增多模态支持。
新增 language_model_only 参数,启用后仅创建语言模型部分,并只加载与保存语言模型相关权重。
修复若干 Bug。

New Features

Added model_type support for gemma4_unified; added multimodal support for kimi_k25.
Added language_model_only parameter, which when enabled, only creates the language model component and exclusively loads/saves language model weights.
Fixed several bugs.

What's Changed

[bugfix] fix: clamp num_tokens=0 in MTP loss & add normalized scale for MTP per token loss by @YaoweiFan in #104
[bugfix] fix tie_word_embeddings by @Jintao-Huang in #105
[bugfix] fix deepseek-v4 dev branch by @Jintao-Huang in #107
[model] support gemma4_unified by @Jintao-Huang in #108
update batch_p2p_comm by @Jintao-Huang in #111
support language_model_only by @Jintao-Huang in #112
support kimi_k25 mm by @Jintao-Huang in #113
update mla rope mcore>=0.18 (0.15-0.18 compat) by @Jintao-Huang in #114

New Contributors

@YaoweiFan made their first contribution in #104

Full Changelog: v1.4.2...v1.4.3

Contributors

Jintao-Huang and YaoweiFan

Assets 2

v1.4.2

31 May 12:05

@Jintao-Huang Jintao-Huang

v1.4.2

dfb4663

v1.4.2

新特性

新增 model_type 支持:bailing_hybrid。
修复 olmoe/bailing_moe 在TP > 1时的损失异常。

New Features

Add model_type support: bailing_hybrid.
Fix abnormal loss for olmoe/bailing_moe when TP > 1.

What's Changed

[bugfix] fix bug by @Jintao-Huang in #99
[bugfix] fix qwen3_next norm sp by @Jintao-Huang in #100
[model] Support bailing_hybrid by @Jintao-Huang in #85
refactor olmoe by @Jintao-Huang in #101
[bugfix] fix npu GDN by @Jintao-Huang in #103

Full Changelog: v1.4.1...v1.4.2

Contributors

@Jintao-Huang

Jintao-Huang

Assets 2

v1.4.1

27 May 15:23

@Jintao-Huang Jintao-Huang

v1.4.1

14cc51b

v1.4.1

中文版

新特性

新增 model_type 支持:gemma4、deepseek_v4。
README 新增使用 Mcore-Bridge 创建模型并执行 forward、计算损失的最简示例。
兼容 megatron-core main 与 dev 分支。

English Version

New Features

Added model_type support for: gemma4, deepseek_v4.
Added a minimal example in README demonstrating how to create a model using Mcore-Bridge to perform forward pass and compute loss.
Compatible with both megatron-core main and dev branches.

What's Changed

[model] Support gemma4 by @Jintao-Huang in #56
[docs] update readme by @Jintao-Huang in #84
compat megatron dev branch by @Jintao-Huang in #87
[model] support gemma4 padding_free by @Jintao-Huang in #88
[docs] update docs by @Jintao-Huang in #89
update gemma4 rope by @Jintao-Huang in #90
refactor MLA by @Jintao-Huang in #91
compat mtp megatron_core main branch by @Jintao-Huang in #92
[model] Support deepseek-v4 by @Jintao-Huang in #86
[bugfix] fix bugs by @Jintao-Huang in #95
[model] support deepseek v4 mtp by @Jintao-Huang in #93
Support fp4 blockwise load by @Jintao-Huang in #96
[bugfix] fix gdn conv1d by @Jintao-Huang in #97
update lora add by @Jintao-Huang in #98

Full Changelog: v1.4.0...v1.4.1

Contributors

@Jintao-Huang

Jintao-Huang

Assets 2

v1.4.0

17 May 15:50

@Jintao-Huang Jintao-Huang

v1.4.0

6a39584

v1.4.0

中文版

新特性

新增 model_type 支持:bailing_moe、qwen3_asr。
支持 Qwen3-Next 以 Mcore-GDN 方式运行(默认),从而支持序列 packing、FP8 及 CP。
对 transformer_block / transformer_layer 进行重构,通过可继承的方式便于新模型的接入。
兼容 Python 3.13。
支持 transformers 中以 grouped 方式组织专家的 MoE 模型的 LoRA 权重存储与读取。(注意:该 LoRA 权重不支持通过 transformers 直接加载,但可通过 Megatron 加载以用于后续继续训练。)
新增 padding_mask 支持,修复了在 padding_free=False 时,moe_aux_loss 对 padding token 错误计算 routing loss 的问题。

English Version

New Features

Added model_type support for bailing_moe and qwen3_asr.
Support running Qwen3-Next with Mcore-GDN (default), enabling sequence packing, FP8, and CP.
Refactored transformer_block / transformer_layer with an inheritable design to simplify the integration of new models.
Added compatibility with Python 3.13.
Support LoRA weight saving and loading for MoE models whose experts are organized in grouped mode in transformers. (Note: these LoRA weights cannot be loaded directly via transformers, but can be loaded via Megatron for continued training.)
Added padding_mask support, fixing an issue where moe_aux_loss incorrectly computed routing loss on padding tokens when padding_free=False.

What's Changed

[bugfix] fix MTP & mcore 0.15 (NPU) by @Jintao-Huang in #67
compat python 3.13 by @Jintao-Huang in #68
compat lint py313 by @Jintao-Huang in #69
compat lint py3.13 by @Jintao-Huang in #70
[model] support bailing by @Jintao-Huang in #55
update gpt_model by @Jintao-Huang in #71
refactor transformer_block by @Jintao-Huang in #72
[bugfix] fix tie_word_embeddings by @Jintao-Huang in #74
[bugfix] fix qwen3_vl by @Jintao-Huang in #73
remove hf_grouped lora error by @Jintao-Huang in #75
[model] support qwen3_next gdn by @Jintao-Huang in #76
compat megatron.core 0.18 by @Jintao-Huang in #77
[model] support qwen3_asr by @Jintao-Huang in #78
Support padding mask by @Jintao-Huang in #79
compat peft 0.19 by @Jintao-Huang in #80
[readme] Update readme by @Jintao-Huang in #81
[docs] update readme by @Jintao-Huang in #82
[bugfix] fix minimax qk_norm sp by @Jintao-Huang in #83

Full Changelog: v1.3.0...v1.4.0

Contributors

@Jintao-Huang

Jintao-Huang

Assets 2

Patch release v1.3.2

12 May 14:41

@Jintao-Huang Jintao-Huang

v1.3.2

1878000

Patch release v1.3.2

Full Changelog: v1.3.1...v1.3.2

Assets 2

Patch release v1.3.1

10 May 05:29

@Jintao-Huang Jintao-Huang

v1.3.1

a974770

Patch release v1.3.1

Full Changelog: v1.3.0...v1.3.1

Assets 2

v1.3.0

07 May 02:51

@Jintao-Huang Jintao-Huang

v1.3.0

2cf4483

v1.3.0

中文版

新特性

新增 model_type 支持:kimi_k25、hy_v3、llava_onevision。
mlp_padding_free 兼容 Sequence Parallelism。
移除对 megatron-core 0.12 - 0.14 版本的依赖支持。

English Version

New Features

Added model_type support: kimi_k25, hy_v3, llava_onevision.
mlp_padding_free is now compatible with Sequence Parallelism.
Removed dependency support for megatron-core versions 0.12 - 0.14.

What's Changed

[docs] update readme by @Jintao-Huang in #49
update requirements by @Jintao-Huang in #51
npu qwen3.5 megatron padding_free fix by @addsubmuldiv in #50
[model] support kimi_k25 by @Jintao-Huang in #52
[model] support hy_v3 by @Jintao-Huang in #53
Add support for LLaVA-OneVision-1.5 model by @randydl in #54
[bugfix] fix torch_dtype by @Jintao-Huang in #57
fix qwen3_next by @Jintao-Huang in #58
remove mcore0.12-mcore0.14 by @Jintao-Huang in #59
fix kwargs by @Jintao-Huang in #61
[megatron] support mlp_padding_free & sp; refactor TransformerLayer by @Jintao-Huang in #62
[bugfix] fix gather_from_sp by @Jintao-Huang in #63
update transformers by @Jintao-Huang in #65
update requirements by @Jintao-Huang in #66

New Contributors

@randydl made their first contribution in #54

Full Changelog: v1.2.0...v1.3.0

Contributors

addsubmuldiv, randydl, and Jintao-Huang

Assets 2

Patch release v1.2.3

05 May 13:51

@Jintao-Huang Jintao-Huang

v1.2.3

f0c59b2

Patch release v1.2.3

Full Changelog: v1.2.2...v1.2.3

Assets 2

Patch release v1.2.2

04 May 09:52

@Jintao-Huang Jintao-Huang

v1.2.2

a1b6973

Patch release v1.2.2

Full Changelog: v1.2.1...v1.2.2

Assets 2

Patch release v1.2.1

25 Apr 06:46

@Jintao-Huang Jintao-Huang

v1.2.1

9b6b69c

Patch release v1.2.1

Full Changelog: v1.2.0...v1.2.1

Assets 2

Releases: modelscope/mcore-bridge

v1.4.3

新特性

New Features

What's Changed

New Contributors

Contributors

Uh oh!

v1.4.2

新特性

New Features

What's Changed

Contributors

Uh oh!

v1.4.1

中文版

新特性

English Version

New Features

What's Changed

Contributors

Uh oh!

v1.4.0

中文版

新特性

English Version

New Features

What's Changed

Contributors

Uh oh!

Patch release v1.3.2

Uh oh!

Patch release v1.3.1

Uh oh!

v1.3.0

中文版

新特性

English Version

New Features

What's Changed

New Contributors

Contributors

Uh oh!

Patch release v1.2.3

Uh oh!

Patch release v1.2.2

Uh oh!

Patch release v1.2.1

Uh oh!