Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Releases: modelscope/mcore-bridge

v1.4.3

07 Jun 14:57
@Jintao-Huang Jintao-Huang

Choose a tag to compare

新特性

  1. 新增 model_type 支持:gemma4_unified;kimi_k25 新增多模态支持。
  2. 新增 language_model_only 参数,启用后仅创建语言模型部分,并只加载与保存语言模型相关权重。
  3. 修复若干 Bug。

New Features

  1. Added model_type support for gemma4_unified; added multimodal support for kimi_k25.
  2. Added language_model_only parameter, which when enabled, only creates the language model component and exclusively loads/saves language model weights.
  3. Fixed several bugs.

What's Changed

New Contributors

Full Changelog: v1.4.2...v1.4.3

Contributors

Jintao-Huang and YaoweiFan
Assets 2
Loading

v1.4.2

31 May 12:05
@Jintao-Huang Jintao-Huang

Choose a tag to compare

新特性

  1. 新增 model_type 支持:bailing_hybrid。
  2. 修复 olmoe/bailing_moe 在TP > 1时的损失异常。

New Features

  1. Add model_type support: bailing_hybrid.
  2. Fix abnormal loss for olmoe/bailing_moe when TP > 1.

What's Changed

Full Changelog: v1.4.1...v1.4.2

Contributors

Jintao-Huang
Loading

v1.4.1

27 May 15:23
@Jintao-Huang Jintao-Huang

Choose a tag to compare

中文版

新特性

  1. 新增 model_type 支持:gemma4、deepseek_v4。
  2. README 新增使用 Mcore-Bridge 创建模型并执行 forward、计算损失的最简示例。
  3. 兼容 megatron-core main 与 dev 分支。

English Version

New Features

  1. Added model_type support for: gemma4, deepseek_v4.
  2. Added a minimal example in README demonstrating how to create a model using Mcore-Bridge to perform forward pass and compute loss.
  3. Compatible with both megatron-core main and dev branches.

What's Changed

Full Changelog: v1.4.0...v1.4.1

Contributors

Jintao-Huang
Loading

v1.4.0

17 May 15:50
@Jintao-Huang Jintao-Huang

Choose a tag to compare

中文版

新特性

  1. 新增 model_type 支持:bailing_moeqwen3_asr
  2. 支持 Qwen3-Next 以 Mcore-GDN 方式运行(默认),从而支持序列 packing、FP8 及 CP。
  3. transformer_block / transformer_layer 进行重构,通过可继承的方式便于新模型的接入。
  4. 兼容 Python 3.13。
  5. 支持 transformers 中以 grouped 方式组织专家的 MoE 模型的 LoRA 权重存储与读取。(注意:该 LoRA 权重不支持通过 transformers 直接加载,但可通过 Megatron 加载以用于后续继续训练。)
  6. 新增 padding_mask 支持,修复了在 padding_free=False 时,moe_aux_loss 对 padding token 错误计算 routing loss 的问题。

English Version

New Features

  1. Added model_type support for bailing_moe and qwen3_asr.
  2. Support running Qwen3-Next with Mcore-GDN (default), enabling sequence packing, FP8, and CP.
  3. Refactored transformer_block / transformer_layer with an inheritable design to simplify the integration of new models.
  4. Added compatibility with Python 3.13.
  5. Support LoRA weight saving and loading for MoE models whose experts are organized in grouped mode in transformers. (Note: these LoRA weights cannot be loaded directly via transformers, but can be loaded via Megatron for continued training.)
  6. Added padding_mask support, fixing an issue where moe_aux_loss incorrectly computed routing loss on padding tokens when padding_free=False.

What's Changed

Full Changelog: v1.3.0...v1.4.0

Contributors

Jintao-Huang
Loading

Patch release v1.3.2

12 May 14:41
@Jintao-Huang Jintao-Huang

Choose a tag to compare

Loading

Patch release v1.3.1

10 May 05:29
@Jintao-Huang Jintao-Huang

Choose a tag to compare

Loading

v1.3.0

07 May 02:51
@Jintao-Huang Jintao-Huang

Choose a tag to compare

中文版

新特性

  1. 新增 model_type 支持:kimi_k25、hy_v3、llava_onevision。
  2. mlp_padding_free 兼容 Sequence Parallelism。
  3. 移除对 megatron-core 0.12 - 0.14 版本的依赖支持。

English Version

New Features

  1. Added model_type support: kimi_k25, hy_v3, llava_onevision.
  2. mlp_padding_free is now compatible with Sequence Parallelism.
  3. Removed dependency support for megatron-core versions 0.12 - 0.14.

What's Changed

New Contributors

Full Changelog: v1.2.0...v1.3.0

Contributors

addsubmuldiv, randydl, and Jintao-Huang
Loading

Patch release v1.2.3

05 May 13:51
@Jintao-Huang Jintao-Huang

Choose a tag to compare

Loading

Patch release v1.2.2

04 May 09:52
@Jintao-Huang Jintao-Huang

Choose a tag to compare

Loading

Patch release v1.2.1

25 Apr 06:46
@Jintao-Huang Jintao-Huang

Choose a tag to compare

Loading
Previous 1
Previous

AltStyle によって変換されたページ (->オリジナル) /