tests/quantization/utils.py · Samle/diffusers

代码拉取完成,页面将自动刷新

扫描微信二维码支付

取消

支付完成

richgiteeai

Watch

不关注关注所有动态仅关注版本发行动态关注但不提醒动态

1 Star 0 Fork 0

Samle/diffusers

代码 Issues 0 Pull Requests 0 Wiki 统计流水线

服务

加入 Gitee

与超过 1400万开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)

免费加入

已有帐号? 立即登录

文件

main

分支 (633)

标签 (92)

管理

main

modular-load-components

move-testing-utils

fa3-from-kernels

modular-standard-repo

schedulers/unipc-custom-sigmas

attn-dispatcher-cp-and-training

fa3-fake-ops-added

modular-qwen

qwen-image-compilation-followup

nunchaku

torch-main-dep

export-tests

attn-refactor-blocks

allow-non-list-component

qwenimage-lru-cache-bypass

add-attentionmixin-qwen-image

requirements-custom-blocks

refactor-lora-save-weights

v0.35.1-patch

v0.35.1

v0.35.0

v0.34.0

v0.33.1

v0.33.0

v0.32.2

v0.32.1

v0.32.0

v0.31.0

v0.30.3

v0.30.2

v0.30.1

v0.30.0

v0.29.2

v0.29.1

v0.29.0

v0.28.2

v0.28.1

v0.28.0

v0.27.2

克隆/下载

HTTPS SSH SVN SVN+SSH 下载ZIP

提示

下载代码请复制以下命令到终端执行

为确保你提交的代码身份被 Gitee 正确识别,请执行以下命令完成配置

git config --global user.name userName 
git config --global user.email userEmail

初次使用 SSH 协议进行代码克隆、推送等操作时,需按下述提示完成 SSH 配置

1 生成 RSA 密钥

2 获取 RSA 公钥内容,并配置到 SSH公钥中

在 Gitee 上使用 SVN,请访问使用指南

使用 HTTPS 协议时,命令行会出现如下账号密码验证步骤。基于安全考虑,Gitee 建议配置并使用私人令牌替代登录密码进行克隆、推送等操作

Username for 'https://gitee.com': userName

Password for 'https://userName@gitee.com': # 私人令牌

分支 633

标签 92

diffusers

tests

quantization

utils.py

utils.py 1.55 KB

from diffusers.utils import is_torch_available

from ..testing_utils import (
 backend_empty_cache,
 backend_max_memory_allocated,
 backend_reset_peak_memory_stats,
 torch_device,
)

if is_torch_available():
 import torch
 import torch.nn as nn

class LoRALayer(nn.Module):
 """Wraps a linear layer with LoRA-like adapter - Used for testing purposes only

Taken from
 https://github.com/huggingface/transformers/blob/566302686a71de14125717dea9a6a45b24d42b37/tests/quantization/bnb/test_4bit.py#L62C5-L78C77
 """

def __init__(self, module: nn.Module, rank: int):
 super().__init__()
 self.module = module
 self.adapter = nn.Sequential(
 nn.Linear(module.in_features, rank, bias=False),
 nn.Linear(rank, module.out_features, bias=False),
 )
 small_std = (2.0 / (5 * min(module.in_features, module.out_features))) ** 0.5
 nn.init.normal_(self.adapter[0].weight, std=small_std)
 nn.init.zeros_(self.adapter[1].weight)
 self.adapter.to(module.weight.device)

def forward(self, input, *args, **kwargs):
 return self.module(input, *args, **kwargs) + self.adapter(input)

@torch.no_grad()
 @torch.inference_mode()
 def get_memory_consumption_stat(model, inputs):
 backend_reset_peak_memory_stats(torch_device)
 backend_empty_cache(torch_device)

model(**inputs)
 max_mem_allocated = backend_max_memory_allocated(torch_device)
 return max_mem_allocated

一键复制编辑原始数据按行查看历史

Dhruv Nair 提交于 2025年08月28日 22:23 +08:00 . [Refactor] Move testing utils out of src (#12238)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45

from diffusers.utils import is_torch_available

from ..testing_utils import (
 backend_empty_cache,
 backend_max_memory_allocated,
 backend_reset_peak_memory_stats,
 torch_device,
)


if is_torch_available():
 import torch
 import torch.nn as nn

 class LoRALayer(nn.Module):
 """Wraps a linear layer with LoRA-like adapter - Used for testing purposes only

 Taken from
 https://github.com/huggingface/transformers/blob/566302686a71de14125717dea9a6a45b24d42b37/tests/quantization/bnb/test_4bit.py#L62C5-L78C77
 """

 def __init__(self, module: nn.Module, rank: int):
 super().__init__()
 self.module = module
 self.adapter = nn.Sequential(
 nn.Linear(module.in_features, rank, bias=False),
 nn.Linear(rank, module.out_features, bias=False),
 )
 small_std = (2.0 / (5 * min(module.in_features, module.out_features))) ** 0.5
 nn.init.normal_(self.adapter[0].weight, std=small_std)
 nn.init.zeros_(self.adapter[1].weight)
 self.adapter.to(module.weight.device)

 def forward(self, input, *args, **kwargs):
 return self.module(input, *args, **kwargs) + self.adapter(input)

 @torch.no_grad()
 @torch.inference_mode()
 def get_memory_consumption_stat(model, inputs):
 backend_reset_peak_memory_stats(torch_device)
 backend_empty_cache(torch_device)

 model(**inputs)
 max_mem_allocated = backend_max_memory_allocated(torch_device)
 return max_mem_allocated