GPTQ per-channel 量化配置 #442

Open

Description

opened

on Nov 4, 2025

weight:
 bit: 8
 symmetric: True
 granularity: per_channel
 group_size: -1
 calib_algo: mse
 act:
 bit: 8
 symmetric: True
 granularity: per_token
 calib_algo: minmax
 special:
 actorder: True
 static_groups: False
 percdamp: 0.01
 blocksize: 128
 chunk_num: 4
 true_sequential: True
 online_rotate: False
 fp32_had: True

GPTQ, yaml 这么配置时, 虽然 granularity: per_channel group_size: -1 但是, blocksize: 128 ,请问下,那实际是按 blocksize: 128 优化的嘛? 还是说把 block_size 也改成 -1 , 相关代码对应修改下更合理呢?

 def weight_transform(self, W, Hinv, Losses, tmp):
 # NOTE:
 curr_blocksize = self.blocksize
 if curr_blocksize <= 0:
 logger.info('Blocksize < 0 , using perchannel GPTQ')
 curr_blocksize = W.shape[1]
 print(f'=== W.shape:{W.shape}')
 
 for i1 in range(0, self.n_nonout, curr_blocksize):
 # for i1 in range(0, self.n_nonout, self.blocksize):
 i2 = min(i1 + curr_blocksize, self.n_nonout)
 count = i2 - i1
 W1, Hinv1 = W[:, i1:i2].clone(), Hinv[i1:i2, i1:i2]
 tmp1, Err1, Losses1 = (
 torch.zeros_like(W1),
 torch.zeros_like(W1),
 torch.zeros_like(W1),
 )
 for i in range(count):
 w, d = W1[:, i], Hinv1[i, i]
 if self.wquantizer.granularity == 'per_group':
 idx = i1 + i
 if not self.static_groups:
 if (i1 + i) % self.wquantizer.group_size == 0:
 column_tensors = W[
 :,
 (i1 + i): min(
 (i1 + i + self.wquantizer.group_size),
 (self.columns - self.n_out),
 ),
 ]
 self.search_column_qparams(column_tensors, idx)
 else:
 if self.actorder:
 idx = self.perm[idx]
 self.qparams = self.groups[idx // self.wquantizer.group_size]
 q = self.wquantizer.quant_dequant(
 w.unsqueeze(1),
 self.qparams['scale'],
 self.qparams['zero'],
 self.qparams['qmax'],
 self.qparams['qmin'],
 ).squeeze(1)
 tmp1[:, i] = w
 Losses1[:, i] = ((w - q) ** 2) / (2 * d**2)
 err1 = (w - q) / d
 W1[:, i:] -= err1.unsqueeze(1).matmul(Hinv1[i, i:].unsqueeze(0))
 Err1[:, i] = err1
 tmp[:, i1:i2], Losses[:, i1:i2] = tmp1, Losses1
 W[:, i2:] -= Err1.matmul(Hinv[i1:i2, i2:])

Metadata

Assignees

No one assigned

Labels

No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPTQ per-channel 量化配置 #442

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions