CuPyTorch是一个小型PyTorch,名字来源于:
- 不同于已有的几个使用NumPy实现PyTorch的开源项目,本项目通过CuPy支持cuda计算
- 发音与Cool PyTorch接近,因为使用不超过1000行纯Python代码实现PyTorch确实很cool
CuPyTorch支持numpy和cupy两种计算后端,实现大量PyTorch常用功能,力求99%兼容PyTorch语法语义,并能轻松扩展,以下列出已经完成的功能:
-
tensor:tensor: 创建张量arange: 区间等差张量stack: 堆叠张量ones/zeros,ones/zeros_like: 全1/0张量rand/randn,rand/randn_like: 0~1均匀分布/高斯分布张量+,-,*,/,@,**: 双目数值运算及其右值和原地操作>,<,==,>=,<=,!=: 比较运算&,|,^: 双目逻辑运算~,-: 取反/取负运算[]: 基本和花式索引和切片操作abs,exp,log,sqrt: 数值运算sum,mean: 数据归约操作max/min,amax/amin,argmax/argmin: 最大/小值及其索引计算
-
autograd: 支持以上所有非整数限定运算的自动微分 -
nn:Module: 模型基类,管理参数,格式化打印activation:ReLU,GeLU,Sigmoid,Tanh,Softmax,LogSoftmaxloss:L1Loss,MSELoss,NLLLoss,CrossEntropyLosslayer:Linear,Dropout,LSTM
-
optim:Optimizer: 优化器基类,管理参数,格式化打印SGD,Adam: 两个最常见的优化器lr_scheduler:LambdaLR和StepLR学习率调度器
-
utils.data:DataLoader: 批量迭代Tensor数据,支持随机打乱Dataset: 数据集基类,用于继承TensorDataset: 纯用Tensor构成的数据集
cloc的代码统计结果:
| Language | files | blank | comment | code |
|---|---|---|---|---|
| Python | 22 | 353 | 27 | 1000 |
自动微分示例:
import cupytorch as ct a = ct.tensor([[-1., 2], [-3., 4.]], requires_grad=True) b = ct.tensor([[4., 3.], [2., 1.]], requires_grad=True) c = ct.tensor([[1., 2.], [0., 2.]], requires_grad=True) d = ct.tensor([1., -2.], requires_grad=True) e = a @ b.T f = (c.max(1)[0].exp() + e[:, 0] + b.pow(2) + 2 * d.reshape(2, 1).abs()).mean() print(f) f.backward() print(a.grad) print(b.grad) print(c.grad) print(d.grad) # tensor(18.889057, grad_fn=<MeanBackward>) # tensor([[2. 1.5] # [2. 1.5]]) # tensor([[0. 4.5] # [1. 0.5]]) # tensor([[0. 3.694528] # [0. 3.694528]]) # tensor([ 1. -1.])
手写数字识别示例:
from pathlib import Path import cupytorch as ct from cupytorch import nn from cupytorch.optim import SGD from cupytorch.optim.lr_scheduler import StepLR from cupytorch.utils.data import TensorDataset, DataLoader class Net(nn.Module): def __init__(self, num_pixel: int, num_class: int): super().__init__() self.num_pixel = num_pixel self.fc1 = nn.Linear(num_pixel, 256) self.fc2 = nn.Linear(256, 64) self.fc3 = nn.Linear(64, num_class) self.act = nn.ReLU() self.drop = nn.Dropout(0.1) def forward(self, input: ct.Tensor) -> ct.Tensor: output = input.view(-1, self.num_pixel) output = self.drop(self.act(self.fc1(output))) output = self.drop(self.act(self.fc2(output))) return self.fc3(output) def load(path: Path): # define how to load data as tensor pass path = Path('../datasets/MNIST') train_dl = DataLoader(TensorDataset(load(path / 'train-images-idx3-ubyte.gz'), load(path / 'train-labels-idx1-ubyte.gz')), batch_size=20, shuffle=True) test_dl = DataLoader(TensorDataset(load(path / 't10k-images-idx3-ubyte.gz'), load(path / 't10k-labels-idx1-ubyte.gz')), batch_size=20, shuffle=False) model = Net(28 * 28, 10) criterion = nn.CrossEntropyLoss() optimizer = SGD(model.parameters(), lr=1e-3, momentum=0.9) scheduler = StepLR(optimizer, 5, 0.5) print(model) print(optimizer) print(criterion) for epoch in range(10): losses = 0 for step, (x, y) in enumerate(train_dl, 1): optimizer.zero_grad() z = model(x) loss = criterion(z, y) loss.backward() optimizer.step() losses += loss.item() if step % 500 == 0: losses /= 500 print(f'Epoch: {epoch}, Train Step: {step}, Train Loss: {losses:.6f}') losses = 0 scheduler.step()
examples文件夹中提供了两个完整示例:
参考: