我的博客

pytorch CIFAR10 分类

目录
  1. 过程
    1. 载入数据
    2. 定义网络
    3. 优化器
    4. 训练
    5. 测试
    6. 计算准确率
    7. 保存模型
    8. 载入模型
  2. 一些细节
  3. 遇到的错误

本文是按照 pytorch 官网提供的教程进行实验的,教程原文,模型在测试集上的准确率是 56 %

因为 pillow 版本过高(7.0.0)遇到一个错误,详见文章最后

环境:

1
2
print(torch.__version__)        # 1.3.1+cpu
print(torchvision.__version__) # 0.4.2+cpu

过程

载入数据

因为我的数据已经下好了,所以 download 可以设置成 False,就不用重新下载了,代码和数据文件放在一起了,所以目录指定的是上一级。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
import torch
import torchvision
import torchvision.transforms as transforms

transform = transforms.Compose(
[transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(root='../', train=True,
download=False, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='../', train=False,
download=False, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4,
shuffle=False, num_workers=2)

classes = ('plane', 'car', 'bird', 'cat',
'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

定义网络

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
import torch.nn as nn
import torch.nn.functional as F


class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(3, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 5 * 5, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)

def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 5 * 5)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x


net = Net()

优化器

1
2
3
4
import torch.optim as optim

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

训练

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
for epoch in range(2):  # loop over the dataset multiple times

running_loss = 0.0
for i, data in enumerate(trainloader, 0):
# get the inputs; data is a list of [inputs, labels]
inputs, labels = data

# zero the parameter gradients
optimizer.zero_grad()

# forward + backward + optimize
outputs = net(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()

# print statistics
running_loss += loss.item()
if i % 2000 == 1999: # print every 2000 mini-batches
print('[%d, %5d] loss: %.3f' %
(epoch + 1, i + 1, running_loss / 2000))
running_loss = 0.0

print('Finished Training')

输出:

1
2
3
4
5
6
7
8
9
10
11
12
13
[1,  2000] loss: 2.183
[1, 4000] loss: 1.861
[1, 6000] loss: 1.679
[1, 8000] loss: 1.565
[1, 10000] loss: 1.494
[1, 12000] loss: 1.462
[2, 2000] loss: 1.383
[2, 4000] loss: 1.368
[2, 6000] loss: 1.338
[2, 8000] loss: 1.309
[2, 10000] loss: 1.266
[2, 12000] loss: 1.260
Finished Training

测试

原标签

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
import matplotlib.pyplot as plt
import numpy as np

def imshow(img):
img = img / 2 + 0.5 # unnormalize
npimg = img.numpy()
plt.imshow(np.transpose(npimg, (1, 2, 0)))
plt.show()

dataiter = iter(testloader)
images, labels = dataiter.next()

# print images
imshow(torchvision.utils.make_grid(images))
print('GroundTruth: ', ' '.join('%5s' % classes[labels[j]] for j in range(4)))

预测结果:

1
2
3
4
5
6
outputs = net(images)

_, predicted = torch.max(outputs, 1)

print('Predicted: ', ' '.join('%5s' % classes[predicted[j]]
for j in range(4)))

计算准确率

1
2
3
4
5
6
7
8
9
10
11
12
correct = 0
total = 0
with torch.no_grad():
for data in testloader:
images, labels = data
outputs = net(images)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()

print('Accuracy of the network on the 10000 test images: %d %%' % (
100 * correct / total))

输出:Accuracy of the network on the 10000 test images: 56 %

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))
with torch.no_grad():
for data in testloader:
images, labels = data
outputs = net(images)
_, predicted = torch.max(outputs, 1)
c = (predicted == labels).squeeze()
for i in range(4):
label = labels[i]
class_correct[label] += c[i].item()
class_total[label] += 1


for i in range(10):
print('Accuracy of %5s : %2d %%' % (
classes[i], 100 * class_correct[i] / class_total[i]))

输出:

1
2
3
4
5
6
7
8
9
10
Accuracy of plane : 64 %
Accuracy of car : 48 %
Accuracy of bird : 44 %
Accuracy of cat : 38 %
Accuracy of deer : 41 %
Accuracy of dog : 60 %
Accuracy of frog : 64 %
Accuracy of horse : 62 %
Accuracy of ship : 64 %
Accuracy of truck : 72 %

保存模型

1
2
PATH = './cifar_net.pth'
torch.save(net.state_dict(), PATH)

载入模型

1
2
net = Net()
net.load_state_dict(torch.load(PATH))

一些细节

CIFAR-10 数据集原来的图片格式前面一篇文章已经介绍过了。每张图片是长度为 3072 的 array。

这里通过 torchvision.transforms 对原始数据做了处理

1
2
3
transform = transforms.Compose(
[transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
1
2
3
4
it = iter(trainset)
x = next(it)
print(x[0].shape) # torch.Size([3, 32, 32])
print(x[0])
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
tensor([[[-0.5373, -0.6627, -0.6078,  ...,  0.2392,  0.1922,  0.1608],
[-0.8745, -1.0000, -0.8588, ..., -0.0353, -0.0667, -0.0431],
[-0.8039, -0.8745, -0.6157, ..., -0.0745, -0.0588, -0.1451],
...,
[ 0.6314, 0.5765, 0.5529, ..., 0.2549, -0.5608, -0.5843],
[ 0.4118, 0.3569, 0.4588, ..., 0.4431, -0.2392, -0.3490],
[ 0.3882, 0.3176, 0.4039, ..., 0.6941, 0.1843, -0.0353]],

[[-0.5137, -0.6392, -0.6235, ..., 0.0353, -0.0196, -0.0275],
[-0.8431, -1.0000, -0.9373, ..., -0.3098, -0.3490, -0.3176],
[-0.8118, -0.9451, -0.7882, ..., -0.3412, -0.3412, -0.4275],
...,
[ 0.3333, 0.2000, 0.2627, ..., 0.0431, -0.7569, -0.7333],
[ 0.0902, -0.0353, 0.1294, ..., 0.1608, -0.5137, -0.5843],
[ 0.1294, 0.0118, 0.1137, ..., 0.4431, -0.0745, -0.2784]],

[[-0.5059, -0.6471, -0.6627, ..., -0.1529, -0.2000, -0.1922],
[-0.8431, -1.0000, -1.0000, ..., -0.5686, -0.6078, -0.5529],
[-0.8353, -1.0000, -0.9373, ..., -0.6078, -0.6078, -0.6706],
...,
[-0.2471, -0.7333, -0.7961, ..., -0.4510, -0.9451, -0.8431],
[-0.2471, -0.6706, -0.7647, ..., -0.2627, -0.7333, -0.7333],
[-0.0902, -0.2627, -0.3176, ..., 0.0980, -0.3412, -0.4353]]])

可以看到数据维度变成了 3 × 32 × 32。

遇到的错误

import torchvision 报错:

1
ImportError: cannot import name 'PILLOW_VERSION' from 'PIL' (/usr/local/lib/python3.7/site-packages/PIL/__init__.py)

原来是 pillow 7.0 版本以上没有 PILLOW_VERSION 这个属性了,我看网上教程说可以安装低版本 pillow:

pip3 install 'pillow<7.0.0'

我想试试手动添加这个属性:

sudo vi /usr/local/lib/python3.7/site-packages/PIL/__init__.py

__version__ = _version.__version__ 下面添加一行: PILLOW_VERSION = __version__:wqackages

结果如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
"""Pillow (Fork of the Python Imaging Library)

Pillow is the friendly PIL fork by Alex Clark and Contributors.
https://github.com/python-pillow/Pillow/

Pillow is forked from PIL 1.1.7.

PIL is the Python Imaging Library by Fredrik Lundh and Contributors.
Copyright (c) 1999 by Secret Labs AB.

Use PIL.__version__ for this Pillow version.

;-)
"""

from . import _version

# VERSION was removed in Pillow 6.0.0.
# PILLOW_VERSION was removed in Pillow 7.0.0.
# Use __version__ instead.
__version__ = _version.__version__
PILLOW_VERSION = __version__:wqackages

结果就好了。

评论无需登录,可以匿名,欢迎评论!