【深度学习】PyTorch深度学习实践 - Lecture_10_Basic_CNN
创始人
2024-04-22 09:05:42
0

文章目录

  • 一、CNN基础流程图
  • 二、CNN的两个阶段
  • 三、卷积的基本知识
    • 3.1 单信道的卷积
    • 3.2 三信道的卷积
    • 3.3 N信道卷积
    • 3.4 输入N信道-输出M信道卷积
    • 3.5 卷积层的常见参数
      • 3.5.1 Padding
      • 3.5.2 Stride
      • 3.5.3 下采样(MaxPooling)
  • 四、实现一个简单的CNN
    • 4.1 网络结构图
    • 4.2 PyTorch代码-CPU
    • 4.3 PyTorch代码-GPU
    • 4.4 课后作业(尝试更复杂的CNN)


一、CNN基础流程图

卷积:对数据升维
下采样:对数据降维

在这里插入图片描述

二、CNN的两个阶段

特征提取阶段:通过卷积下采样对图像数据进行升维和降维,从而实现特征提取
分类器:将特征提取阶段输出的数据展平成一维,再通过全连接层进行分类

在这里插入图片描述

三、卷积的基本知识

卷积后数据维度会变(比如信道数增加、宽度和高度减小)

在这里插入图片描述

3.1 单信道的卷积

在这里插入图片描述

然后通过滑动卷积框,多次进行卷积计算,直到卷积框移动到Input最下最右,此时就将输出矩阵的信息补充完整了

在这里插入图片描述

3.2 三信道的卷积

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

3.3 N信道卷积

在这里插入图片描述

3.4 输入N信道-输出M信道卷积

原理:采用M个卷积核对Input进行卷积,这样就得到了M层的输出

在这里插入图片描述
在这里插入图片描述

PyTorch代码实现卷积:

import torchin_channels, out_channels = 5, 10  # 输入信道数、输出信道数
w, h = 100, 100  # 输入图片尺寸
kernel_size = (3, 3)  # 卷积核大小3*3
batch_size = 1  # 每次参与模型更新的数据数量
input = torch.randn(batch_size, in_channels, w, h)  # torch.randn 随机产生输入数据
conv_layer = torch.nn.Conv2d(in_channels, out_channels, kernel_size=kernel_size)  # 定义卷积层(输入信道数,输出信道数,卷积核大小)
output = conv_layer(input)  # 进行卷积操作
# 输出输入、输出和卷积层权重的形状
print(input.shape)
print(output.shape)
print(conv_layer.weight.shape)

输出:

torch.Size([1, 5, 100, 100])
torch.Size([1, 10, 98, 98])
torch.Size([10, 5, 3, 3])

3.5 卷积层的常见参数

3.5.1 Padding

Padding:在输入数据的四周进行扩展的数量,比如Padding = (1,1)代表在输入数据的四周一圈扩展并填充0
一般地:
当卷积核大小为3×3时:Padding=(1,1)
当卷积核大小为5×5时:Padding=(2,2)

当卷积核大小为n×n时:Padding=(m,m),其中 m =(n-1)/2,n为奇数

在这里插入图片描述

Padding默认填充0

在这里插入图片描述

Padding的意义:

  • 为了不丢弃原图信息
  • 为了保持feature map 的大小与原图一致
  • 为了让更深层的layer的input依旧保持有足够大的信息量
  • 为了实现上述目的,且不做多余的事情,padding出来的pixel的值都是0,不存在噪音问题。

PyTorch代码体会Padding:

import torchinput = [3, 4, 6, 5, 7,2, 4, 6, 8, 2,1, 6, 7, 8, 4,9, 7, 4, 6, 2,3, 7, 5, 4, 1]
input = torch.Tensor(input).view(1, 1, 5, 5)
conv_layer = torch.nn.Conv2d(1, 1, kernel_size=(3, 3), padding=(1, 1), bias=False)
kernel = torch.Tensor([1, 2, 3, 4, 5, 6, 7, 8, 9]).view(1, 1, 3, 3)
conv_layer.weight.data = kernel.data
output = conv_layer(input)
print(output)

输出:

tensor([[[[ 91., 168., 224., 215., 127.],[114., 211., 295., 262., 149.],[192., 259., 282., 214., 122.],[194., 251., 253., 169.,  86.],[ 96., 112., 110.,  68.,  31.]]]], grad_fn=)

3.5.2 Stride

Stride:卷积核滑动的步长,为一个元组。如Stride =(2,2)分别表示宽和高方向上的滑动步长

在这里插入图片描述

PyTorch代码体会Stride:

import torchinput = [3, 4, 6, 5, 7,2, 4, 6, 8, 2,1, 6, 7, 8, 4,9, 7, 4, 6, 2,3, 7, 5, 4, 1]
input = torch.Tensor(input).view(1, 1, 5, 5)
conv_layer = torch.nn.Conv2d(1, 1, kernel_size=(3, 3), stride=(2, 2), bias=False)
kernel = torch.Tensor([1, 2, 3, 4, 5, 6, 7, 8, 9]).view(1, 1, 3, 3)
conv_layer.weight.data = kernel.data
output = conv_layer(input)
print(output)

输出:

tensor([[[[211., 262.],[251., 169.]]]], grad_fn=)

3.5.3 下采样(MaxPooling)

最常用的下采样方法是最大池化(Max Pooling)
如下图所示,MaxPooling=(2,2),即将输入数据分成四个(2,2)无交集的区域,然后选取每个区域的最大值作为输出数据

在这里插入图片描述

由MaxPooling的计算特性可知,MaxPooling不会对信道的数量产生影响

PyTorch代码体会MaxPooling:

import torchinput = [3, 4, 6, 5,2, 4, 6, 8,1, 6, 7, 5,9, 7, 4, 6]
input = torch.Tensor(input).view(1, 1, 4, 4)
max_pooling_layer = torch.nn.MaxPool2d(kernel_size=(2, 2))
output = max_pooling_layer(input)
print(output)

输出:

tensor([[[[4., 8.],[9., 7.]]]])

四、实现一个简单的CNN

4.1 网络结构图

在这里插入图片描述
在这里插入图片描述

4.2 PyTorch代码-CPU

构建网络模型:

class Net(torch.nn.Module):def __init__(self):super(Net, self).__init__()self.conv1 = torch.nn.Conv2d(1, 10, kernel_size=(5, 5))self.conv2 = torch.nn.Conv2d(10, 20, kernel_size=(5, 5))self.pooling = torch.nn.MaxPool2d((2, 2))self.fc = torch.nn.Linear(320, 10)def forward(self, x):batch_size = x.size(0)x = torch.relu(self.pooling(self.conv1(x)))x = torch.relu(self.pooling(self.conv2(x)))x = x.view(batch_size, -1)x = self.fc(x)return x

完整代码:

import torch
from torchvision import transforms
from torchvision import datasets
from torch.utils.data import DataLoader
import torch.optim as optimclass Net(torch.nn.Module):def __init__(self):super(Net, self).__init__()self.conv1 = torch.nn.Conv2d(1, 10, kernel_size=(5, 5))self.conv2 = torch.nn.Conv2d(10, 20, kernel_size=(5, 5))self.pooling = torch.nn.MaxPool2d((2, 2))self.fc = torch.nn.Linear(320, 10)def forward(self, x):batch_size = x.size(0)x = torch.relu(self.pooling(self.conv1(x)))x = torch.relu(self.pooling(self.conv2(x)))x = x.view(batch_size, -1)x = self.fc(x)return x# 单次训练函数
def train(epoch, criterion):running_loss = 0.0for batch_idx, data in enumerate(train_loader, 0):inputs, target = dataoptimizer.zero_grad()outputs = model(inputs)loss = criterion(outputs, target)loss.backward()optimizer.step()running_loss += loss.item()if batch_idx % 300 == 299:print('[%d, %5d] loss: %.3f' % (epoch + 1, batch_idx + 1, running_loss / 300))running_loss = 0.0# 单次测试函数
def ttt():correct = 0.0total = 0.0with torch.no_grad():for data in test_loader:images, labels = dataoutputs = model(images)_, predicted = torch.max(outputs.data, dim=1)total += labels.size(0)correct += (predicted == labels).sum().item()print('Accuracy on test set: %d %% [%d/%d]' % (100 * correct / total, correct, total))if __name__ == '__main__':batch_size = 64transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])train_dataset = datasets.MNIST(root='../dataset/', train=True, download=True, transform=transform)train_loader = DataLoader(train_dataset, shuffle=True, batch_size=batch_size)test_dataset = datasets.MNIST(root='../dataset/', train=False, download=True, transform=transform)test_loader = DataLoader(test_dataset, shuffle=False, batch_size=batch_size)# 声明模型model = Net()# 定义损失函数criterion = torch.nn.CrossEntropyLoss()# 定义优化器optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.5)for epoch in range(10):train(epoch, criterion)ttt()

输出:

[1,   300] loss: 0.564
[1,   600] loss: 0.191
[1,   900] loss: 0.141
Accuracy on test set: 96 %
[2,   300] loss: 0.110
[2,   600] loss: 0.092
[2,   900] loss: 0.088
Accuracy on test set: 97 %
[3,   300] loss: 0.079
[3,   600] loss: 0.072
[3,   900] loss: 0.069
Accuracy on test set: 98 %
[4,   300] loss: 0.064
[4,   600] loss: 0.056
[4,   900] loss: 0.063
Accuracy on test set: 98 %
[5,   300] loss: 0.052
[5,   600] loss: 0.053
[5,   900] loss: 0.055
Accuracy on test set: 98 %
[6,   300] loss: 0.048
[6,   600] loss: 0.050
[6,   900] loss: 0.045
Accuracy on test set: 98 %
[7,   300] loss: 0.048
[7,   600] loss: 0.042
[7,   900] loss: 0.041
Accuracy on test set: 98 %
[8,   300] loss: 0.039
[8,   600] loss: 0.042
[8,   900] loss: 0.040
Accuracy on test set: 98 %
[9,   300] loss: 0.039
[9,   600] loss: 0.034
[9,   900] loss: 0.039
Accuracy on test set: 98 %
[10,   300] loss: 0.034
[10,   600] loss: 0.036
[10,   900] loss: 0.035
Accuracy on test set: 98 %

4.3 PyTorch代码-GPU

import torch
from torchvision import transforms
from torchvision import datasets
from torch.utils.data import DataLoader
import torch.optim as optimclass Net(torch.nn.Module):def __init__(self):super(Net, self).__init__()self.conv1 = torch.nn.Conv2d(1, 10, kernel_size=(5, 5))self.conv2 = torch.nn.Conv2d(10, 20, kernel_size=(5, 5))self.pooling = torch.nn.MaxPool2d((2, 2))self.fc = torch.nn.Linear(320, 10)def forward(self, x):batch_size = x.size(0)x = torch.relu(self.pooling(self.conv1(x)))x = torch.relu(self.pooling(self.conv2(x)))x = x.view(batch_size, -1)x = self.fc(x)return x# 单次训练函数
def train(epoch, criterion):running_loss = 0.0for batch_idx, data in enumerate(train_loader, 0):inputs, target = data# 将inputs, target转移到Gpu或者Cpu上inputs, target = inputs.to(device), target.to(device)optimizer.zero_grad()outputs = model(inputs)loss = criterion(outputs, target)loss.backward()optimizer.step()running_loss += loss.item()if batch_idx % 300 == 299:print('[%d, %5d] loss: %.3f' % (epoch + 1, batch_idx + 1, running_loss / 300))running_loss = 0.0# 单次测试函数
def ttt():correct = 0.0total = 0.0with torch.no_grad():for data in test_loader:images, labels = data# 将images, labels转移到Gpu或者Cpu上images, labels = images.to(device), labels.to(device)outputs = model(images)_, predicted = torch.max(outputs.data, dim=1)total += labels.size(0)correct += (predicted == labels).sum().item()print('Accuracy on test set: %d %% [%d/%d]' % (100 * correct / total, correct, total))if __name__ == '__main__':batch_size = 64transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])train_dataset = datasets.MNIST(root='../dataset/', train=True, download=True, transform=transform)train_loader = DataLoader(train_dataset, shuffle=True, batch_size=batch_size)test_dataset = datasets.MNIST(root='../dataset/', train=False, download=True, transform=transform)test_loader = DataLoader(test_dataset, shuffle=False, batch_size=batch_size)device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")# 声明模型model = Net()# 将模型转移道Gpu或者Cpu上model.to(device)# 定义损失函数criterion = torch.nn.CrossEntropyLoss()# 定义优化器optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.5)for epoch in range(10):train(epoch, criterion)ttt()

输出:

[1,   300] loss: 0.666
[1,   600] loss: 0.206
[1,   900] loss: 0.148
Accuracy on test set: 96 % [9652/10000]
[2,   300] loss: 0.117
[2,   600] loss: 0.109
[2,   900] loss: 0.097
Accuracy on test set: 97 % [9749/10000]
[3,   300] loss: 0.080
[3,   600] loss: 0.082
[3,   900] loss: 0.077
Accuracy on test set: 98 % [9804/10000]
[4,   300] loss: 0.067
[4,   600] loss: 0.070
[4,   900] loss: 0.062
Accuracy on test set: 98 % [9814/10000]
[5,   300] loss: 0.056
[5,   600] loss: 0.060
[5,   900] loss: 0.059
Accuracy on test set: 98 % [9845/10000]
[6,   300] loss: 0.053
[6,   600] loss: 0.047
[6,   900] loss: 0.054
Accuracy on test set: 98 % [9855/10000]
[7,   300] loss: 0.047
[7,   600] loss: 0.048
[7,   900] loss: 0.044
Accuracy on test set: 98 % [9845/10000]
[8,   300] loss: 0.044
[8,   600] loss: 0.041
[8,   900] loss: 0.042
Accuracy on test set: 98 % [9883/10000]
[9,   300] loss: 0.040
[9,   600] loss: 0.039
[9,   900] loss: 0.041
Accuracy on test set: 98 % [9867/10000]
[10,   300] loss: 0.038
[10,   600] loss: 0.036
[10,   900] loss: 0.038
Accuracy on test set: 98 % [9872/10000]

4.4 课后作业(尝试更复杂的CNN)

在这里插入图片描述

PyTorch-GPU代码实现:

除了上图所示的改变外,我还做了以下2点改变:

  • 为了获得更多的特征。增加了每个层的信道数量
  • 为了获得更多的特征,并考虑到计算量开销,采取了卷积核大小逐层递减的策略,第一层卷积核大小设置为(9,9),第二层卷积核大小设置为(7,7),第三层卷积核大小设置为(5,5),为了保证足够多的特征数量,每一层都采用了padding
import torch
from torchvision import transforms
from torchvision import datasets
from torch.utils.data import DataLoader
import torch.optim as optimclass Net(torch.nn.Module):def __init__(self):super(Net, self).__init__()self.conv1 = torch.nn.Conv2d(1, 20, padding=(4, 4), kernel_size=(9, 9))self.conv2 = torch.nn.Conv2d(20, 40, padding=(3, 3), kernel_size=(7, 7))self.conv3 = torch.nn.Conv2d(40, 60, padding=(2, 2), kernel_size=(5, 5))self.pooling = torch.nn.MaxPool2d((2, 2))self.fc1 = torch.nn.Linear(540, 256)self.fc2 = torch.nn.Linear(256, 10)def forward(self, x):batch_size = x.size(0)x = torch.relu(self.pooling(self.conv1(x)))x = torch.relu(self.pooling(self.conv2(x)))x = torch.relu(self.pooling(self.conv3(x)))x = x.view(batch_size, -1)x = torch.relu(self.fc1(x))x = self.fc2(x)return x# 单次训练函数
def train(epoch, criterion):running_loss = 0.0for batch_idx, data in enumerate(train_loader, 0):inputs, target = data# 将inputs, target转移到Gpu或者Cpu上inputs, target = inputs.to(device), target.to(device)optimizer.zero_grad()outputs = model(inputs)loss = criterion(outputs, target)loss.backward()optimizer.step()running_loss += loss.item()if batch_idx % 300 == 299:print('[%d, %5d] loss: %.3f' % (epoch + 1, batch_idx + 1, running_loss / 300))running_loss = 0.0# 单次测试函数
def ttt():correct = 0.0total = 0.0with torch.no_grad():for data in test_loader:images, labels = data# 将images, labels转移到Gpu或者Cpu上images, labels = images.to(device), labels.to(device)outputs = model(images)_, predicted = torch.max(outputs.data, dim=1)total += labels.size(0)correct += (predicted == labels).sum().item()print('Accuracy on test set: %d %% [%d/%d]' % (100 * correct / total, correct, total))if __name__ == '__main__':batch_size = 64transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])train_dataset = datasets.MNIST(root='../dataset/', train=True, download=True, transform=transform)train_loader = DataLoader(train_dataset, shuffle=True, batch_size=batch_size)test_dataset = datasets.MNIST(root='../dataset/', train=False, download=True, transform=transform)test_loader = DataLoader(test_dataset, shuffle=False, batch_size=batch_size)device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")# 声明模型model = Net()# 将模型转移道Gpu或者Cpu上model.to(device)# 定义损失函数criterion = torch.nn.CrossEntropyLoss()# 定义优化器optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.5)for epoch in range(10):train(epoch, criterion)ttt()

输出:

[1,   300] loss: 1.350
[1,   600] loss: 0.215
[1,   900] loss: 0.139
Accuracy on test set: 96 % [9684/10000]
[2,   300] loss: 0.103
[2,   600] loss: 0.090
[2,   900] loss: 0.079
Accuracy on test set: 98 % [9800/10000]
[3,   300] loss: 0.070
[3,   600] loss: 0.064
[3,   900] loss: 0.060
Accuracy on test set: 98 % [9856/10000]
[4,   300] loss: 0.048
[4,   600] loss: 0.053
[4,   900] loss: 0.051
Accuracy on test set: 98 % [9833/10000]
[5,   300] loss: 0.044
[5,   600] loss: 0.040
[5,   900] loss: 0.041
Accuracy on test set: 98 % [9850/10000]
[6,   300] loss: 0.032
[6,   600] loss: 0.039
[6,   900] loss: 0.036
Accuracy on test set: 98 % [9889/10000]
[7,   300] loss: 0.027
[7,   600] loss: 0.033
[7,   900] loss: 0.031
Accuracy on test set: 99 % [9906/10000]
[8,   300] loss: 0.023
[8,   600] loss: 0.028
[8,   900] loss: 0.027
Accuracy on test set: 99 % [9903/10000]
[9,   300] loss: 0.022
[9,   600] loss: 0.025
[9,   900] loss: 0.023
Accuracy on test set: 98 % [9873/10000]
[10,   300] loss: 0.020
[10,   600] loss: 0.019
[10,   900] loss: 0.021
Accuracy on test set: 99 % [9910/10000]

从输出结果可以看出,正确率从之前的98%提升至了99%,说明本节建立的三层CNN模型比之前的两层CNN模型具有更好的特征提取能力和分类能力。

相关内容

热门资讯

喜欢穿一身黑的男生性格(喜欢穿... 今天百科达人给各位分享喜欢穿一身黑的男生性格的知识,其中也会对喜欢穿一身黑衣服的男人人好相处吗进行解...
发春是什么意思(思春和发春是什... 本篇文章极速百科给大家谈谈发春是什么意思,以及思春和发春是什么意思对应的知识点,希望对各位有所帮助,...
网络用语zl是什么意思(zl是... 今天给各位分享网络用语zl是什么意思的知识,其中也会对zl是啥意思是什么网络用语进行解释,如果能碰巧...
为什么酷狗音乐自己唱的歌不能下... 本篇文章极速百科小编给大家谈谈为什么酷狗音乐自己唱的歌不能下载到本地?,以及为什么酷狗下载的歌曲不是...
家里可以做假山养金鱼吗(假山能... 今天百科达人给各位分享家里可以做假山养金鱼吗的知识,其中也会对假山能放鱼缸里吗进行解释,如果能碰巧解...
华为下载未安装的文件去哪找(华... 今天百科达人给各位分享华为下载未安装的文件去哪找的知识,其中也会对华为下载未安装的文件去哪找到进行解...
四分五裂是什么生肖什么动物(四... 本篇文章极速百科小编给大家谈谈四分五裂是什么生肖什么动物,以及四分五裂打一生肖是什么对应的知识点,希...
怎么往应用助手里添加应用(应用... 今天百科达人给各位分享怎么往应用助手里添加应用的知识,其中也会对应用助手怎么添加微信进行解释,如果能...
客厅放八骏马摆件可以吗(家里摆... 今天给各位分享客厅放八骏马摆件可以吗的知识,其中也会对家里摆八骏马摆件好吗进行解释,如果能碰巧解决你...
苏州离哪个飞机场近(苏州离哪个... 本篇文章极速百科小编给大家谈谈苏州离哪个飞机场近,以及苏州离哪个飞机场近点对应的知识点,希望对各位有...