《机器学习算法与实现 —— Python编程与应用实例》-- 第八贴结项语与resnet实战

传媒学子

《机器学习算法与实现 —— Python编程与应用实例》-- 第八贴结项语与resnet实战 [复制链接]

结项语：

本贴是此次评测活动的最后一贴，首先感谢论坛给机会参与此次阅读评测活动，通过此次评测，学习到了很多机器学习的基础概念，帮助我建立了很多人工智能基础知识的连接。让我对神经网络的原理有了深入的认识。从目标函数建立、到梯度求解，从传统目标检测模型到目前RNN/CNN等模型，得益于计算技术的快速发展，CPU和GPU的异构硬件算力以及内存速率的不断提升，相信人工智能的发展会更加迅猛。

感谢论坛，感谢本书的作者，书中也有一些地方描述的不是太清晰，希望能进一步把当前大家比较感兴趣的卷积深度神经网络与神经网络的关系以及演进讲的再清晰一些。当然，本书主要是讲解机器学习的相关内容，从这个意义上讲，本书已经非常好了。感谢布老师以及其它作者的辛勤付出，再次感谢论坛，能举办这类的阅读测评活动，非常有意义。希望这次分享的8个帖子能给大家带来些许益处。

ResNet实战--引言：本贴的另外一部分，将尝试将书中介绍的resnet(残差网络)的pytorch在本地try起来，resnet模型是我在工作当中被反复提及的一个网络，通常软件工程师会用resnet50来测试我们制作的人工智能芯片的算力等性能，当然resnet50目前也显得落后了，但不失为一个典型的人工智能网络。

环境说明：

硬件：

我做实验的环境是基于PC，内存

处理器 13th Gen Intel(R) Core(TM) i7-13700KF 3.40 GHz

机带 RAM 16.0 GB (15.7 GB 可用)

系统类型 64 位操作系统, 基于 x64 的处理器

GPU: RXT4060TI-8GB

软件：

Windows 11；

Pycharm社区版

Anaconda免费版

Torch

cuda

Git

pip

需要注意的是: 建议通过conda来管理python环境，在pycharm中，不同的工程可能需要安装不同的python库，这样管理起来比较方便。

另外，可以将conda/pip等资源的源更换为国内，清华源或者中科大源，否则由于墙的限制，下载一些python包的资源会非常慢。

本书也提到了具体的环境搭建方法，可参考：

链接已隐藏，如需查看请登录或者注册

需要注意的一点：书中一些例程常常会用到from utils import train，这个utils大家千万别认为是一个python的库，这个utils实质上本地的一个实现模块，是本书作者实现的一些模块，千万不要用pip尝试安装，pip安装到的不是你想要的。这个我从网上搜了很多的介绍，很多都是乱扯的，有个别的博主，才说道点子上。这里的utils，大家可以从

链接已隐藏，如需查看请登录或者注册

下载，不同章节都有自己的utils, 大家不要搞错了，注意章节对应上。

Resnet(Residual Network)介绍：

深度神经网络的层数较多时，会存在梯度消失的问题，使得网络无法训练。距离loss层越远的层，在反向传播中，梯度越小，就越难更新。Resnet 通过引入了跨层链接解决了梯度回传失的问题。

通过上述创新，resnet网络在2015年的ImageNet大赛中获得最低的TOP5-错误率。

残差网络的一个基本单元：

Batch norm的作用

机器学习中，进行模型训练之前，需对数据做归一化处理，使其分布一致。在深度神经网络训练过程中，通常一次训练是一个batch，而非全体数据。每个batch具有不同的分布产生了internal covarivate shift问题——在训练过程中，数据分布会发生变化，对下一层网络的学习带来困难。Batch Normalization将数据规范到均值为0，方差为1的分布上，一方面使得数据分布一致，另一方面避免梯度消失。

数据集下载：

http://www.cs.toronto.edu/~kriz/cifar.html

The CIFAR-10 and CIFAR-100 are labeled subsets of the 80 million tiny images dataset. They were collected by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton.

The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.

The dataset is divided into five training batches and one test batch, each with 10000 images. The test batch contains exactly 1000 randomly-selected images from each class. The training batches contain the remaining images in random order, but some training batches may contain more images from one class than another. Between them, the training batches contain exactly 5000 images from each class.

或者修改：train_set = CIFAR10('../../data', train=True, transform=data_tf, download= True)

即可自动下载数据集。

上图中/2是stride=2的意思，未指定步长，则默认为1；

代码实现：

utilis.py & resnet.py, 执行resnet的py即可。

utilis.py

from datetime import datetime

import torch
import torch.nn.functional as F
from torch import nn
from torch.autograd import Variable


def get_acc(output, label):
    total = output.shape[0]
    _, pred_label = output.max(1)
    num_correct = (pred_label == label).sum().item()
    return num_correct / total


def train(net, train_data, valid_data, num_epochs, optimizer, criterion, use_cuda=True):
    if use_cuda and torch.cuda.is_available():
        net = net.cuda()

    l_train_loss = []
    l_train_acc = []
    l_valid_loss = []
    l_valid_acc = []

    prev_time = datetime.now()
    for epoch in range(num_epochs):
        train_loss = 0
        train_acc = 0
        net = net.train()
        for im, label in train_data:
            if use_cuda and torch.cuda.is_available():
                im = Variable(im.cuda())  # (bs, 3, h, w)
                label = Variable(label.cuda())  # (bs, h, w)
            else:
                im = Variable(im)
                label = Variable(label)

            # forward
            output = net(im)
            loss = criterion(output, label)

            # backward
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()

            train_loss += loss.item()
            train_acc += get_acc(output, label)

        if valid_data is not None:
            valid_loss = 0
            valid_acc = 0
            net = net.eval()
            for im, label in valid_data:
                if use_cuda and torch.cuda.is_available():
                    im = Variable(im.cuda())
                    label = Variable(label.cuda())
                else:
                    im = Variable(im)
                    label = Variable(label)
                output = net(im)
                loss = criterion(output, label)
                valid_loss += loss.item()
                valid_acc += get_acc(output, label)
            epoch_str = (
                    "[%2d] Train:(L=%f, Acc=%f), Valid:(L=%f, Acc=%f), "
                    % (epoch, train_loss / len(train_data),
                       train_acc / len(train_data), valid_loss / len(valid_data),
                       valid_acc / len(valid_data)))

            l_valid_acc.append(valid_acc / len(valid_data))
            l_valid_loss.append(valid_loss / len(valid_data))
        else:
            epoch_str = ("[%2d] Train:(L=%f, Acc=%f), " %
                         (epoch, train_loss / len(train_data),
                          train_acc / len(train_data)))

        l_train_acc.append(train_acc / len(train_data))
        l_train_loss.append(train_loss / len(train_data))

        cur_time = datetime.now()
        h, remainder = divmod((cur_time - prev_time).seconds, 3600)
        m, s = divmod(remainder, 60)
        time_str = "T: %02d:%02d:%02d" % (h, m, s)

        prev_time = cur_time
        print(epoch_str + time_str)

    return (l_train_loss, l_train_acc, l_valid_loss, l_valid_acc)


def conv3x3(in_channel, out_channel, stride=1):
    return nn.Conv2d(
        in_channel, out_channel, 3, stride=stride, padding=1, bias=False)


class residual_block(nn.Module):
    def __init__(self, in_channel, out_channel, same_shape=True):
        super(residual_block, self).__init__()
        self.same_shape = same_shape
        stride = 1 if self.same_shape else 2

        self.conv1 = conv3x3(in_channel, out_channel, stride=stride)
        self.bn1 = nn.BatchNorm2d(out_channel)

        self.conv2 = conv3x3(out_channel, out_channel)
        self.bn2 = nn.BatchNorm2d(out_channel)
        if not self.same_shape:
            self.conv3 = nn.Conv2d(in_channel, out_channel, 1, stride=stride)

    def forward(self, x):
        out = self.conv1(x)
        out = F.relu(self.bn1(out), True)
        out = self.conv2(out)
        out = F.relu(self.bn2(out), True)

        if not self.same_shape:
            x = self.conv3(x)
        return F.relu(x + out, True)


class resnet(nn.Module):
    def __init__(self, in_channel, num_classes, verbose=False):
        super(resnet, self).__init__()
        self.verbose = verbose

        self.block1 = nn.Conv2d(in_channel, 64, 7, 2)

        self.block2 = nn.Sequential(
            nn.MaxPool2d(3, 2), residual_block(64, 64), residual_block(64, 64))

        self.block3 = nn.Sequential(
            residual_block(64, 128, False), residual_block(128, 128))

        self.block4 = nn.Sequential(
            residual_block(128, 256, False), residual_block(256, 256))

        self.block5 = nn.Sequential(
            residual_block(256, 512, False),
            residual_block(512, 512), nn.AvgPool2d(3))

        self.classifier = nn.Linear(512, num_classes)

    def forward(self, x):
        x = self.block1(x)
        if self.verbose:
            print('block 1 output: {}'.format(x.shape))
        x = self.block2(x)
        if self.verbose:
            print('block 2 output: {}'.format(x.shape))
        x = self.block3(x)
        if self.verbose:
            print('block 3 output: {}'.format(x.shape))
        x = self.block4(x)
        if self.verbose:
            print('block 4 output: {}'.format(x.shape))
        x = self.block5(x)
        if self.verbose:
            print('block 5 output: {}'.format(x.shape))
        x = x.view(x.shape[0], -1)
        x = self.classifier(x)
        return x

resnet.py

import numpy as np
import torch
from torch import nn
import torch.nn.functional as F
from torch.autograd import Variable
from torchvision.datasets import CIFAR10
from torchvision import transforms as tfs

def conv3x3(in_channel, out_channel, stride=1):
    return nn.Conv2d(in_channel, out_channel, 3,
                     stride=stride, padding=1, bias=False)
class Residual_Block(nn.Module):
    def __init__(self, in_channel, out_channel, same_shape=True):
        super(Residual_Block, self).__init__()
        self.same_shape = same_shape
        stride = 1 if self.same_shape else 2

        self.conv1 = conv3x3(in_channel, out_channel, stride=stride)
        self.bn1 = nn.BatchNorm2d(out_channel)

        self.conv2 = conv3x3(out_channel, out_channel)
        self.bn2 = nn.BatchNorm2d(out_channel)
        if not self.same_shape:
            self.conv3 = nn.Conv2d(in_channel, out_channel, 1,
                                   stride=stride)

    def forward(self, x):
        out = self.conv1(x)
        out = F.relu(self.bn1(out), True)
        out = self.conv2(out)
        out = F.relu(self.bn2(out), True)

        if not self.same_shape:
            x = self.conv3(x)
        return F.relu(x + out, True)
#输入输出形状相同
test_net = Residual_Block(32, 32)
test_x = Variable(torch.zeros(1, 32, 96, 96))
print('input: {}'.format(test_x.shape))
test_y = test_net(test_x)
print('output: {}'.format(test_y.shape))

# 输入输出形状不同
test_net = Residual_Block(3, 32, False)
test_x = Variable(torch.zeros(1, 3, 96, 96))
print('input: {}'.format(test_x.shape))
test_y = test_net(test_x)
print('output: {}'.format(test_y.shape))


class ResNet(nn.Module):
    def __init__(self, in_channel, num_classes, verbose=False):
        super(ResNet, self).__init__()
        self.verbose = verbose

        self.block1 = nn.Conv2d(in_channel, 64, 7, 2)

        self.block2 = nn.Sequential(
            nn.MaxPool2d(3, 2),
            Residual_Block(64, 64),
            Residual_Block(64, 64)
        )

        self.block3 = nn.Sequential(
            Residual_Block(64, 128, False),
            Residual_Block(128, 128)
        )

        self.block4 = nn.Sequential(
            Residual_Block(128, 256, False),
            Residual_Block(256, 256)
        )

        self.block5 = nn.Sequential(
            Residual_Block(256, 512, False),
            Residual_Block(512, 512),
            nn.AvgPool2d(3)
        )

        self.classifier = nn.Linear(512, num_classes)

    def forward(self, x):
        x = self.block1(x)
        if self.verbose:
            print('block 1 output: {}'.format(x.shape))
        x = self.block2(x)
        if self.verbose:
            print('block 2 output: {}'.format(x.shape))
        x = self.block3(x)
        if self.verbose:
            print('block 3 output: {}'.format(x.shape))
        x = self.block4(x)
        if self.verbose:
            print('block 4 output: {}'.format(x.shape))
        x = self.block5(x)
        if self.verbose:
            print('block 5 output: {}'.format(x.shape))
        x = x.view(x.shape[0], -1)
        x = self.classifier(x)
        return x
test_net = ResNet(3, 10, True)
test_x = Variable(torch.zeros(1, 3, 96, 96))
test_y = test_net(test_x)
print('output: {}'.format(test_y.shape))

from utils import train

def data_tf(x):
    im_aug = tfs.Compose([
        tfs.Resize(96),
        tfs.ToTensor(),
        tfs.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])
    ])
    x = im_aug(x)
    return x


train_set = CIFAR10('../../data', train=True, transform=data_tf, download= True)
train_data = torch.utils.data.DataLoader(train_set, batch_size=64, shuffle=True)
test_set = CIFAR10('../../data', train=False, transform=data_tf)
test_data = torch.utils.data.DataLoader(test_set, batch_size=128, shuffle=False)

net = ResNet(3, 10)
optimizer = torch.optim.Adam(net.parameters(), lr=1e-3)
criterion = nn.CrossEntropyLoss()
res = train(net, train_data, test_data, 100, optimizer, criterion)

import matplotlib.pyplot as plt
#%matplotlib inline

plt.plot(res[0], label='train')
plt.plot(res[2], label='valid')
plt.xlabel('epoch')
plt.ylabel('Loss')
plt.legend(loc='best')
plt.savefig('fig-res-resnet-train-validate-loss.pdf')
plt.show()

plt.plot(res[1], label='train')
plt.plot(res[3], label='valid')
plt.xlabel('epoch')
plt.ylabel('Acc')
plt.legend(loc='best')
plt.savefig('fig-res-resnet-train-validate-acc.pdf')
plt.show()

# save raw data
import numpy
numpy.save('fig-res-resnet_data.npy', res)

结果：

训练时PC的硬件消耗：

本网络的性能不是很好，猜测是网络的深度不够，准确率一直在80%左右。书中的解释是数据集小以及没有做数据增强。

结语：机器学习是一门交叉学科，数学是基础，计算机是工具，经过不断努力，大多数人都能掌握基本的思想，但是想创造出优雅的网络模型，离不开前人的基础以及不断持之以恒的努力。另外，硬件是推动技术进步的必要东西，人工智能经历了数次冷暖交替，现在的火热，离不开计算机技术的发展，大量高性能硬件的出现，使得复杂计算的实现成为了现实。作为一名AI芯片行业从业者，对于AI的应用有一定的了解，是有益的，能够提升自己的综合实力，提高自己的可触及的天花板，知其然，知其所以然，相信未来，人工智能必将为人类的发展贡献更大的力量，让我们拭目以待。

《机器学习算法与实现 —— Python编程与应用实例》-- 第八贴 结项语与resnet实战 [复制链接]

《机器学习算法与实现 —— Python编程与应用实例》-- 第八贴结项语与resnet实战 [复制链接]