# Pytorch – VGg network construction

Time：2022-1-16

catalogue

VGg introduction

What’s the power of VGg?

What is receptive field?

Calculation formula

Question 1:

Question 2:

network diagram

Pytorch builds VGg network

🚓1. model.py

Magical place to deal with

🚓2. train.py

3. predict.py

be careful

# VGg introduction

VGg was established by the famous research group of Oxford University in 2014VGGVisual Geometry Group）Put forward,

In the Imagenet competition of the yearLocalization taskFirst and Classification taskproxime accessit. (it can be said to be very powerful)

## What’s the power of VGg?

By stacking multiple small convolution kernels instead of large-scale convolution kernels, the training parameters can be reduced and the same receptive field can be guaranteed

## What is receptive field?

Determining the area size of the input layer corresponding to an element in the output result of a layer is calledReceptive fieldreceptive field

Simply put, a cell on the output feature map corresponds to the area size on the input layer (upper layer) For example, in the above figure, the receptive field of maxpool 1 is 2 (meaning that one grid on the upper layer corresponds to two grids on the lower layer)

Conv1 receptive field was 5

## Calculation formula

The calculation formula of our receptive field is: F (I + 1) is the receptive field of layer I + 1
Stride is the step length of the i-th layer
Ksize is the convolution kernel or pooled kernel size

## Question 1:

Stack two 3 × The convolution kernel of 3 replaces the convolution kernel of 5×5 and stacks three 3 × 3 instead of 7×7.

(In VGg network, the convolution stride defaults to 1)

Are the receptive fields the same before and after substitution?

According to the formula

(first floor)Feature map: F(1) = 1

(second floor)Conv3x3(3)：  (third floor)Conv3x3(2): （5 × 5 convolution nucleus receptive field)

(fourth floor)Conv3x3(1): （7 × 7 convolution nucleus (receptive field)

2 3 × threeConvolution kernel and a5×5Convolution kernelSame receptive field

Prove that it can passStack two 3 × The convolution kernel of 3 replaces the convolution kernel of 5×5 and stacks three 3 × The convolution kernel of 3 replaces the convolution kernel of 7×7

## Question 2:

Stack 3 × 3 is the training parameter really reduced after convolution kernel?

Note: number of CNN parameters = convolution kernel size × Convolution kernel depth × Number of convolution kernel groups = convolution kernel size × Input characteristic matrix depth × Output characteristic matrix depth
It is assumed that the depth of the input characteristic matrix = the depth of the output characteristic matrix = C

Use 7 × 7 number of parameters required for convolution kernel: Stack three 3 × Number of parameters required for convolution kernel of 3: Obviously 27 is less than 49

## network diagram

VGg network has multiple versions,

We generally use vgg16（16 means 16 layers = 12 convolution layers + 4 full connection layers

The network structure is as follows: Looking at the picture and calculation, we can know that after 3 × 3 the size of the characteristic matrix of convolution does not change:

out =(in −F+2P)/S+1=(in ​−3+2)/1+1= in

Out = same size as in

# Pytorch builds VGg network

VGg network is divided intoConvolution layer feature extractionAndThe whole connection layer is classifiedThese two modules

## 🚓1. model.py

``````import torch.nn as nn
import torch

class VGG(nn.Module):
def __init__(self, features, num_classes=1000, init_weights=False):
super(VGG, self).__init__()
self. features = features 			#  Convolution layer feature extraction
self. classifier = nn. Sequential( 	#  The whole connection layer is classified
nn.Dropout(p=0.5),
nn.Linear(512*7*7, 2048),
nn.ReLU(True),
nn.Dropout(p=0.5),
nn.Linear(2048, 2048),
nn.ReLU(True),
nn.Linear(2048, num_classes)
)
if init_weights:
self._ initialize_ Weights() # initialize weights

def forward(self, x):
# N x 3 x 224 x 224
x = self.features(x)
# N x 512 x 7 x 7
x = torch.flatten(x, start_dim=1)
# N x 512*7*7
x = self.classifier(x)
return x

def _initialize_weights(self):
for m in self.modules():
if isinstance(m, nn.Conv2d):
# nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
nn.init.xavier_uniform_(m.weight)
if m.bias is not None:
nn.init.constant_(m.bias, 0)
elif isinstance(m, nn.Linear):
nn.init.xavier_uniform_(m.weight)
# nn.init.normal_(m.weight, 0, 0.01)
nn.init.constant_(m.bias, 0)``````

### Magical place to deal with ``````#VGg network model configuration list. The number represents the number of convolution cores,'m 'represents the maximum pooling layer
cfgs = {
'vgg11': [64, 'M', 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'], 											#  Model a
'vgg13': [64, 64, 'M', 128, 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'], 									#  Model B
'vgg16': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 'M', 512, 512, 512, 'M', 512, 512, 512, 'M'], 					#  Model d
'vgg19': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 256, 'M', 512, 512, 512, 512, 'M', 512, 512, 512, 512, 'M'],  	#  Model e
}

#Convolution layer feature extraction
def make_ Features (CFG: list): # the parameter list of a specific model is passed in
layers = []
in_ channels = 3 		#  Input original image (RGB three channel)
for v in cfg:
#If it is a maximum pool layer, pool it
if v == "M":
layers += [nn.MaxPool2d(kernel_size=2, stride=2)]
#Otherwise, it is the convolution layer
else:
conv2d = nn.Conv2d(in_channels, v, kernel_size=3, padding=1)
layers += [conv2d, nn.ReLU(True)]
in_channels = v
return nn. Sequential (* layers) # single asterisk (*) imports parameters as tuples

Def VGg (model_name = "vgg16", * * kwargs): # double asterisk (* *) import parameters as a dictionary
try:
cfg = cfgs[model_name]
except:
print("Warning: model number {} not in cfgs dict!".format(model_name))
exit(-1)
Model = VGg (make_features (CFG), * * kwargs) # * * kwargs is the dictionary data you passed in
return model``````

## 🚓2. train.py

andPytorch – alexnet – training flower classification data set_ heart_ 6662 blog – blogSame as (data or flower data)

``````import os
import json

import torch
import torch.nn as nn
from torchvision import transforms, datasets
import torch.optim as optim
from tqdm import tqdm

from model import vgg

def main():
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print("using {} device.".format(device))

data_transform = {
"train": transforms.Compose([transforms.RandomResizedCrop(224),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]),
"val": transforms.Compose([transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])}

data_root = os.path.abspath(os.path.join(os.getcwd(), "../.."))  # get data root path
image_path = os.path.join(data_root, "data_set", "flower_data")  # flower data set path
assert os.path.exists(image_path), "{} path does not exist.".format(image_path)
train_dataset = datasets.ImageFolder(root=os.path.join(image_path, "train"),
transform=data_transform["train"])
train_num = len(train_dataset)

# {'daisy':0, 'dandelion':1, 'roses':2, 'sunflower':3, 'tulips':4}
flower_list = train_dataset.class_to_idx
cla_dict = dict((val, key) for key, val in flower_list.items())
# write dict into json file
json_str = json.dumps(cla_dict, indent=4)
with open('class_indices.json', 'w') as json_file:
json_file.write(json_str)

batch_size =32
nw = min([os.cpu_count(), batch_size if batch_size > 1 else 0, 8])  # number of workers
print('Using {} dataloader workers every process'.format(nw))

batch_size=batch_size, shuffle=True,
num_workers=0)

validate_dataset = datasets.ImageFolder(root=os.path.join(image_path, "val"),
transform=data_transform["val"])
val_num = len(validate_dataset)
batch_size=batch_size, shuffle=False,
num_workers=0)
print("using {} images for training, {} images for validation.".format(train_num,
val_num))

# test_image, test_label = test_data_iter.next()

model_name = "vgg16"
net = vgg(model_name=model_name, num_classes=5, init_weights=True)
net.to(device)
loss_function = nn.CrossEntropyLoss()

epochs = 30
best_acc = 0.0
save_path = './{}Net.pth'.format(model_name)
for epoch in range(epochs):
# train
net.train()
running_loss = 0.0
for step, data in enumerate(train_bar):
images, labels = data
outputs = net(images.to(device))
loss = loss_function(outputs, labels.to(device))
loss.backward()
optimizer.step()

# print statistics
running_loss += loss.item()

train_bar.desc = "train epoch[{}/{}] loss:{:.3f}".format(epoch + 1,
epochs,
loss)

# validate
net.eval()
acc = 0.0  # accumulate accurate number / epoch
for val_data in val_bar:
val_images, val_labels = val_data
outputs = net(val_images.to(device))
predict_y = torch.max(outputs, dim=1)
acc += torch.eq(predict_y, val_labels.to(device)).sum().item()

val_accurate = acc / val_num
print('[epoch %d] train_loss: %.3f  val_accuracy: %.3f' %
(epoch + 1, running_loss / train_steps, val_accurate))

if val_accurate > best_acc:
best_acc = val_accurate
torch.save(net.state_dict(), save_path)

print('Finished Training')

if __name__ == '__main__':
main()
``````

## 3. predict.py

``````import os
import json

import torch
from PIL import Image
from torchvision import transforms
import matplotlib.pyplot as plt

from model import vgg

def main():
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

data_transform = transforms.Compose(
[transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

img_path = "../tulip.jpg"
assert os.path.exists(img_path), "file: '{}' dose not exist.".format(img_path)
img = Image.open(img_path)
plt.imshow(img)
# [N, C, H, W]
img = data_transform(img)
# expand batch dimension
img = torch.unsqueeze(img, dim=0)

json_path = './class_indices.json'
assert os.path.exists(json_path), "file: '{}' dose not exist.".format(json_path)

json_file = open(json_path, "r")

# create model
model = vgg(model_name="vgg16", num_classes=5).to(device)
weights_path = "./vgg16Net.pth"
assert os.path.exists(weights_path), "file: '{}' dose not exist.".format(weights_path)

model.eval()
# predict class
output = torch.squeeze(model(img.to(device))).cpu()
predict = torch.softmax(output, dim=0)
predict_cla = torch.argmax(predict).numpy()

print_res = "class: {}   prob: {:.3}".format(class_indict[str(predict_cla)],
predict[predict_cla].numpy())
plt.title(print_res)
for i in range(len(predict)):
print("class: {:10}   prob: {:.3}".format(class_indict[str(i)],
predict[i].numpy()))
plt.show()

if __name__ == '__main__':
main()
``````

# be careful

The VGg network model has a deep depth and needs to use a powerful GPU for training (in addition, for a GPU with a larger memory, my 3050 can’t run, and pytorch will report an error that the GPU memory is insufficient)

You can also try making the batch smaller_ size ## Fluent settings apply theme colors and fonts

In the process of app development, we certainly hope to bring a consistent experience to users. The most basic thing is to keep the tone and font consistent. In fluent, you can set the global theme tone and font, so as to reference the main color and font on other pages to achieve the consistency […]