Convolution neural network part1


[task 1] video learning experience and problem summary

According to the learning content of the following three videos, write a summary, and finally list the problems that have not been understood.

[task 2] code practice

Complete the code practice on Google colab, take screenshots of key steps, and attach some of your own ideas and interpretations.

[task 3] prospect learning

Combined with self-study of Google’s concept V1 to V4, and mobilenets, sort out a learning experience.

[task 4] actual combat practice

AI Institute has launched a new competition “industrial surface texture defect detection”, with a prize of 3000 yuan. Competition link:

The competition time is from July 24 to August 22, from the University of Heidelberg image processing cooperation project, as long as the submitted results exceed the standard score of 40, there will be a bonus.


1. Video learning

Mathematics foundation of deep learning

(download address: )

Convolution neural network

(download address: )

The main contents include:

  • The basic structure of CNN: convolution, pooling and full connection
  • Typical network structure: alexnet, VGg, Google net, RESNET

● Jingdong experts explain RESNET with Python code

Link to station B:

The main contents include: the basic principle of residual learning, identity mapping and shotcut, implementation of resnet152 with Python

All video download links will automatically expire on July 30, 2020. Please hurry to download.


[task 1 answer] video learning experience and problem summary

Video learning experience:

Mathematics foundation of deep learning

1、 Mathematical basis in machine learning (deep learning)

2、 Three elements of machine learning: model, strategy and algorithm

1. Unification of probability / function form

2. “Optimal” strategy design

3. Loss function

3、 Frequency School & Bayesian school

4、 Beyond deep learning

1. Causal inference

2. Group intelligence


● Convolution neural network

1、 Introduction

1. The application of convolution neural network

(1) Basic application

Classification, retrieval, detection and segmentation


(2) Specific application

      Face recognition, expression recognition, image to image, text to image, image style conversion, automatic driving


2. Traditional neural network vs convolution neural network

(1) Trilogy of deep learning

       Step 1. Building neural network structure

       Find a suitable loss function

For example: cross entropy loss, MSE

       Find a suitable optimization function and update the parameters

                  For example: back propagation (BP), random gradient descent (SGD)



       The real location is 1, and everywhere else is 0.


(3) Loss function

In classified loss, Yi: real category YipForecast category Y: real class F (x): forecast result

In regression loss, Yi: real position Yip: predicted location


(4) The difference between traditional neural network and convolution neural network

Each layer of traditional neural network adopts full connection layer, which will lead to too many parameters and over fitting;

Convolution neural network uses convolution kernel to achieve the effect of local correlation and parameter sharing. There are convolution layer, activation layer, pooling layer and full connection layer in the middle.


2、 Basic structure

1. Convolution

(1) Basic concepts involved in convolution


(2) Calculation formula of feature map size


(input graph size + padding * 2-convolution kernel size) / step size + 1


(3) Convolution example


(4) Visual understanding of convolution

The convolution kernel in shallow layer pays more attention to the whole information, while the convolution kernel in deep layer pays more attention to the information of some local feature, and different convolution kernels pay attention to different information.


2. Pooling

(1) Characteristics of pooling

       a. Pooling is equivalent to scaling. It retains the main features while reducing the amount of parameters and calculation, preventing over fitting and improving the generalization ability of the model;

b. It is generally located between the convolution layer and the convolution layer, between the fully connected layer and the fully connected layer.


(2) Types of pooling

Max pooling: maximum pooling

Average pooling: average pooling


(3) Pool parameters commonly used in experiments

a. Common maximum pooling in classification tasks

b. The filter size is generally set to 2 * 2 or 3 * 3

c. The step size is generally set to 2


3. Full connection

a. Between the two layers, all neurons have weight connections

b. Generally, the full connection layer is at the tail of convolutional neural network

c. The total connection layer parameters are usually the largest


3、 Typical structure of convolution neural network







4、 Code Combat: tensorflow CNN


● Jingdong experts explain RESNET with Python code


2. Code practice

Code exercises need to use Google’s colab.

● MNIST dataset classification

A simple CNN is constructed to classify the MNIST dataset. At the same time, we will learn the basic role of pooling and convolution in the experiment.

Link: demo/blob/master/05_ 01_ ConvNet.ipynb

requirement:Input the code into colab and run it online to observe the effect

● cifar10 dataset classification

Using CNN to classify cifar10 datasets

Link: demo/blob/master/05_ 02_ CNN_ CIFAR10.ipynb

requirement:Input the code into colab and run it online to observe the effect

● use vgg16 to classify cifar10

Link: demo/blob/master/05_ 03_ VGG_ CIFAR10.ipynb

requirement:Input the code into colab and run it online to observe the effect

Use VGg model to transfer learning to fight cat and dog

Link: demo/blob/master/05_ 04_ Transfer_ VGG_ for_ dogs_ Vs_ cats.ipynb

Requirements: this part is a cat and dog competition held by kaggle in 2013, which is tested using the VGg network pre trained on Imagenet. Because the classification result of the original network is 1000 classes, so transfer learning is carried out here to fine tune the original network.

Requirements for reading the cat and dog competition of AI Institute carefully: (at present, the competition is over, but it can still be used as a practice competition to submit test results every day)

Download the test set of the competition (including 2000 pictures), test with the VGg model of fine tune, output according to the format of the competition, and upload the results for evaluation.


[task 2 answer] code practice

● MNIST dataset classification

● cifar10 dataset classification

● use vgg16 to classify cifar10

Use VGg model to transfer learning to fight cat and dog


3. Prospect learning

  • Read the classic model of big talk CNN, link:

Google team’s concept network:

1. In order to capture the information of multiple receptive fields, multiple size convolution kernels are used for parallel processing;

2. Large size convolution kernels can be replaced by small convolution kernels (for example, one 5×5 convolution can be replaced by two 3×3 convolutions);

3. Asymmetric convolution is used.


  • Read Google’s paper in 2017: “mobilenets: efficient volatile neural networks for mobile vision applications”

       MobileNet:The depth separable convolution is used to decompose the standard convolution into depth wise and point wise convolutions. For details, please refer to Zhihu article:


  • Read hybridsn: exploring 3-D – 2-D CNN feature hierarchy for hyperspectral image classification

       HybridSN:Think about the difference between two-dimensional convolution and three-dimensional convolution?


[answer to task 3] looking forward to learning

● googlenet (evolution from inception V1 to V4)

● MobileNets

● HybridSN


4. Practical training

Recommended Today

Web authentication and API using token

First of all, I’ll post the article I’ve read:1.…2.…3.…4.jwt demo : There are several good ones here, which are officially recommendedSix… Anti replay attack scheme based on timestamp and nonce The company’s current web design is based on HTTP basic auth, and has always felt that there will be great security problems, and […]