What is transfer learning? (with tensorflow code implementation)

Time:2020-11-24

1、 The concept of transfer learning

What is transfer learning? Transfer learning can be represented by the following diagram:

 

 

The leftmost part of this graph shows the transfer learning, that is, the trained model and weight are directly incorporated into the new dataset for training, but we only change the classifier of the previous model (full connection layer and softmax / sigmoid), so that we can save the training time to a new training model!

But why is it possible to do so?

2、 Why can transfer learning be used?

Generally, in the problem of image classification, the front layer of the convolutional neural network is used to identify the most basic features of the image, such as the outline, color, texture, etc., while the latter layer is the key to extract the abstract features of the image. Therefore, the best way is that we only need to retain the weight of the bottom layer in the convolutional neural network, and train the top-level and new classifiers 。 So how do we use transfer learning in image classification? In general, the steps of using transfer learning, that is, pre training neural network, are as follows;

1. Freeze the convolution layer weight of the pre training network

2. Replace the old full connection layer with a new full connection layer and classifier

3. Thaw the convolution layer at the top of the part and keep the weight of the bottom convolution neural network

4. At the same time, the top layer of the convolution layer and the full connection layer are trained jointly to obtain the new network weight

Now that we know the basic characteristics of transfer learning, why not give it a try?

3、 Code implementation of transfer learning

We use the method of migration learning to classify and recognize the images of cats and dogs. The images of cats and cats are in my folder as shown in the following figure:

 

 

Then guide the package:

import tensorflow as tf
from tensorflow import keras
import matplotlib.pyplot as plt
import numpy as np
import glob
import os

Get the path of the picture, label, and make batch data. The path of the picture is stored in the train folder under disk F, and the path is: F: / / university study / AI / dataset / CatDog / train /.

The code is as follows:

keras=tf.keras
layers=tf.keras.layers
#Get all the labels of the picture
train_image_label=[int(p.split("\")[1]=='cat') for p in train_image_path ]


#Now our JPG file is decoded and becomes a three-dimensional matrix
def load_preprosess_image(path,label):
    #Read path
    image=tf.io.read_file(path)
    #Decoding
    image= tf.image.decode_ JPEG (image, channels = 3) ා color image has 3 channels
    #Change the image to the same size, using crop or twist, where twist is applied
    image=tf.image.resize(image,[360,360])
    #Random cropping image
    image=tf.image.random_crop(image,[256,256,3])
    #Flip images randomly up and down
    image=tf.image.random_flip_left_right(image)
    #Random flip up and down
    image=tf.image.random_flip_up_down(image)
    #Randomly change the brightness of the image
    image=tf.image.random_brightness(image,0.5)
    #Change the contrast randomly
    image=tf.image.random_contrast(image,0,1)
    #Change data type
    image=tf.cast(image,tf.float32)
    #The image was normalized
    image=image/255
    #Now we need to process the label. Now we are the list [1,2,3],
    #Need to become [[1]. [2]. [3]]
    label=tf.reshape(label,[1])
    return image,label

train_image_ds=tf.data.Dataset.from_tensor_slices((train_image_path,train_image_label))
AUTOTUNE= tf.data.experimental . autotune ා adjust the operation speed according to the performance of the computer
train_image_ds=train_image_ds.map(load_preprosess_image,num_parallel_calls=AUTOTUNE)
#Train now_ image_ DS will be read in, and now the order and batch size will be specified
BATCH_SIZE=32
train_count=len(train_image_path)
#Now set batch and out of order
train_image_ds=train_image_ds.shuffle(train_count).batch(BATCH_SIZE)
train_ image_ ds=train_ image_ ds.prefetch Preprocessing part of the process, ready to read

imags,labels=iter(train_ image_ Put the data in the next () separately
plt.imshow(imags[30])

The cat image in the batch data is displayed

 

Build the network architecture, introduce the classical image classification model vgg16, and call the weight of vgg16 pre training network. Finally, the last three layers of the coiling layer are adjusted to be trainable, that is, the convolution neural network at the top level can be combined with the full link classifier.

conv_base=keras.applications.VGG16(weights='imagenet',include_top=False)
#If weights is set to Imagenet, it means the weight trained by imagebnet. If you fill in false, it means no weight is used
#Network architecture only, include_ Top indicates whether to use the full connectivity layer for classification
#We can add full connection layer and output layer to the convolution layer
model=keras.Sequential()
model.add(conv_base)
model.add(layers.GlobalAveragePooling2D())
model.add(layers.Dense(512,activation='relu'))
model.add(layers.Dense(1,activation='sigmoid'))

conv_ base.trainable=True# There are 19 floors
for layer in conv_base.layers[:-3]:
    layer.trainable=False
#From the first layer to the last third layer, it is not trainable. Now the top layer of convolution has been thawed and joint training is started

#Compile this network
model.compile(optimizer=keras.optimizers.Adam(lr=0.001),
              loss='binary_crossentropy',
              metrics=['acc'])

history=model.fit(
train_image_ds,
steps_per_epoch=train_count//BATCH_SIZE,
    epochs=1
)

The results of training only one epoch are as follows;

Train for 62 steps
62/62 [==============================] - 469s 8s/step - loss: 0.6323 - acc: 0.6159

The accuracy of one iteration has reached 60%. How about it? Do you have a certain feeling about transfer learning now?