1、 The concept of transfer learning
What is transfer learning? Transfer learning can be represented by the following diagram:
The leftmost part of this graph shows the transfer learning, that is, the trained model and weight are directly incorporated into the new dataset for training, but we only change the classifier of the previous model (full connection layer and softmax / sigmoid), so that we can save the training time to a new training model!
But why is it possible to do so?
2、 Why can transfer learning be used?
Generally, in the problem of image classification, the front layer of the convolutional neural network is used to identify the most basic features of the image, such as the outline, color, texture, etc., while the latter layer is the key to extract the abstract features of the image. Therefore, the best way is that we only need to retain the weight of the bottom layer in the convolutional neural network, and train the top-level and new classifiers 。 So how do we use transfer learning in image classification? In general, the steps of using transfer learning, that is, pre training neural network, are as follows;
1. Freeze the convolution layer weight of the pre training network
2. Replace the old full connection layer with a new full connection layer and classifier
3. Thaw the convolution layer at the top of the part and keep the weight of the bottom convolution neural network
4. At the same time, the top layer of the convolution layer and the full connection layer are trained jointly to obtain the new network weight
Now that we know the basic characteristics of transfer learning, why not give it a try?
3、 Code implementation of transfer learning
We use the method of migration learning to classify and recognize the images of cats and dogs. The images of cats and cats are in my folder as shown in the following figure:
Then guide the package:
import tensorflow as tf from tensorflow import keras import matplotlib.pyplot as plt import numpy as np import glob import os
Get the path of the picture, label, and make batch data. The path of the picture is stored in the train folder under disk F, and the path is: F: / / university study / AI / dataset / CatDog / train /.
The code is as follows:
keras=tf.keras layers=tf.keras.layers #Get all the labels of the picture train_image_label=[int(p.split("\")=='cat') for p in train_image_path ] #Now our JPG file is decoded and becomes a three-dimensional matrix def load_preprosess_image(path,label): #Read path image=tf.io.read_file(path) #Decoding image= tf.image.decode_ JPEG (image, channels = 3) ා color image has 3 channels #Change the image to the same size, using crop or twist, where twist is applied image=tf.image.resize(image,[360,360]) #Random cropping image image=tf.image.random_crop(image,[256,256,3]) #Flip images randomly up and down image=tf.image.random_flip_left_right(image) #Random flip up and down image=tf.image.random_flip_up_down(image) #Randomly change the brightness of the image image=tf.image.random_brightness(image,0.5) #Change the contrast randomly image=tf.image.random_contrast(image,0,1) #Change data type image=tf.cast(image,tf.float32) #The image was normalized image=image/255 #Now we need to process the label. Now we are the list [1,2,3], #Need to become [. . ] label=tf.reshape(label,) return image,label train_image_ds=tf.data.Dataset.from_tensor_slices((train_image_path,train_image_label)) AUTOTUNE= tf.data.experimental . autotune ා adjust the operation speed according to the performance of the computer train_image_ds=train_image_ds.map(load_preprosess_image,num_parallel_calls=AUTOTUNE) #Train now_ image_ DS will be read in, and now the order and batch size will be specified BATCH_SIZE=32 train_count=len(train_image_path) #Now set batch and out of order train_image_ds=train_image_ds.shuffle(train_count).batch(BATCH_SIZE) train_ image_ ds=train_ image_ ds.prefetch Preprocessing part of the process, ready to read imags,labels=iter(train_ image_ Put the data in the next () separately plt.imshow(imags)
The cat image in the batch data is displayed
Build the network architecture, introduce the classical image classification model vgg16, and call the weight of vgg16 pre training network. Finally, the last three layers of the coiling layer are adjusted to be trainable, that is, the convolution neural network at the top level can be combined with the full link classifier.
conv_base=keras.applications.VGG16(weights='imagenet',include_top=False) #If weights is set to Imagenet, it means the weight trained by imagebnet. If you fill in false, it means no weight is used #Network architecture only, include_ Top indicates whether to use the full connectivity layer for classification #We can add full connection layer and output layer to the convolution layer model=keras.Sequential() model.add(conv_base) model.add(layers.GlobalAveragePooling2D()) model.add(layers.Dense(512,activation='relu')) model.add(layers.Dense(1,activation='sigmoid')) conv_ base.trainable=True# There are 19 floors for layer in conv_base.layers[:-3]: layer.trainable=False #From the first layer to the last third layer, it is not trainable. Now the top layer of convolution has been thawed and joint training is started #Compile this network model.compile(optimizer=keras.optimizers.Adam(lr=0.001), loss='binary_crossentropy', metrics=['acc']) history=model.fit( train_image_ds, steps_per_epoch=train_count//BATCH_SIZE, epochs=1 )
The results of training only one epoch are as follows;
Train for 62 steps 62/62 [==============================] - 469s 8s/step - loss: 0.6323 - acc: 0.6159
The accuracy of one iteration has reached 60%. How about it? Do you have a certain feeling about transfer learning now?