Book: come on, it’s time to prove you love me!

Time:2020-6-16

Project experience address: http://at.iunitv.cn/

Effect preview:

Book: come on, it's time to prove you love me!

Trivia:

A lot of kids can’t learn from each other. In fact, they are honest.
Book: come on, it's time to prove you love me!

After all, there are still many benefits of reading: for example, it can let your brain radiate the light of wisdom, or it can give you reason to say that there is no girlfriend because reading is too busy, and so on.So on this special day, we have contracted for your books of this year. Not for anything else, just to help everyone here meet a better self in 2020!

Book: come on, it's time to prove you love me!

Today’s theme is just to send books. We also want to use this special opportunity to popularize tensorflow related knowledge. We will use TensorFlow.js Make a model of book recognition, and run it in Vue application, endow web page with the ability to recognize books.

This paper describes the AI related concept knowledge and how to use SSD mobile net V1 model for migration learning, so as to help you complete a Book recognition model that can run on the web page.

[there are activities at the end of the article]

Text:

What is transfer learning

Transfer learning and domain adaptation refers to the use of knowledge learned in one environment in another to improve its generalization performance. ——Deep learning, P. 526

Let’s have a simple understanding. Taking today’s book recognition model training as an example, we use the AI model trained by predecessors with the ability of image recognition, retain the ability of image feature extraction in the AI model, and then train again, so that the AI model has the ability of recognizing books.

Transfer learning can greatly improve the speed of model training and achieve relatively good accuracy.

The object we need to transfer and learn today is SSD mobile net V1 model. Students who first contact with neural network can understand it as a light and small AI model with picture recognition, which can run efficiently on mobile devices. Students interested in the specific neural network design structure of this model can search by themselves.

Now that we know the basic concepts, let’s start! We can design our own AI model based on SSD mobile net model and let it run in Vue application.

Object detection

This project is to train a model of object detection, that is, the model of object recognition, which can recognize and circle the corresponding object in the picture.

Book: come on, it's time to prove you love me!

preparation

Synchronous development environment

In order to avoid all kinds of holes encountered by small partners due to environmental problems, we need to synchronize the running environment with you before work. If you want to do it by hand, you should also try to keep consistent with our operating environment, which can effectively avoid stepping on the pit and avoid the phenomenon of “from entry to giving up”.

development environment

  • System Mac OS System
  • Python version: 3.7.3
  • Tensorflow version: 1.15.2
  • Tensorflowjs version: 1.7.2
  • Development tools: pysharm and webstorm

Download items

After synchronizing the development environment, it’s finally time to start. First, we need to download several projects on GitHub:

  • Migration model training program
  • Picture format conversion project
  • Picture circle tool labelimg

Prepare picture material

We can collect picture materials of books through search engine:

Secondly, we can clone the labelimg project on GitHub, and install and run the labelimg project according to the instructions of GitHub in different environments. The page after running is as follows:

Book: come on, it's time to prove you love me!

Then, we follow the steps below to convert the image format to the XML file after the selection area:

  1. Open the directory where the pictures are stored
  2. Select the storage directory after circling
  3. Circle image target area
  4. Label the selected area
  5. Save as XML

After storage, we will see many files in XML format in the stored directory. This file records the location information, circle information and label information of the pictures, which are used for subsequent model training.

Configure the environment for installing object detection

Clone the project migration model training project from GitHub, pay attention to running in r1.5 branch, and open the project with pycharm.

Book: come on, it's time to prove you love me!

The directory environment of the project is shown in the figure above. First, we need to download tensorflow1.15.2:

pip install tensorflow==1.15.2

Second, install the dependency package:

sudo pip install pillow
sudo pip install lxml
sudo pip install jupyter
sudo pip install matplotlib

Then switch to the research directory through the terminal and execute several lines of configuration commands. For details, please refer to GitHub’s instructions:

cd ./research
protoc object_detection/protos/*.proto --python_out=.
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim

Finally, we runmodel_builder_test.pyFile. If you see the word “OK” in the terminal, it indicates that the configuration is successful.

python object_detection/builders/model_builder_test.py

Convert XML format to tfrecord format required by tensorflow

Clone and open the image format conversion project, and then we will make a small transformation of the project:

Directory of modification documents:

  1. deleteannotationsdatatrainingContents of the catalog
  2. Add onexmlsDirectory to store XML files

Book: come on, it's time to prove you love me!
Modification documents:
Next, we will modify the following 2 files and add a new one to facilitate the transformation of image format

  1. reformxml_to_csv.pyIs:

    import os
    import glob
    import pandas as pd
    import xml.etree.ElementTree as ET
    import random
    import time
    import shutil
    
    class Xml2Cvs:
        def __init__(self):
            self.xml_filepath = r'./xmls'
            self.save_basepath = r"./annotations"
            self.trainval_percent = 0.9
            self.train_percent = 0.85
    
        def xml_split_train(self):
    
            total_xml = os.listdir(self.xml_filepath)
            num = len(total_xml)
            list = range(num)
            tv = int(num * self.trainval_percent)
            tr = int(tv * self.train_percent)
            trainval = random.sample(list, tv)
            train = random.sample(trainval, tr)
            print("train and val size", tv)
            print("train size", tr)
            start = time.time()
            test_num = 0
            val_num = 0
            train_num = 0
            for i in list:
                name = total_xml[i]
                if i in trainval:
                    if i in train:
                        directory = "train"
                        train_num += 1
                        xml_path = os.path.join(os.getcwd(), 'annotations/{}'.format(directory))
                        if (not os.path.exists(xml_path)):
                            os.mkdir(xml_path)
                        filePath = os.path.join(self.xml_filepath, name)
                        newfile = os.path.join(self.save_basepath, os.path.join(directory, name))
                        shutil.copyfile(filePath, newfile)
                    else:
                        directory = "validation"
                        xml_path = os.path.join(os.getcwd(), 'annotations/{}'.format(directory))
                        if (not os.path.exists(xml_path)):
                            os.mkdir(xml_path)
                        val_num += 1
                        filePath = os.path.join(self.xml_filepath, name)
                        newfile = os.path.join(self.save_basepath, os.path.join(directory, name))
                        shutil.copyfile(filePath, newfile)
                else:
                    directory = "test"
                    xml_path = os.path.join(os.getcwd(), 'annotations/{}'.format(directory))
                    if (not os.path.exists(xml_path)):
                        os.mkdir(xml_path)
                    test_num += 1
                    filePath = os.path.join(self.xml_filepath, name)
                    newfile = os.path.join(self.save_basepath, os.path.join(directory, name))
                    shutil.copyfile(filePath, newfile)
    
            end = time.time()
            seconds = end - start
            print("train total : " + str(train_num))
            print("validation total : " + str(val_num))
            print("test total : " + str(test_num))
            total_num = train_num + val_num + test_num
            print("total number : " + str(total_num))
            print("Time taken : {0} seconds".format(seconds))
    
        def xml_to_csv(self, path):
            xml_list = []
            for xml_file in glob.glob(path + '/*.xml'):
                tree = ET.parse(xml_file)
                root = tree.getroot()
                print(root.find('filename').text)
                for object in root.findall('object'):
                    value = (root.find('filename').text,
                             int(root.find('size').find('width').text),
                             int(root.find('size').find('height').text),
                             object.find('name').text,
                             int(object.find('bndbox').find('xmin').text),
                             int(object.find('bndbox').find('ymin').text),
                             int(object.find('bndbox').find('xmax').text),
                             int(object.find('bndbox').find('ymax').text)
                             )
                    xml_list.append(value)
            column_name = ['filename', 'width', 'height', 'class', 'xmin', 'ymin', 'xmax', 'ymax']
            xml_df = pd.DataFrame(xml_list, columns=column_name)
            return xml_df
    
        def main(self):
            for directory in ['train', 'test', 'validation']:
                xml_path = os.path.join(os.getcwd(), 'annotations/{}'.format(directory))
                xml_df = self.xml_to_csv(xml_path)
                xml_df.to_csv('data/mask_{}_labels.csv'.format(directory), index=None)
                print('Successfully converted xml to csv.')
    
    
    if __name__ == '__main__':
        Xml2Cvs().xml_split_train()
        Xml2Cvs().main()
  1. reformgenerate_tfrecord.pyFile to convert the CSV format to the record format required by tensorflow:

Book: come on, it's time to prove you love me!

Change the row of this area_ Label is changed to the label name in our labelimg, because we only have one label, so we directly change it tobookJust.

  1. Add a new onegenerate_tfrecord.shScript, easy to executegenerate_tfrecord.pyfile

    #!/usr/bin/env bash
    python generate_tfrecord.py --csv_input=data/mask_train_labels.csv  --output_path=data/mask_train.record --image_dir=images
    python generate_tfrecord.py --csv_input=data/mask_test_labels.csv  --output_path=data/mask_test.record --image_dir=images
    python generate_tfrecord.py --csv_input=data/mask_validation_labels.csv  --output_path=data/mask_validation.record --image_dir=images
    

Configure the environment for object decision

Export pythonpath = $pythonpath: the full directory path of your models / research / slim

Finally, we copy the image file toimagesDirectory, copying XML files toxmlsUnder directory, execute againxml_to_csv.pyFile, we will see that the data directory produces several files with the end of CSV format; at this time, we executegenerate_tfrecord.shFile, the data format required by tensorflow is finished.

Book: come on, it's time to prove you love me!

Migration training model:

In this link, we need to do the following:

  • Put the newly generated record file in the corresponding directory
  • Download SSD mobile net V1 model file
  • to configurebook.pbtxtDocuments andbook.configfile
Place record file and SSD mobile net V1 model

For my convenience, I willmodels/research/object_detection/test_dataEmpty the directory under, and put the files of migration training.

First, we download the SSD mobile net V1 model file:

Book: come on, it's time to prove you love me!

We download the first SSD_ mobilenet_ V1_ After downloading the coco model, we decompress the downloaded model compression package file and put the model related files in thetest_dataOfmodelDirectory. And put the record file we just generated in thetest_dataDirectory.
Book: come on, it's time to prove you love me!

Complete the pbtxt and config configuration files

We aretest_dataUnder directory, create a new onebook.pbtxtFile and complete the configuration:

item {
  id: 1
  name: 'book'
}

Since we only have one label, we will directly configure one with ID value of 1 and name of bookitemObject.

Because we use the SSD mobile net V1 model for migration learning, we go tosample\configsMake a copy under the cataloguessd_mobilenet_v1_coco.configFile and rename tobook.configDocuments.

Book: come on, it's time to prove you love me!

And then we change itbook.configProfile in:

Put num_ Classes to the current number of labels:

Book: come on, it's time to prove you love me!

Since we only have one book tag, we can change it to 1.

Modify allPATH_TO_BE_CONFIGUREDPath to:
<center>
<img style=”zoom:50%;” />
</center>

We set the model file address here totestdata/model/model.ckptFull path address of.

Book: come on, it's time to prove you love me!

We willtrain_input_readerOfinput_pathSet tomask_train.recordFull path address of; willlabel_map_pathSet tobook.pbtxtFull path address of; willeval_input_readerOfinput_pathSet tomask_test.recordFull path address of.

So far, all our configurations have been completed. Next comes the exciting training model moment.

function train.py Document training model

We run in the terminaltrain.pyFile, start to migrate learning and training models.

python3 train.py --logtostderr --train_dir=./test_data/training/ --pipeline_config_path=./test_data/book.config

amongtrain_dirDirectory for our trained model, pipeline_ config_ Path for usbook.configThe relative path of the file.

After running the command, we can see that the model is training step by step:

Book: come on, it's time to prove you love me!

And/test_data/trainingThe model files after training are stored in the directory:

<center>
<img style=”zoom:50%;” />
</center>

Convert CKPT file to Pb file

We passedexport_inference_graph.pyFile, convert the trained model to Pb format file, which we will use later to convert to TensorFlow.js Recognizable file format. Finally we see TensorFlow.js The shadow of.

<center>
<img style=”zoom:50%;” />
</center>

We execute orders, runexport_inference_graph.pyFile:

python export_inference_graph.py --input_type image_tensor --pipeline_config_path ./test_data/book.config --trained_checkpoint_prefix ./test_data/training/model.ckpt-1989 --output_directory ./test_data/training/book_model_test

amongpipeline_config_pathbybook.configRelative file path for,trained_checkpoint_prefixFor the path of the model file, for example, we choose the model file trained in 1989,output_directoryThe target directory for us to output Pb files.

After running, we can see a buildbook_model_testcatalog:

Book: come on, it's time to prove you love me!

Convert Pb file to tensorflowjs model

First, we need to rely on the dependency package of tensorflowjs

pip install tensorflowjs

Then convert the generated Pb file through the command line

tensorflowjs_converter --input_format=tf_saved_model --output_node_names='detection_boxes,detection_classes,detection_features,detection_multiclass_scores,detection_scores,num_detections,raw_detection_boxes,raw_detection_scores' --saved_model_tags=serve --output_format=tfjs_graph_model ./saved_model ./web_model

Where we set the last two parameters, namelysaved_modelTable of contents and TensorFlow.js Identify the output directory of the model.

After running, we can see a newly generatedweb_modelDirectory, which includes our migration learning training model.

Here, the model training phase is finally over.

Book: come on, it's time to prove you love me!

Running the model in Vue

preparation

Create a new Vue project and put our trained model, namely web, in the public directory of the Vue project_ Model directory.

Book: come on, it's time to prove you love me!

Then we use Tensorflow.js Dependency package, inpackage.jsonOfdependenciesAdd:

"@tensorflow/tfjs": "^1.7.2",
"@tensorflow/tfjs-core": "^1.7.2",
"@tensorflow/tfjs-node": "^1.7.2",

Then install the dependency package through NPM command.

Load model

The dependency package of tensorflow is introduced in our JS code:

import * as tf from '@tensorflow/tfjs';
import {loadGraphModel} from '@tensorflow/tfjs-converter';

Next, the first step is to load themodel.jsonFile:

const MODEL_URL = process.env.BASE_URL+"web_model/model.json";
this.model = await loadGraphModel(MODEL_URL);

adoptloadGraphModelMethods, we load the training model, and then print out the model object:

Book: come on, it's time to prove you love me!

Later, we can see that the model will output an array with a length of 4:

  • detection_scores: it indicates the confidence degree of the recognition object model. The higher the confidence degree is, the higher the possibility that the model thinks the corresponding area is recognized as a book
  • detection_classes: the label corresponding to the area identified by the model. For example, in this case, book is identified
  • num_detections: indicates the number of target objects recognized by the model
  • detection_boxes: indicates the area of the target object recognized by the model, which is an array of length 4, respectively: [x_ pos,y_ pos,x_ width,y_ height] 。 The first bit represents the X coordinate of the upper left corner of the circle selection area, the second represents the Y coordinate of the upper left corner of the circle selection area, the third represents the width of the circle selection area, and the fourth represents the length of the circle selection area.

Model recognition

Knowing the output value, we can start to input pictures into the model to get the prediction results of the model:

const img = document.getElementById('img');
let modelres =await this.model.executeAsync(tf.browser.fromPixels(img).expandDims(0));

We passedmodel.executeAsyncMethod, input the image into the model, and then get the output value of the model.

The result is an array of length 4 that we mentioned earlier. Then we use a custom method to sort out the results, and output a desired result format:

buildDetectedObjects:function(scores, threshold, imageWidth, imageHeight, boxes, classes, classesDir) {
          const detectionObjects = [];
          scores.forEach((score, i) => {
              if (score > threshold) {
                  const bbox = [];
                  const minY = boxes[i * 4] * imageHeight;
                  const minX = boxes[i * 4 + 1] * imageWidth;
                  const maxY = boxes[i * 4 + 2] * imageHeight;
                  const maxX = boxes[i * 4 + 3] * imageWidth;
                  bbox[0] = minX;
                  bbox[1] = minY;
                  bbox[2] = maxX - minX;
                  bbox[3] = maxY - minY;
                  detectionObjects.push({
                      class: classes[i],
                      label: classesDir[classes[i]].name,
                      score: score.toFixed(4),
                      bbox: bbox
                  });
              }
          });

          return detectionObjects
}

We callbuildDetectedObjectsTo organize and return the final results.

  • Scores: input thedetection_scoresarray
  • Threshold: threshold, that is, the result score > threshold, we will put the corresponding result into the result objectdetectionObjectsin
  • Imagewidth: the width of the picture
  • Imageheight: the length of the picture
  • Boxes: input thedetection_boxesarray
  • Classes: input thedetection_classesarray
  • Classsdir: model label object

callbuildDetectedObjectsMethod example:

let classesDir = {
    1: {
        name: 'book',
        id: 1,
        }
    };
let res=this.buildDetectedObjects(modelres[0].dataSync(),0.20,img.width,img.height,modelres[3].dataSync(),modelres[1].dataSync(),classesDir);

We passedmodelres[0].dataSync(), to get the array object of the corresponding result, and then input it into the method, so as to finally get the res result object.

Book: come on, it's time to prove you love me!

Finally, we use canvas API to draw the corresponding region of the image according to the array object returned by bbox. Due to the length, it will not be elaborated. The final effect is as follows:
Book: come on, it's time to prove you love me!

last

There are some shortcomings in the model of this case. Due to the short training time, the cover types of books are numerous, and there are styles such as portrait, landscape map, etc., which may lead to the model mistakenly recognizing a small number of faces, landscape photos and other pictures into the book cover in the process of recognition. In the process of training your model, you can consider to optimize this problem.

Of course, the model of this case will be inaccurate in the recognition of non Book scenes. On the one hand, it is because the book samples collected from the network in this case have certain limitations, and the cover types of books vary widely, including portrait, landscape map and other styles. On the other hand, because this paper is only for the purpose of attracting valuable ideas, it is for all front-end partners and TensorFlow.js Related knowledge, and provide solutions to train their own models, so the training time in collecting samples and models is short. Interested partners can figure out how to optimize the samples and improve the training time without over fitting, so as to improve the accuracy of the model for the object to be recognized.

We have written this article only for the purpose of serving as a reference for all the front-end partners TensorFlow.js Knowledge and provide an AI solution.

On world reading day, we hope to learn new knowledge and make progress together with the majority of programmers. The school of personal push technology has specially prepared wechat reading cards for you. We hope that every developer who loves learning can enjoy the sea of books and meet a better self!

Event prizes

First prize: 1 geek time recharge card, 1

Book: come on, it's time to prove you love me!

Second prize: 1 e-book VIP annual card, 3

Book: come on, it's time to prove you love me!

5 third prizes: 1 in-depth learning, 5

Book: come on, it's time to prove you love me!

Lottery method

Scan the bottom two dimensional code, pay attention to the official account of technical college, and reply to “I love reading” to get the lottery entrance.

Book: come on, it's time to prove you love me!

Opening time: at 16:00 on April 27, 2020, the system will randomly select lucky fans.

Collection methodPlease fill in the receiving information in the lottery assistant within 24 hours, and we will send it to you within 7 working days.

notes: the right to interpret the activity belongs to the individual.

Book: come on, it's time to prove you love me!