Constructing deep learning application with raspberry pie 4B (9) Yolo

Time:2020-10-30

preface

In the last article, we installed the openvino environment on raspberry pie, and ran several official demos as the key point of model transformation. Taking the implementation of various versions of Yolo as an example, we will do the implementation in this article.

Constructing deep learning application with raspberry pie 4B (9) Yolo

Target detection is a mature field of artificial intelligence applications, not only to be able to identify the target in the picture, but also to locate its position, which will be a basic scene in automatic driving. It is generally divided into two categories: one is two stage, which is based on r-cnn, fast r-cnn, fast r-cnn and so on. Mr. Zhang forms a region proposal, and then carries out classification to obtain category and regression to obtain position; the other is Yolo, SSD, which is one stage algorithm, which directly uses CNN network to obtain target and location.

In contrast, r-cnn system has higher accuracy but slower speed, while Yolo system has faster speed and lower accuracy. In many CV fields, as long as the accuracy of classification is ensured, the detection speed is more important than the positioning accuracy. The deployment of one stage model has natural advantages, which greatly reduces the computational pressure of edge equipment with limited computing power.

Constructing deep learning application with raspberry pie 4B (9) Yolo

Yolo is the representative of optimization speed of target detection class, and the speed can be further improved after converting to openvino. This paper mainly introduces the methods of model transformation from tensorflow and pytoch to openvino.

Constructing deep learning application with raspberry pie 4B (9) Yolo

First of all, the tensorflow model transformation process is: first, the weight file. Weight is converted into static graph. Pb file, then converted into IR model. Bin and. XML, and finally deployed to the nerve rod to run. Let’s run a yolov4 tiny app to experience it.

Yolov4 application

1. Download the source code

git clone https://github.com/TNTWEN/OpenVINO-YOLOV4.git
cd OpenVINO-YOLOV4

2

Download yolov4. Weight and yolov4- tiny.weight Put it in this directory

python convert_weights_pb.py --class_names cfg/coco.names --weights_file yolov4-tiny.weights --data_format NHWC --tiny

There is output frozen in the directory_ darknet_ yolov4_ model.pb The transformation was successful.

Constructing deep learning application with raspberry pie 4B (9) Yolo

Tip:

The data format must be specified as nhwc to match the corresponding format on Intel ncs2.

Constructing deep learning application with raspberry pie 4B (9) Yolo

Tip:

If there is an error message about “cloud” import, it is because the — nogcp flag is turned on when compiling TF, resulting in tensorflow / contrib / cloud not being added to the PIP installation package. Here, as long asinitYou can fix this bug by commenting out two lines of code in. The data format must be specified as nhwc, so reverse is required here_ input_ Channels flip the corresponding input channel.

/home/pi/my_envs/tensorflow/lib/python3.7/site-packages/tensorflow/contrib/__init__.py

Constructing deep learning application with raspberry pie 4B (9) Yolo

3. Initialize openvino environment

To do the conversion on a Windows or Linux host, you need to install the openvino development kit.

Constructing deep learning application with raspberry pie 4B (9) Yolo

"C:\Program Files (x86)\IntelSWTools\openvino\bin\setupvars.bat"

Constructing deep learning application with raspberry pie 4B (9) Yolo

4

Switch to the openvino-yolov4 directory and convert the Pb file into XML and bin files.

python "C:\Program Files (x86)\IntelSWTools\openvino_2020.4.287\deployment_tools\model_optimizer\mo.py" --input_model frozen_darknet_yolov4_model.pb --transformations_config yolo_v4_tiny.json --batch 1 --reverse_input_channels

Constructing deep learning application with raspberry pie 4B (9) Yolo

Tip:

Before converting IR model, we must pay attention to the compatibility of OP operator and the data precision of corresponding platform. The specific information can be found in the following page. Many model transformation failures are due to the lack of support.

https://docs.openvinotoolkit.org/latest/openvino_docs_IE_DG_supported_plugins_Supported_Devices.html

5. Operation model

Running the model on raspberry pie, FPS was stable at about 6-7 frames

source ~/my_envs/tensorflow/bin/activate
/opt/intel/openvino/bin/setupvars.sh
python object_detection_demo_yolov4_async.py -i cam -m frozen_darknet_yolov4_model.xml -d MYRIAD

Constructing deep learning application with raspberry pie 4B (9) Yolo

Yolov5 application

Yolov5 mainly introduces mosaic enhancement and adaptive anchor box. These new features are not different from yolov4 in structure. However, the open source version of V5 is python, which is easier to convert to various platforms for deployment than Darknet.

Constructing deep learning application with raspberry pie 4B (9) Yolo

Workflow:

If the other mainstream framework pytoch is to be optimized with openvino, its conversion process is as follows: firstly, the model file. Pt of Python is converted into onnx format, and then it is converted into IR model, which can be deployed on the nerve rod.

Constructing deep learning application with raspberry pie 4B (9) Yolo

Let’s run through the process with the latest version of yolov5.

1. Install openvino development tools on Ubuntu

Download the April 2020 Linux version of the installation package

cd ~/Downloads/
wget http://registrationcenter-download.intel.com/akdlm/irc_nas/16803/l_openvino_toolkit_p_2020.4.287.tgz
tar -xvzf l_openvino_toolkit_p_2020.4.287.tgz

2. Installation dependency

pip3 install defusedxml
pip3 install networkx
pip3 install test-generator==0.1.1
#Here we just need to convert onnx
cd l_openvino_toolkit_p_2020.4.287
sudo ./install_prerequisites_onnx.sh

3. Establish virtual environment

Python>=3.8,PyTorch==1.5.1,ONNX>=1.7。

CONDA activate openvino? Enter the virtual environment of Ubuntu
git clone https://github.com/ultralytics/yolov5.git
cd yolov5
pip3 install -r requirements.txt onnx
#Down version
pip install torch==1.5.1 torchvision==0.6.1

4. Derive yolov5 model trained on python

Download yolov5s.pt (or yolov5 model for training your own dataset) and put it in the directory

wget https://github.com/ultralytics/yolov5/releases/download/v2.0/yolov5s.pt
mv yolov5s.pt yolov5s_2.0.pt

Tip:

To download v2.0 with nn.LeakyReLU (0.1) because of 3.0 nn.Hardswish It’s not supported yet.

5. Modify the activation function

Since onnx and openvino do not support hardswitch yet, the hardwise activation function should be changed to relu or leaky relu.

# yolov5/models/common.py
# Line 26 in 5e0b90d
# self.act = nn.Hardswish() if act else nn.Identity()
self.act = nn.Relu() if act else nn.Identity()

6. Modification yolo.py

# yolov5/models/yolo.py
# Lines 49 to 53 in 5e0b90d
#    y[..., 0:2] = (y[..., 0:2] * 2. - 0.5 + self.grid[i].to(x[i].device)) * self.stride[i]  # xy 
#    y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i]  # wh 
#    z.append(y.view(bs, -1, self.no)) 
#  
# return x if self.training else (torch.cat(z, 1), x) 

Modify the output layer stack without input layer

    c=(y[..., 0:2] * 2. - 0.5 + self.grid[i].to(x[i].device)) * self.stride[i]  # xy
    d=(y[..., 2:4] * 2) ** 2 * self.anchor_grid[i]  # wh
    e=y[..., 4:]
    f=torch.cat((c,d,e),4)
    z.append(f.view(bs, -1, self.no))

  return x if self.training else torch.cat(z, 1)

7. Modification export.py

# yolov5/models/export.py
# Line 31 in 5e0b90d
# model.model[-1].export = True  # set Detect() layer export=True 
model.model[-1].export = False

Because version 10 opset can support the reset operator, the opset version number should be modified.

# yolov5/models/export.py
# Lines 51 to 52 in 5e0b90d
# torch.onnx.export(model, img, f, verbose=False, opset_version=12, input_names=['images'], 
torch.onnx.export(model, img, f, verbose=False, opset_version=10, input_names=['images'], 
                   output_names=['classes', 'boxes'] if y is None else ['output'])

Tip:

Make sure that Torch = = 1.15.1, torch vision = = 0.6.1, onnx = = 1.7, opset = 10. The activation function is relu, and the network inference layer is modified.

8. Set Pt > onnx

Constructing deep learning application with raspberry pie 4B (9) Yolo

export PYTHONPATH="$PWD"  
python models/export.py --weights yolov5s_2.0.pt --img 640 --batch 1  

Display and export to onnx and torch script files.

ONNX export success, saved as ./yolov5s.onnx
Export complete. Visualize with https://github.com/lutzroeder/netron.

9

python3 /opt/intel/openvino_2020.4.287/deployment_tools/model_optimizer/mo.py 
    --input_model yolov5s_2.0.onnx 
    --output_dir ./out 
    --input_shape [1,3,640,640]

If it goes well, the IR model of yolov5s can be generated in the out directory, and then the file will be transferred to the raspberry pie.

Constructing deep learning application with raspberry pie 4B (9) Yolo

Tip:

Here, the input shape to match yolov5s is [1, 3, 640, 640].

10. Modify the parameter matching training model

git clone https://github.com/linhaoqi027/yolov5_openvino_sdk.git

Modify inference device and input shape

# device = 'CPU'
# input_h, input_w, input_c, input_n = (480, 480, 3, 1)
device = 'MYRIAD'
input_h, input_w, input_c, input_n = (640, 640, 3, 1)

Modify category information

# label_id_map = {
#     0: "fire",
# }
names=['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light',
       'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',
       'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',
       'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard',
       'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
       'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch',
       'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone',
       'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear',
       'hair drier', 'toothbrush']

label_id_map = {index: item for index, item in enumerate(names)}

Modify multi category output

for idx, proposal in enumerate(data):
    if proposal[4] > 0:
        print(proposal)
        confidence = proposal[4]
        xmin = np.int(iw * (proposal[0] / 640))
        ymin = np.int(ih * (proposal[1] / 640))
        xmax = np.int(iw * (proposal[2] / 640))
        ymax = np.int(ih * (proposal[3] / 640))
        idx = int(proposal[5])
        #if label not in label_id_map:
        #    log.warning(f'{label} does not in {label_id_map}')
        #    continue
        detect_objs.append({
            'name': label_id_map[idx],
            'xmin': int(xmin),
            'ymin': int(ymin),
            'xmax': int(xmax),
            'ymax': int(ymax),
            'confidence': float(confidence)
        })

11. Reasoning output

if __name__ == '__main__':
    # Test API
    img = cv2.imread('../inference/images/bus.jpg')
    predictor = init()
    import time
    t = time.time()
    n = 10
    for i in range(n):
        result = process_image(predictor, img)

    Print ("average inference time")( time.time ()-t)/n)
    print("FPS", 1/((time.time()-t)/n))
    # log.info(result)
    for obj in json.loads(result)['objects']:
        print(obj)
python inference.py 

Constructing deep learning application with raspberry pie 4B (9) Yolo

The FPS of yolov5s is only about 2 frames. Compared with yolov4 tiny, the speed is not fast. The input shape 640 is much larger than the 416 of yolov4. The main time consumption is focused on the reasoning of nerve rods, which takes 377ms, which is quite considerable.

In addition, the confidence of the converted IR model on CPU and myriad is also very high. It seems that yolov5 has a lot of optimization space.

Constructing deep learning application with raspberry pie 4B (9) Yolo

To pursue the ultimate detection speed, there are several directions to try:

  • Multiple neural rods distributed reasoning;
  • Multi thread reasoning is used;
  • The screen is refreshed asynchronously;
  • Choose smaller model;
  • The inference code is changed from Python to C++

Constructing deep learning application with raspberry pie 4B (9) Yolo

In particular, recently, a Yolo fast version has been opened to Youtu. Backbone is efficient net Lite, which makes the weight of training model only 1.2m, and Yolo fast XL is only 3.3m, which is very compact.

Constructing deep learning application with raspberry pie 4B (9) Yolo

This mobilenet SSD has run over FPS 40 on 4 + 5 ncs2 of raspberry pie, amazing!!!

Constructing deep learning application with raspberry pie 4B (9) Yolo

This article mainly introduces the transformation methods of tensorflow and python to openvino model.

  • .weight –> .pb –> .xml .bin
  • .pt –> onnx –> .xml .bin

In fact, there are more AI frameworks, which are transformed into each other through Pb, onnx, IR and other intermediate models, such as tensorrt, tensorflow Lite, etc.

Constructing deep learning application with raspberry pie 4B (9) Yolo

Generally, as long as we understand the structure of the input and output layer, the weight of each model and the network structure file, and pay attention to the support of OP operator, we can use the pipeline provided by the framework to do the transformation.

The final means is to rewrite the network topology, using the similar load_ Weight method to import the weight, and then save to the target framework.

Code download

Constructing deep learning application with raspberry pie 4B (9) Yolo

The relevant documents and information can be answered in the background of official account: rpi09, get the download link.


Next Preview

We’re going to introduce the agent software for raspberry pie,
Deploy openwrt soft routing,
Create a once and for all online environment,
Coming soon…

Constructing deep learning application with raspberry pie 4B (9) Yolo