This article first appeared in personal bloghttp://zuyunfei.com/2020/07/0…Welcome to read the latest!
The object detection problem is more complex than the general AI classification problem. It not only detects the target, outputs the category of the target, but also locates the location of the target. The simple accuray index in the classification problem can not reflect the accuracy of the results of the target detection problem, and map (mean average precision) is a common index used to measure the advantages and disadvantages of the target detection algorithm.
To understand what map is, we need to clarify what precision and recall are.
Precision and recall
Precision rate and recall rate are performance metrics that often appear in information retrieval, web search and other applications. In machine learning, they can also be used to measure “how many percentage of the prediction results are of interest to users”. For the binary classification problem, the samples can be divided into two categories according to the combination of the real category and the predicted category
- TP (true positive): a real example. Positive means that the prediction output is positive, and true means that the prediction is correct.
- TN (true negative): true counterexample. The predicted output is negative and the prediction is correct.
- FP (false positive): false positive. The prediction output is positive, but the prediction is wrong.
- FN (false negative): false counterexample. The prediction output is negative, but the prediction is wrong.
Precision (precision): in the results of all positive cases, the ratio of correct prediction.
Recall (recall): the ratio of all positive cases correctly predicted.
How to calculate
In the problem of target detection, the algorithm usually outputs a bounding box to identify the location of the detected target. To measure the accuracy of the prediction box and the actual location of the target, the IOU index can be used.
IOU (intersection over union)
The cross and merge ratio IOU measures the degree of overlap of two regions, and is the proportion of the area of the overlap of two regions to the total area of the two regions (the overlap is only calculated once).
In target detection, IOU is the combination of prediction frame and actual frameIntersection divided by union。
We can set a threshold, usually 0.5, and the prediction results can be divided into:
If IOU > = 0.5
- If the prediction category is also correct, it is considered as a good prediction and classified as TP
- If the prediction category is wrong, it is considered as a bad prediction and classified as FP
- If IOU < 0.5, it is considered as a bad prediction and classified as FP
- If a target appears in the image but is not detected by the algorithm, it is classified as FN
- TN (all parts of the image that do not contain the actual frame and the detection frame) is usually not used in the calculation.
AP and map
Recall and precision are usually a pair of contradictory measures. Generally speaking, when the precision is high, the recall is often low; When the recall rate is high, the precision rate is often low.
If we sort all the prediction results, the “most likely” positive samples are in the front row, and draw a “P-R curve” with precision as the vertical axis and recall as the horizontal axis.
The P-R curve intuitively shows the recall and precision of an algorithm in the whole sample. If the P-R curve of one algorithm is completely enclosed by the curve of another algorithm, it can be asserted that the latter is better than the former. But in practice, the P-R curves of different algorithms often cross each other, so it is difficult to intuitively judge the pros and cons of the two. This is usually the caseEquilibrium point (BEP), F1 metric, APAnd so on.
AP（average precision Average accuracy): AP is to calculate the average accuracy of a single class model. For the target detection task,Each class can calculate its precision and recall, and each class can get a P-R curve. The area under the curve is the value of AP.If the AP value of an algorithm is large, that is, the area under the P-R curve is relatively large, it can be considered that the precision and recall of this algorithm are relatively high on the whole
Map (mean of average precision): average the AP values of all categories.
Pascal VOC (voc2007 & voc2012) is a common data set to evaluate target detection algorithm. VOC data set uses a fixed IOU threshold of 0.5 to calculate AP value. However, after 2014, ms-coco (Microsoft common objects) data set gradually emerged. In coco dataset, more attention should be paid to the accuracy of prediction box position. AP value is the average value of AP for multiple IOU thresholds. Specifically, 10 IOU thresholds (0.5, 0.55, 0.6,… 0.9, 0.95) are selected between 0.5 and 0.95. So map in VOC dataset is usually marked as
mAP @ IoU=0.5,
mAP_50, marked as
mAP @ IoU=0.5:0.05:0.95,
mAP @ IoU=0.5:0.95perhaps