Foveabox: difference in details, another densebox + FPN anchor free scheme | IEEE tip 2020

Time:2021-7-30

As an anchor free paper in the same period as fcos and fsaf, foveabox is also based on the densebox plus FPN strategy in the overall structure. The main difference is that foveabox only uses the target central area for prediction, and the regression prediction is the normalized offset value. In addition, multiple layers of FPN are selected for training according to the target size. You can learn

Source: Xiaofei’s algorithm Engineering Notes official account

Paper: foveabox: beyond anchor based object detection

Foveabox: difference in details, another densebox + FPN anchor free scheme | IEEE tip 2020

Introduction


Foveabox: difference in details, another densebox + FPN anchor free scheme | IEEE tip 2020

This paper believes that the use of anchor is not necessarily the best way to search the target, and inspired by the fovea: the middle of the visual region has the highest visual acuity, so an anchor free target detection method foveabox is proposed.

Foveabox: difference in details, another densebox + FPN anchor free scheme | IEEE tip 2020

Foveabox jointly predicts the possibility that each effective position is the target center and the size of its corresponding target, and outputs the category confidence and the size information used to transform the target area. If you have seen many anchor free detection schemes, you may think that the implementation scheme of the paper is very common. Indeed, in fact, this article is also a work in the early stage of anchor free blowout. The overall idea is very pure, and it is also the idea thought of many leaders. You should pay attention to the following details when reading:

  • The central area of the target is used for classification prediction and regression prediction
  • The regression prediction is the normalized offset value
  • When training, you can specify FPN multi-layer training at the same time
  • A feature alignment module is proposed, which uses the output of regression to adjust the input features of classification

FoveaBox


Foveabox: difference in details, another densebox + FPN anchor free scheme | IEEE tip 2020

Object Occurrence Possibility

Given GT target box $(x)_ 1, y_ 1, x_ 2, y_ 2) $to map it to the feature pyramid layer $P_ l$:

Foveabox: difference in details, another densebox + FPN anchor free scheme | IEEE tip 2020

$s_ L $is the feature layer relative to the entered stripe, and the positive sample area $R ^ {POS} $is roughly the reduced version of the mapping box:

Foveabox: difference in details, another densebox + FPN anchor free scheme | IEEE tip 2020

$\ sigma $is an artificial scaling factor. In the training stage, the feature points in the positive sample area are marked as the corresponding target category, the other areas are negative sample areas, the output of each layer of the feature pyramid is $C \ times h \ times w $, $C $is the total number of categories.

Scale Assignment

The goal of the network is to predict the boundary of the target. Direct prediction is unstable because the span of the target size is very large. Therefore, the target size is divided into multiple intervals, corresponding to each layer of the feature pyramid, and each layer is responsible for the prediction of a specific size range. Give feature pyramid $p_ 3 $to $p_ 7 $foundation size $R_ L = 2 ^ {L + 2} $, then the target size range of layer $l $is:

Foveabox: difference in details, another densebox + FPN anchor free scheme | IEEE tip 2020

$\ ETA $is a manually set parameter used to control the regression size range of each layer of the feature pyramid, and the training targets not within the size range of this layer are ignored. The target may fall within the size range of multiple layers. At this time, multi-layer training is used for training. Multi layer training has the following advantages:

  • Adjacent feature pyramid layers usually have similar semantic information, which can be optimized at the same time.
  • The number of training samples per layer is greatly increased to make the training process more stable.

Box Prediction

When predicting the target size, foveabox directly calculates the normalized offset value from the positive sample area $(x, y) $to the target boundary:

Foveabox: difference in details, another densebox + FPN anchor free scheme | IEEE tip 2020

Formula 4 first maps the pixels of the feature pyramid layer back to the input picture, then calculates the offset value, and the L1 loss function is used for training.

Network Architecture

Foveabox: difference in details, another densebox + FPN anchor free scheme | IEEE tip 2020

The network structure is shown in Figure 4. The backbone network adopts the form of feature pyramid, and each layer is connected with a prediction head, including classification branch and regression branch. In this paper, a simpler head structure is used, and a more complex head can obtain better performance.

Feature Alignment

Foveabox: difference in details, another densebox + FPN anchor free scheme | IEEE tip 2020

This paper proposes a trick for feature alignment, which is mainly to transform the prediction head. The structure is shown in Figure 7,

Experiment


Foveabox: difference in details, another densebox + FPN anchor free scheme | IEEE tip 2020

Compared with SOTA method.

Conclusion


As an anchor free paper synchronized with fcos and fsaf, the overall structure of foveabox is also based on the strategy of densebox plus FPN. The main difference is that foveabox only uses the target central area for prediction, and the regression prediction is the normalized offset value, and the multi-layer of FPN is selected for training according to the target size. Because the overall implementation scheme of foveabox is too pure and very similar to other anchor free methods, it has been submitted until now, and the author is not easy.



If this article is helpful to you, please like it or read it
More content, please pay attention to WeChat official account.

Foveabox: difference in details, another densebox + FPN anchor free scheme | IEEE tip 2020