Target detection in the loss of focus of the introductory guide!

Time:2021-7-21

By guest blog
Compile Flin
Source: analyticsvidhya

introduce

Object detection is one of the most widely studied topics in the computer vision community. It has entered various industries, involving use cases from image security, surveillance, automatic vehicle systems to machine inspection.

At present, object detection based on deep learning can be roughly divided into two categories

  1. Two stage detectors, such as region based CNN (r-cnn) and its subsequent products.

  2. Primary detectors such as Yolo series detectors and SSDs

The conventional, dense sampling (possible object position) primary detector applied to the anchor frame may be faster and simpler, but its accuracy has lagged behind that of the two-stage detector due to the extreme level imbalance encountered in the training process.

Fair published a paper in 2018, in which they introduced the concept of focus loss and used a primary detector they called retinanet to deal with such imbalances.

Before we go into the essence of focus loss, let’s first understand what this kind of imbalance problem is and what it may cause.

catalogue

  1. Why focus loss

  2. What is focus loss

  3. Cross entropy loss

    1. Cross entropy problem
    2. example
  4. Balanced cross entropy loss

    1. Equilibrium cross entropy problem
    2. example
  5. Focus loss description

    1. example
  6. Cross entropy loss vs focus loss

    1. Records easy to classify correctly
    2. Misclassified records
    3. Very easy to categorize records
  7. Last thought

Why focus loss

Two classical first level detection methods, such as enhanced detector, DPM and the latest methods (such as SSD), can evaluate about 10 ^ 4 to 10 ^ 5 candidate positions in each image, but only a few positions contain objects (i.e. foreground), while the rest are only background objects. This leads to class imbalance.

This imbalance leads to two problems

  1. The training efficiency is low, because most of the positions are easy to be judged as negative (which means the detector can easily classify them as background), which is not helpful to the learning of the detector.

  2. Negative classes (detection with high probability) account for a large part of the input. Although the gradients and losses calculated separately are small, they may overburden the losses and calculated gradients and lead to model degradation.

What is focus loss

In short, focal loss (FL) is an improved version of cross entropy loss (CE). It deals with class imbalance by assigning more weights to examples that are difficult to classify or prone to misclassification (i.e. background with noise texture or some objects or objects that we are interested in), The weight of the simple example (background object) is reduced.

As a result, focus loss reduces the loss contribution of simple examples and increases the importance of correcting misclassified examples.

Therefore, let’s first understand the cross entropy loss of binary classification.

Cross entropy loss

The idea behind the cross entropy loss is to punish the wrong prediction, rather than reward the correct prediction.

The cross entropy loss of binary classification is as follows:

Among them:

Yact=Actual value of Y

Ypred=Predicted value of Y

For the convenience of marking, we remember yact = YAnd Ypred = p

Y ∈ {0,1}, which is the correct annotation

P ∈ [0,1] is the estimated probability of y = 1.

For symbolic convenience, we can rewrite the above equation as:

pt={- ln (P), when y = 1 – ln (1-p), when y =}

CE(p,y)= CE(pt)=-ln⁡(pt

Cross entropy problem

As you can see, the blue line in the figure below shows the easily classified P when p is very close to 0 (when y = 0) or 1t>The example of 0.5 may cause a considerable loss.

Let’s use the following example to understand it.

example

Suppose that the prospect (which we call class 1) is correctly classified as P = 0.95——

CE(FG)= -ln(0.95)= 0.05

And the background (we call it Class 0) is correctly classified as P = 0.05——

CE(BG)=-ln(1- 0.05)= 0.05

The problem is that for the imbalanced data set, when these small losses are added to the whole image, the overall loss (total loss) may be overburdened. Therefore, it will lead to model degradation.

Balanced cross entropy loss

A common method to solve the problem of class imbalance is to introduce the weight factor ∝ [0,1] into the class

For marking convenience, we can define ∝ in the loss functiontAs follows:

CE(pt)= -∝t ln ln(pt

As you can see, this is just an extension of cross entropy.

The problem of equilibrium cross entropy

Our experiments will show that the large class imbalance in the dense detector training process outweighs the cross entropy loss.

The easily classified negative class accounts for the majority of the losses and dominates the gradient. Although it balances the importance of positive / negative examples, it does not distinguish between simple / difficult examples.

Let’s understand this through an example

example

Suppose that the prospect (which we call class 1) is correctly classified as P = 0.95——

CE(FG)= -0.25 * ln(0.95)= 0.0128

Correctly classified as P = 0.05 background (we call it Class 0)——

CE(BG)=-(1-0.25)* ln(1- 0.05)= 0.038

Although positive and negative classes can be distinguished correctly, simple / difficult examples cannot be distinguished.

This is where focus loss (extended to cross entropy) works.

Focus loss description

Focus loss is just an extension of cross entropy loss function, which will reduce the weight of simple examples and focus training on difficult negative samples.

For this reason, the researchers put forward the following suggestions
(1- ptγIt is cross entropy loss, and the focusing parameters can be adjusted γ ≥0。

Focus loss in retinanet object detection α Balanced variants, where α = 0.25, γ= 2. The effect is the best.

Therefore, focus loss can be defined as——

FL (pt) = -αt(1- pt)γ log log(pt).

about γ For several values of [0,5], you can see the focus loss, see Figure 1.

We will note the following characteristics of focus loss:

  1. When the sample classification is wrong and PtThe modulation factor is close to 1, and the loss is not affected.
  2. Whenpt →1The factor becomes 0, and the loss of well classified examples is weighed.
  3. Focusing parameters γ The weight of the simple example is adjusted smoothly.

The effect of modulation factor also increases with the increase of modulation factor( After a lot of experiments and experiments, the researchers found that γ = 2) the best effect

be careful: when γ= At 0, FL is equivalent to CE. Refer to the blue curve in the figure.

Intuitively, the modulation factor reduces the loss contribution of the simple example and expands the range of low loss for example reception.

Let’s understand the characteristics of the above focus loss through an example.

example

  • When records (foreground or background) are correctly classified,
  1. When the foreground is correctly classified, the prediction probability p = 0.99; when the background is correctly classified, the prediction probability p = 0.01.

    pt=99, when y {0act=01, when y = 1act=0} modulation factor (FG) = (1-0.99)2 = 0.0001
    Modulation factor (BG) = (1 – (1-0.01)) 2 = 0.0001, as you can see, the modulation factor is close to 0, so the loss will be weighted down.

  2. When the foreground is misclassified, the prediction probability is p = 0.01; when the background object is misclassified, the prediction probability is p = 0.99.
    pt=01, when y {0act=99, when y = 1act=0} modulation factor (FG) = (1-0.01)2 = 0.9801
    Modulation factor (BG) = (1 – (1-0.99))2=9801 as you can see, the modulation factor is close to 1, so the loss is not affected.

Now, let’s use some examples to compare cross entropy and focus loss, and look at the impact of focus loss during training.

Cross entropy loss vs focus loss

Let’s make a comparison by considering the following situations.

Records easy to classify correctly

It is assumed that the prediction probability of correct foreground classification is p = 0.95, and the prediction probability of correct background classification is p = 0.05.
pt=95, when y {0act=1:1-0.05, when yact =0} Ce (FG) = – ln (0.95) = 0.0512932943875505

Let’s consider ∝ = 0.25 and γ= 2.

FL(FG)= -0.25 * (1-0.95)2 * ln (0.95) = 3.2058308992219E-5

FL(BG)= -0.75 * (1-(1-0.05))2 * ln (1-0.05) = 9.61E-5

Misclassified records

It is assumed that the foreground with prediction probability p = 0.05 is classified as the background object with prediction probability p = 0.05.

pt={0.95, when y Act = 1 1-0.05, when y Act = 0}

CE(FG)= -ln(0.05)= 2.995732273553991

CE(BG)= -ln(1-0.95)= 2.995732273553992

Let’s consider the same scenario, that is, ∞ = 0.25 and γ= 2。

FL(FG)= -0.25 * (1-0.05)2 * ln(0.05)= 0.675912094220619
FL(BG)= -0.75 * (1-(1-0.95))2 * ln(1-0.95)= 2.027736282661858

Very easy to categorize records

Suppose that for the background object with prediction probability p = 0.01, the foreground is classified with prediction probability p = 0.99.

pt=99, when y {0act=01, when y = 1act =0} Ce (FG) = – ln (0.99) = 0.0100503358535014

CE(BG)= -ln(1-0.01)= 0.0100503358535014

Let’s consider the same scenario, that is, ∞ = 0.25 and γ= 2。

FL(FG)= -0.25 * (1-0.01)2 * ln(0.99)= 2.51 * 10 -7
FL(BG)= -0.75 * (1-(1-0.01))2 * ln(1-0.01) = 7.5377518901261E-7

Last thought

Option 1: 0.05129/3.2058 * 10 – 7 = 1600 times smaller
Option 2: 2.3 / 0.667 = 4.5 times smaller
Option 3: 0.01/0.00000025 = 40000 times smaller.

These three cases clearly show how focus loss can reduce the weight of well classified records, and on the other hand, give greater weight to misclassified or difficult records.

After a large number of tests and experiments, the researchers found that ∝ = 0.25 and γ = 2. The effect is the best.

Endnote

In object detection, we experienced the whole evolutionary process from cross entropy loss to focus loss. I’ve tried to explain the loss of focus in target detection.

Thank you for reading!

reference resources

Link to the original text:https://www.analyticsvidhya.com/blog/2020/08/a-beginners-guide-to-focal-loss-in-object-detection/

Welcome to panchuang AI blog:
http://panchuang.net/

Sklearn machine learning official Chinese document:
http://sklearn123.com/

Welcome to pancreato blog Resource Hub:
http://docs.panchuang.net/