### Loss function of logistic regression

The loss function of linear regression is the square loss. The loss function of logistic regression is a logarithmic loss function, which is defined as follows:

\displaystyle

LogLoss = \sum_{(x,y)\in D} – ylog(y’) – (1-y)log(1-y’)

Of which:

1. (x, y) \ in D is a dataset containing many labeled samples (x, y)

2. “Y” is the label in the labeled sample. Since this is a logistic regression, each value of “Y” must be 0 or 1.

3. “Y” is the predicted value (between 0 and 1) for feature set “X”.

Equation and of logarithmic loss function**Entropy measurement in Shannon’s information theory**Closely related. It’s also**likelihood function **Negative logarithm of (assuming “Y” belongs to**Bernoulli distribution**）。 In fact, minimizing the value of the loss function will generate the maximum likelihood estimate.

### Regularization in logistic regression

**Regularization**It is very important in logistic regression modeling. If there is no regularization, the asymptotic property of logistic regression will continuously promote the loss to reach 0 in high-dimensional space. Therefore, most logistic regression models use one of the following two strategies to reduce the complexity of the model:

1.L_ 2 regularization.

2. Early stop method, that is, limit the number of training steps or learning rate.

(we will discuss the third strategy, l_1 regularization, in a later unit.)

Suppose you assign a unique ID to each sample and map each ID to its own characteristics. If you do not specify a regularization function, the model becomes completely over fitted. This is because the model will try to make the loss of all samples reach 0, but it will never reach it, so that the weight of each indicator feature is close to positive infinity or negative infinity. This occurs when there are a large number of rare feature combinations and only one in each sample.

Fortunately, use L_ 2 or early stop method can prevent such problems.

Summary:

- Logistic regression models generate probabilities.
- The logarithmic loss function is the loss function of logistic regression.
- Logistic regression is widely used by many practitioners.

This work adoptsCC agreement, reprint must indicate the author and the link to this article