# Logistic regression in machine learning: model training

Time：2021-10-14

### Loss function of logistic regression

The loss function of linear regression is the square loss. The loss function of logistic regression is a logarithmic loss function, which is defined as follows:

\displaystyle
LogLoss = \sum_{(x,y)\in D} – ylog(y’) – (1-y)log(1-y’)

Of which:

1. (x, y) \ in D is a dataset containing many labeled samples (x, y)

2. “Y” is the label in the labeled sample. Since this is a logistic regression, each value of “Y” must be 0 or 1.

3. “Y” is the predicted value (between 0 and 1) for feature set “X”.

Equation and of logarithmic loss functionEntropy measurement in Shannon’s information theoryClosely related. It’s alsolikelihood function Negative logarithm of (assuming “Y” belongs toBernoulli distribution）。 In fact, minimizing the value of the loss function will generate the maximum likelihood estimate.

### Regularization in logistic regression

RegularizationIt is very important in logistic regression modeling. If there is no regularization, the asymptotic property of logistic regression will continuously promote the loss to reach 0 in high-dimensional space. Therefore, most logistic regression models use one of the following two strategies to reduce the complexity of the model:

1.L_ 2 regularization.

2. Early stop method, that is, limit the number of training steps or learning rate.

(we will discuss the third strategy, l_1 regularization, in a later unit.)

Suppose you assign a unique ID to each sample and map each ID to its own characteristics. If you do not specify a regularization function, the model becomes completely over fitted. This is because the model will try to make the loss of all samples reach 0, but it will never reach it, so that the weight of each indicator feature is close to positive infinity or negative infinity. This occurs when there are a large number of rare feature combinations and only one in each sample.

Fortunately, use L_ 2 or early stop method can prevent such problems.

Summary:

• Logistic regression models generate probabilities.
• The logarithmic loss function is the loss function of logistic regression.
• Logistic regression is widely used by many practitioners.