II Machine learning algorithms – linear regression (1)

Time:2022-1-1

1. Idea of linear regression algorithm

Machine learning algorithms can be divided into supervised learning and unsupervised learning.

What is supervised learning algorithm?
This method is called supervised learning, which is the most commonly used machine learning method. It is a machine learning task to infer the model from the labeled training data set.

Regression algorithm is a supervised learning algorithm. From the perspective of machine learning, regression algorithm is used to build an algorithm model, which is the mapping relationship between attribute (x) and label (y).

Linear regression is a regression analysis modeled between one or more independent variables and dependent variables.
It is characterized by a linear combination of one or more model parameters called regression coefficients.

2. Linear regression example

House area (m ^ 2) Rent (yuan)
10 800
15 5 1200
20 2 1600
35.0 2500
48 3 3300
58.9 3800
65.2 4500

Taking the above data as a row of samples, we can get the following relationship

X (house area) y (rent)
0   10                          800
1   15.5                        1200
...
5  65.2                        4500

As shown below
II Machine learning algorithms - linear regression (1)According to the above data, what is the rent for a house with an area of 80 square meters?
First, we need to find the mapping relationship between house area and price, y = Ka + B, as shown in the figure below
II Machine learning algorithms - linear regression (1)
Then through the mapping relationship of y = f (x), the rent price is predicted.

This is an eigenvalue, so what if it is two eigenvalues? What we’re looking for is a plane.

II Machine learning algorithms - linear regression (1)
To expand to more eigenvalues, the mapping relationship we are looking for is
II Machine learning algorithms - linear regression (1)
X is the eigenvalue,
II Machine learning algorithms - linear regression (1)
We can see θ。* x。, x。 Is 1, and then J we get this equation
II Machine learning algorithms - linear regression (1)
Use a vector to represent the above formula
II Machine learning algorithms - linear regression (1)
Eventually we got
II Machine learning algorithms - linear regression (1)

3. Error

We get this model, but obviously there is an error between the predicted value and the real value ε To represent the error.
For each sample
II Machine learning algorithms - linear regression (1)
From the central limit theorem of probability theory, we can know the error ε It is independent and has the same distribution, and obeys the mean of 0 and the variance of 0 σ² Gaussian distribution of.

therefore
II Machine learning algorithms - linear regression (1)
Bring equation (1) into equation (2)
II Machine learning algorithms - linear regression (1)
Then the likelihood function is used:
II Machine learning algorithms - linear regression (1)

For ease of solution, take logarithm
II Machine learning algorithms - linear regression (1)
II Machine learning algorithms - linear regression (1)
II Machine learning algorithms - linear regression (1)

When
II Machine learning algorithms - linear regression (1)
At the minimum, that is, when it is 0, logl( θ) The maximum value is our loss function.
Further transformation:
II Machine learning algorithms - linear regression (1)
Then find its partial derivative
II Machine learning algorithms - linear regression (1)
Let the partial derivative be 0, and you can finally find it
II Machine learning algorithms - linear regression (1)
This is the least square method, which is also one of the methods to solve the linear regression loss function.

For the above code example of the relationship between house area and rent

import numpy as np
from matplotlib import pyplot as plt
from sklearn.linear_model import LinearRegression as lr


#Housing area data
x_list = [10, 15.5, 20.2, 35.0, 48.3, 58.9, 65.2]
#Corresponding rent data
y_list = [800, 1200, 1600, 2500, 3300, 3800, 4500]

x = np.array(x_list).reshape(-1,1)
y = np.array(y_list).reshape(-1,1)

model = lr()
model.fit(x, y)

y_plot = model.predict(x)

print(model.coef_)

plt.figure(figsize=(5,5),dpi=80, facecolor='w')

plt.scatter(x, y, color='red', linewidths=2,)
plt.plot(x, y_plot, color='blue',)

x_tick = list(range(5, 70, 5))

plt.grid(alpha=0.4)

plt.xticks(x_tick)

plt.show()

result

[[63.66780288]]

II Machine learning algorithms - linear regression (1)

3. Example of Boston house price forecast

From sklearn Obtain relevant data sets in datasets, use standard linear regression to establish house price prediction model, and draw scattered and broken line diagrams of house price prediction value and real house price.
Code example

# coding:utf-8
from sklearn.datasets import load_boston
from sklearn.linear_model import LinearRegression as lr
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

from matplotlib import pyplot as plt
from matplotlib import font_manager

font = font_manager.FontProperties(fname="/usr/share/fonts/wps-office/msyhbd.ttf")

def my_predic_fun():
    """
    Forecast house prices in Boston using linear regression
    :return:
    """
    lb = load_boston()

    x_train, x_test, y_train, y_test = train_test_split(lb.data, lb.target, test_size=0.2)

    x_std = StandardScaler()
    y_std = StandardScaler()

    x_train = x_std.fit_transform(x_train)
    x_test = x_std.transform(x_test)
    y_train = y_std.fit_transform(y_train.reshape(-1,1))
    y_test = y_std.transform(y_test.reshape(-1,1))


    model = lr()
    model.fit(x_train, y_train)

    y_predict = y_std.inverse_transform(model.predict(x_test))
    return y_predict, y_std.inverse_transform(y_test)


def draw_fun(y_predict, y_test):
    """
    Draw scatter and line charts of house price forecast and real value
    :param y_predict:
    :param y_test:
    :return:
    """
    x = range(1,len(y_predict)+1)
    plt.figure(figsize=(20, 8), dpi=80)
    plt. Scatter (x, y_test, label = "true value", color ='Blue ')
    plt. Scatter (x, y_predict, label = 'predicted value', color ='Red ')
    plt.plot(x,y_test)
    plt.plot(x,y_predict)

    x_tick = list(x)
    y_tick = list(range(0,60,5))

    plt.legend(prop=font, loc='best')
    plt.xticks(list(x), x_tick)
    plt.yticks(y_tick)
    plt.grid(alpha=0.4)
    plt.show()


if __name__ == '__main__':
    y_predict, y_test = my_predic_fun()
    draw_fun(y_predict, y_test)

result
II Machine learning algorithms - linear regression (1)

reference resources:https://blog.csdn.net/guoyunf…

Recommended Today

Could not get a resource from the pool when the springboot project starts redis; nested exception is io. lettuce. core.

resolvent: Find your redis installation path: Start redis server Exe After successful startup: Restart project resolution. ———————————————————————->Here’s the point:<——————————————————————- Here, if you close the redis command window, the project console will report an error. If you restart the project, the same error will be reported at the beginning, The reason is: It is inconvenient to […]