Machine learning algorithms can be divided into supervised learning and unsupervised learning.
What is supervised learning algorithm?
This method is called supervised learning, which is the most commonly used machine learning method. It is a machine learning task to infer the model from the labeled training data set.
Regression algorithm is a supervised learning algorithm. From the perspective of machine learning, regression algorithm is used to build an algorithm model, which is the mapping relationship between attribute (x) and label (y).
Linear regression is a regression analysis modeled between one or more independent variables and dependent variables.
It is characterized by a linear combination of one or more model parameters called regression coefficients.
2. Linear regression example
|House area (m ^ 2)||Rent (yuan)|
Taking the above data as a row of samples, we can get the following relationship
X (house area) y (rent) 0 10 800 1 15.5 1200 ... 5 65.2 4500
As shown below
According to the above data, what is the rent for a house with an area of 80 square meters?
First, we need to find the mapping relationship between house area and price, y = Ka + B, as shown in the figure below
Then through the mapping relationship of y = f (x), the rent price is predicted.
This is an eigenvalue, so what if it is two eigenvalues? What we’re looking for is a plane.
To expand to more eigenvalues, the mapping relationship we are looking for is
X is the eigenvalue,
We can see θ。* x。， x。 Is 1, and then J we get this equation
Use a vector to represent the above formula
Eventually we got
We get this model, but obviously there is an error between the predicted value and the real value ε To represent the error.
For each sample
From the central limit theorem of probability theory, we can know the error ε It is independent and has the same distribution, and obeys the mean of 0 and the variance of 0 σ² Gaussian distribution of.
Bring equation (1) into equation (2)
Then the likelihood function is used:
For ease of solution, take logarithm
At the minimum, that is, when it is 0, logl（ θ) The maximum value is our loss function.
Then find its partial derivative
Let the partial derivative be 0, and you can finally find it
This is the least square method, which is also one of the methods to solve the linear regression loss function.
For the above code example of the relationship between house area and rent
import numpy as np from matplotlib import pyplot as plt from sklearn.linear_model import LinearRegression as lr #Housing area data x_list = [10, 15.5, 20.2, 35.0, 48.3, 58.9, 65.2] #Corresponding rent data y_list = [800, 1200, 1600, 2500, 3300, 3800, 4500] x = np.array(x_list).reshape(-1,1) y = np.array(y_list).reshape(-1,1) model = lr() model.fit(x, y) y_plot = model.predict(x) print(model.coef_) plt.figure(figsize=(5,5),dpi=80, facecolor='w') plt.scatter(x, y, color='red', linewidths=2,) plt.plot(x, y_plot, color='blue',) x_tick = list(range(5, 70, 5)) plt.grid(alpha=0.4) plt.xticks(x_tick) plt.show()
3. Example of Boston house price forecast
From sklearn Obtain relevant data sets in datasets, use standard linear regression to establish house price prediction model, and draw scattered and broken line diagrams of house price prediction value and real house price.
# coding:utf-8 from sklearn.datasets import load_boston from sklearn.linear_model import LinearRegression as lr from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler from matplotlib import pyplot as plt from matplotlib import font_manager font = font_manager.FontProperties(fname="/usr/share/fonts/wps-office/msyhbd.ttf") def my_predic_fun(): """ Forecast house prices in Boston using linear regression :return: """ lb = load_boston() x_train, x_test, y_train, y_test = train_test_split(lb.data, lb.target, test_size=0.2) x_std = StandardScaler() y_std = StandardScaler() x_train = x_std.fit_transform(x_train) x_test = x_std.transform(x_test) y_train = y_std.fit_transform(y_train.reshape(-1,1)) y_test = y_std.transform(y_test.reshape(-1,1)) model = lr() model.fit(x_train, y_train) y_predict = y_std.inverse_transform(model.predict(x_test)) return y_predict, y_std.inverse_transform(y_test) def draw_fun(y_predict, y_test): """ Draw scatter and line charts of house price forecast and real value :param y_predict: :param y_test: :return: """ x = range(1,len(y_predict)+1) plt.figure(figsize=(20, 8), dpi=80) plt. Scatter (x, y_test, label = "true value", color ='Blue ') plt. Scatter (x, y_predict, label = 'predicted value', color ='Red ') plt.plot(x,y_test) plt.plot(x,y_predict) x_tick = list(x) y_tick = list(range(0,60,5)) plt.legend(prop=font, loc='best') plt.xticks(list(x), x_tick) plt.yticks(y_tick) plt.grid(alpha=0.4) plt.show() if __name__ == '__main__': y_predict, y_test = my_predic_fun() draw_fun(y_predict, y_test)