# Python machine learning (I) – linear regression

Time：2022-1-12
• The linear regression model belongs to the classical statistical model, and its application scenario isPredict a continuous numerical variable (dependent variable) based on known variables (independent variables), linear regression can usually be applied toStock price forecastRevenue forecastAdvertising effect predictionSales performance forecastamong.
• ## Univariate linear regression:

• ### Basic concepts:

• Univariate linear regression is a method to analyze the linear correlation between only one independent variable (independent variable x and dependent variable y). The value of an economic index is often affected by many factors. If only one of them is the main factor and plays a decisive role, thenUnivariate linear regression can be used for prediction and analysis.Data sets can be expressed as {(x1, Y1), (X2, Y2),…, (xn, yn)}. Where Xi represents the ith value of the independent variable x, Yi represents the ith value of the dependent variable y, and N represents the sample size of the data set. After the model is built, the value of dependent variable y can be predicted according to the value of other independent variables X. The mathematical formula of the model can be expressed as:

• ### Displayed in Python:

• Import the packages and related libraries we need
• ``````#The sklearn library is introduced and the linear regression module is used
from sklearn import  datasets,linear_model
#Introduce train_ test_ Split to divide our data set into training set and test set
from sklearn.model_selection import train_test_split
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt``````

#For example, we now have 10 rows and 2 columns of data. The first column is height and the second column is weight,Usual practice: when raw data is segmented,80% of the original data is used as training data to train the model, and the other 20% is used as test data, judge the effect of the model directly through the test data, and continuously improve the model before it enters the real environment;

``````data = np.array([[152,51],[156,53],[160,54],[164,55],
[168,57],[172,60],[176,62],[180,65],
[184,69],[188,72]])

#X and Y store eigenvectors and labels respectively. The purpose of using reshape here is that data [:, 0] is a one-dimensional array, but the latter model calls in the form of matrix
X,y = data[:,0].reshape(-1,1),data[:,1]
#Distinguish between training set and test set
# train_ Size = 0.8 means that 80% of the data are randomly extracted as training data
X_train,X_test,y_train,y_test = train_test_split(X,y,train_size=0.8)

#Linear regression algorithm model
regr = linear_model.LinearRegression()
#Fitting data, training model
regr.fit(X_train,y_train)
#The return result obtained by score is the square value of the determination coefficient R
regr.score(X_train,y_train)``````
• Square value of the determination coefficient r = 1-u / V
• U = sum of squares of (actual value of Y – expected value of Y)
• V = (actual value of Y – average value of actual value of Y) sum of squares — square value of output result R=0.963944147932503
• ``````font = {'family':"SimHei",'size':20}
plt.rc('font',**font)
##Training data
plt.scatter(X_train,y_train,color='r')
##Draw fitting line
plt.plot(X_train,regr.predict(X_train),color='b')
plt.scatter(X_test,y_test,color='black')
#Test data
plt. Xlabel ('height ')
plt. Ylabel ('body weight ')
plt.show()`````` Let’s make a simple prediction. What’s the weight of a person with a height of 170?

• ``np.round(regr.predict([]),1)``
`Array ([59.8]), we can see that 170 people weigh 59.8 kg according to our prediction.`

## Interpreting the use of annotations in Ruby

RubyHanlin AcademiciannotesThe code for is ignored at run time. The single line comment starts with the characters, and they are as follows from to the end of the line: ? 1 2 3 4 5 #!/usr/bin/ruby -w   # This is a single line comment.   puts “Hello, Ruby!” When the above procedure is executed, […]