# Machine learning (4): popular understanding of SVM and code practice

Time：2021-6-22

Last articleWe introduce the use of logistic regression to deal with classification problems. In this paper, we talk about a more powerful classification model. This article still focuses on code practice, you will find that we have more and more ways to solve problems, and the problem handling is more and more simple.

Support vector machine (SVM) is one of the most popular machine learning models. It is especially suitable for the classification of small and medium-sized complex data sets.

# 1、 What is support vector machine

SMV searches for an optimal decision boundary among many instances. The instances on the boundary are called support vector machines, which are called support vector machines because their “support” (support) is separated from the hyperplane.

So how do we ensure that the decision boundary we get isoptimalWhat’s wrong with it?

As shown in the figure above, all three black lines can segment the dataset perfectly. Therefore, we can get innumerable solutions by using only a single line. So, which line is the best?

As shown in the figure above, we calculate the distance between the line and the segmentation instance, so that our line is consistent with the distance of the datasetAs far away as possibleThen we can get the unique solution. It is our goal to maximize the distance between the dotted lines in the figure above. The example highlighted in the figure above is called support vector.

This is support vector machine.

# 2、 Mapping theory from code

## 2.1 importing data sets

``````import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt``````

``````df = pd.read_csv('https://blog.caiyongji.com/assets/mouse_viral_study.csv')
Med_1_mL Med_2_mL Virus Present
0 6.50823 8.58253 0
1 4.12612 3.07346 1
2 6.42787 6.36976 0
3 3.67295 4.90522 1
4 1.58032 2.44056 1

The data set simulates a medical study in which mice infected with the virus were treated with two different doses of drugs to observe whether the mice were infected with the virus after two weeks.

• featuresObjective: 1. Drug Med_ 1_ Ml drug Med_ 2_ mL
• label: whether infected with virus (1 infected / 0 uninfected)

## 2.2 observation data

``sns.scatterplot(x='Med_1_mL',y='Med_2_mL',hue='Virus Present',data=df)``

We used Seaborn to plot the scatter plot of infection results of two drugs at different dose characteristics.

``sns.pairplot(df,hue='Virus Present')``

We use pairplot method to draw the corresponding relationship between two features.

We can make a general judgment that increasing the dosage of the drug can prevent the mice from being infected.

## 2.3 using SVM to train data set

``````#SVC: support vector classifier
from sklearn.svm import SVC

#Data preparation
y = df['Virus Present']
X = df.drop('Virus Present',axis=1)

#Define the model
model = SVC(kernel='linear', C=1000)

#Training model
model.fit(X, y)

#Draw an image
#Defining the method of drawing SVM boundary
def plot_svm_boundary(model,X,y):

X = X.values
y = y.values

# Scatter Plot
plt.scatter(X[:, 0], X[:, 1], c=y, s=30,cmap='coolwarm')

# plot the decision function
ax = plt.gca()
xlim = ax.get_xlim()
ylim = ax.get_ylim()

# create grid to evaluate model
xx = np.linspace(xlim[0], xlim[1], 30)
yy = np.linspace(ylim[0], ylim[1], 30)
YY, XX = np.meshgrid(yy, xx)
xy = np.vstack([XX.ravel(), YY.ravel()]).T
Z = model.decision_function(xy).reshape(XX.shape)

# plot decision boundary and margins
ax.contour(XX, YY, Z, colors='k', levels=[-1, 0, 1], alpha=0.5,
linestyles=['--', '-', '--'])
# plot support vectors
ax.scatter(model.support_vectors_[:, 0], model.support_vectors_[:, 1], s=100,
linewidth=1, facecolors='none', edgecolors='k')
plt.show()
plot_svm_boundary(model,X,y)``````

We import`sklearn`Next`SVC`(supply vector classifier) classifier, which is an implementation of SVM.

## 2.4 SVC parameter C

SVC method parameters`C`Represents the L2 regularization parameter, the strength of regularization and`C`The value of the cityinverse ratioThat is to say, the larger the C value is, the weaker the regularization intensity is, which must be strictly positive.

``````model = SVC(kernel='linear', C=0.05)
model.fit(X, y)
plot_svm_boundary(model,X,y)``````

When we reduce the value of C, we can see that the degree of model fitting data is weakened.

## 2.5 nuclear techniques

Application of SVC method`kernel`Parameter can be taken`{'linear', 'poly', 'rbf', 'sigmoid', 'precomputed'}`. As used earlier, we can make`kernel='linear'`Linear classification. So what if we do nonlinear classification?

### 2.5.1 polynomial kernel

Polynomial kernel`kernel='poly'`In a nutshell,Using single feature to generate multi feature to fit curve. For example, we expand the corresponding relationship from X to y as follows:

X X^2 X^3 y
0 6.50823 6.50823**2 6.50823**3 0
1 4.12612 4.12612**2 4.12612**3 1
2 6.42787 6.42787**2 6.42787**3 0
3 3.67295 3.67295**2 3.67295**3 1
4 1.58032 1.58032**2 1.58032**3 1

So we can fit the data set with a curve.

``````model = SVC(kernel='poly', C=0.05,degree=5)
model.fit(X, y)
plot_svm_boundary(model,X,y)``````

We use the polynomial kernel and use the`degree=5`Set the value of the polynomialMaximum times5. We can see that there is a certain radian in the segmentation.

### 2.5.2 Gaussian RBF kernel

The default kernel of SVC method is Gaussian`RBF`That is radial basis function. Now we need to introduce`gamma`Parameter to control the shape of the bell function. Increasing the gamma value will make the bell curve narrower, so the influence range of each instance will be smaller, and the decision boundary will be more irregular. Reducing the gamma value will make the bell curve wider, so the influence range of each instance is larger and the decision boundary is flatter.

``````model = SVC(kernel='rbf', C=1,gamma=0.01)
model.fit(X, y)
plot_svm_boundary(model,X,y)``````

## 2.6 parameter adjustment skills: grid search

``````from sklearn.model_selection import GridSearchCV
svm = SVC()
param_grid = {'C':[0.01,0.1,1],'kernel':['rbf','poly','linear','sigmoid'],'gamma':[0.01,0.1,1]}
grid = GridSearchCV(svm,param_grid)
grid.fit(X,y)
print("grid.best_params_ = ",grid.best_params_,", grid.best_score_ =" ,grid.best_score_)``````

We can go through it`GridSearchCV`Method to traverse the possibilities of super parameters to find the optimal super parameters. This is a means of violent parameter adjustment by means of force calculation. Of course, in the analysis stage, we must limit the optional range of each parameter to apply this method.

Because the data set is too simple, we have got 100% accuracy when traversing the first possibility. The output is as follows:

``grid.best_params_ =  {'C': 0.01, 'gamma': 0.01, 'kernel': 'rbf'} , grid.best_score_ = 1.0``

# summary

When we deal with linearly separable data sets, we can use`SVC(kernel='linear')`Methods to train data, of course, we can also use faster methods`LinearSVC`To train data, especially when the training set is very large or there are many features.
When we deal with nonlinear SVM classification, we can use Gaussian RBF kernel, polynomial kernel, sigmoid kernel to fit the nonlinear model. Of course, we can also find the optimal parameters through gridsearchcv.

Previous articles:

## Programming Xiaobai must understand the network principle

How is the network composed? Why can we easily surf the Internet now?Whether you are a computer major or not, you may always have such questions in your heart!And today we will solve this matter and tell you the real answer! Basic composition of network First, let’s look at this sentence Connect all computers together […]