# Optimization of support vector machine (linear model) for machine learning

Time：2021-9-11 Move the middle line parallel to both sides until it passes through one or several training sample points.

We record the hyperplane as（W, b)。

definition:

1. Training data and labels (x_1, y_1) (x_2, y_2)… (x_n, y_n), whereX_iIs a vector, y_ I = + 1 or – 1

2. A training set is linearly separable

Training set: {(x_i, y_i)}_ {i = 1\sim N}\\

3. Support vector: in support vector machine, several training sample points closest to the hyperplane and meeting certain conditions are called support vectors.

### Optimization of support vector machine

It comes down to the following two points:

Minimize: \ dfrac {1} {2} | w | ^ 2

Subject to: y_ i[W^TX_i + b]\geq 1 ( i = 1\sim N )

prove:

Fact 1: W ^ TX + B = 0 is the same plane as aw ^ TX + AB = 0. a\in R^+

If（W, b) satisfies Formula 1, then (AW, ab) also satisfies Formula 1.

Fact 2: distance formula from point to plane

Plane: W_ 1X + W ^ 2Y + B = 0, then (x_0, y_0) the distance to this plane:

d = \dfrac{|w_1X_0 + w_2y_0 + b|} {\sqrt{w_1^2 + w_2^2}}

Support vector x_ 0 to hyperplane（W, b) can be written as

d = \dfrac{|\mathbf{W}^T\mathbf{X_0} +b|}{||\mathbf{W}||}\\

We can scale with a

( W, b)\xrightarrow{} ( aW, ab)

Finally, the support vector x_ There is | w ^ TX on 0 (satisfied on all support vectors)_ 0 + b| = 1, the distance between the support vector and the plane

d = \dfrac{1}{||W||}

Therefore, maximize the distance \ dfrac {2} | w |}, that is, minimize \ dfrac {1} {2} | w | ^ 2