Optimization of support vector machine (linear model) for machine learning

Time:2021-9-11

Optimization of support vector machine (linear model) for machine learning

Move the middle line parallel to both sides until it passes through one or several training sample points.

We record the hyperplane as(W, b)。

definition:

1. Training data and labels (x_1, y_1) (x_2, y_2)… (x_n, y_n), whereX_iIs a vector, y_ I = + 1 or – 1

2. A training set is linearly separable

Training set: {(x_i, y_i)}_ {i = 1\sim N}\\

3. Support vector: in support vector machine, several training sample points closest to the hyperplane and meeting certain conditions are called support vectors.

Optimization of support vector machine

It comes down to the following two points:

Minimize: \ dfrac {1} {2} | w | ^ 2

Subject to: y_ i[W^TX_i + b]\geq 1 ( i = 1\sim N )

prove:

Fact 1: W ^ TX + B = 0 is the same plane as aw ^ TX + AB = 0. a\in R^+

If(W, b) satisfies Formula 1, then (AW, ab) also satisfies Formula 1.

Fact 2: distance formula from point to plane

Plane: W_ 1X + W ^ 2Y + B = 0, then (x_0, y_0) the distance to this plane:

d = \dfrac{|w_1X_0 + w_2y_0 + b|} {\sqrt{w_1^2 + w_2^2}}

Support vector x_ 0 to hyperplane(W, b) can be written as

d = \dfrac{|\mathbf{W}^T\mathbf{X_0} +b|}{||\mathbf{W}||}\\

We can scale with a

( W, b)\xrightarrow{} ( aW, ab)

Finally, the support vector x_ There is | w ^ TX on 0 (satisfied on all support vectors)_ 0 + b| = 1, the distance between the support vector and the plane

d = \dfrac{1}{||W||}

Therefore, maximize the distance \ dfrac {2} | w |}, that is, minimize \ dfrac {1} {2} | w | ^ 2

This work adoptsCC agreement, reprint must indicate the author and the link to this article

Hacking