Feature combination of machine learning: coding multiple non-linear laws

Time:2021-10-6

In Figures 1 and 2, we make the following assumptions:

1. Blue dots represent sick trees

2. Orange dots represent healthy trees

Feature combination of machine learning: coding multiple non-linear laws

Figure 1. Is this a linear problem?

Can you draw a line to clearly separate the sick tree from the healthy tree? Sure. It’s a linear problem. This line is not perfect. One or two sick trees may be on the “healthy” side, but the line you draw can make a good prediction

Now let’s look at the following figure:

Feature combination of machine learning: coding multiple non-linear laws

Figure 2. Is this a linear problem?

Can you draw a straight line to clearly separate the sick tree from the healthy tree?

No, you can’t. It’s a nonlinear problem. Any line you draw can’t predict the health of the tree wellFeature combination of machine learning: coding multiple non-linear laws

Figure 3. One line cannot separate two data

To solve the nonlinear problem shown in Figure 2, you can create a feature combination. Feature combination refers to the synthetic feature that multiplies two or more input features to encode the nonlinear law in the feature space. The term “cross” comes from cross product. We create a feature combination named X3 by combining X1 and X2:

x3 = x1x2

We handle this new X3 feature combination like any other feature. The linetype formula becomes:

y = b + w1x1 + w2x2 + w3x3

In other words, although W3 represents nonlinear information, you do not need to change the training method of the linear model to determine the value of W3

Types of feature combinations

We can create many different kinds of feature combinations. For example:

  1. [a x b]: feature combination formed by multiplying two features
  2. [a x B x C x D x E]: feature combination formed by multiplying the values of five features
  3. [a x a]: feature combination formed by squaring the value of a single feature

    By using the random gradient descent method, the linear model can be trained effectively. Therefore, when using the linear extended linear model, supplemented by feature combination has always been an effective method for training large-scale data sets

This work adoptsCC agreement, reprint must indicate the author and the link to this article

Hacking