The article is from WeChat official account.

First of all, I took a picture of Kalman to show my respect

# Kalman filter

- English Kalman filter
- Here is a simple, one state filter
- Kalman filter is often used in control system and robot system, but here we mainly explain how to use it in AI big data analysis and prediction

# Why use Kalman filter

Suppose we have data of 100 time points, and this data is the result of observation at 100 points.

There are two methods to obtain the data at each time point

- The first is observation, but the measurement results are not necessarily accurate, which may be limited by the accuracy of the measuring instrument?
- The second is to use all the data before this time point to predict the data at this time point. Of course, the predicted value is also inaccurate.
- Can we use these two methods to promote each other and make the predicted values more accurate, or make the observed values closer to the essence? This is what Kalman filter does.

There must be some people who don’t understand that the observed values are not necessarily accurate. How can you rely on the predicted values? (in fact, this is what Ali’s interviewer refuted me. At that time, I was really confused, because this method was only used in feature construction. In fact, filter was used in control system specialty.)

【**Breaking the conceptual cognitive Shackles: what does Kalman filter do**】

What the Kalman filter does is: for example, know the position of the aircraft at the last time and the position of the aircraft measured by the radar at this time. Use the first two data to estimate the position of the aircraft at this time. In short, we know the state of the last moment and the measurement data, and fuse the two data to find the current state.

**You must ask**Now that we know the measurement data at the current moment, I think that the current state is the measurement data? In other words: “you must think that the position of the aircraft measured by the radar is the current position of the aircraft? Why use Kalman filter to estimate the current position of aircraft.

**answer**: the position of the plane measured by the radar signal received at this moment is not necessarily the actual position of the plane. Firstly, there is error in radar signal measurement. Second, think about it. I’m receiving a radar signal. That’s the signal I sent and then returned. Does this process take time? During this period of time, the plane may fly at a speed of more than 2 times the speed of sound, or crash directly, all of which are possible.**That is to say, even if the measurement data is received, it is still not sure where the aircraft is**. So I need to estimate the aircraft position at the current time according to the position of the previous time, combined with the measurement data to estimate the current aircraft position. This is the function of Kalman filter.

**Then you must ask**: how can we estimate the position of the aircraft at this moment based on the position of the aircraft estimated at the previous moment?

**answer**Kalman thinks that all state changes (position changes) are linear. What is linear? Last time the position was 0.3 and the speed was 0.2. Then I estimate the position of the next moment is 0.5. This is called linearity.

**Next you will ask**Not all state changes are linear? If you look at wind speed, it’s not linear.

**answer**Congratulations on your new algorithm. In fact, others have named this algorithm as extended Kalman filter. Now we are going to learn Kalman filtering. You just need to remember that Kalman filtering assumes that all changes are linear.

Now I know how to use the position of the aircraft at the previous time to estimate the position of the aircraft at the current time. I also know that I have to use the measurement data received at the current time to estimate the position of the current aircraft. So how to consider it comprehensively? This involves a proportion. What is the proportion of these two data? This is the core essence of Kalman filter. The Kalman filter algorithm needs to adjust the proportion dynamically. (there is a golden mean tone, which is not only based on the measured data, but also on the current time position estimated by the position of the last time.)

# Let’s talk about the process of Kalman filter

Every observation data, strictly speaking, should have a deviation value. For example, if the thermometer measures 26 degrees and the deviation is 0.5 degrees, the real problem should be between (25.5,26.5), or $26pm0.5 $.

So we**Predicted value**, and**Observed value**, plus this**Two respective deviations**A total of four known information to infer the real and more essential data.

- Predicted value: it can be calculated from the real value of the previous time through the preset formula;
- Observed value: directly read the value of the measuring instrument.
- Deviation of observed values: this can also be obtained directly;
- Deviation of predicted value: This is calculated from the deviation of predicted value at the last time point through the given formula.

In the following formula, the foot mark K represents the time point, and k-1 is the last time point. The capital letters a, B and C are constants, which are set in advance; the capital letter H is a constant that needs to be calculated.

- Predicted value: $x ^ {predicted}_ k=A*x_ {k-1} ^ {real} + b * u_ {k-1}$
- Observed value: $x ^ {observed}_ k$
- Deviation of observed value: $p ^ {observed}_ k$
- Deviation of predicted value: $p ^ {predicted}_ K = – sqrt {(1-h) * (P ^ {prediction}_ {k-1})^2}$
- Kalman gain H: $h_ K = – frac {(P ^ {forecast}_ k) ^ 2} {(P ^ {forecast}_ k) ^ 2 + (P ^ {observation}_ k)^2}$
- Real value: $x ^ {real}_ k=H_ K * x ^ {observation}_ k+（1-H_ k) * x ^ {forecast}_ k$

It can be seen that the Kalman gain is the weight of a weighted average, whether the observed value is more important or the predicted value is more important; the importance of the two is determined by the size of the deviation, and the smaller the deviation is more important.

Of which $u_ {k-1} $represents the control signal of the last time point. For example, the state of a robot can reflect the behavior of the robot itself, but in many cases, the control signal is not considered. For example, if we do Kalman filtering on the time series of the stock market, then there is no control signal to control, just let it develop freely.

**For example, the temperature of a room:**

There are three hours in the morning, afternoon and evening (actually, the time interval should be very short, here is just an example). The observed temperature in the morning is 23 degrees, and the deviation is 0.5, because the morning is the first time point, so there is no predicted value;

In the afternoon, assuming a = 1 and B = 0, the predicted value in the afternoon is 23 degrees, and then assuming the initial deviation is 1; the observed value in the afternoon is 25 degrees, and the deviation of the observed value is 0.5, so the Kalman gain $h = – frac {1 ^ 2} {1 ^ 2 + 0.5 ^ 2} = 0.8 $, so the real value in the afternoon is $0.8 * 25 + (1-0.8) * 23 = 24.6$

At night, the predicted value at night is the real value of the last moment, so it is 24.6, and the deviation is $sqrt {(1-0.8) * 1} = 0.4472 $; the observed value at night is 20 degrees, and then the deviation is 0.5, so the Kalman gain $h = – frac {0.4472 ^ 2} {0.4472 ^ 2 + 0.5 ^ 2} = 0.4444 $, so the real value at this moment is $0.4444 * 20 + (1-0.4444) * 24.6 = 22.56$

Finally, what do we need? We need to know that the observation error is 0.5, then the observation data of three time points: [23,25,20], and then after using the Kalman filter, it becomes [23,24.6,22.56]. Similar to a smoothing effect.

# How to implement it in Python?

```
from pykalman import KalmanFilter
def Kalman1D(observations,damping=1):
# To return the smoothed time series data
observation_covariance = damping
initial_value_guess = observations[0]
transition_matrix = 1
transition_covariance = 0.1
initial_value_guess
kf = KalmanFilter(
initial_state_mean=initial_value_guess,
initial_state_covariance=observation_covariance,
observation_covariance=observation_covariance,
transition_covariance=transition_covariance,
transition_matrices=transition_matrix
)
pred_state, state_cov = kf.smooth(observations)
return pred_state
```

The Kalman filter in the pykalman library is used here. Because the Kalman filter explained above is simplified and relies on the knowledge of normal distribution of orthodox interpretation, the parameters of the Kalman filter here may not be able to correspond one by one with the Kalman formula given above, resulting in some disconnection.

Here are the parameters:

- initial_ state_ Mean and initial_ state_ Covariance: in the above formula, the initial value at the beginning is the first observed value. However, in this method, the initial value is not the first observed value, but a value randomly sampled from a normal distribution. The positive distribution is expressed as initial_ state_ Mean is the mean value, and initial is the best_ state_ Covariance is of variance;
- observation_ Covariance can be equivalent to observation deviation;
- transition_ This is the prediction bias;
- transition_ Matrices is the capital letter A in the above formula, which is 1.

# Running results

From the function above, you can see transition_ The covariance is 0.1, that is, 0.1 when the prediction deviation is 0.1, so assuming that the observation deviation is very small, it can be imagined that the result after the filter should be very close to the observation value. Here, the observation deviation is selected as 0.001:

Then suppose that the observation error is very large, then you can think that the smoothing force will be very large, and the result is as follows:

Finally, let’s take a look at the comparison before and after using Kalman filter to smooth the data in a competition