Unlock the Power of Quantitative Strategies: Explore Our Cutting-Edge Website Today!

All

Beyond linear: the Extended Kalman Filter

Alejandro Pérez

21/07/2021

No Comments

Although linear systems are pretty convenient at many levels, many real world applications cannot rely in this assumption. The Extended Kalman Filter can deal with these nonlinearities in a simple way. Learn how in this post.

Introduction

In the 1960s, Rufold E. Kalman codeveloped one of the most important and used algorithms of the 20th century: the Kalman Filter [6][7]. The Kalman Filter is an optimal state estimation algorithm that can recover information about the state of a signal given that this signal is noisy and/or incomplete. It is used in electronics, avionics, finance and in a numerous number of other fields.

Brief Review of the Kalman Filter

We’ve already talked about the Kalman Filter in this blog, but I’d like to briefly revisit the theory in order to set up the notation and the problem we are facing.

Given a linear dynamic system governed by the following equations:

$$
x_{k} = A_{k} x_{k-1} + B_{k} u_{k} + q_{k}
$$

$$
z_{k} = H_{k} x_{k} + r_{k}
$$

with \(q_{k} \sim \mathcal{N}(0, Q) \) and \( r_{k} \sim \mathcal{N}(0, R)
\), the Kalman Filter estimates the state using a form of feedback-control loop of two stages: predict and update.

Predict

In the predict stage, the Kalman Filter computes the prior values of state and covariance using the previous timestep.

$$
x_{k} = A_{k} \hat{x}_{k-1}^{-} + B_{k} u_{k-1}
$$

$$
P_{k}^{-} = A_{k} P_{k-1} A_{k}^{T} + Q_{k}
$$

Update

In the update state, the Kalman filter combines the predictions with the observed variable \(Z\) at time \(k\), regulating the posterior estimates using the Kalman gain \(K_{k}\).

$$
K_{k} = P_{k}^{-} H_{k}^{T} (H_{k} P_{k} H_{k}^{T} + R_{k})^{-1}
$$

$$
\hat{x}_{k} = \hat{x}_{k} ^{-} + K_{k} (z_{k} – H_{k} \hat{x}_{k}^{-})
$$

$$
P_{k} = (I – K_{k} H_{k}) P_{k}^{-}
$$

For a detailed guide on what each matrix represents and what are its dimensions, see [1], [4] and [5].

If the system we are modeling satisfies the Gaussian and linear assumptions, the Kalman Filter will give us the optimal estimation, which means we do not need to search for other algorithms since the Kalman Filter will output the best possible estimate.

The Extended Kalman Filter

Why do we need nonlinear estimates?

It is actually pretty easy to violate the linear assumption. For instance, a vehicle that is moving in some direction has an angular component (sine and cosine functions) that makes the whole system nonlinear, for which the Kalman Filter has no way to estimate the optimal estate. Let’s see some examples.

If we pass a Gaussian distribution through a linear function, we can see how the distribution stays Gaussian [2][5].

Gaussian distribution passed through a linear function.
Figure 1: Gaussian distribution passed through a linear function.

However, a nonlinear function is not able to preserve the Gaussian properties [2][5]

Gaussian distribution passed through a nonlinear function.
Figure 2: Gaussian distribution passed through a nonlinear function.

Given that the distribution does not stay Gaussian, the Kalman filter will not give us the best possible estimate.

We need to change the original scheme to account for these nonlinearities: the system is now governed by the following nonlinear stochastic differentiable equations:

$$
x_{k} = f(x_{k-1}, u_{k}) + w_{k-1}
$$
$$
z_{k} = h(x_{k}) + v_{k}
$$

Here, \(w_{k}\) and \(v_{k}\) represent the process and measurement noises, with \(w_{k} \sim \mathcal{N}(0, Q)\) and \(v_{k} \sim \mathcal{N}(0, R)\), and \(f\) and \(h\) are nonlinear functions.

The question that remains is…

How can we linearize a function?

We’ve seen that nonlinear functions have some ugly properties that we do not want in our filter but, how can we linearize a given differentiable function? The answer lies in the Jacobian matrix.

The Jacobian matrix is the matrix of first order derivatives of a vector-valued function.

$$
J = \begin{bmatrix}
\dfrac{\partial \mathbf{f}}{\partial x_1} & \cdots & \dfrac{\partial \mathbf{f}}{\partial x_n}
\end{bmatrix}
$$

By computing the Jacobian matrix at each time step \(k\), we can locally linearize our nonlinear function. It is a process somewhat akin to a Taylor series [3].

Derivative of sin(x) at (0,0).
Figure 3: Derivative of sin(x) at (0,0).

Once we have the tangent line, we can use it to transform our distribution without distorting the nice Gaussian shape [4].

Using the tangent line computed with the derivative we keep the Gaussian Shape (distribution in green).
Figure 4: using the tangent line computed with the derivative we keep the Gaussian shape (distribution in green).

Reframing the Kalman Filter

We are now able to extend (totally intended pun) the original Kalman Filter by modifying its equations. The feedback-control loop we saw previously now changes to:

Predict

$$
\hat{x}_{k} ^{-} = f(\hat{x}_{k-1}, u_{k})
$$

$$
P_{k}^{-} = A_{k} P_{k-1} A_{k}^{T} + Q_{k}
$$

Update

$$
K_{k} = P_{k}^{-} H_{k}^{T} (H_{k} P_{k} H_{k}^{T} + R_{k})^{-1}
$$

$$
\hat{x}_{k} = \hat{x}_{k} ^{-} + K_{k} (z_{k} – h(\hat{x}_{k}^{-}))
$$

$$
P_{k} = (I – K_{k} H_{k}) P_{k}^{-}
$$

Here, \(A_{k}\) is the Jacobian matrix of partial derivatives of \(f\) with respect to \(x\) and \(H_{k}\) is the Jacobian matrix of partial derivatives of \(h\) with respecto to \(x\)

This new version of the filter receives the name of Extended Kalman Filter (EKF).

EKF is not optimal (mostly)

An important thing to notice is that the Extended Kalman Filter does not guarantee the optimal estimation; it only guarantees a better estimation than the vanilla Kalman Filter if the system is nonlinear. If the system is linear with Gaussian noises and the configuration of the filter is identical to its standard counterpart then, EKF is optimal.

This happens because the EKF approximates state transitions and measurements using linear Taylor expansions, making the goodness of the approximation dependant on the degree of nonlinearity of the functions being approximated and on the uncertainty of its Gaussian belief [2][5].

Conclusions

In this post we’ve seen how Kalman Filters work and what are their 2 main disadvantages, mainly: assuming that the equations that govern a dynamic system are linear and that the noises can be drawn from a Gaussian distribution with mean equals to 0.

In order to tackle the nonlinearity problem, we’ve reviewed the extended version of the filter, which uses the Jacobian matrix to perform a local linearization.

Though the Extended Kalman Filter does not guarantee an optimal estimation, it is certainly better than the standard version for nonlinear systems.

References

[1] Terejanu, G. A. (2008). Extended kalman filter tutorial. University at Buffalo.

[2] Cyrill Stachniss – Kalman Filter & EKF (Cyrill Stachniss).

[3] Welch, G., & Bishop, G. (1995). An introduction to the Kalman filter.

[4] Rhudy, M. B., Salguero, R. A., & Holappa, K. (2017). A Kalman filtering tutorial for undergraduate students. International Journal of Computer Science & Engineering Survey8(1), 1-9.

[5] Thrun, S., Burgard, W., & Fox, D. (2006). Probabilistic robotics. Kybernetes.

[6] Kalman, R. E., & Bucy, R. S. (1961). New results in linear filtering and prediction theory.

[7] Kalman, R. E. (1960). A new approach to linear filtering and prediction problems.

0 Comments
Inline Feedbacks
View all comments