The Unscented Kalman Filter allows to deal with nonlinear systems in a different way than the Extended Kalman Filter. Find how it works in this post.

This is not the first time we talk about the Kalman Filter (and it probably won’t be the last); I recommend you check this and this posts to understand the standard and extended versions of this algorithm and the notation we are going to use.

## The Extended Kalman Filter

As we saw in the previous post, in the framework proposed by the Extended Kalman Filter we evaluate the Jacobian matrices at a single point (the mean of the Gaussian) to locally linearize the nonlinear function that breaks the gaussian properties of our distribution.

We can further explore this idea and use multiple points to approximate a distribution instead of only one.

## Sigma Points

Now, the key question is: **how many points?** Based on the law of great numbers, the greater the number of points the better will be the approximation.

This is what we called **Monte Carlo Simulation**, and it can be very expensive computationally, since we do not know how many points we are going to need to approximate the distribution. **Can we do better?** Yes.

Instead of blindly sampling a lot of random points, we can **choose a fixed set of points** based on some specific and **deterministic algorithm.**

The paper [4] implements a set of sigma \( 2 \cdot n + 1 \) points, where \( n\) is the number of dimensions, such that $$ \mathcal{X}_{0} = \mu \\ \mathcal{X}_{i} = \mu + \sqrt{(n + \kappa) \cdot \Sigma}_{i} \ \text{for } i=1, …, n\\ \mathcal{X}_{i+n} = \mu – \sqrt{(n + \kappa) \cdot \Sigma}_{i-n} \ \text{for } i=n+1, …, 2n $$

Here, \( \kappa \) is a scaling factor that allows to control the spread of the points around the mean, \( \mu \) is the mean and \( \Sigma \) is the covariance matrix.

## The Unscented Transform

The *Unscented Transform *is a method for calculating the statistics of a random variable which undergoes a nonlinear transformation [4]. With the set of **sigma points** we already defined, we just need to **propagate them through the nonlinear function** to obtain our transformation

$$ \mathcal{Y}_{i} = f(\mathcal{X}_{i}), $$

where \( f \) is a non-linear function.

With the transformed points, we can now **extract the mean \( \hat{\mu} \) and covariance** \( \hat{\Sigma} \) of the unscented transform **to approximate a new Gaussian distribution**.

In this context, the **unscented transform is weighted** in order to control the effect of each sigma point. This is useful when the problem is very nonlinear [2].

$$ \hat{\mu} = \sum_{i=0}^{2n} w_{i}^{\mu} \cdot \mathcal{Y}_{i} \\ \hat{\Sigma} = \sum_{i=0}^{2n} w_{i}^{\Sigma} \cdot (\mathcal{Y}_{i} – \hat{\mu}) \cdot (\mathcal{Y}_{i} – \hat{\mu})^{T} $$

## The Unscented Kalman Filter

With all of the above, we can reframe the Kalman Filter using the unscented transform to get the Unscented Kalman Filter Algorithm. For each time step \( k \), we have the familiar Kalman feedback loop consisting of two steps: predict and update.

**Predict Step**

$$ \mathcal{X}_{k} = \sigma(\hat{x}_{k-1}^{-}, P_{k-1}) \\ w_{k}^{x}, w_{k}^{P} = \mathcal{W}(n) \\ \mathcal{Y}_{k} = f(\mathcal{X}_{k}) \\ \hat{x}_{k}^{-} = \sum_{i=0}^{2n} w_{k, i}^{x} \cdot \mathcal{Y}_{k, i} \\ P_{k}^{-} = \sum_{i=0}^{2n} w_{k, i}^{P} \cdot (\mathcal{Y}_{k, i} – \hat{x}_{k}^{-}) \cdot (\mathcal{Y}_{k, i} – \hat{x}_{k}^{-})^{T} + Q_{k}, $$

where \( \sigma \) is an arbitrary function that computes the sigma points, \( \mathcal{W} \) is an arbitrary function that computes the weights, \( n \) is the dimensionality and \( f \) is the nonlinear transition function.

**Update Step**

$$ \mathcal{Z}_{k} = h(\mathcal{Y}_{k}) \\ \mathcal{M}_{k} = \sum_{i=0}^{2n} w_{k, i}^{x} \mathcal{Z}_{k, i} \\ \mathcal{C}_{k} = \sum_{i=0}^{2n} w_{k, i}^{P} (\mathcal{Z}_{k, i} – \mathcal{M}_{k}) (\mathcal{Z}_{k, i} – \mathcal{M}_{k})^{T} + R_{k} \\ K_k = \left( \sum_{i=0}^{2n} w_{k, i}^{P} (\mathcal{Y}_{k, i} – \hat{x}_{k}^{-}) (\mathcal{Z}_{k, i} – \mathcal{M}_{k})^{T} \right) \mathcal{C}_{k}^{-1} \\ \hat{x}_{k} = \hat{x}_{k}^{-} + K_k (z_k – \mathcal{M}_{k}) \\ P_k = P_{k}^{-} – K_{k} \mathcal{C}_{k} K_{k}^{T}$$

where \( h \) is the nonlinear observation function and \( z_{k} \) is the observation at each time step \( k \).

## Conclusions

In this post we have briefly reviewed the concepts of sigma points and unscented transform and we have applied them to the Kalman Filter to understand the Unscented Kalman filter.

As per the question of which filter to choose between the Extended and the Unscented, the Extended Kalman filter requires a linearization that generates approximation errors and it only uses one point instead of a set of sigma points. However, there seems to be no evidence that strongly suggest the Extended Kalman Filter is worse than the Unscented [2], so I would try both and analyze the results thoroughly before selecting one version over the other.

## References

[1] Terejanu, G. A. (2008). Extended kalman filter tutorial. *University at Buffalo*.

[2] Labbe, R. (2014). Kalman and bayesian filters in python. *Chap*, *7*(246), 4.

[3] Särkkä, S. (2013). *Bayesian filtering and smoothing* (No. 3). Cambridge university press.

[4] Julier, S. J., & Uhlmann, J. K. (1997, July). New extension of the Kalman filter to nonlinear systems. In *Signal processing, sensor fusion, and target recognition VI* (Vol. 3068, pp. 182-193). Spie.