Predicting a time series is often complicated and frustrating. The series may appear completely random; a volatile sequence showing no signs of predictability. But let’s not despair just yet…there are some options available to us.
Decomposition is a technique that can be used to separate a series into components and predict each one individually. Each part can be treated in the most appropriate way and thereby improve the total prediction.
Time series are full of patterns and relationships. Decomposition aims to identify and separate them into distinct components, each with specific properties and behaviour. It is a tool mainly used for analysing and understanding historical time series, but it can also be useful in forecasting.
Depending on the original series, a number of components can be extracted: Trend τ, Cyclical c, Seasonal s and Irregular i.
The trend part reflects the long-term movement of the series, the seasonal part represents the changes with fixed and known periodicity and the cyclical part explains the non-periodic fluctuations. Lastly, the irregular component contains the noisy or random movements, and fluctuates around zero. This irregular component contains the volatile part of the series and tends to be the least predictable of all the elements.
There are different functional forms of decomposition depending on the behaviour of the original series. The most basic is additive decomposition.
The components are simply added together to form the original series. If there’s exponential growth in the series (typical in economic series) then a multiplicative form is necessary. Alternatively, an additive form after a log transformation (log-additive) can be used for series with growth. In additive decomposition the value of an original series y for each day t is:
The decomposition process is carried out by sequentially identifying and separating the different components. There are several ways to extract each component. Once the seasonal part is identified and removed (if it exists) the next step is to use a smoothing procedure to define the trend part. The remaining components (cyclical and irregular) are what’s left over: the difference between the original (without seasonal) and the trend.
One possibility for ‘detrending’ a series is using a centred moving average. The order (window length) determines the smoothness of the trend. This method is only really useful for historical data analysis because the beginning and end of the estimates are undefined, making forecasting impossible. Parametric models like X-12 ARIMA address this problem by estimating all the final data points. And another alternative is using STL, which makes use of local regression (LOESS) to smooth the series.
Lastly, it is possible to estimate stochastic trends using the Hodrick-Prescott filter. It is a model-free based approach; similar to a symmetric weighted average, but includes adjustments at the end of the sample. Let’s look at this filter in more detail and see how it can be applied to the EURUSD exchange rate.
The HP filter is a technique commonly used with macro-economic series that have a trend (long-term movements), business cycle and irregular parts (short-term fluctuations). It constructs the trend component by solving an optimisation problem. It aims to form the smoothest trend estimate that minimises the squared distances to the original series. In other words, it has to find equilibrium between the smoothness of the trend and its closeness to the original.
Let y(t) for t=1,2,..,T be the logarithm of an economic time series and that it can be divided into a trend component, τ, and an irregular/cyclical component, i, such that y(t) = τ(t) + i(t). The components can be found by solving the following minimisation problem, which has a unique solution:
The first term in the formula penalises the irregular component and the second penalises variations in the growth rate of the trend component. This trade-off between the goodness of fit and the smoothness is controlled by lambda, a (non-negative) multiplying parameter. It affects the sensitivity of the trend to short-term fluctuations. The larger the value of lambda, the smoother the trend.
Example in EURUSD
We implement the decomposition with 3 values of lambda in 20 years’ history of the EURUSD exchange rate with weekly frequency. The trend series are clearly smoother with a larger lambda and this implies more volatile irregular components.
What if we wanted to forecast the EURUSD spot movements?
Applying a simple autoregressive model recursively (only using past observations to choose model parameters) directly to the EURUSD series has a poor outcome. The predictions greatly underestimate the real movements. The AR is not able to forecast the sign of movements, and thus the errors are large (the mean absolute error (MAE) is greater than 1%).
If, however, we apply autoregression to the components and sum together their result we get a much more accurate prediction of the movement. The predictions have been improved hugely. Analysing the separation achieved with lambda=100, the sign is correctly forecasted 73% of the time and the errors are reduced (a 26% reduction of the MAE predicting the series directly).
Impact of the Future
Now before we get too excited about this result, there is a slight issue with the HP filter method. The decomposition algorithm makes use of observations that come both before and after the current estimate. This means that a particular day’s trend estimate can change when we add more data.
If we calculate the trend recursively (each day estimated with only previous observations) the result is the one-sided HP filter. Using this separation the forecasts are poor. The MAE is as large as when estimating the original series directly, and the direction is correctly forecasted only half of the time.
The outcome of the decomposition hugely affects the final predictions. The AR model parameters were not estimated using future observations, but the way the series are separated uses future movements unknown at the time of estimation.
This large impact demonstrates the benefits an accurate decomposition can have. If the movements of a time series can be correctly categorised, the results are impressive.
In our example, the problem with the one-sided filter is the difficulty to separate the volatile movements from the long-term ones. The two-sided version improves this separation and achieves better predictions. The key to forecasting time series is finding a suitable but realistic decomposition.