Classifying the market through decision trees
Regardless of which trading system you use, there will always be varying degrees of success. Such results are normal, of course, as different systems are designed to work optimally under certain conditions. Unfortunately, the market does not always work to the assumptions made by individual systems. This is a problem – but it does have a solution.
Here’s a working example to highlight our approach.
We’ve divided the EURUSD series according to the market type. We then distinguish the tendency moments of those others when the movement is not clear, lateral moments, and denominated periods of ranged market.
Betting adequately on the trend’s direction is relatively straightforward. We could, for example, take the sign of the slope of the linear regression and, from that, distinguish the bullish/bearish moments with a high probability of success.
What isn’t easy, is that the same regression – or any other method of trend following – works when there’s no trend (when the market is lateral).
When conditions are opposite to those in which our strategy was successful, its outcome will be the reverse. We’d lose part of what we gained during the trend, and the initially successful strategy will become mediocre. Even if not everything is lost, it would still be a disastrous result.
Hmm, it seems as though it would be beneficial to deactivate the strategy during these lateral moments. Even better, what if I create a strategy to win during the lateral markets? That would mean I always win! Wait, wait… let’s start at the beginning. How can I identify those losing moments?
The previous ‘perfect’ separation was constructed a posteriori, when we know what’s going to happen. The reality is that when we get closer to a concrete, local problem, the difference in price yields between trends and the lateral markets are not so pronounced.
Are the features of these yields really different enough that they can be identified as they occur?
To give an answer, it’s logical to start by determining the interesting characteristics, and to check if these factors really are related to the market type.
Is today a lateral or trend market? Should I keep my system active, or should I deactivate it? The automatic construction of classification trees serve as an alternative to manual methods of knowledge extraction. I can simply let a tree make the decision for me.
First, I gather all the characteristics I need to analyse. For example: volatility, efficiency, etc. Next, I set up a training set in which I label each day as ‘TR’ or ‘RM’, (‘Trend’ or ‘Ranged Market’). Usually, this set is only part of the available history, and the rest (the test set) is used to verify the outcome of the strategy in the rest of the period.
The construction of decision trees consists of prioritising the relevance of all values in all characteristics, according to the fixed classification. Thus, the most appropriate criteria for assessing the new points – whose classification is unknown and must be estimated – are being determined. And its application is extremely simple, since it’s enough to evaluate the current moment’s known characteristics, and follow the path proposed by the tree until reaching a terminal point: trend or ranged market.
Each terminal node is assigned the probability of its corresponding class outcome. Taking into account the patterns observed in the training set (in the example, this is up to 1997), the percentage of cases that are actually assigned the majority class is noted.
Using this percentage, we can make a gradual market division based on the estimation of each moment’s tendency:
This tool can be very useful as a compliment to the initial system – that linear regression – as it can help to reduce the losses derived from unfavourable situations. That is, those for which the system has not been specifically designed.
You can find more theory and tools in the following post: Classification trees in MATLAB.