When talking about quantitative trading, there are a large number of indicators and operators we can use as a buy/sell rule. But apart from deciding what indicator to follow, the most important part is in setting the correct parameters. So, one method we can use to find adequate parameters without spending too much time in the simulation of a lot of combinations would be using a genetic algorithm.
A genetic algorithm is an optimisation method inspired by the evolution of species and natural selection. Although it does not strictly belong to the Machine Learning field, it can be a good base to build a ML algorithm upon (as we will mention below).
The process would be the following:
- Initialisation: the algorithm starts with an initial population, which may be generated totally randomly. Every possible solution, (i.e. every element in that population), is called a chromosome.
- Iterative process:
- Crossover: those chromosomes are combined, creating a new population – the offspring. Every chromosome in this new generation is formed by mixing pieces – genes by biological analogy – of their ‘ancestors’.
- Mutation: usually, a mutation factor is also introduced, to allow some variation in the genes apart from the combination of the already existing characteristics.
- Evaluation: last, we have to calculate the fitness value of every new individual.
The idea in the crossover process is to create a generation bigger than the first, because only the best qualified individuals would survive. That means we will select the chromosomes that achieve the best results to be the parents for the following ones.
- Stop Conditions: we can use two or three different criteria to stop the iteration process:
- Reaching a fixed number of generations.
- Getting a satisfactory fitness level.
- Convergence of the algorithm.
I’m a particular fan of the first and third criteria (I would only use the second when execution time is essential).
Implementing a genetic algorithm is in fact an easy task; the most challenging part is how to transform our problem into chromosomes. We need mutable variables that we can transform easily and that do not require a huge amount of memory, so the algorithm can be efficient.
The following code is an example of a simple implementation of a genetic algorithm using Python syntax, where:
max_iter: is the maximum number of iterations allowed before the algorithm stops.
n_repeats: is the maximum number of iterations allowed whose best fitness is worse than the overall best fitness achieved in general during all the process. This is an easy way to control the convergence of the method. If we don’t use this, we can spend time executing the code when the algorithm is already stagnant in local optima.
N: is the number of individuals selected in every iteration to become the parents of the following generation.
i=0 fitness = 0 counter = 0 while i < max_iter: if counter > n_repeats: STOP Y = [ ] fit = [ ] pairs = randompairsfrom (X) for j in pairs: y = crossover ( j ) y = randommutation ( y ) Y. append (y) fit.append (fitness(y)) Y, fit = ordermaxtomin (Y, fit ) bestfitness = fit[ 0 ] if bestfitness < fitness: counter = counter+1 else: counter = 0 X = Y [0: N] i=i+1 fitness = bestfitness
It’s also important to take into account that, as mentioned above, as GA are an optimisation method, it is easy to get a local optimum of the problem. When that occurs, it’s really difficult to evolve and find another optimums. One thing we could do to force the algorithm to ‘jump’ out of local optima is change the selection process.
When the algorithm is ‘working well’, we make the selection of the better solutions from the set formed by the last generation and the new one. If we find that the algorithm is ‘working badly’ -some iterations occur but the best fitness is not getting better – we force the jump by using the brand new set of chromosomes (the last offspring) without also selecting the best solutions from the last iteration. Of course, the best fitness will become worse than the last one, but as we are changing the parents, the method will find a different set of solutions (we hope better than the others!).
So, you already know how to program a genetic algorithm (well done!).
The next step is deciding how to transform parameters of an indicator into chromosomes. If we can decide a general way to do that, then we will be able to use the same genetic algorithm to optimise the parameters of several different indicators.
I’m really sorry, but you’ll have to wait for the next post to find an example of the implementation, and what type of results we can get using real price series.
If you can’t wait and want to try by yourself, here’s a starting point for you:
What are the advantages and disadvantages of using a genetic algorithm to find the optimal parameters of an indicator for a concrete time series?
The virtues of using a GA should be finding good parameters to our trading models without spending too much time. The disadvantages are clear: the greatest risk of using an optimisation method is overfitting. We have already covered some techniques to mitigate overfitting by adding a regularisation term in a previous post.
Another option would be, make an adaptive algorithm. Thus, answering the question: what if we design an algorithm capable of detecting when its parameters are no longer getting good results, and then look for others that suit our constraints?