post list
QuantDare
categories
artificial intelligence

Random forest: many is better than one

xristica

artificial intelligence

Non-parametric Estimation

T. Fuertes

artificial intelligence

Classification trees in MATLAB

xristica

artificial intelligence

Graph theory: connections in the market

T. Fuertes

artificial intelligence

Data Cleansing & Data Transformation

psanchezcri

artificial intelligence

Learning with kernels: an introductory approach

ogonzalez

artificial intelligence

SVM versus a monkey. Make your bets.

P. López

artificial intelligence

Clustering: “Two’s company, three’s a crowd”

libesa

artificial intelligence

Euro Stoxx Strategy with Machine Learning

fjrodriguez2

artificial intelligence

Visualizing Fixed Income ETFs with T-SNE

j3

artificial intelligence

Hierarchical clustering, using it to invest

T. Fuertes

artificial intelligence

Markov Switching Regimes say… bear or bullish?

mplanaslasa

artificial intelligence

“K-Means never fails”, they said…

fjrodriguez2

artificial intelligence

What is the difference between Bagging and Boosting?

xristica

artificial intelligence

Outliers: Looking For A Needle In A Haystack

T. Fuertes

artificial intelligence

Machine Learning: A Brief Breakdown

libesa

artificial intelligence

Stock classification with ISOMAP

j3

artificial intelligence

Sir Bayes: all but not naïve!

mplanaslasa

artificial intelligence

Returns clustering with k-Means algorithm

psanchezcri

artificial intelligence

Confusion matrix & MCC statistic

mplanaslasa

artificial intelligence

Reproducing the S&P500 by clustering

fuzzyperson

artificial intelligence

Random forest vs Simple tree

xristica

artificial intelligence

Clasificando el mercado mediante árboles de decisión

xristica

artificial intelligence

Árboles de clasificación en Matlab

xristica

artificial intelligence

Redes Neuronales II

alarije

artificial intelligence

Análisis de Componentes Principales

j3

artificial intelligence

Vecinos cercanos en una serie temporal

xristica

artificial intelligence

Redes Neuronales

alarije

artificial intelligence

Caso Práctico: Multidimensional Scaling

rcobo

artificial intelligence

Applying Genetic Algorithms to define a Trading System

aparra

22/12/2016

1
Applying Genetic Algorithms to define a Trading System

When talking about quantitative trading, there are a large number of indicators and operators we can use as a buy/sell rule. But apart from deciding what indicator we will follow, the most important part would be setting the correct parameters.

So, one method we can use to find adequate parameters without spending a lot of time in the simulation of a lot of combinations would be using a genetic algorithm.

A genetic algorithm is an optimization method inspired by the evolution of species and natural selection. Although it does not strictly belong to the Machine Learning field, it can be a good base to build a ML algorithm upon (as we will mention below).
The process would be the following:

  • Initialization: the algorithm starts with an initial population, which may be generated totally randomly. Every possible solution, i.e. every element in that population is called a chromosome.
  • Iterative process:
    • Crossover:  those chromosomes are combined, creating a new population -the offspring-. Every chromosome in this new generation is formed by mixing pieces –genes by biological analogy- of their ‘ancestors’.
    • Mutation: usually, a mutation factor is also introduced, to allow some variation in the genes apart from the combination of the already existing characteristics.
    • Evaluation: last, we have to calculate the fitness value of every new individual.
      The idea in the crossover process is to create a generation bigger than the first one, because only the best qualified individuals would survive. That means we will select the chromosomes that achieve the best results to be the parents for the following ones.

Genetic algorithm

  • Stop Conditions: we can use two or three different criteria to stop the iteration process:
    • Reaching a fixed number of generations.
    • Getting a satisfactory fitness level.
    • Convergence of the algorithm.

I am a particularly fan of the first and third criteria (I would only use the second one when execution time is essential).

Implementing a genetic algorithm is in fact an easy task; the most challenging part is how to transform our problem into chromosomes. We need mutable variables that we can transform easily and that do not require a huge amount of memory so the algorithm can be efficient.

 

Pseudo algorithm

The following code is an example of a simple implementation of a genetic algorithm using Python syntax, where:

max_iter: is the maximum number of iterations allowed before the algorithm stops.
n_repeats: is the maximum number of iterations allowed whose best fitness is worse than the overall best fitness achieved in general during all the process. This is an easy way to control the convergence of the method: if we don’t use this, we can spend time executing the code when the algorithm is already stagnant in local optima.
N: is the number of individuals selected in every iteration to become the parents of the following generation.

i=0 
fitness = 0 
counter = 0 
while i < max_iter: 
   if counter > n_repeats: 
      STOP 
   Y = [ ]
   fit = [ ] 
   pairs = randompairsfrom (X) 
   for j in pairs: 
      y = crossover ( j )
      y = randommutation ( y )
      Y. append (y)
      fit.append (fitness(y))
   Y, fit = ordermaxtomin (Y, fit ) 
   bestfitness = fit[ 0 ]
   if bestfitness < fitness: 
      counter = counter+1
   else: 
      counter = 0
      X = Y [0: N] 
      i=i+1 
      fitness = bestfitness

 

Improvements

It is also important to take into account that, as mentioned above, as GA are an optimization method, it is easy to get a local optimum of the problem and when that occurs, it is really difficult to evolve and find another optimums of the problem. One thing we could do to force the algorithm to ‘jump’ out of local optima is change the selection process.

When the algorithm is ‘working well’, we do the selection of the better solutions from the set formed by the last generation and the new one. If we find that the algorithm is ‘working bad’ –some iterations occur but the best fitness is not getting better- we force the jump by using the brand new set of chromosomes (the last offspring) without selecting also the best solutions from the last iteration. Of course, the best fitness will become worse than the last one, but as we are changing the parents, the method will find a different set of solutions (we hope better than the others!).

local optima

 

Applications

So, you already know how to program a genetic algorithm (well done!).

The next step is deciding how to transform parameters of an indicator into chromosomes. If we decide a general way to do that, then we will be able to use the same genetic algorithm to optimize the parameters of several different indicators.

I’m really sorry, but you will have to wait for the next post to find an example of the implementation and what type of results we can get using real price series.

If you cannot wait and want to try by yourself, I have one last shot:

What are the advantages and disadvantages of using a genetic algorithm to find the optimal parameters of an indicator for a concrete time series?

The virtues of using a GA should be finding good parameters to our trading models without spending too much time. The disadvantages are clear: the greatest risk of using an optimization method is overfitting. We have already covered some techniques to mitigate overfitting  by adding a regularization term in a previous post.

Another option would be make an adaptive algorithm. Thus, answering the question: what if we design an algorithm capable of detecting when its parameters are no longer getting good results and then look for others that suit our constraints?

Tweet about this on TwitterShare on LinkedInShare on FacebookShare on Google+Email this to someone

add a comment

Very interesting! =)

wpDiscuz