post list
QuantDare
categories
artificial intelligence

## Classification trees in MATLAB

### xristica

artificial intelligence

## Graph theory: connections in the market

### T. Fuertes

artificial intelligence

## Data Cleansing & Data Transformation

### psanchezcri

artificial intelligence

## Learning with kernels: an introductory approach

### ogonzalez

artificial intelligence

## SVM versus a monkey. Make your bets.

### P. López

artificial intelligence

## Clustering: “Two’s company, three’s a crowd”

### libesa

artificial intelligence

## Euro Stoxx Strategy with Machine Learning

### fjrodriguez2

artificial intelligence

## Visualizing Fixed Income ETFs with T-SNE

### j3

artificial intelligence

## Hierarchical clustering, using it to invest

### T. Fuertes

artificial intelligence

## Markov Switching Regimes say… bear or bullish?

### mplanaslasa

artificial intelligence

## “K-Means never fails”, they said…

### fjrodriguez2

artificial intelligence

## What is the difference between Bagging and Boosting?

### xristica

artificial intelligence

## Outliers: Looking For A Needle In A Haystack

### T. Fuertes

artificial intelligence

## Machine Learning: A Brief Breakdown

### libesa

artificial intelligence

## Stock classification with ISOMAP

### j3

artificial intelligence

## Sir Bayes: all but not naïve!

### mplanaslasa

artificial intelligence

## Returns clustering with k-Means algorithm

### psanchezcri

artificial intelligence

## Confusion matrix & MCC statistic

### mplanaslasa

artificial intelligence

## Reproducing the S&P500 by clustering

### fuzzyperson

artificial intelligence

## Random forest vs Simple tree

### xristica

artificial intelligence

## Clasificando el mercado mediante árboles de decisión

### xristica

artificial intelligence

## Árboles de clasificación en Matlab

### xristica

artificial intelligence

## Redes Neuronales II

### alarije

artificial intelligence

## Análisis de Componentes Principales

### j3

artificial intelligence

## Vecinos cercanos en una serie temporal

### alarije

artificial intelligence

## Caso Práctico: Multidimensional Scaling

### rcobo

There are some things impossible to quantify. How delicious are your mom’s cookies or how exciting is to train a neural network, for example. But financial markets are made of numbers – among other things. They should be measurable, quantifiable. Nobody said it was easy. But we dare.

# Applying Genetic Algorithms to define a Trading System

### aparra

#### 22/12/2016

1

When talking about quantitative trading, there are a large number of indicators and operators we can use as a buy/sell rule. But apart from deciding what indicator we will follow, the most important part would be setting the correct parameters.

So, one method we can use to find adequate parameters without spending a lot of time in the simulation of a lot of combinations would be using a genetic algorithm.

A genetic algorithm is an optimization method inspired by the evolution of species and natural selection. Although it does not strictly belong to the Machine Learning field, it can be a good base to build a ML algorithm upon (as we will mention below).
The process would be the following:

• Initialization: the algorithm starts with an initial population, which may be generated totally randomly. Every possible solution, i.e. every element in that population is called a chromosome.
• Iterative process:
• Crossover:  those chromosomes are combined, creating a new population -the offspring-. Every chromosome in this new generation is formed by mixing pieces –genes by biological analogy- of their ‘ancestors’.
• Mutation: usually, a mutation factor is also introduced, to allow some variation in the genes apart from the combination of the already existing characteristics.
• Evaluation: last, we have to calculate the fitness value of every new individual.
The idea in the crossover process is to create a generation bigger than the first one, because only the best qualified individuals would survive. That means we will select the chromosomes that achieve the best results to be the parents for the following ones.

• Stop Conditions: we can use two or three different criteria to stop the iteration process:
• Reaching a fixed number of generations.
• Getting a satisfactory fitness level.
• Convergence of the algorithm.

I am a particularly fan of the first and third criteria (I would only use the second one when execution time is essential).

Implementing a genetic algorithm is in fact an easy task; the most challenging part is how to transform our problem into chromosomes. We need mutable variables that we can transform easily and that do not require a huge amount of memory so the algorithm can be efficient.

## Pseudo algorithm

The following code is an example of a simple implementation of a genetic algorithm using Python syntax, where:

max_iter: is the maximum number of iterations allowed before the algorithm stops.
n_repeats: is the maximum number of iterations allowed whose best fitness is worse than the overall best fitness achieved in general during all the process. This is an easy way to control the convergence of the method: if we don’t use this, we can spend time executing the code when the algorithm is already stagnant in local optima.
N: is the number of individuals selected in every iteration to become the parents of the following generation.

i=0
fitness = 0
counter = 0
while i < max_iter:
if counter > n_repeats:
STOP
Y = [ ]
fit = [ ]
pairs = randompairsfrom (X)
for j in pairs:
y = crossover ( j )
y = randommutation ( y )
Y. append (y)
fit.append (fitness(y))
Y, fit = ordermaxtomin (Y, fit )
bestfitness = fit[ 0 ]
if bestfitness < fitness:
counter = counter+1
else:
counter = 0
X = Y [0: N]
i=i+1
fitness = bestfitness


## Improvements

It is also important to take into account that, as mentioned above, as GA are an optimization method, it is easy to get a local optimum of the problem and when that occurs, it is really difficult to evolve and find another optimums of the problem. One thing we could do to force the algorithm to ‘jump’ out of local optima is change the selection process.

When the algorithm is ‘working well’, we do the selection of the better solutions from the set formed by the last generation and the new one. If we find that the algorithm is ‘working bad’ –some iterations occur but the best fitness is not getting better- we force the jump by using the brand new set of chromosomes (the last offspring) without selecting also the best solutions from the last iteration. Of course, the best fitness will become worse than the last one, but as we are changing the parents, the method will find a different set of solutions (we hope better than the others!).

## Applications

So, you already know how to program a genetic algorithm (well done!).

The next step is deciding how to transform parameters of an indicator into chromosomes. If we decide a general way to do that, then we will be able to use the same genetic algorithm to optimize the parameters of several different indicators.

I’m really sorry, but you will have to wait for the next post to find an example of the implementation and what type of results we can get using real price series.

If you cannot wait and want to try by yourself, I have one last shot:

What are the advantages and disadvantages of using a genetic algorithm to find the optimal parameters of an indicator for a concrete time series?

The virtues of using a GA should be finding good parameters to our trading models without spending too much time. The disadvantages are clear: the greatest risk of using an optimization method is overfitting. We have already covered some techniques to mitigate overfitting  by adding a regularization term in a previous post.

Another option would be make an adaptive algorithm. Thus, answering the question: what if we design an algorithm capable of detecting when its parameters are no longer getting good results and then look for others that suit our constraints?