Sometimes it is too costly, even impossible, to explicitly optimize an equation. Today we will see how to optimize implicitly using genetic algorithms.
Sometimes, in finance as well as in other aspects of life, a problem presents itself in the most clear of terms: an explicit equation which we must optimize. In these cases, we must either maximize or minimize its value, given some variables which we can set. But the number of those variables can be really big, and can interact in complex ways. In those cases, it can be non-viable to explicitly solve for the best solution. But we have a really powerful tool, genetic algorithms, to optimize these equations
We have seen some approaches to solve these complex problems, most notably neural networks. But, sometimes, this approach may not be suitable for our needs: these algorithms are prone to over-fitting, specialized hardware is usually needed for training, and reaching the global optima might depend on having a specific initialization of parameters (lottery ticket hypothesis).
There is another approach, which we saw an overview of some time ago. This approach, called genetic algorithms, combines the power of “gradient descent” used in neural networks (albeit implicitly, instead of explicitly) with the usage of multiple agents with different initializations.
First, let’s define the different concepts used in genetic algorithms.
- Agent: individual solution to the problem.
- Population: set of agents. At the begging, we will start with \( n \) number of agents, which will be our initial population.
- Fitness function: a mathematical expression which will tell us how good or bad an agent’s solution is, so that we can compare how good our agents are compared to each other.
- Selection: after comparing the fitness of our agents, we will keep the \( m \) best agents.
- Crossover: after selecting the best agents, we want to reach \( n \) agents again. To achieve this, we generate new agents by combining the surviving ones. We consider each of the variables/parameters as “genes”, and each offspring inherits a mixture of the “genes” from two “parent” agents. This crossover can have special restrictions, as we will see in our example.
- Mutation: after inheriting from two parents, we randomly produce mutations in some of the parameters, to broaden our search space and produce new solutions.
- Stopping condition: we must choose a condition which, when reached, must stop the program. If we know the optimum value for our equation, the condition will be that one of the agents’ fitness reach that value. If we do not know the optimum value, we can iterate for a fixed number of loops, until the fitness of the agents stops improving, or whatever other condition seems appropriate.
So our basic algorithm (in pseudo-code) will be:
Create initial population Measure fitness of agents Stop condition = false While stop condition = false Select best agents Create new agents by crossover Apply mutation to new agents Measure fitness of agents Check stop condition
When we exit the loop, we should have the optimum agent (in case we know the optimum value for the equation and used it as the stop condition), or the best agent yet (if we did not know the optimum value).
An example in finance
That is all fine and well, but let us see an example to fix this knowledge. Let’s say we could have invested from January 1, 2020 until September 1, 2021 in any of the FAANG (Facebook, Apple, Amazon, Netflix, and Google) stocks. For simplicity’s sake, we will invest a percentage of our portfolio in each asset, and rebalance every day until the end of the period, with no fees applied. Which asset allocation would have given us the best Sharpe ratio? Let’s optimize using genetic algorithms!
First, we must make some considerations. First of all, our agents’ parameters represent percentages, so every time we create a new one, we must normalize it so that the percentages add up to one. In this example we are simulating long-only positions, so all weights must be greater than or equal to zero, and smaller than or equal to one (take special care when applying the mutations!). Let’s see the –incomplete– Python code I have written:
... n = 20 m = 5 population = np.random.uniform(0, 10, size=(n, 5)) population = population / population.sum(axis=1)[:, np.newaxis] fitness = fitness_pop(population) stop_condition = False rounds = 0 while stop_condition is False: best = fitness.argpartition(-m)[-m:] new_population = population[best] while new_population.shape < n: parent1, parent2 = choose_parents(best) new_agent = choose_genes_from_parents(population, parent1, parent2) new_agent = mutate_agent(new_agent) new_agent = new_agent / new_agent.sum() new_population = np.append(new_population, new_agent.reshape(1, -1), axis=0) population = new_population fitness = fitness_pop(population) rounds += 1 if rounds == 1000: stop_condition = True print(population[fitness.argmax()])
My best agent, which has a Sharpe ratio of 3.51, turns out to be:
- Apple: 10.5%
- Amazon: 11.75%
- Google: 77.75%
Some work is left to the reader:
- Can you write the missing functions? (fitness_pop, choose_parents, choose_genes_from_parents, and mutate_agent)
- Can you come up with a better stopping condition?
- Can you guess what happens as you make \( m \) and \( n \) bigger or smaller? What if your investment universe is bigger? Could you implement it?
Of course, the example we have seen is very simple, and not very useful. It would be more interesting if we can apply the same technique to a decision-making process. Maybe next time we can optimize a decision-making process using genetic algorithms. Until then, I hope this post was interesting for you, please leave a comment sharing your thoughts. Thank you for reading!