In a previous post, we explained how self-organizing maps work, with a very simple example. In this post, we will explain how to implement self-organizing maps for an investment strategy.

Last time, we gave a simple example with a map of colors to explain in detail how self-organizing maps (SOM) work. As we saw, similar colors tend to stick together. As such, we may use this algorithm for some kind of clustering. In the conclusions, we speculated we could use it in finance, with **fundamental data** or with **price** behavior. In this post, we will use a SOM with fundamental data.

## Review: the basic algorithm

First, let’s review the basics of how the SOM algorithm works:

- Randomly initialize the neurons of the map.
- Repeat until convergence or until maximum number of steps is reached:
- Shuffle examples
- For each example
- Find
**best matching unit**(BMU) - Update BMU’s and neighbors’ values

- Find

Last time, we used three values for each neuron and for each example, representing RGB values. We did this as it was fairly easy to see the results. At the end we had a beautiful, organized map, in which similar colors clumped close together. The problem is that we were lacking a real application. This time around, we will finally see how to use self-organizing maps for an investment strategy

## Real example

First, we will start with a basic **momentum strategy**, with the S&P 500 as the investment universe. Every day, we will rank the available assets by their 6-month cumulative return, and invest in the 10% with the best result. We will not take into account any transaction costs, or any other fees.

As we can see, the results are relatively bad. Up until the beginning of 2019, the strategy keeps up with the benchmark. But afterwards, it starts underperforming. In the whole period, the strategy has an 8.34% annualized return and an annualized volatility of 19.90%, compared to the benchmark’s 11.36% return and 18.91% volatility. Nevertheless, this is the least important part. Now, we want to test whether a SOM will help with our results.

What will be our algorithm? It depends on what we want to achieve. We think diversification is always a worthy goal, and we want to use financial ratios. So we have settled on the following: we will use a SOM to cluster together similar assets, in terms of their Debt to Equity, Return on Equity, and Price to Earnings. When we have trained our SOM, we will pick assets one by one from our original ranking (remember, a 6-month momentum ranking). Every time we do so, we see which other assets fall close to it in the map (as assigned by their respective BMU’s), and penalize them on the ranking.

In this case, we have 2 parameters to fix or study. First, we have to decide the **distance limit**, the maximum distance (between corresponding BMU’s) in which we want to penalize other assets. And second, we must decide how, and how much, to **penalize** them. In our case, as a naive example, we have fixed the distance limit to 5, and we have decided to penalize adding positions to the momentum ranking following:

\(

rank_i := rank_i – \frac {20} {d(i, s)}

\)

Where *i* is the asset to penalize, and *s* is the selected asset.

When we finish doing this for the first selected asset, we select the second one (using our modified ranking, and of course excluding our already selected asset) and start over. As we have to train a new SOM every day, the simulation will take a little time. But, as it is a simple algorithm, it should not take very long.

## Results

Now, let’s see if we have been able to improve our results.

Indeed, we seem to have improved our results in the studied period. The SOM enhanced momentum strategy achieves an 11.51% annualized return, with a 19.05% annualized volatility. Still, we can see that until 2019, the SOM enhanced strategy underperforms both the naive momentum and the benchmark. It’s in 2021 and 2022 when it really shines, catching up to the benchmark.

A few questions are left as an exercise for the reader:

- How would you
**implement**the algorithm? - Using the provided penalizing method, what would be the
**optimal parameters**? How would you make sure not to**overfit**the training data? - Is there a
**better**method of**penalization**? Or maybe a better set of**financial ratios**to use?