Hierarchical Risk Parity



No Comments

Building profitable portfolios has been giving investment managers headaches for decades. Many approaches have been used up until now, some of the most well-known being Markowitz’s Efficient Frontier and Risk Parity.

Today, we are presenting a brand new approach to this recurrent problem developed by Dr. Marcos López de Prado applying Modern Graph Theory and Machine Learning techniques.

López de Prado is the chief executive officer of True Positive Technologies and one of the top-10 most read authors in finance (SSRN’s rankings). You can learn more about the author in his personal site.

Challenging status quo

The development of this methodology is motivated by some issues regarding widely used strategies, such as:

  • Markowitz’s dependency on quadratic optimization of forecasted returns, frequently providing unstable and highly concentrated solutions.
  • Traditional risk parity’s ignorance of useful covariance information.

In order to address these issues, aforementioned methodology proposes to:

  • Drop forecasted returns and rely completely on covariance data.
  • Cluster assets based on correlation in order to allocate less weight to similar assets.

Hierarchical Risk Parity (HRP) in a nutshell

The HRP algorithm works in three stages:

  1. Tree clustering: group similar investments into clusters based on their correlation matrix. Having a hierarchical structure helps us to improve stability issues of quadratic optimizers when inverting the covariance matrix.
    The former dendogram shows the top 13 companies by global market cap clustered by the algorithm. Notice how financial stocks such as Bank of America and JPMorgan are clustered together. Same happens for Asian stocks (Tencent and Alibaba) or IT giants (Google, Amazon, Facebook).
  2. Quasi-diagonalization: reorganize the covariance matrix so similar investments will be placed together. This matrix diagonalization allow us to distribute weights optimally following an inverse-variance allocation.
  3. Recursive bisection: distribute the allocation through recursive bisection based on cluster covariance.


Let’s try the algorithm (Python code here) to build a portfolio with stocks of the world’s 13 largest companies by market cap. To compare it with classic portfolio management methodologies, we are going to compute:

  1. Markowitz’s Minimum-Variance Portfolio (MVP).
  2. Traditional risk parity’s Inverse-Variance Portfolio (IVP).

We are taking the returns of a share from January 1st 2015 to January 1st 2018. Resulting portfolios are shown in the table below:


Alibaba 1.73% 3.62% 4.56%
Alphabet 2.40% 6.76% 9.02%
Amazon 1.88% 4.16% 3.93%
Apple 3.40% 6.49% 9.28%
Bank of America 0.35% 4.72% 2.86%
Berkshire Hathaway 29.70% 17.65% 13.10%
ExxonMobil 5.82% 10.00% 7.42%
Facebook 2.55% 5.82% 6.09%
JPMorgan 0.55% 7.45% 4.51%
Johnson & Johnson 33.16% 17.48% 18.31%
Microsoft 1.34% 6.67% 6.30%
Samsung Electronics 16.15% 5.22% 9.62%
Tencent 0.97% 3.98% 5.01%



As we can extract from the asset allocation, MVP concentrates around 80% of the allocation in the top 3 categories, while HPR concentrates only 40%. On the other hand, IVP evenly spreads weights through all assets, ignoring the correlation structure.

The diversification that HRP achieves across uncorrelated assets makes the methodology more robust against shocks. Although MVP provides the optimal solution on in-sample data, evidence suggests that a HRP portfolio will outperform out-of-sample. This will achieve superior risk-adjusted returns than other traditional methods.

I hope you had an insight on how Machine Learning may improve well-established investment methodologies and how squeezing data may help us to get the most out of it.

Thanks for reading!

add a comment