post list
QuantDare
categories
asset management

Foreseeing the future: a user’s guide

Jose Leiva

asset management

Stochastic portfolio theory, revisited!

P. López

asset management

“Past performance is no guarantee of future results”, but helps a bit

ogonzalez

asset management

Playing with Prophet on Financial Time Series (Again)

rcobo

asset management

Shift or Stick? Should we really ‘sell in May’?

jsanchezalmaraz

asset management

What to expect when you are the SPX

mrivera

asset management

How to… use bootstrapping in Portfolio Management

psanchezcri

asset management

Playing with Prophet on Financial Time Series

rcobo

asset management

Dual Momentum Analysis

J. González

asset management

Random forest: many are better than one

xristica

asset management

Using Multidimensional Scaling on financial time series

rcobo

asset management

Comparing ETF Sector Exposure Using Chord Diagrams

rcobo

asset management

Euro Stoxx Strategy with Machine Learning

fjrodriguez2

asset management

Hierarchical clustering, using it to invest

T. Fuertes

asset management

Lasso applied in Portfolio Management

psanchezcri

asset management

Markov Switching Regimes say… bear or bullish?

mplanaslasa

asset management

Exploring Extreme Asset Returns

rcobo

asset management

Playing around with future contracts

J. González

asset management

BETA: Upside Downside

j3

asset management

Approach to Dividend Adjustment Factor Calculation

J. González

asset management

Are Low-Volatility Stocks Expensive?

jsanchezalmaraz

asset management

Predict returns using historical patterns

fjrodriguez2

asset management

Dream team: Combining classifiers

xristica

asset management

Stock classification with ISOMAP

j3

asset management

Could the Stochastic Oscillator be a good way to earn money?

T. Fuertes

asset management

Correlation and Cointegration

j3

asset management

Momentum premium factor (II): Dual momentum

J. González

asset management

Dynamic Markowitz Efficient Frontier

plopezcasado

asset management

‘Sell in May and go away’…

jsanchezalmaraz

asset management

S&P 500 y Relative Strength Index II

Tech

asset management

Performance and correlated assets

T. Fuertes

asset management

Reproducing the S&P500 by clustering

fuzzyperson

asset management

Size Effect Anomaly

T. Fuertes

asset management

Predicting Gold using Currencies

libesa

asset management

Inverse ETFs versus short selling: a misleading equivalence

J. González

asset management

S&P 500 y Relative Strength Index

Tech

asset management

Seasonality systems

J. González

asset management

Una aproximación Risk Parity

mplanaslasa

asset management

Using Decomposition to Improve Time Series Prediction

libesa

asset management

Las cadenas de Markov

j3

asset management

Momentum premium factor sobre S&P 500

J. González

asset management

Fractales y series financieras II

Tech

asset management

El gestor vago o inteligente…

jsanchezalmaraz

asset management

¿Por qué usar rendimientos logarítmicos?

jsanchezalmaraz

asset management

Fuzzy Logic

fuzzyperson

asset management

El filtro de Kalman

mplanaslasa

asset management

Fractales y series financieras

Tech

asset management

Volatility of volatility. A new premium factor?

J. González

asset management

K-Means in investment solutions: fact or fiction

T. Fuertes

19/04/2017

No Comments
K-Means in investment solutions: fact or fiction

We’ve spoken previously about different clustering methods many times: K-Means, Hierarchical Clustering, and so on. However, this field does not end here. In this post, I will try to find how K-Means clustering works in an investment solution.

 

K-Means Clustering

The K-Means algorithm partitions the points in a data set into clusters. This partition minimises the sum, across the clusters, of the within-cluster sums of point-to-cluster-centroid distances (you can look here for further information).

As I did in a previous post “Hierarchical clustering, using it to invest“, I will use this clustering method to invest in a set of assets.  I only have to follow these straightforward steps:

  • Characterising data with the return and volatility from the previous six months.
  • Standardising the variables.
  • Applying K-Means algorithm looking for 4 clusters.
  • Selecting the cluster which has the maximum performance and the minimum volatility (if there are two clusters with the same position, I select the one with higher performance).
  • Investing in each asset that the cluster is composed of, equally weighted, every day.

The universe is composed of fixed income and equity from all countries assuming a currency hedge so that a good benchmark could be the MSCI World Local Currency.

The result is not good, as it does not outperform the benchmark during the whole period. There is a protection in the most important market losses in 2008 and 2011 (marked as a blue circle), but it underperforms the benchmark in 2009 and 2015 (marked as a red circle).

KMeans Clustering Result Graph

 

Helping K-Means

K-Means clustering does not always work, as we’ve just discovered in the previous test. Moreover, they told that in this post.

In my previous post, I said that the result of Hierarchical clustering could be used as a robust method of initializing other clustering methods. Thus, I will use the Hierarchical clustering to initialise the K-Means algorithm.

We repeat the previous simulation process, adding a new step:

  • Characterising data with the return and volatility from the previous six months.
  • Standardising the variables.
  • Applying Hierarchical clustering algorithm (Ward) with Euclidean distance.
  • Applying K-Means algorithm looking for 4 clusters with the clusters reached in Ward clustering.
  • Selecting the cluster which has the maximum performance and the minimum volatility (if there are two clusters with the same position, we select the one with higher performance). 
  • Investing in each asset that the cluster is composed of, equally weighted, every day.

In this case, the result is better in the whole period, but it does not outperform the reference index.

KMeans Initialised Clustering Result Graph

 

Conclusion

K-Means clustering can be a good method to separate the data set into groups, but we need to look for better characteristics to describe the data. In addition, if we initialised the centroids, the results would improve.

In conclusion, K-Means is an easy and useful algorithm, but it needs help.

Tweet about this on TwitterShare on LinkedInShare on FacebookShare on Google+Email this to someone

add a comment

wpDiscuz