There are a wide variety of Machine Learning techniques that help us to solve Big Data problems. In this post we talk about how to apply Lasso Regression in Portfolio Management. You may have heard of this technique in the past, and for that reason I’ll try to keep the introductory explanation brief.
Lasso definition
Least Absolute Shrinkage and Selection Operator or Lasso is a regression analysis method that helps us to fit a linear model given a set of input measurements x_{1}, …, x_{N} and an outcome measurement y.
The optimisation objective for Lasso is:
Where α is a tuning parameter.
Lasso in finance
Our real problem lies in dealing with a large set of series and a wide range of models that makes it really difficult to analyse. For that reason, Lasso helps us in the models analysis and provides us with their optimum linear combination.
Step by step
We draw from a wide variety of financial series that could be grouped in different families and regions (around 400 assets set up our universe):
This process is only for one day and one asset (we have to do a loop to create our simulator):

 Firstly, we have to decide which models are going to be used. In this example, I’m using window lengths from 5 to 100 with 5 as the step between them (a total of 20 windows). I created 40 Moving Average models:
 The first 20 models use these windows and the weights are equal for each return.
 The last 20 models use these windows and the weights are geometrically decreasing with 1.1 as the scale factor. The recent past is more important than the distant past.
 For each asset, we have to calculate the estimates of the last 4 years to achieve a matrix with 1044 rows and 40 columns. I trained the algorithm with the daily return estimation of the last 4 years to have a wide variety of examples. We have to make sure that we don’t use information early!
 This matrix and a vector of real returns are the inputs for Lasso model, and we have to fit it and predict the next value.
 Firstly, we have to decide which models are going to be used. In this example, I’m using window lengths from 5 to 100 with 5 as the step between them (a total of 20 windows). I created 40 Moving Average models:
Once we have calculated the estimates for each asset, we have to do the Asset Allocation process and assign weights to each serie. I did it in two different ways:
 Equally Weighted (EW): All assets with positive estimation have the same weight.
 Weighted according to estimates (W2Est): The weight of each asset with positive estimation is a normalisation of its estimate.
Results
The results that I got are worse than I had expected because the total return is not much better than an equal weighting of all assets regardless of the estimation (Benchmark). Moreover, the turnover would be unacceptable for a real product:
It’s a first approach to Lasso, but there’s one question that insistently lingers in my mind after seeing these results…