In the financial field, managers usually take advantage of the great development in machine learning techniques to improve their models and get the best performance of their portfolios. These techniques may be clustering, neural networks, or even a more traditional one as optimization algorithms.
In general, the existing algorithms are suitable in most of the problems that managers have to face and it is short and sweet to apply these algorithms directly with the parameters established. However, what happens if the problem that a manager faces does not fit in the current algorithms?
In this case, a redefinition of the problem is usually enough to adapt it to the parameters, as this post will explain by using an optimization problem case.
Optimization problem
An optimization problem finds the best solution from all feasible solutions subject to some restrictions. Optimization problems can be divided into some categories, depending on whether the variables are continuous or discrete:
- Discrete optimization finds a solution from a countable set of discrete variables,
- Continuous optimization finds a solution from a continuous function, or
- Mixed optimization finds a solution from a mixture of discrete variables and continuous functions.
Let f be an objective function to be minimised over the n-variable vector x and let g and h be continuous functions, then a common form of a continuous optimization problem is like that:
\[\begin{array}{rrl} &min_{x}&& f(x)\ \ (f:\mathbb{R}^{n}\to\mathbb{R})\\ &subject\ to&&g_{i}(x)\le0,\ \ \ \ i=1, …, m\ \ (m\ge0)\\ &&&h_{j}(x)=0,\ \ \ \ j=1, …, p\ \ (p\ge0)\\ \end{array}\]If m=p=0, the problem is unconstrained. Nevertheless, as the definition shows, there may be both inequality constraints -like g(·)- or equality constraints -like h(·)-.
In that point, a programming language like Python comes in handy to solve these problems because a great deal of different solvers has been developed, which work with a wide range of kinds of optimization problems.
For example, the Python solvers find solutions to a problem subject to continuous restrictions like:
\[\begin{array}{l} &x\ge0\\ &\sum_{i=1}^nx_{i}=0 \end{array}\]However, the managers’ problems don’t always fit in this definition, as it happens when they have to face non-continuous restrictions.
Continuous restrictions
Let’s imagine a simple optimization problem that maximises the gains of a portfolio over a vector of weights, where the weights have to be lower than 25% and the sum of the weights of asset a and asset b has to be lower than 40%. Let r be a vector of returns, then the problem’s definition may be like that:
\[\begin{array}{rl} min_{w}&& -\sum_{i=1}^Nw_{i}·r_{i}\\ subject\ to&&w_{i}\ge0,\ \ \ \ \forall i=1, …, N\\ &&w_{i}\le0.25,\ \ \ \ \forall i=1, …, N\\ &&w_{a}+w_{b}\le0.4, \ \ \ \ a,b\in\{1, …, N\}\\ &&\sum_{i=1}^Nw_{i}=1\\ \end{array}\]This is an optimization problem with continuous restrictions. Therefore, it is really easy to solve in Python with the library cvxpy, as it works under these circumstances:
import cvxpy as cvx import numpy as np returns = [0.15, 0.03, -0.15, 0.5, -0.04] group_constraint_control = [1, 0, 0, 1, 0] n_variables = len(returns) minimum_values = np.zeros(len(returns)) maximum_values = 0.25 * np.ones(len(returns)) weights_cvx = cvx.Variable(n_variables) gains = cvx.sum(returns @ weights_cvx) weights_sum_equal_one = cvx.sum(weights_cvx) == 1 variable_greater_or_equal_zero = weights_cvx >= minimum_values variable_less_or_equal_maximum = weights_cvx <= maximum_values some_less_than_value = cvx.sum(group_constraint_control @ weights_cvx) <= 0.4 constraints = [ weights_sum_equal_one, variable_greater_or_equal_zero, variable_less_or_equal_maximum, some_less_than_value ] objective = cvx.Minimize(-gains) prob = cvx.Problem(objective, constraints) prob.solve(verbose=True) print("weights: {}".format( ', '.join(['{:.2%}'.format(x) for x in weights_cvx.value]) ))
But, what happens if a manager requires a non-continuous restriction like “w may be 0 or a value greater than 0.3”? Let the manager redefine the optimization problem.
Non-continuous restrictions
An integer optimization problem is the lifesaver with non-continuous restrictions. So, let’s turn the problem into an integer optimization problem!
Think out of the box
The main characteristic of an integer optimization problem is that the available values have to be integer; therefore, the available values of weights have to define as integers too.
As weights are obviously real numbers, firstly the available interval [0, 1] has to be split into a set of discrete values by using a step, for example, of 0.1 (notice that this step may be lower if necessary):
\[w_{i}\in[0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1]\]This isn’t an interval of integer numbers yet, so the algorithm should find the position of the weights vector that minimises the objective function, because the position is, in fact, an integer number. As there is no solver in Python that deals with positions, it may be a lifesaver to use a boolean vector with the same shape of available weights vector in which only one position can be equal to 1. The position with 1 will indicate which value of weights is selected. Now we already have a vector of integer values!
Let’s see step by step how to redefine the problem.
Definition with non-continuous restrictions
Let W be the set of available weights of all the assets in the portfolio (that is, a set of discrete values as it has been shown before), and let W* be the set of positions with values in {0,1}. If r refers to the return of each asset and N is the number of assets, the new definition of the problem will be the following:
\[\begin{array}{rrl} min_{W^*\in\{0,1\}}&& -\sum_{i=1}^N\sum_{j=1}^{N_i}w_{i,j}^*·w_{i,j}·r_{i}\\ subject\ to&&W^*\in\{0,1\}\\ &&w_{i,j}^*=0, \ \ \forall i,j |w_{i,j}\ge0.25\\ &&\sum_{i=1}^N\sum_{j=1}^{N_i}w_{i,j}^*·w_{i,j}=1\\ &&\sum_{j=1}^{N_i}w_{i,j}^*=1,\ \ \forall i\in[1, 2,… N]\\ &&\sum_{j=1}^{N_a}w_{a,j}^*·w_{a,j}+\sum_{j=1}^{N_b}w_{b,j}^*·w_{b,j}\le0.4, \ \ \ a,b\in\{1, …, N\}\\ \end{array}\]where
\[\begin{array}{l} N_{i}\equiv Number\ of\ available\ weights\ of\ the\ asset\ i\\ W=\{w_{1,1}, w_{1,2},…, w_{1,N_{1}}, w_{2,1}, w_{2,2},…, w_{2,N_{2}},…, w_{N,1}, w_{N,2},…, w_{N,N_{N}}\}\\ W^*=\{w_{1,1}^*, w_{1,2}^*,…, w_{1,N_{1}}^*, w_{2,1}^*, w_{2,2}^*,…, w_{2,N_{2}}^*,…, w_{N,1}^*, w_{N,2}^*,…, w_{N,N_{N}}^*\}\\ \end{array}\]In these terms, it is easy to add the non-continuos restriction that one of the weights has to be “0 or a value greater or equal to 0.3“. For example, if the first asset has to meet that restriction, what it’s done is to remove the non-available weights from W and their corresponded positions form W*. In this case, the definition of W would be:
\[\begin{array}{rrl} W&=&\{w_{1,1}, w_{1,2}, w_{1,3},…, w_{1,N_{1}}, w_{2,1}, w_{2,2},…, w_{2,N_{2}},…, w_{N,1}, w_{N,2},…, w_{N,N_{N}}\}=\\ &=&\{0, 0.3, 0.4,…, 1, 0, 0.1,…, 1,…, 0, 0.1,…,1\}\\ where\\ &&N_{1}=9;\ \ \ N_{2}=N_{3}=…=N_{N}=11 \end{array}\]Code with non-continuous restrictions
This problem is now a discrete optimization problem to solve over integer values, specially bolean values. Consequently it is easy to solve it with Mixed Integer Problem (MIP) solvers, when the problem is codified properly.
Here it goes in Python:
import cvxpy as cvx import numpy as np returns = [0.15, 0.03, -0.15, 0.5, -0.04] n_variables = len(returns) minimum_weight = 0 maximum_weight = 0.25 step = 0.005 available_weights = {} for i in range(n_variables): weight_range = np.arange(minimum_weight, maximum_weight + step, step) if i == 0: idx = (weight_range == 0) | (weight_range >= 0.3) weight_range = weight_range[idx] available_weights[i] = list(weight_range) weights_matrix = list(available_weights.values()) weights_matrix = [w for weight in weights_matrix for w in weight] len_per_asset = [ len(available_weights[symbol]) for symbol in available_weights.keys() ] control_matrix_by_symbol = [] group_constraint_control_by_symbol = [] return_matrix = [] cumulative_len = np.hstack([0, np.cumsum(len_per_asset)]) for i in range(n_variables): control_matrix_by_symbol_i = np.zeros(len(weights_matrix)) initial_idx = cumulative_len[i] last_idx = cumulative_len[i] + len_per_asset[i] control_matrix_by_symbol_i[initial_idx:last_idx] = 1 control_matrix_by_symbol.append(list(control_matrix_by_symbol_i)) control = 0 if i in (0, 3): control = 1 group_constraint_control_by_symbol.extend([control] * len_per_asset[i]) return_matrix.extend([returns[i]] * len_per_asset[i]) positions_cvx = cvx.Variable(len(weights_matrix), boolean=True) gains = cvx.sum(return_matrix @ cvx.multiply(weights_matrix, positions_cvx)) weights_sum_equal_one = cvx.sum(weights_matrix @ positions_cvx) == 1 some_less_than_value = cvx.sum(weights_matrix @ cvx.multiply( group_constraint_control_by_symbol, positions_cvx )) <= 0.4 constraints = [ weights_sum_equal_one, some_less_than_value ] for i in range(n_variables): only_one_per_asset = cvx.sum( control_matrix_by_symbol[i] @ positions_cvx ) == 1 constraints.extend([only_one_per_asset]) objective = cvx.Minimize(-gains) prob = cvx.Problem(objective, constraints) prob.solve(verbose=False, solver='GLPK_MI') weights_aux = np.multiply(positions_cvx.value, weights_matrix) optimised_weights = np.zeros((n_variables, )) for i in range(n_variables): initial_idx = cumulative_len[i] last_idx = cumulative_len[i] + len_per_asset[i] optimised_weights[i] = weights_aux[initial_idx:last_idx].sum() print("weights: {}".format( ', '.join(['{:.2%}'.format(x) for x in optimised_weights]) ))
Conclusion
Even though there are several algorithms implemented in Python -or another programming language-, sometimes the problems that portfolio managers have to face don’t fit in those algorithms. Hence, out-of-the-box thoughts are vital to solve their problems.