# Graph Theory in portfolio analysis. Part I

### Pablo Sánchez

#### 03/07/2019

Have you ever thought about the bias of your portfolio to specific countries or asset types? Do you know that high concentration in one region implies a riskier path for your portfolio? If you want to know how to improve your portfolio using Graph Theory, first you’ll need to understand the basics.

We discussed this topic in previous posts:

Once the graph is created, there are different statistics and measures we can use to extract a lot of information and get a better understanding of our portfolio. In this first post, we are going to learn the theory behind these concepts and measures.  Let’s start defining the different terms:

• $$V$$ is a set of vertices or nodes. $$v_i \in V$$ is the i-th vertex.
• $$E$$ is a set of edges. $$e_{v_i, v_j} \in E$$ is the edge between the vertices $$v_i, v_j \in V$$.
• The graph is represented by $$G(V,E)$$.
• $$n = |V|$$ is the number of vertices of the graph.
• $$m = |E|$$ is the number of edges of the graph.

The adjacency matrix, $$A_G$$, represents the different connections between the vertices of the graph G(V,E). If the vertex $$v_i \in V$$ is connected with the vertex $$v_j \in V$$ through the edge $$e_{v_i, v_j} \in E$$, then the positions (i, j) and (j, i) of the matrix are 1; if not both are 0. The adjacency matrix is symmetric and it is defined as follows:

$$A_G = a_{i,j} = \left\{ \begin{matrix} 1 & if & e_{v_i, v_j} \in E \\ 0 & if & e_{v_i, v_j} \not\in E \end{matrix} \right. \quad \forall v_i,v_j \in V$$

In the next example we can see the adjacency matrix of a simple graph.

### Neighbours of a vertex

The neighbours of a vertex $$v_i \in V$$ is a subset of V, $$N_{v_i} \in V$$, including all the vertices connected with it.

$$N_{v_i} = \left\{ v_j \; : \; v_j \in V \wedge e_{v_i, v_j} \in E \right\} \quad \forall v_j \in V$$

### Degree of a vertex

The degree of a vertex $$v_i \in V$$ is the number of connected vertices or neighbours this vertex has.

$$deg\left( v_i \right) = {\delta}_{v_i} = \left| N_{v_i} \right|$$

The maximum and minimum degree of a graph G are represented by $$\Delta(G)$$ y $$\delta(G)$$, respectively. They could be defined as follows:

$$\Delta(G) = max\left({\delta}_{v_i}\right) \quad \forall v_i \in V$$

$$\delta(G) = min\left({\delta}_{v_i}\right) \quad \forall v_i \in V$$

## Degree matrix

The degree matrix, $${\Delta}_G$$, has 0 in all its elements except the diagonal in which it is shown the degree for each vertex.

$${\Delta}_G = d_{i,j} = \left\{ \begin{matrix} {\delta}_{v_i} & if & i=j \\ 0 & if & i \neq j \end{matrix} \right. \quad \forall v_i \in V$$

In the next example we can see the degree matrix of the same simple graph.

## Clustering coefficient

$$C_{v_i}$$ is the clustering coefficient of the vertex $$v_i \in V$$ and it is a real number in the interval $$\left[0, 1\right]$$. This measure quantifies the number of existing connections between the neighbours of that vertex $$v_i$$ in comparison to the number of posible connections, i.e., it shows if there are triple connections (triangles). It is said that there is a strong clustering around the vertex $$v_i$$ when this measure is close to 1.

$$C_{v_i}=\frac{2}{{\delta}_{v_i}\left({\delta}_{v_i}-1\right)}\left|t_{v_i}\right| \quad : \quad {\delta}_{v_i} > 1$$

Where $$t_{v_i}$$ is the set of edges that connect $$v_j$$ and $$v_k$$ and both vertices are neighbours of $$v_i$$ ($$v_j,v_k \in N_{v_i}, \, e_{v_j, v_k} \in E$$).

The clustering coefficient of a graph G, $$C(G)$$, is an equally-weighted mean of the clustering coefficients of each vertex of the graph. When the clustering coefficient, $$C(G)$$, is close to 1 it shows that all the vertices are inter-connected and all possibles triple connections exists. A high clustering coefficient shows a high robustness.

$$C(G) = \frac{1}{N} \sum_{v_i \in V; {\delta}_{v_i}>1} C_{v_i}$$

## Distance between two vertices

The distance between two vertices $$v_i, v_j \in V$$ is the minimum number of edges necessary to go from one vertex to the other. It is represented by $$d_{v_i, v_j}$$. The BFS or Breadth-first search algorithm calculates this measure in complex graphs.

### Distance matrix

The distance matrix of a graph G is symmetric and contains the minimum distances between each pair of vertices. It is represented by $$D_G$$ and the diagonal is 0.

$$D_G = d_{i,j} = \left\{ \begin{matrix} d_{v_i, v_j} & if & i \neq j \\ 0 & if & i=j \end{matrix} \right. \quad \forall v_i,v_j \in V$$

### Average distance

The average distance of a graph G, $$\bar{d}(G)$$, is defined as follows:

$$\bar{d}(G) = \frac{2}{n(n-1)}\sum_{i=1}^{n} \sum_{j=i+1}^{n} d_{v_i, v_j} \quad \forall v_i, v_j \in V$$

### Diameter

The diameter, $${d}^{max}(G)$$, is the maximum of the minimum existing distances between two vertices of the graph G:

$$d^{max}(G) = max \left( \left\{d_{v_i, v_j}, \, \forall v_i, v_j \in V, \, v_i \neq v_j \right\} \right)$$

The lower the value of these measures, the greater the robustness of the graph.

## Efficiency

The graph efficiency is a measure that shows the effectiveness in the information exchange between two vertices. The average graph efficiency of a graph is defined as follows:

$$E(G) = \frac{2}{n(n-1)}\sum_{i=1}^{n} \sum_{j=i+1}^{n} \frac{1}{d_{v_i, v_j}} \quad \forall v_i, v_j \in V$$

## Connection density or cost

The connection density or cost is the number of existing edges, m, in the graph G in relation to the total number of possible edges. It is the simplest estimator of the physical cost of a network.

$$D(G) = \frac{2m}{n(n-1)}$$

## Intermediation

The intermediation of a vertex $$v_i$$ or an edge $$e_{v_i, v_j}$$ is the number of shortest paths between two vertices $$v_k, v_l \in V$$ that include the vertex $$v_i$$ or the edge $$e_{v_i, v_j}$$.

### Average intermediation of vertices

The average intermediation of a vertex is defined as follows:

$$\bar{b}_v (G) = \frac{1}{2}(n-1)\left(\bar{d}(G)+1\right)$$

### Average intermediation of an edge

The average intermediation of an edge is defined as follows:

$$\bar{b}_e (G) = \frac{n(n-1)}{2m}\bar{d}(G)$$

In the next Graph Theory post we will see an example of how to apply these statistics and measures to a real case. They will be useful to analyse the universe of our portfolio and check its diversification.