We discussed this topic in previous posts:

- Graph theory: connections in the market. To understand what a graph is and the elements it has.
- World connections using financial indexes. In this other post, you can learn how to create a graph using the correlation matrix between financial indexes from different countries of the world.

Once the graph is created, there are different statistics and measures we can use to extract a lot of information and get a better understanding of our portfolio. In this first post, we are going to learn the theory behind these concepts and measures. Let’s start defining the different terms:

- \(V\) is a set of vertices or nodes. \(v_i \in V\) is the i-th vertex.
- \(E\) is a set of edges. \(e_{v_i, v_j} \in E\) is the edge between the vertices \(v_i, v_j \in V\).
- The graph is represented by \(G(V,E)\).
- \(n = |V|\) is the number of vertices of the graph.
- \(m = |E|\) is the number of edges of the graph.

## Adjacency matrix

The adjacency matrix, \(A_G\), represents the different connections between the vertices of the graph G(V,E). If the vertex \(v_i \in V\) is connected with the vertex \(v_j \in V\) through the edge \(e_{v_i, v_j} \in E\), then the positions (i, j) and (j, i) of the matrix are 1; if not both are 0. The adjacency matrix is symmetric and it is defined as follows:

\(A_G = a_{i,j} = \left\{ \begin{matrix} 1 & if & e_{v_i, v_j} \in E \\ 0 & if & e_{v_i, v_j} \not\in E \end{matrix} \right. \quad \forall v_i,v_j \in V\)

In the next example we can see the adjacency matrix of a simple graph.

### Neighbours of a vertex

The neighbours of a vertex \(v_i \in V\) is a subset of V, \(N_{v_i} \in V\), including all the vertices connected with it.

\(N_{v_i} = \left\{ v_j \; : \; v_j \in V \wedge e_{v_i, v_j} \in E \right\} \quad \forall v_j \in V \)

### Degree of a vertex

The degree of a vertex \(v_i \in V\) is the number of connected vertices or neighbours this vertex has.

\(deg\left( v_i \right) = {\delta}_{v_i} = \left| N_{v_i} \right| \)

The maximum and minimum degree of a graph G are represented by \(\Delta(G)\) y \(\delta(G)\), respectively. They could be defined as follows:

\(\Delta(G) = max\left({\delta}_{v_i}\right) \quad \forall v_i \in V\)

\(\delta(G) = min\left({\delta}_{v_i}\right) \quad \forall v_i \in V\)

## Degree matrix

The degree matrix, \({\Delta}_G\), has 0 in all its elements except the diagonal in which it is shown the degree for each vertex.

\({\Delta}_G = d_{i,j} = \left\{ \begin{matrix} {\delta}_{v_i} & if & i=j \\ 0 & if & i \neq j \end{matrix} \right. \quad \forall v_i \in V \)

In the next example we can see the degree matrix of the same simple graph.

## Clustering coefficient

\(C_{v_i}\) is the clustering coefficient of the vertex \(v_i \in V\) and it is a real number in the interval \(\left[0, 1\right]\). This measure quantifies the number of existing connections between the neighbours of that vertex \(v_i\) in comparison to the number of posible connections, i.e., it shows if there are triple connections (triangles). It is said that there is a strong clustering around the vertex \(v_i\) when this measure is close to 1.

\(C_{v_i}=\frac{2}{{\delta}_{v_i}\left({\delta}_{v_i}-1\right)}\left|t_{v_i}\right| \quad : \quad {\delta}_{v_i} > 1\)

Where \(t_{v_i}\) is the set of edges that connect \(v_j\) and \(v_k\) and both vertices are neighbours of \(v_i\) (\(v_j,v_k \in N_{v_i}, \, e_{v_j, v_k} \in E\)).

The clustering coefficient of a graph G, \(C(G)\), is an equally-weighted mean of the clustering coefficients of each vertex of the graph. When the clustering coefficient, \(C(G)\), is close to 1 it shows that all the vertices are inter-connected and all possibles triple connections exists. A high clustering coefficient shows a high robustness.

\(C(G) = \frac{1}{N} \sum_{v_i \in V; {\delta}_{v_i}>1} C_{v_i}\)

## Distance between two vertices

The distance between two vertices \(v_i, v_j \in V\) is the minimum number of edges necessary to go from one vertex to the other. It is represented by \(d_{v_i, v_j}\). The BFS or Breadth-first search algorithm calculates this measure in complex graphs.

### Distance matrix

The distance matrix of a graph G is symmetric and contains the minimum distances between each pair of vertices. It is represented by \(D_G\) and the diagonal is 0.

\(D_G = d_{i,j} = \left\{ \begin{matrix} d_{v_i, v_j} & if & i \neq j \\ 0 & if & i=j \end{matrix} \right. \quad \forall v_i,v_j \in V\)

### Average distance

The average distance of a graph G, \(\bar{d}(G)\), is defined as follows:

\(\bar{d}(G) = \frac{2}{n(n-1)}\sum_{i=1}^{n} \sum_{j=i+1}^{n} d_{v_i, v_j} \quad \forall v_i, v_j \in V \)

### Diameter

The diameter, \({d}^{max}(G)\), is the maximum of the minimum existing distances between two vertices of the graph G:

\(d^{max}(G) = max \left( \left\{d_{v_i, v_j}, \, \forall v_i, v_j \in V, \, v_i \neq v_j \right\} \right)\)

The lower the value of these measures, the greater the robustness of the graph.

## Efficiency

The graph efficiency is a measure that shows the effectiveness in the information exchange between two vertices. The average graph efficiency of a graph is defined as follows:

\(E(G) = \frac{2}{n(n-1)}\sum_{i=1}^{n} \sum_{j=i+1}^{n} \frac{1}{d_{v_i, v_j}} \quad \forall v_i, v_j \in V\)

## Connection density or cost

The connection density or cost is the number of existing edges, m, in the graph G in relation to the total number of possible edges. It is the simplest estimator of the physical cost of a network.

\(D(G) = \frac{2m}{n(n-1)}\)

## Intermediation

The intermediation of a vertex \(v_i\) or an edge \(e_{v_i, v_j}\) is the number of shortest paths between two vertices \(v_k, v_l \in V\) that include the vertex \(v_i\) or the edge \(e_{v_i, v_j}\).

### Average intermediation of vertices

The average intermediation of a vertex is defined as follows:

\(\bar{b}_v (G) = \frac{1}{2}(n-1)\left(\bar{d}(G)+1\right)\)

### Average intermediation of an edge

The average intermediation of an edge is defined as follows:

\(\bar{b}_e (G) = \frac{n(n-1)}{2m}\bar{d}(G)\)

In the next Graph Theory post we will see an example of how to apply these statistics and measures to a real case. They will be useful to analyse the universe of our portfolio and check its diversification.