When you are working with a huge data, knowing how to deal with them is as important as representig them in a proper way. So, in machine learning world, you can find a great deal of ways to represent data and most of them are visually fantastic. In this post we want to introduce you the **graph theory** as a way of representation.

This technique is useful to represent a lot of points and their relationships. These can be based on different rules; for example, the union can be from a statistic data which relates each point to the others directly, to a combination of some statistics that are reduced to a single value.

In the graph theory you should distinguish the following parts:

**Node**. It is each point to be represented.**Link**. It represents the union between nodes. The stronger this union, the bigger link.

The links can be represented by different ways depending on what they are showing. If the union has only one way, an arrow is drawn to indicate the relationship’s direction. In that case, it is called a **directed graph**. Moreover, the intensity of the union can be represented by the line thickness.

## A toy example

To ilustrate the graph theory, we will start with a simple example. We get six MSCI indeces: Asia, Europe, World, Emerging, Japan and U.S.A. Then, we calculate the correlation between all of them to set the relationship between them. Note that in this example the union follows a rule based on a single statistic.

We represent these indeces and their links by using the graph theory. Note that this is not a directed graph because the union has both direction, that is, the correlation links one point with another and viceversa. To decide how strong the union is, we focus on the calculated correlation (from 2005 until October of 2016).

The next interactive graph shows the six indeces and the strength between their unions. If you hover the mouse over the nodes, you will see the name of the index to identify them. In addition, if you click on the nodes and drag it, you see how the indeces move keeping the links.

## More data in a graph

When you have few data, the graph theory is unuseful, or at least, you do not take advantage of all its power.

Now we take the S&P500 index and its components. We will use the graph theory to show how the relationship between the components has changed through years.

To simplify the ilustration, we choose only 100 components out of 500 each year. This subset of assets is selected by taking into account the market cap, that is, we choose the 100 assets with the largest market capitalization. If the correlation is over 0.6, we link the assets, and if it is lower, there is no union between assets.

As you can see in the following interactive graphs, the relationship between the most important companies in U.S.A. has changed in the last years.

**Relationship in 2005**

**Relationship in 2015**

Although we focus on the visualitation part of the graph theory, it can be used in finance in order to analyse how strong the relationship between variables is. Moreover, it also allows us to know how the relationship between the assets is.