Artificial Intelligence

Geometrical evaluation of Generative Adversarial Networks

Gustavo Vargas

24/07/2019

No Comments

Generative Adversarial Networks are a quite powerful tool for generating synthetic samples. Visual inspection has been used as a traditional measure of performance. However, it is quite hard to inspect when a time series looks realistic or not!  Which methodology can be used then? In order to measure sample quality, topological tools could provide us with some insight, as the geometrical properties of synthetic series and the original ones could be compared. In this post we will introduce one of the most relevant metrics: The Mean Relative Living Times.

The problem of measure synthetic data

In a previous post, we saw how Generative Adversarial Networks (GANs) are a great Deep Learning tool in order to capture the properties of a target distribution. In fact, for images they have done a great job, generating images barely distinguishable from samples.

This visualization is, indeed, the acid test for image generation. But with time series this kind of test is not enough, as our capacity to recognize linear shapes is more limited. An approach that has been proposed (here or here) consists on creating a supervised-learning classification problem, apply real data to it, and then check if the performance is improved adding to the test set the synthetic samples that our GAN has generated.

Topology

Another kind of approach could be considered. What we want is to check if the samples our GAN has generated are similar to the real data, and similarity could be tackled with topology tools. We can create a topological structure, the Čech complex, placing balls of diameter t at each point of the dataset. Intersections create a 1-simplex, 2-simplex and so on, and holes could be detected. As grows, some holes will appear and some will disappear. One way to represent it is the persistence barcode, where we can see the birth and death of every hole.

Čech complex and persistence barcode

Figure 1. (a) Čech complex (b) Persistence barcode. Source: Quantdare

Mean Relative Living Times

The barcode gives us interesting information: some holes are more important than others, as they “live more time”, that is, they are a characteristic of the topology structure we are dealing with. For each number of holes, we can take the ratio between the total amount of time this number of holes was present and \(t_{max}\), obtaining the Relative Living Times (RLT). We can interpret this outcome as the confidence rate for a certain number of holes to be characteristic of the underlying data manifold.

As we want a topological rather than geometrical approximation, we won’t use all the points of the dataset, instead we are going to choose randomly a fixed amount of landmarks, calculate RLT for this sample, and then repeat the process several times. Continously, we will calculate the average in order to obtaining the Mean Relative Living Times (MRLT). As they add up to 1, we can interpret it as a probability distribution.

But, does it work?

In figure 2, Khrulkov and Oseledets tried this method in five 2D datasets with 5000 points each one. We can see that the resulting distributions correctly identify the number of holes. As a sanity check, this seems pretty good. But, could this work on time series?

MRLT for 2D datasets

Figure 2. Mean Relative Living Times (MRLT) for various 2D datasets. The number of one-dimensional holes is correctly identified in all cases. Source: Khrulkov & Oseledets, 2018.

Experiment

For this experiment, we are going to use the daily returns of EUR/USD data, from 1999 to 2010. We train a WGAN-GP for 100.000 epochs and generate 100 samples of length 1000:

eurusd from 1999 to 2010

Figure 3. EUR/USD. Source: Yahoo! Finance.

Now we have two datasets: the original returns of eurusd and the one generated by the WGAN. Are they similar? Using the code implemented by Khrulkov and Oseledets, we are going to compute the MRLT for our data.

 

EURUSD Mean Relative Living Times

Figure 4. EURUSD Mean Relative Living Times. Original and synthetic.

We had already tested this synthetic data in a supervised classification problem and it worked properly, and the similarity of these two distributions confirms that the shape of our generated data is alike to the real one!

Topological data analysis can help us to measure the performance of Generative Adversarial Networks, but this metric by itself doesn’t have to correlate with good quality generated data. GANs are a relatively young field of research but a promising field of application. So, we expect a lot of improvements in this area in years to come. Stay tuned for GANs news!