Artificial Intelligence

Mitigating overfitting on Financial Datasets with Generative Adversarial Networks

Fernando De Meer


No Comments

What good is synthetic data for in a financial setting? This is a very valid question, given that data augmentation techniques can be hard to evaluate and the time series they produce are very complex. As we will see in this post however, it turns out that synthetic series can be very useful! Specially mitigating overfitting in financial settings. Following up on our latest posts about GANs (see here, here and here), we want to close up the series by showing a paramount application of synthetic time series.

Mitigating overfitting step by step

The technique is called “Train on Synthetic and Test on Real”, it was initially proposed in the setting of time series data in this paper, it consists on training a deep learning model on synthetic data. In the paper, the authors use the technique as a way to measure the quality of their generations (as if synthetic data is realistic enough, then it must be able to train a deep learning model as well as the real data) but for us, it will also be a valuable application. In financial settings the time series we work with are somewhat complex, they’re heavily non-stationary and seemingly chaotic. Because of this, when training a deep model in a financial setting, training and test data will always have radically different structures and we will inevitably overfit. If we, however, manage to train the model on a much larger amount of training samples, it should find a weight configuration that accommodates a much larger variety of financial scenarios. That’s indeed what happens, in the experiment we’re going to present, the model improves its accuracy and robustness upon being trained on synthetic data.

First, we present the deep learning model we’re going to employ, the ResNet.


Presented in [1], the ResNet (Residual Network) is a special type of neural network designed for Time Series Classification (TSC). The main characteristic of ResNets is the shortcut residual connection between consecutive convolutional layers. A linear shortcut is added to link the output of a residual block to its input thus enabling the flow of the gradient directly through these connections. This makes training a ResNet much easier by reducing the vanishing gradient effect.

The network is composed of three residual blocks followed by a Global Average Pooling layer and a final softmax classifier, whose number of neurons is equal to the number of classes in the dataset. Each residual block is first composed of three convolutions whose output is added to the residual block’s input and then fed to the next layer. The number of filters for all convolutions is fixed to 64, with the ReLU activation function that is preceded by a batch normalization operation. In each residual block, the filter’s length is set to 8, 5 and 3, respectively, for the first, second and third convolution.


ResNet architecture illustrated.

In a thorough empirical comparison of many TSC models (see [2]) the ResNet was found to be the best performer in a wide variety of datasets, for this reason we choose it for our experiment.

Target Application: Forecasting VIX peaks

The VIX is already an old friend in this series of posts, we have already discussed its unique dynamics and how to generate synthetic VIX price paths, so it’s time to put those synthetic series to work.

The VIX tends to be a mean-reverting time series and stays in the 10-15 points range for long periods of time. Hence the options with strike prices in this range are very popular. Call options, in particular, are very competitively priced due to the mean-reverting behaviour the VIX presents, being able to predict VIX peaks could be very valuable. Given the last 30 prices of the VIX, can the ResNet predict at what level the VIX will be 20 trading days from today?

First, we perform training in a classic way, splitting the dataset in a training set from 01/02/2004 to 10/20/2015 and a test set from 10/21/2015 to 04/04/2019. We split each dataset in periods of 30 days (non-overlapping in the training set and overlapping in the test set) and then divide the samples into two classes:

  1. Class 0 : If the VIX price 20 days after the last day of each period was below 15.
  2. Class 1 : If the VIX price 20 days after the last day of each period was above 15.

We conduct training for 2000 iterations and obtain the following accuracy curves and confusion matrix.

Confusion matrix and accuracy plot resulting from training on real data only.

The accuracy stabilizes close to 60% and the curve suggests some overfitting to the training data. These results are coherent, as we had already discussed, financial time series are ever-changing by nature, and for this reason, overfitting is to be expected. The sudden drops visible in the accuracy plot on the test set are however particularly worrisome, specially those that do not correlate with a drop in performance on the training set, as they hint that our model may not generalize well on unseen data.

In order to mitigate this overfitting we now want to generate synthetic VIX price series, so that the ResNet may be presented with a bigger number of scenarios during training. To do this, we train the WGAN-GP from our earlier post using training data only (meaning the GAN is never in contact with test data) and then generate 1000 synthetic scenarios so that we can enlarge the training set. Finally, we train the ResNet on the enlarged training set and get the following accuracy curve and confusion matrix:

Confusion matrix and accuracy plot resulting from training on real and synthetic data.


We can see that the enlarged training set improves overall accuracy, as it now stabilizes around 70%, and the robustness of the model, as the accuracy on the test set no longer presents any drops. The performance in class 1 is significantly improved as well, jumping from 42% to 53%. The synthetic scenarios produced by our GAN mitigate overfitting, as they force the model to compromise during training in order to accommodate a much bigger set of scenarios, which makes the model generalize better! This technique is not dependent on the structure of the time series we’re dealing with, so it is potentially applicable to a wide variety of assets.

An article including this experiment has been submitted to The Journal of Financial Data Science, in the article we give a bigger overview over the GAN methodology and argue why it is a useful technique, it is still under review but hopefully, it will be published soon. Also,  we have uploaded all the necessary code to reproduce this experiment to the following GitHub Repository. In it you will find the ResNet along with the real and synthetic VIX time series, so feel free to play around with them, any feedback is appreciated!

Closing up, I want to end this post with an artistic touch. If you recall, the Generator of the GAN is a continuous function at the end of the day, so if we go along a path in latent space we can generate series in a “continuous” manner. This is called a latent space interpolation and leads to videos as pretty as the following:

Interpolation video of VIX synthetic samples

Thanks for reading!