Artificial Intelligence

Generating OHLC bars with Generative Adversarial Networks

Gustavo Vargas

30/01/2020

No Comments

Open-High-Low-Close (OHLC) bars are a type of financial data typically used to represent daily movements in the price of a financial instrument. They give us more information about certain characteristics of the series than line charts, such as intraday volatility or daily momentum. Could Generative Adversarial Networks learn to generate series with the underlying structure of OHLC bars? If it’s possible, those series could mitigate the overfitting in a classification problem. That is what we will try to do in this post!

Open-High-Low-Close Bars

Each OHLC Bar consist of four prices: open, high, low, and closing prices for each period. We are going to try to generate synthetic OHLC Bars using Generative Adversarial Networks. GANs are a deep learning approach that uses two neural networks, where each one stands against the other in order to generate new data that can pass for real data, that is, that comes from the original distribution.

In this article, we are going to use EURUSD OHLC prices, first with a WGAN-GP with a base-dimension and the Relativistic GAN with the other dimensions (you have more info about this approach in the paper called: “Enriching Financial Datasets with GANs” and in the post “Generating Financial Series with Generative Adversarial Networks. Part 2“).

First, Open and Close price bars are straightforward to generate, since the Open prices are just the closing prices of the day earlier. Next, we calculate the returns between Close-Close, Close-High and Close-Low series. With that, we train a WGAN-GP just on the Close-Close series, which will generate Close-Close synthetic returns series. Now we can use our Relativistic GAN, using the synthetic Close-Close series as base-dimension and Close-High and Close-Low as secondary dimensions. Finally, we assemble the OHLC price bars from the generated synthetic returns series.

Quantitative Analysis

One way to verify the quality of the generated Open-High-Low-Close bars is the “Train on Synthetic, Test on Real” Approach. In this spirit, we set out to forecast a certain condition of the EURUSD.

We want to predict on a daily basis, whether the return of the following day will be within a certain threshold. We take as input for the predictor model the last 90 Close-Close, Close-High and Close-Low returns of the asset, and try to predict whether the Close-Close return of day 91 will be higher or lower than a threshold of 0.005 (0.5%) in absolute value. For this, we are going to use the ResNet as a classifier.

Training the model on the real series gives the following accuracy curve:

Original accuracy curve
Figure 1

Training the model on an enlarged Train dataset with the returns series obtained from our synthetic OHLC bars gives the following accuracy curve.

Synthetic accuracy curve
Figure 2

Conclusions

We can observe in Figure 2 that our synthetic series add stability to the training of the Resnet, which in turn indicates that they must be “realistic”. This stability of the training curve is a very desirable property to have, specially in a financial setting such as this one, because it suggests that our ResNet will be able to extrapolate better on unseen data (i.e the future). In upcoming posts, we are going to show how these kinds of predictors can lead to very interesting strategies, stay tuned!