Artificial neural networks are mathematical models that try to reproduce the function of the nervous system: they capture information, process it and generate a response adapted to each situation. It’s a computational technique that’s used today in many fields: medical applications, market, and customer analysis, data mining, industrial process optimisation, financial and economic modelling… Its appeal lies in its ability to find nonlinear relationships between any combination of market data, technical indicators, and key factors.
Artificial neurons
As in our human nervous system, the network’s main element is the neuron.
It’s a simple processing element: from an input vector (input information) it produces a single output (output information).
The output produced by a neuron (i), for a given moment of time (t) can be written as: $$y_{i}(t)=F_{i}(f_{i}[a_{i}(t-1),σ_{i}(w_{ij},x_{j}(t))])$$
Set of inputs, \(x_j(t)\): from outside or from other neurons.
Synaptic Weights, \( w_{ij}\): degree of connection between neurons \(i, j\). It can be positive, negative or zero (no connection). By adjusting these weights the neuron is able to adapt to any environment.
Propagation Rule, \(σ_i(w_{ij},x_j(t))\): collects the information and determines the potential resulting from the interaction of the neuron \(i\) with the neighboring neurons (\(h_i(t)\)). The most common, simple way to perform a sum of the weighted inputs with their corresponding synaptic weights is: \(h_i(t)=∑_j w_{ij}\cdot x_j(t)\).
Activation Function, \(f_i(a_i(t-1),h_i(t))\): determines the current activation state of the neuron \(i\) (\(a_i(t)\)). In most models, the neuron’s previous state (\(a_i(t-1)\)) is usually ignored. The most commonly used activation functions are: identify function, step, linear by sections, sigmoidea, sinusoidal…
Output Function, \(F_i(a_i(t))\): current output of neuron i. In general, the identity function is used.
Union of Neurons = Neuron Networks
A neural network is nothing more than a union of several neurons. These neurons are combined in structures called layers.
Network architecture has no limits – it depends on the desired utility and complexity. You can include as many layers as you want, and each layer can contain innumerable neurons.
The simplest structure is the monolayer network. Neurons, in this case, fulfil the function of both input and output.
Normally, structures have multiple layers: an input layer (neurons receive information directly from the outside), at least one hidden layer (receiving information from other neurons, and processing the information), and a layer of output (where they receive the processed information and return a response).
The connections within networks are usually unidirectional (feedforward networks), but you can also use output information to feed back the network (feedback networks).
The primary advantage of networks is that they are able to adapt their operation to different environments by modifying their inter-neuron connections.
How do they work?
There are two phases:
1. Learning/Training Phase: The network is trained to perform a processing type. It learns the relationship between the presented data to return a certain output in each situation.
Starting from a set of random synaptic weights, we look for a set of weights that allow the network to correctly perform a given task. This phase, therefore, includes the process of setting parameters. It’s an iterative process, in which the solution is refined until a sufficiently good level of operation is achieved. Most training methods consist of proposing an error function that measures the current performance of the network as a function of the synaptic weights. The goal of the method is to find the set of synaptic weights that minimise (or maximise) the function.
2. Operation/Execution Phase: This consists of evaluating the behaviour of the network against previously unseen patterns. This phase is essential to prevent over-learning (to read more about why over-learning – also called overfitting – is a problem, take a look at this video from Coursera). In most cases, the learning process is allowed to progress to a reasonable error level, periodically saving the different intermediate configurations and then selecting the one with the lowest evaluation error.
The use of networks
Some important points while designing the network are:
- Input Data: must be composed of a representative number of data, with sufficiently different information, to avoid over-optimisation (overfitting). It’s common to use 70% of the data to train the neuron, 20% to test the result and 10% as validation outside the sample.
- Control the number of neurons and levels: too few, and the process becomes general. If they are too many, there will be too much data adjustment.
- Chosen Functions: start with the simplest function and then complicate it according to the observations and further requirements.
How do we use them?
In the financial world, neural networks are primarily used in two ways:
1. Price Prediction: using the series (or other relevant information) to predict, for example, the day’s closing price, the week’s closing price, the next resistance or support…(etc)
2. Give Purchase or Sale Signals: in this case there are usually two output neurons, one for each signal. A threshold is set in the activation function, returning a value of 0 or 1 depending on whether the threshold is exceeded or not.
The next step is to put this technique into practice!
Do you fancy reading this post in Spanish?