Is it possible to classify and **predict** (yes, predict!) if market trends will be **bullish**, **bear** or **ranged **by using a method called “naïve” and based on something as simple as Bayes’ theorem? Let’s see!

## Our main objective

Here, we’re looking to explore machine learning techniques that can help us not only to label series in a posteriori analysis, but also to predict to which class a new value given of the serie belongs.

The **Naïve Bayesian Classifier** is a supervised learning method of **machine learning**, as well as a statistical method for classification.

Although this method includes the word “naïve” in its name, it will be our chosen tool to predict different trends of an index-represented market. Bayesian classification provides practical learning algorithms where prior knowledge and observed **data can be combined**. It calculates explicit probabilities for hypothesis, and theory maintains that it’s robust to noise in input data.

## An extremely brief lesson of history of statistics…

This classification is named after Thomas Bayes (1702-1761), who proposed the famous *Bayes Theorem*, which constitutes the guts of the issue concerned:

This theorem is only telling us how we can transform the conditional probability of two events, assuming that the conditional event, B, has probability of occurrence not null. This means P(B)!=0.

We can trivially prove that if {C_{j}} are a set of n exclusive and exhaustive events, this formula becomes:

From now on, C will be referred to as the class of a variable (this is what we want to predict) and A will represent the different attributes or characteristics measured that can be known every day, unlike variable C.

We want to maximize the probability P(C_{i}|A) for each class i. Therefore, the Bayesian Naïve Classifier solves the next problem, which can be derived from the previous formula (take into account that the divisor of the formula is common for each class):

where C_{i} is the class (i denotes the possible label it can show) and A_{j} is the j-attribute.

But why “**naïve**“? This name comes from the fact that we are assuming (to get the previous formula) that:

Keep in mind the basic idea:

*Find out the probability of the previously unseen instance belonging to each class, then simply pick the most probable class.*

## The key to all the work behind this post

We are going to classify and predict classes of trends (**Upward**, **Downward** and **Ranged** **Market**, which could be the possible values of our variable C in the above formula) by using historical data mixing how the real market was in the past (variable C) and what some models (attributes) were indicating to us. They could be betting for bullish market, down market, or they could be positioning out of the market.

More details are given to understand our application of this classifier:

: The attributes will be the models mentioned before. But which models and how many? We have developed a program built with 30 models based on different philosophies, which quantify different aspects and leads us to a position (-1, 0 or 1) in the market. These models are assumed to be independent. Therefore, we will have 30 different positions each day.__Attributes A___{j}__Variable C__**:**This variable has to be estimated carefully. We want a measure to predict trends, because today we don’t know which trend we are living. We’re going to make predictions by using a window of historical data. The “real values” of variable C (**Upward**,**Downward**,**Ranged****Market**) in the window have been collected using an algorithm which works with future information, so every day, when we estimate probabilities, we have to delete “advanced” (unknown at the time) information. This means that every day, we have a dynamic historical window, the length of which depends on the information we have to throw out in order to make fair estimations.If today we have a new 30-tuple of positions of our attributes (models) and the tuple shows k different values (k<=3 because possible values are -1, 0 or 1), we have to estimate (every day) the next probabilities__Which probabilities do we have to estimate?__**for each class i**(i=**Upward**,**Downward**or**Ranged****Market**):

These probabilities are estimated by simply counting correspondent frequencies in the historic window.

for each class we calculate, the class that shows the maximum value will constitute our bet. In other words, our prediction.__And, finally…__

We have applied this to guess the trends of the stock *Google Inc* from October of 2006, and the results look like this, (much better than we expected!):

Taking a quick look, we notice that some downward trends have been predicted as upward, but although **we are not magicians**, the predictions obtained should not be disregarded. We cannot throw away the idea of using this method yet, because the coincidence of the prediction based on future information is around 51% value, which is not an insignificant result since this application does not cheat, as the other method does.