Clustering data into groups that share common characteristics can be very useful, but using experts to perform this grouping is costly and in many cases decisions are influenced by emotions.

That is why clustering is one of the main topics of **Unsupervised Machine Learning **algorithms, that doesn’t require labels to find patterns in data.

We have shown how to use clustering techniques to find groups in the S&P500 components and how to reduce data dimensionality using autoencoders.

Why not combining both?

Here I try to combine both by using a **Fully Convolutional Autoencoder** to reduce dimensionality of the S&P500 components, and applying a classical clustering method like **KMeans** to generate groups.

## Why Fully Convolutional?

When using **fully connected** or **convolutional** Autoencoders, it is common to find a** flatten** operation that converts the features into a 1D vector. This operation removes spatial information present in the features.

Fully Convolutional avoids flattening the features vector by using only convolutional layers along the network structure.

## Experiment

In order to check the feasibility of the proposal, we have followed the next steps:

- Get the last
**256**prices of the S&P500 components (from**2019-10-15**to**2020-10-06**). - Create the
**cumulative returns**of all the components and scale them. - Train a Fully Convolutional Autoencoder and extract the
**encoded features**. - Perform
**KMeans**clustering over the encoded features.

### Network Structure

The network architecture has two parts:

- The
**encoder**: reduces the input size (**500×256**) by consecutively applying convolutions and max-pooling operations until reaching a smaller version (**500×2**). - The
**decoder**: increases the encoded size (**500×2**) by consecutively applying upsampling operations and convolutions until reaching the input size (**500×256**).

The input shape is **256 **so it can be **downsampled **and **upsampled **by a factor of 2, ‘*n’* times.

### Network performance

As an example, the Apple stock price (**AAPL**) is feedforwarded over the network, generating the following results:

The encoder is able to transform the 256 daily returns of the AAPL component in two values (left graph) and with those, the decoder does its best to reconstruct the original series (right graph).

### Clustering

Finally, we perform the clustering over the encoded samples but, how many clusters do we have in our data? That is where the “**elbow**” method comes into action.

The elbow method consists in executing a clustering algorithm with different parameters and calculate a metric called * inertia*.

There is not a right or a wrong number of clusters, but we should look at values that after decreasing, produce an “elbow” in the graph.

For this problem we select **4 clusters**, leading to the following grouping of encoded components:

In order to check if the clusters correspond to similar behaviours, we show their original evolution along time:

**Cluster 0** and **1** look very similar to an eye inspection. That happens because the instances of cluster 0 and 1 are **very close** in the encoded version (similar behaviour).

As one can guess, the cluster 3 corresponds to the big **tech and pharmaceutical** companies of USA. All the components belonging to cluster 3 are the following:

'AAPL', 'ADBE', 'ADSK', 'AMZN', 'CTXS', 'EBAY', 'FAST', 'BIIB', 'LRCX', 'MSFT', 'NVDA', 'QCOM', 'REGN', 'NLOK', 'TSCO', 'VRTX', 'MNST', 'NFLX', 'NDAQ', 'APD', 'BBY', 'CLX', 'CAG', 'DHR', 'DVA', 'LLY', 'FCX', 'KR', 'LB', 'LOW', 'SPGI', 'MCO', 'NEM', 'ROK', 'TER', 'TMO', 'TIF', 'UNH', 'VAR', 'FMC', 'PKI', 'CRM', 'FB', 'BLK', 'ABBV', 'CMG', 'EA', 'HUM', 'KSU', 'LDOS', 'URI', 'ALB', 'ANSS', 'BIO', 'CDNS', 'EQIX', 'JKHY', 'MTD', 'QRVO', 'RMD', 'ROL', 'SWKS', 'SNPS', 'CNC', 'DPZ', 'MSCI', 'FTNT', 'FBHS', 'INCY', 'KHC', 'PYPL', 'ATVI', 'CCI', 'CHTR', 'SBAC', 'NOW', 'TMUS', 'DXCM', 'OTIS', 'CARR'

## Conclusions

Using a Fully Convolutional Autoencoder as a preprocessing step to cluster time series is useful to **remove noise** and extract **key features**, but condensing 256 prices into 2 values might be very restrictive.

There is some future work that might lead to better clustering:

- Generate encodings with
**higher dimensionality.** - Use more daily returns to capture past information.
- Apply a different clustering technique such as
**DBSCAN**or**Spectral Clustering**.