Volatility is one of the best known and most widely used concepts in finance. Given a price series of a financial instrument, its volatility is defined as the dispersion of the returns. This measure is used to compare securities in terms of risk. But in order to compare, sometimes it is necessary to scale. In this post we explain the mathematical foundation of the annualized volatility.

Anybody with basic financial knowledge will tell you how to calculate the volatility of an instrument: **as the standard deviation of the daily returns**. Recall that the standard deviation, \(\sigma\) is the square root of the variance, \(\sigma^2\). Both are statistical measures of dispersion, as they quantify how much the observations of a series deviate from the mean.

$$

\begin{align}

\sigma^2 &=\frac{1}{n} \sum_{i=1}^n (x_i – \mu)^2,\\

\mu &= \frac{1}{n} \sum_{i=1}^n x_i .

\end{align}

$$

## Annualized volatility: why \(\sqrt{T}\)?

In fact, most people would know what is the recipe to **annualize **this **daily volatility**. The formula of the annualized volatility is easy to remember:

$$

\sigma_{1Y} = \sqrt{261} \sigma.

$$

Where 261 is a convention for the number of business days in a year (not everybody uses this convention, but we had to choose one, and it is totally unimportant for the purpose of this post).

Of course, this formula can be generalised for any period \(T\), the volatility of this period of time would be \(\sigma_T = \sqrt{T} \sigma\).

But don’t be fooled by the simplicity of this expression, the concept underneath it and the explanation of **why the factor \(\sqrt{T}\) appears**, is anything but trivial. Some sophisticated mathematics are needed to answer this question. Intuitively, one could argue that it makes sense for the *annualized* volatility to be bigger than the daily one, and a first impulse might be to scale it by the number of days in a year. But why the square root? and what is exactly the annualized volatility? Let us start from the beginning and illustrate the process with an example.

## Back to basics

We depart from a financial series \(\{ P_i\}\), in our chosen example we have the daily prices of a given security for the period comprised between January 1985 and August 2019^{1}. The larger the sample is, the more accurate the results will be, as the statistics will be closer to their expected values.

We can represent both the price series and the return series. Returns are obtained from prices as:

$$

r_i = \frac{P_i-P_{i-1}}{P_{i-1}} = \frac{P_i}{P_{i-1}} – 1.

$$

Now we want to calculate the volatility of this security. We could just take the standard deviation of the returns for the whole sample period, which would produce just one number (the historic daily volatility), we could select a subperiod, or we could do rolling volatility for any window (261, 22…) which would give another time series. However, the important thing to notice is that, **since the returns we used to calculate volatility were daily returns, the results we obtain are daily volatility measures**.

We are going to take the standard deviation of the returns of the whole period as our measure of **daily volatility: \(\sigma_d \simeq 0.716 \).**

## Sample period vs. volatility period

What is the annualized volatility? Well, it represents **the dispersion of the annual returns**. One could naively think that computing the standard deviation of the daily returns of a year, the annual volatility is attained. This is wrong, our \(T\) has nothing to do with the sample period, it is the number of variations in the price; the number of steps (days) between the final and the initial prices used to calculate the returns.

However, to calculate the annual volatility we should have a sample large enough to compute the standard deviation of a significative number of a annual returns. If for example we had data of two years, we would only have two annual returns, and thus only two data points to calculate the mean and the standard deviation of, and that is not a very useful result, right?

This is why having a formula to obtain the annual volatility from the daily volatility comes in handy. If we have a sample of data of one year or so, we can compute the daily volatility of the period, that is, the standard deviation of the daily returns, \(\sigma\), and then estimate the volatility of bigger periods, for example a year, as \(\sigma_T = \sqrt{T} \sigma_d\).

## Checking the formula with real data

In our example, we have data from 1985 to 2019, that is, data for 34 years. We can represent the yearly returns:

34 data points is not much, still, we can calculate the standard deviation of these yearly returns, obtaining: \(\sigma_y \simeq 11.40\). We expect this number to be similar to the annualized daily volatility, that is, \(\sigma_{261} = \sqrt{261} \cdot \sigma_d \simeq 11.57\). Both numbers turn out to be quite close.

This can be checked for any period \(T\), for example, a month. Assuming a month has 21 business days, we obtain: \(\sigma_m \simeq 3.27\) and \(\sigma_{21} = \sqrt{21} \cdot \sigma_d \simeq 3.28\). Where the first number was calculated as the standard deviation of the **monthly returns**.

All indications are that the volatility is indeed proportional to the square root of \(T\), being \(T\) the period for which we calculate the returns. For a more thorough analysis, we have calculated and represented the volatility computed in these two ways for \(T\) ranging from 1 to 261.

The column *%Volatility_direct* represents the volatility calculated directly as the standard deviation of the returns of prices separated by a period of \(T\) days, while the column *%Volatility_eq *represents the volatility calculated with the equation \(\sigma_T = \sqrt{T} \sigma_d\). If we plot these results together we appreciate that the blue dots fit the line \(\sigma_T = \sqrt{T} \sigma_d\) quite well.

As expected, the points corresponding to lower values of \(T\) fit better in the line. This is due to the fact that for low \(T\) we have more observation (returns) and hence their standard deviation is closer to its expected statistical value.

This could be seen as an empirical confirmation that the equation \(\sigma_T = \sqrt{T} \sigma_d\) holds. But we should ask ourselves why is volatility, or better standard deviation of a series of returns of a financial instrument, proportional to the square root of the time?

## Mathematical explanation

In order to answer this question, we have to keep in mind the meaning of \(T\). This is the period (number of days) separating the prices we use to calculate returns. In other words, \(T\) is the number of days for which the price was -randomly- moving before we calculate the return.

The word random is important; we have to make a number of underlying assumptions about the pricing model that will allow us to derive the formula. Basically, we are assuming that the prices follow a **geometric random walk** without drift^{2} and with constant volatility:

A geometric random walk is a stochastic process in which the log of the randomly varying quantity follows a random walk.

The translation of this is that **each variation of the -log- price**

$$

\xi_t = \log(P_t) – \log(P_{t-1}),

$$

**is a random variable with mean 0 (without drift) and with constant variance **\(\sigma^2\), and all random variables \(\xi_1, \xi_2, \ldots\) are mutually independent^{3}.

Now we have that for a period of time \(T\) the price has moved \(T\) times:

$$

\log(P_{t+T}) – \log(P_t) = \sum_{i=t+1}^{t+T} \xi_i.

$$

## Deriving the equation from theoretical principles

Our goal is to compute the variance of the returns, \(\text{Var}(r_T)\), where we define \(r_T\) as:

$$

r_T = \frac{P_{t+T}-P_t}{P_t} = \frac{P_{t+T}}{P_t} – 1.

$$

We have not used logarithmic returns in our example, but it does not matter too much, since for small \(r_t\) we have:

$$

r_t \simeq \log (1 + r_t) = \log(\frac{P_t}{P_{t-1}})= \log(P_t) – \log(P_{t-1}).

$$

All together, we have that

$$

\begin{align}

\text{Var}(r_T) &\simeq \text{Var}(\log(r_T + 1)) = \text{Var}(\log(\frac{P_{t+T}}{P_t})) =\text{Var}(\log(P_{t+T}) – \log(P_t)) \\

&=\text{Var}(\sum_{i=t+1}^{t+T} \xi_i) = \sum_{i=t+1}^{t+T}\text{Var}(\xi_i) = \sum_{i=t+1}^{t+T} \sigma^2 = T \sigma^2 .

\end{align}

$$

Where we have used the the fact that if the random variables are independent, variance is linear. Finally, we have recovered our equation:

$$

\sigma_T =\sqrt{ \text{Var}(r_T)} = \sqrt{T} \sigma.

$$

## Conclusions

We have studied the equation \(\sigma_T = \sqrt{T}\sigma\) both from an empirical and from a theoretical point of view, some might say that in the reverse order of the usual. First, we shown that for our concrete example the observations followed the formula quite nicely, and then, we explained how to derive it assuming certain hypothesis about the price series.

We may conclude that, the fact that our data fits the equation, is a confirmation that the assumptions we made in the theoretical framework were reasonable, or at least the price series we choose satisfy these hypothesis. Now you can check it for your own data!

## Notes

- The price series corresponds to the USDCHF currency cross rate.
- Without drift just means that there is no long-term trend. Each step (change in the log price) is a random variable
**of mean 0**. - In statistics, a collection of random variables with the same probability distribution and mutually independent is called
**independent and identically distributed****(i.i.d.)**

## References

- Enrique Millán, A Matter of Scale: Returns and Volatility
- Macroption, Why is volatility proportional to the square root of time?
- Wikipedia, Wiener process
- Wikipedia, Independent and identically distributed random variables
- Wikipedia, Random walk
- Wikipedia, Black–Scholes model
- Wikipedia, Variance
- Investopedia, Volatility: Meaning In Finance and How it Works with Stocks
- Investopedia, Dispersion in Statistics: Understanding How It’s Used
- Geometric random walk model
- Gregory Gundersen, Returns and Log Returns
- Tom Z. Jiahao, Random Walk, Brownian Motion, and Stochastic Differential Equations — the Intuition
- Stackexchange, The linearity of variance