Calculating returns on a price series is one of the most basic calculations in finance, but it can become a headache when we want to do aggregations for weeks, months, years, etc. In Python, the Pandas library makes this aggregation very easy to do, but if we don’t pay attention we could still make mistakes. Assuming that we want the return of the whole month, and we are not interested, for example, in the returns accumulated so far. These latter returns require that they be normalized to be comparable with the other returns.
Given some prices on business days, you can get the trailing returns per month for the situation that we want to calculate. With this in mind, I’d like to describe how to avoid miscalculating monthly returns.
Before you start, you may need some dependencies to do this:
Let’s get to work
We’re going to work with the shares of “Banco do Brazil SA.” Our first step is to download yahoo finance data using pandas_datareader:
Let’s see what the data looks like:
If we plot the closing prices, we’ll see this:
Now we’ll work with closing prices. We’re going to calculate the monthly returns, so we can do the following*:
* At the end of this post you will find the auxiliary functions used in the code, such as “total_return”
The problem of this approximation is that it leaves out one day in the calculation of each monthly return, as it only takes into account the prices that belong to the month in question and completely omits all other information.
For the calculation to be correct, you must include the closing price on the day before the first day of the month, i. e. the last day of the previous month.
We can see it with an example: if we select month 8 of 2017, and see the prices that have been used to calculate returns, we will see that the series starts on August 1st and ends on August 31st. We have left out July 31st!, this happens every month.
It should also be mentioned that the last month (the current month) is not comparable with the rest of the months since it has not yet finished.
A second approach
Okay, as a second approach to incorporate the previous data, I could calculate the returns first, then group and calculate the total return with that series:
However, all that glitters is not gold; this approximation has a problem in the first value. We still don’t have the price before the one needed to make the calculation. The remaining months would be correctly calculated with the exception of the last return (current month), which is again not comparable with the rest.
The third approach
So, what can we do? The most correct result would come from first decimating the price series by taking only the last working day of the month, then grouping by year and month, and with the resulting series calculate, finally, the returns.
The good thing about this approximation is that it doesn’t return values for the first or last month with prices available, which can save us some calculation errors.
It’s unfortunately very easy to make mistakes with these kinds of calculations. Pandas makes things much simpler, but sometimes can also be a double-edged sword. Nothing like a quick reading to avoid those potential mistakes. As we can see on the plot, we can underestimate or overestimate the returns obtained.
- Approximation 1, gives us some miscalculations.
- Approximation 2, is a little closer to what we are looking for, but it has values that we should not use.
- Approximation 3, is the best method to use for this calculation.
Appendix: useful functions
I have prepared a notebook with the complete code so that it can be executed and played with: Check it out here!
Why not Df.resample(“M”).last().pct_change(1) ?