Python performance: a comparison

J. González


No Comments

When coding in any computer language, performance is always an important feature to take into consideration. But if it comes to Python, this factor becomes crucial. In this post, we will see how the way we develop a function and whether we’re using a library or not can make dramatical changes regarding performance.

Let’s look at two possible implementations of a simple function, which apply different transformations according to the input values:

def some_calcs1(x):
    if x > 0.04:
         return 0.4
    elif x > 0.01:
         return 10 * x
    elif x > 0:
         return 10 * x / 3.0 + 1 / 15.0
         return 0


This second function would obtain the same result as the previous one:

def some_calcs2(x):
    return (x > 0.04) * 0.4 + \
           ((x > 0.01) & (x <= 0.04)) * x * 10.0 + \ ((x > 0) & (x <= 0.01)) * (x * 10.0 / 3.0 + 1.0 / 15.0) + \


Let’s define 3 equivalent variables in different types: a list of lists, a Numpy array and a DataFrame from Pandas:

import pandas as pd
import numpy as np

xarray = np.random.rand(1000,10)
xlist = xarray.tolist()
xdf = pd.DataFrame(xarray)



If you apply the two functions defined above to a list of lists, the first function is 5 times faster than the second, just as a consequence of the way they are coded. In function 1 only the code inside the fulfilled condition is executed, while in function 2 all the calculations are done for every figure.

%timeit z1 = [list(map(some_calcs1, z)) for z in xlist]
%timeit z2 = [list(map(some_calcs2, z)) for z in xlist]

Numpy arrays

But, what if we worked with Numpy arrays instead of lists? Can we expect the same behaviour?

First of all, in order to “map” the first function in Numpy we would need to vectorize it (we will use a decorator), otherwise it would not work. Vectorize a function allows us to apply the function to the whole array, instead of using a loop.

import numpy as np

def some_calcs1_vec(x):
    if x > 0.04:
        return 0.4
    elif x > 0.01:
        return 10 * x
    elif x > 0:
        return 10 * x / 3.0 + 1 / 15.0
        return 0

At first sight, we realise that the performance has improved a little with the first function, and tremendously with the second one. But what amazes the most is that now the second function is much faster than the first one! But, were not we saying that the first implementation was faster? Let’s explain what is going on here:

%timeit z1 = some_calcs1_vec(xarray)
%timeit z2 = some_calcs2(xarray)

Numpy has what they call the universal functions (ufunc), which are functions that can receive array like inputs and return array output, but they operate over each element. It is quite the same that we do when vectorizing, but with faster results, since these functions look over the elements by loops in a lower level (C implementations). Besides, these functions broadcast (adjust) the input arrays when they have different dimensions.

Then, the first function is “generalized” to operate like the ufuncs, but the second function does use the ufuncs.

Alright, but I don’t see any ufunc at all! Well, the different operators you see in the formulas, like *, +, &, > are overloaded with the ufuncs multiply(), add(), logical_and() or greater(). Then, for example, x+4.0 would be the same as applying np.add(x, 4.0).

Pandas DataFrame

To conclude, we might wonder if Pandas library would obtain similar performance as Numpy, taking into account that Pandas makes use of Numpy arrays underneath.

If we apply a map operation with the functions defined, the performance would be slower than the one obtained by mapping a list and, obviously, much slower than Numpy:

%timeit z1 = xdf.applymap(some_calcs1)
%timeit z2 = xdf.applymap(some_calcs2)

If we apply the functions directly over Pandas DataFrames, the first vectorized function performs slower than with a Numpy array; the second function loses all its potential when applying over a DataFrame, with a performance similar to the one obtained with a list.

%timeit z1 = some_calcs1_vec(xdf)
%timeit z2 = some_calcs2(xdf)

Finally, to complicate matters even further, we could use the “apply ” DataFrame method which applies the function specified to entire rows or columns; as we see below, the choice of the axis to operate on is a factor that makes a big difference in terms of performance.

In general, it is advisable not to use this kind of mapping in Pandas if you want an acceptable performance.

%timeit z1 = xdf.apply(some_calcs1_vec, axis=0)
%timeit z2 = xdf.apply(some_calcs1_vec, axis=1)

So, be warned: the way you implement your code and the choice of the right libraries and functions can make your programs fly or be as slow as molasses in January.

add a comment