Rankings are everywhere. They are sometimes useful and, at other times, contradicting. In such a case, we need to come up with a consensus ranking but… how do we *evaluate* ranking consensus?

The other day I was reading about something called **rank aggregation**, which is just a fancy name for combining preferences expressed through rankings.

I bet you know that rankings are everywhere. The page Google returns after a search is a ranking, as are the ratings we leave for a restaurant in TripAdvisor or a movie in IMDB. Of course, they are not constrained to the interwebs: when we vote in an election, we rank the candidates; when you’re chosen for a job after that interview, you were sitting at the top of their rank.

Almost every selection involves a ranking process, and selections are ubiquitous in finance, as well. Nonetheless, one investor’s preferences (rankings) may differ from another. Even if none of them know the absolute truth, as well-respected experts we could benefit from putting their opinions in common and coming up with a new, hopefully more truthful, ranking. This is what rank aggregation is all about, and Machine-Learning people like to see this as a type of ensemble method.

A couple of weeks ago I stumbled on a great blog post about one of the best-known techniques to find these consensuated rankings: the so-called **Kemeny-Young optimal rank aggregation**. The goal of this technique is coming up with a new ranking that minimizes the aggregated Kendall tau distance to every other ranking. In the post, a Python function to compute the Kendall tau distance according to the naive algorithm was provided. The time complexity of this algorithm is quadratic in the length of the rankings but, given the similarity of the problem with that of ordering, I thought something close to linearithmic ought to exist (and, indeed, it does).

## Confusion on top of confusion

I turned to Google to look for a more-efficient Python implementation and, to my surprise, I found that the existence of the closely-related Kendall rank correlation coefficient, sometimes referred to as Kendall tau coefficient (not distance), was causing a lot of confusion on the net. At the time of writing, this distinction seems clear but confusion is still everywhere, even in those places where people usually go to shake confusion off, namely, Stack Exchange questions or Github issues. Confusion has even lead to a new function implementation but, worst of all, quoting the Wikipedia article about Kendall tau distance, as done in this comment, only worsens the situation.

So, what’s going on? Is Wikipedia wrong? Are there two alternative definitions of the Kendall tau distance? Not really. After ordering my thoughts I came upon another source of confusion, which I will try to clarify for you through an example.

## A simple example

Let’s assume two experts must evaluate three planets (red, yellow, and blue) according to how favourable they are to establish a future human colony. In order to collect their opinions, they are asked to fill the following spreadsheet (yes, I know, spreadsheets are as ubiquitous as rankings):

We could say (special attention here) that each expert has been asked to provide a **ranking list**. An equivalent alternative could have been asking the experts for a **ranked list of items** (planets, in this case) by filling this spreadsheet:

As you have probably noticed, the **same preferences can be expressed with any of the formats**:

and note that we could even change the order of the columns in Case 1 and it would not matter.

Now, see what happens when we compute the Kendall tau distance between the three rankings according to the some of the equivalent definitions available:

*Number of pairwise disagreements between two ranking lists (number of pairs that are in different order in the two rankings)*:**Case 1**: Both experts disagree in the order of a single pair, (red, blue), so, the distance is 1.**Case 1b**: Column ordering does not matter and, again, both experts disagree in the order of the (red, blue) pair. Distance is 1.**Case 2**: (red, blue) are in different order. Distance is 1.

*Number of swaps that the bubble sort algorithm would take to place one list in the same order as the other list (minimum number of adjacent transpositions needed to transform one ranking into the other):***Case 1**: Three swaps are required, (3<->1), (3<->2) and (1<->2). Distance*seems*to be 3.**Case 1b**: Swapping 3 and 2 suffices to match both rankings. Distance is 1.**Case 2**: Swapping red and blue suffices to match both rankings. Distance is 1.

From these results, it is clear that the second definition cannot be applied reliably to ranking lists (Case 1). The reason why it works for Case 1b is that Expert 1’s preferences were already in their natural order. When this happens, the second definition also works and indeed gives rise to yet another usually considered definition: the number of inversions in the unordered ranking.

## Computing distance from correlation coefficient

At this point, you may be thinking that the confusion I told you about at the beginning was actually reasonable. I agree. But, fortunately, there is an easier way. To avoid column ordering issues just plot one ranking against the other and measure how linear this scatter plot looks by using the Kendall correlation coefficient, \(\rho_\tau\).

Once you have done that, since both the Kendall correlation coefficient and the Kendall tau distance depend on the number of discordant pairs in the ranking, you can rely on the definition of the correlation coefficient to work out the Kendall tau distance:

$$\rho_\tau = \frac{\text{number of concordant pairs}-\text{number of discordant pairs}}{n(n-1)/2},$$

where \(n\) is the number of items in the list. Then, the Kendall tau distance \(d_{\tau} \) can be computed as:

$$d_{\tau}=\text{number of discordant pairs}=\frac{n(n-1)}{2}\frac{(1-\rho_\tau)}{2}.$$

In our example, $$\frac{3(3-1)}{2}\frac{(1-1/3)}{2}=1.$$

Much simpler this way, right? See you next time.