Creating a list containing the rank of the elements in the original list

Question 1

I don't really know how to explain what I'm looking for in a way that makes sense, but here goes:

Say I have a list $$L=(4,7,9,10,6,11,3)$$ What I want to produce is a corresponding list $$ K = (1,3,4,5,2,6,0)$$

where element K[i] has the value of the 'rank' of the element for the corresponding location in L. So the higher the number the higher the rank, starting from rank 0 for the smallest number in L

The code I have written for this is:

x = [4,7,9,10,6,11,3]
index = [0]*len(x)
for i in range(len(x)):
 index[x.index(min(x))] = i
 x[x.index(min(x))] = max(x)+1

and it works, but I just feel it looks horrible, and was wondering if there might exist a more aesthetic way.

Question 2

Do we assume all list elements are distinct and hence you don't care about breaking ties?

Question 3

If you can use numpy, this is implemented: np.array([4,7,9,10,6,11,3]).argsort() gives array([6, 0, 4, 1, 2, 3, 5]), which is by decreasing order. (Your use-case wants to rank by increasing order, in which case just negate the array before argsort'ing). NumPy is faster than native Python and mostly in C.

Question 4

Also implemented both in scipy, and in pandas rank()

Question 5

I believe your solution is \$O(n^2)\,ドル and we can reduce that to \$O(n \log n)\$. I'm not well-versed in python, but this is the general idea:

The output is a permutation of the indices \0,ドル 1, \ldots, n - 1\,ドル where \$n\$ is the length of the input.
We sort the indices based on the element at the specified index. In your example, we get \6,ドル 0, 4, 1, 2, 3, 5\,ドル i.e. the \6ドル\$th element (\3ドル\$) is smallest, so \6ドル\$ is first.
Seeing that \6ドル\$ is at index \0ドル\,ドル we know that the \6ドル\$th element of the output is \0ドル\,ドル and so on. This step is easier to explain in code than in words, sorry about that.

In code,

indices = list(range(len(input)))
indices.sort(key=lambda x: input[x])
output = [0] * len(indices)
for i, x in enumerate(indices):
 output[x] = i

Or the more terse

output = [0] * len(input)
for i, x in enumerate(sorted(range(len(input)), key=lambda y: input[y])):
 output[x] = i

I used the timeit module to compare the running times of this version and the original for different input sizes. The functions were each called 1,000 times on randomly shuffled input of size \$n\$. Here are some of the results

 n this version (s) original version (s)
 10 0.02 0.04
 100 0.17 1.40
1000 1.81 133.35

This is asymptotically optimal, since if it were not, we would have a sub-linearithmic comparison sort:

sorted = [0] * len(input)
for i, x in enumerate(output):
 sorted[x] = input[i]

Question 6

If all of the numbers in x are unique, this works:

x = [4,7,9,10,6,11,3]
seq = sorted(x)
index = [seq.index(v) for v in x]

The technique is to sort the input list, then look up the position of each value from the original list in the sorted one, storing the results in a list via list comprehension.

It will have trouble if the numbers in x are non-unique, because when the list is sorted there will be two identical numbers next to each other and index() will find the first one. This might be beneficial, as technically the numbers are indeed the same rank, but it will also mean there is a "hole" in the ranking order (for example, if two numbers are tied for third, the fourth rank will actually be numbered 5 because it will be the 5th entry in the sorted list)

It also involves creating a sorted copy of the original list so may take up extra memory if the list is large.

Question 7

You can make this easily (in python) by sorting twice: you first sort each element and its relative index (i.e. argsort), then you enumerate each new element and sort back the relative index.

This solution has same complexity of your sorting algorithm, therefore you can make it \$O(n\log n)\$ or even \$O(n)\$ if you have small integers and use, for example, radix sort.

In this case, I use the build-in sorted, which is \$O(n\log n)\,ドル and zip to get back only the list with ranks

Here's an example

L = [4, 7, 9, 10, 6, 11, 3]
K = (1, 3, 4, 5, 2, 6, 0)
g1 = lambda e: e[1]
g10 = lambda e: e[1][0]
ranks, _ = zip(*sorted(enumerate(sorted(enumerate(L), key=g1)), key=g10))
print(ranks == K) # True

Here's what is happening:

s1 = sorted(enumerate(L), key=lambda e: e[1])
print(s1)
# [(6, 3), (0, 4), (4, 6), (1, 7), (2, 9), (3, 10), (5, 11)]
s2 = sorted(enumerate(s1), key=lambda e: e[1][0])
print(s2)
# [(1, (0, 4)), (3, (1, 7)), (4, (2, 9)), (5, (3, 10)), (2, (4, 6)), (6, (5, 11)), (0, (6, 3))]

mjolka mjolka 16.3k2 gold badges30 silver badges73 bronze badges · Accepted Answer · 2014-10-08 00:34:52Z

I believe your solution is \$O(n^2)\,ドル and we can reduce that to \$O(n \log n)\$. I'm not well-versed in python, but this is the general idea:

The output is a permutation of the indices \0,ドル 1, \ldots, n - 1\,ドル where \$n\$ is the length of the input.
We sort the indices based on the element at the specified index. In your example, we get \6,ドル 0, 4, 1, 2, 3, 5\,ドル i.e. the \6ドル\$th element (\3ドル\$) is smallest, so \6ドル\$ is first.
Seeing that \6ドル\$ is at index \0ドル\,ドル we know that the \6ドル\$th element of the output is \0ドル\,ドル and so on. This step is easier to explain in code than in words, sorry about that.

In code,

indices = list(range(len(input)))
indices.sort(key=lambda x: input[x])
output = [0] * len(indices)
for i, x in enumerate(indices):
 output[x] = i

Or the more terse

output = [0] * len(input)
for i, x in enumerate(sorted(range(len(input)), key=lambda y: input[y])):
 output[x] = i

I used the timeit module to compare the running times of this version and the original for different input sizes. The functions were each called 1,000 times on randomly shuffled input of size \$n\$. Here are some of the results

 n this version (s) original version (s)
 10 0.02 0.04
 100 0.17 1.40
1000 1.81 133.35

This is asymptotically optimal, since if it were not, we would have a sub-linearithmic comparison sort:

sorted = [0] * len(input)
for i, x in enumerate(output):
 sorted[x] = input[i]

Stack Exchange Network

Creating a list containing the rank of the elements in the original list

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Creating a list containing the rank of the elements in the original list

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions