List elements digit difference sort

Question 1

I am doing a code challenge of codesignal.com which ask for following things: Given an array of integers, sort its elements by the difference of their largest and smallest digits. In the case of a tie, that with the larger index in the array should come first.

Example

For a = [152, 23, 7, 887, 243], the output should be
digitDifferenceSort(a) = [7, 887, 23, 243, 152].

Here are the differences of all the numbers:

152: difference = 5 - 1 = 4;
23: difference = 3 - 2 = 1;
7: difference = 7 - 7 = 0;
887: difference = 8 - 7 = 1;
243: difference = 4 - 2 = 2.

23 and 887 have the same difference, but 887 goes after 23 in a, so in the sorted array it comes first. I wrote following code using python3 and it passes all normal tests but it cannot pass execution time tests. How can I improve my code to decrease it's execution time for list with considerable amount of elements?

def digitDifferenceSort(a):
 diff = []
 for i in a:
 i = list(str(i))
 diff.append(i)
 for i in range(len(diff)):
 for j in range(i+1, len(diff)):
 if int(max(diff[i])) - int(min(diff[i])) > int(max(diff[j])) - int(min(diff[j])):
 diff[j], diff[i] = diff[i], diff[j]
 elif int(max(diff[i])) - int(min(diff[i])) == int(max(diff[j])) - int(min(diff[j])):
 diff[i], diff[j] = diff[j], diff[i] 
 new_list = [] 
 for i in diff:
 b = ''
 for j in i:
 b = b + j
 new_list.append(int(b))
 return new_list

Question 2

Python has a built-in sorted function, you should use it. What it needs to sort according to some special criteria is a key function:

def max_digit_diff(n):
 n_str = str(n)
 return int(max(n_str)) - int(min(n_str))

This uses the fact that "0" < "1" < ... < "9".

However, the sorted function uses a stable sorting algorithm, so if two elements compare equal, the original order is preserved. But here we want the opposite order (later elements come first), so we just reverse the list first:

def digit_difference_sort(a):
 return sorted(reversed(a), key=max_digit_diff)

This should be vastly easier to read than your convoluted function. Note that the function names also follow Python's official style-guide, PEP8.

Like all (good) sorting functions, this is \$\mathcal{O}(n \log n)\$. Here is a timing comparison to your function with arrays up to length 10k (at which point your function takes more than a minute...).

enter image description here

Here is an implementation of the radix sort suggested by @JollyJoker in their answer:

from itertools import chain
def radix_sort(a):
 sub_a = [[] for _ in range(10)]
 for x in a:
 sub_a[max_digit_diff(x)].append(x)
 return list(chain.from_iterable(reversed(x) for x in sub_a))

This seems to have the same complexity as my approach, probably the implementation of max_digit_diff actually dominates this:

enter image description here

Question 3

Thanks a lot. Would you please give more clarification about passing max_digit_diff method as a parameter to sorted function and how this part works?

Question 4

@IbrahimRahimi: sorted calls the function you specify as a key exactly once for each input and sorts according to that. Have a look at the official Sorting HOW TO for more information.

Question 5

Nice answer. Just curious, How did you generate the comparison graph? is there a package for doing it?

Question 6

@gustavovelascoh: It is basically done with the code in my question here, with some input from the answers. One of these days I will finally make it look pretty and upload it to github...

Question 7

You can do better than \$\mathcal{O}(n \log n)\$ using a Radix sort.

The differences can only have values 0-9, so you can sort the original array into a list of 10 lists while just going through the array once. Then, for each list 0-9, pop() the values into an output list until the list is empty.

Question 8

Added an implementation of this to my answer and included it in the timings. Interestingly it is exactly the same as using the built-in sorted. Probably due to the fact that both use max_digit_diff.

Question 9

@Graipher The scaling seems odd. Could you add some timing check before the return row just to check there's nothing slow in the last line? Then again, maybe sorted just is that good.

Question 10

Weirdly, it looks exactly the same when directly returning sub_a. Also, max_digit_diff is basically constant time, obviously, since even the longest numbers have only a few digits (less than a hundred).

Question 11

I also tried arrays with up to 10k elements, still the same (without the OPs algorithm, obviously).

Question 12

@Graipher You're probably right on Timsort. BTW, good job on turning my answer into actual code :) I wasn't at all certain my text was clear enough.

Graipher Graipher 41.6k7 gold badges70 silver badges134 bronze badges · Accepted Answer · 2019-03-11 10:30:07Z

Python has a built-in sorted function, you should use it. What it needs to sort according to some special criteria is a key function:

def max_digit_diff(n):
 n_str = str(n)
 return int(max(n_str)) - int(min(n_str))

This uses the fact that "0" < "1" < ... < "9".

However, the sorted function uses a stable sorting algorithm, so if two elements compare equal, the original order is preserved. But here we want the opposite order (later elements come first), so we just reverse the list first:

def digit_difference_sort(a):
 return sorted(reversed(a), key=max_digit_diff)

This should be vastly easier to read than your convoluted function. Note that the function names also follow Python's official style-guide, PEP8.

Like all (good) sorting functions, this is \$\mathcal{O}(n \log n)\$. Here is a timing comparison to your function with arrays up to length 10k (at which point your function takes more than a minute...).

enter image description here

Here is an implementation of the radix sort suggested by @JollyJoker in their answer:

from itertools import chain
def radix_sort(a):
 sub_a = [[] for _ in range(10)]
 for x in a:
 sub_a[max_digit_diff(x)].append(x)
 return list(chain.from_iterable(reversed(x) for x in sub_a))

This seems to have the same complexity as my approach, probably the implementation of max_digit_diff actually dominates this:

enter image description here

Thanks a lot. Would you please give more clarification about passing max_digit_diff method as a parameter to sorted function and how this part works?
@IbrahimRahimi: sorted calls the function you specify as a key exactly once for each input and sorts according to that. Have a look at the official Sorting HOW TO for more information.
Nice answer. Just curious, How did you generate the comparison graph? is there a package for doing it?
@gustavovelascoh: It is basically done with the code in my question here, with some input from the answers. One of these days I will finally make it look pretty and upload it to github...

Stack Exchange Network

List elements digit difference sort

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

List elements digit difference sort

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions