I am doing a code challenge of codesignal.com which ask for following things: Given an array of integers, sort its elements by the difference of their largest and smallest digits. In the case of a tie, that with the larger index in the array should come first.
Example
For a = [152, 23, 7, 887, 243], the output should be
digitDifferenceSort(a) = [7, 887, 23, 243, 152].
Here are the differences of all the numbers:
152: difference = 5 - 1 = 4;
23: difference = 3 - 2 = 1;
7: difference = 7 - 7 = 0;
887: difference = 8 - 7 = 1;
243: difference = 4 - 2 = 2.
23 and 887 have the same difference, but 887 goes after 23 in a, so in the sorted array it comes first. I wrote following code using python3 and it passes all normal tests but it cannot pass execution time tests. How can I improve my code to decrease it's execution time for list with considerable amount of elements?
def digitDifferenceSort(a):
diff = []
for i in a:
i = list(str(i))
diff.append(i)
for i in range(len(diff)):
for j in range(i+1, len(diff)):
if int(max(diff[i])) - int(min(diff[i])) > int(max(diff[j])) - int(min(diff[j])):
diff[j], diff[i] = diff[i], diff[j]
elif int(max(diff[i])) - int(min(diff[i])) == int(max(diff[j])) - int(min(diff[j])):
diff[i], diff[j] = diff[j], diff[i]
new_list = []
for i in diff:
b = ''
for j in i:
b = b + j
new_list.append(int(b))
return new_list
2 Answers 2
Python has a built-in sorted
function, you should use it. What it needs to sort according to some special criteria is a key
function:
def max_digit_diff(n):
n_str = str(n)
return int(max(n_str)) - int(min(n_str))
This uses the fact that "0" < "1" < ... < "9"
.
However, the sorted
function uses a stable sorting algorithm, so if two elements compare equal, the original order is preserved. But here we want the opposite order (later elements come first), so we just reverse the list first:
def digit_difference_sort(a):
return sorted(reversed(a), key=max_digit_diff)
This should be vastly easier to read than your convoluted function. Note that the function names also follow Python's official style-guide, PEP8.
Like all (good) sorting functions, this is \$\mathcal{O}(n \log n)\$. Here is a timing comparison to your function with arrays up to length 10k (at which point your function takes more than a minute...).
Here is an implementation of the radix sort suggested by @JollyJoker in their answer:
from itertools import chain
def radix_sort(a):
sub_a = [[] for _ in range(10)]
for x in a:
sub_a[max_digit_diff(x)].append(x)
return list(chain.from_iterable(reversed(x) for x in sub_a))
This seems to have the same complexity as my approach, probably the implementation of max_digit_diff
actually dominates this:
-
\$\begingroup\$ Thanks a lot. Would you please give more clarification about passing max_digit_diff method as a parameter to sorted function and how this part works? \$\endgroup\$Ibrahim Rahimi– Ibrahim Rahimi2019年03月11日 10:57:58 +00:00Commented Mar 11, 2019 at 10:57
-
1\$\begingroup\$ @IbrahimRahimi:
sorted
calls the function you specify as akey
exactly once for each input and sorts according to that. Have a look at the official Sorting HOW TO for more information. \$\endgroup\$Graipher– Graipher2019年03月11日 11:00:30 +00:00Commented Mar 11, 2019 at 11:00 -
1\$\begingroup\$ Nice answer. Just curious, How did you generate the comparison graph? is there a package for doing it? \$\endgroup\$gustavovelascoh– gustavovelascoh2019年03月11日 12:30:53 +00:00Commented Mar 11, 2019 at 12:30
-
1
You can do better than \$\mathcal{O}(n \log n)\$ using a Radix sort.
The differences can only have values 0-9, so you can sort the original array into a list of 10 lists while just going through the array once. Then, for each list 0-9, pop()
the values into an output list until the list is empty.
-
\$\begingroup\$ Added an implementation of this to my answer and included it in the timings. Interestingly it is exactly the same as using the built-in
sorted
. Probably due to the fact that both usemax_digit_diff
. \$\endgroup\$Graipher– Graipher2019年03月11日 14:30:48 +00:00Commented Mar 11, 2019 at 14:30 -
\$\begingroup\$ @Graipher The scaling seems odd. Could you add some timing check before the return row just to check there's nothing slow in the last line? Then again, maybe
sorted
just is that good. \$\endgroup\$JollyJoker– JollyJoker2019年03月11日 15:02:46 +00:00Commented Mar 11, 2019 at 15:02 -
\$\begingroup\$ Weirdly, it looks exactly the same when directly returning
sub_a
. Also,max_digit_diff
is basically constant time, obviously, since even the longest numbers have only a few digits (less than a hundred). \$\endgroup\$Graipher– Graipher2019年03月11日 15:18:38 +00:00Commented Mar 11, 2019 at 15:18 -
\$\begingroup\$ I also tried arrays with up to 10k elements, still the same (without the OPs algorithm, obviously). \$\endgroup\$Graipher– Graipher2019年03月11日 15:27:11 +00:00Commented Mar 11, 2019 at 15:27
-
1\$\begingroup\$ @Graipher You're probably right on Timsort. BTW, good job on turning my answer into actual code :) I wasn't at all certain my text was clear enough. \$\endgroup\$JollyJoker– JollyJoker2019年03月11日 15:48:51 +00:00Commented Mar 11, 2019 at 15:48
Explore related questions
See similar questions with these tags.