Codewars: N smallest elements in original order

Question 1

I am consistently timing out on the mentioned kata:

Task You will be given an array of random integers and a number n. You have to extract n smallest integers out of it preserving the original order.

Examples
performant_smallest([1, 2, 3, 4, 5], 3) == [1, 2, 3]
performant_smallest([5, 4, 3, 2, 1], 3) == [3, 2, 1]
performant_smallest([1, 2, 3, 4, 1], 3) == [1, 2, 1]
performant_smallest([1, 2, 3, -4, 0], 3) == [1, -4, 0]
performant_smallest([2, 1, 3, 2, 3], 3) == [2, 1, 2]
Notes

The number n will always be smaller then the array length

There will be duplicates in the array, and they have to be returned in the order of their each separate appearence
Your solution has to be at least as fast as reference, passing the tests in 12000+ ms

If you didn't pass a small amount of tests, try submitting again. Otherwise, your solution is not efficient enough.

Test suite
Tests: 1500
Array size: 4500
Numbers range: [-100..100]
Number of elements to return: 50-100% of the array
Note: you can't use Python 2

I wrote several different functions for this:

from heapq import *
from collections import Counter as count
from copy import copy
def heap_pop(lst, n):
 heap, res, i = copy(lst), [0]*n, 0
 heapify(heap)
 c = count(heappop(heap) for _ in range(n))
 for itm in lst:
 if c.get(itm, 0) > 0:
 res[i] = itm
 i += 1
 c[itm] -= 1
 if i == n: break
 return res
def sorting(lst, n):
 res, i = [0]*n, 0
 c = count(sorted(lst)[:n])
 for itm in lst:
 if c.get(itm, 0) > 0:
 res[i] = itm
 i += 1
 c[itm] -= 1
 if i == n: break
 return res
def n_smallest(lst, n):
 heap, res, i = copy(lst), [0]*n, 0
 heapify(heap)
 c = count(nsmallest(n, heap))
 for itm in lst:
 if c.get(itm, 0) > 0:
 res[i] = itm
 i += 1
 c[itm] -= 1
 if i == n: break
 return res
def counter_sort(lst, n):
 c = count(lst)
 d, x, res, sm = sorted(c), {}, [0]*n, 0
 for k in d:
 z = min(c[k], n-sm)
 x[k] = z
 sm += z
 if sm == n: break
 sm = 0
 for itm in lst:
 if x.get(itm, 0) > 0:
 res[sm] = itm
 sm += 1
 x[itm] -= 1
 if sm == n: break
 return res
def cs2(lst, n):
c = count(lst)
d, x, res, sm = sorted(c), {}, [0]*n, 0
for k in d:
 z = min(c[k], n-sm)
 x[k] = z
 sm += z
 if sm == n: break
sm = 0
for itm in (itm for itm in lst if itm in x):
 if x.get(itm, 0) > 0:
 res[sm] = itm
 sm += 1
 x[itm] -= 1
 if sm == n: break
return res
def csy(lst, n):
 c, sm = [0]*201, 0
 for itm in lst: c[itm+100] += 1 #Map each number to num+100.
 x, res, sm = {}, [0]*n, 0
 for k in range(201):
 z = min(c[k], n-sm)
 x[k-100] = z
 sm += z
 if sm == n: break
 sm = 0
 for itm in lst:
 if x.get(itm, 0) > 0:
 res[sm] = itm
 sm += 1
 x[itm] -= 1
 if sm == n: break
 return res
def csz(lst, n):
 c = [0]*201 #Our counter.
 for itm in lst: c[itm+100] += 1 #Map each number to num+100.
 res, sm = [0]*n, 0
 for k in range(201): #Iterate through the numbers in our counter.
 sm += c[k]
 if sm >= n:
 c[k] += n - sm #The sum of `c[:k+1]` should be equal to `n`, and this would give us the count of the `n` smallest elements.
 break
 sm = 0
 for itm in lst: #Iterate through the list to present the elements in their appearance order in the list.
 v = itm+100 #The mapping between the list item and its index in our counter.
 if v <= k and c[v] > 0: #The item is one of the `n` smallest items.
 res[sm] = itm #Place it in its position in the result list.
 sm += 1
 c[v] -= 1
 if sm == n: break
 return res

I benchmarked all of them (lst is 4500 random numbers in [-100, 100], and n in [2250, 4500] with 10000 iterations) to simulate the actual data set. I'm very surprised that none of my counting sort implementations worked considering that it is linear time and there were only two \2ドル ,円 O(n)\$ iterations in it.

enter image description here

csz() was the clear best performer, but it only gets to 1068 of the test cases before failing (the same applies to all my counting sort implementations; despite showing massive difference under the benchmark (17.6 - 29.2s) they all reach and terminate at 1068 test cases).

enter image description here

Question 2

Added new functions I tried. I don't think I can do any better.

Question 3

The Kata was broken—perhaps Codewars server load currently is a lot more than it was 9 months ago when the question was made—but for whatever reason, none of the top 6 solutions could pass within the time constraints imposed on the question. The kata was reworked after someone filed an issue, and now my solution passes. I did write an even more micro optimised version of counting sort which is what I eventually submitted. The only change I made was that since n was 50-100% of the list length it would be faster to find the n smallest elements by counting off the (length - n) largest elements than counting up to n, and since the array had a fixed length for the random tests, we could do that. Here's my final version:

My Code

count = 0
def performant_smallest(lst, n):
 global count
 count += 1
 c = [0]*101 #Our counter.
 for itm in lst: c[itm+50] += 1 #Map each number to num+50
 res, sm, stop = [0]*n, 0, (5 - n if count <= 5 else 10000 - n)
 for k in range(100, -1, -1): #Iterate through the numbers in our counter starting backwards to get even better performance, since `n` is large.
 sm += c[k]
 if sm >= stop:
 c[k] = sm - stop #The sum of `c[:k+1]` should be equal to `n`, and this would give us the count of the `n` smallest elements.
 break
 sm = 0
 for itm in lst: #Iterate through the list to present the elements in their appearance order in the list.
 v = itm+50 #The mapping between the list item and its index in our counter.
 if v <= k and c[v] > 0: #The item is one of the `n` smallest items.
 res[sm] = itm #Place it in its position in the result list.
 sm += 1
 c[v] -= 1
 if sm == n: break
 return res

Tobi Alafin Tobi Alafin 1,7921 gold badge15 silver badges31 bronze badges · Answer 1 · 2019-02-18 02:55:19Z

The Kata was broken—perhaps Codewars server load currently is a lot more than it was 9 months ago when the question was made—but for whatever reason, none of the top 6 solutions could pass within the time constraints imposed on the question. The kata was reworked after someone filed an issue, and now my solution passes. I did write an even more micro optimised version of counting sort which is what I eventually submitted. The only change I made was that since n was 50-100% of the list length it would be faster to find the n smallest elements by counting off the (length - n) largest elements than counting up to n, and since the array had a fixed length for the random tests, we could do that. Here's my final version:

My Code

count = 0
def performant_smallest(lst, n):
 global count
 count += 1
 c = [0]*101 #Our counter.
 for itm in lst: c[itm+50] += 1 #Map each number to num+50
 res, sm, stop = [0]*n, 0, (5 - n if count <= 5 else 10000 - n)
 for k in range(100, -1, -1): #Iterate through the numbers in our counter starting backwards to get even better performance, since `n` is large.
 sm += c[k]
 if sm >= stop:
 c[k] = sm - stop #The sum of `c[:k+1]` should be equal to `n`, and this would give us the count of the `n` smallest elements.
 break
 sm = 0
 for itm in lst: #Iterate through the list to present the elements in their appearance order in the list.
 v = itm+50 #The mapping between the list item and its index in our counter.
 if v <= k and c[v] > 0: #The item is one of the `n` smallest items.
 res[sm] = itm #Place it in its position in the result list.
 sm += 1
 c[v] -= 1
 if sm == n: break
 return res

Stack Exchange Network

Codewars: N smallest elements in original order

Examples

Notes

Test suite

1 Answer 1

My Code

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Codewars: N smallest elements in original order

Examples

Notes

Test suite

1 Answer 1

My Code

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions