The task is, you have an array of n numbers called nums
, and an array of m numbers called maxes
.
For each element in maxes
, find how many members of nums
are smaller than or equal to that element.
Example
nums = [4, 2, 5, 10, 8]
maxes = [3, 1, 7, 8]
output = [1, 0, 3, 4]
The first element, 3, in maxes
, is higher than only 2
in nums
, hence the first element in output is 1.
The initial solution I could think of was trivial, and it scales as O(n2):
def sol_naive(nums,maxes):
output = []
for entry in maxes:
higher_numbers = len([x for x in nums if x <= entry])
output.append(higher_numbers)
return output
I'm wondering if there's a better approach to solve this problem.
3 Answers 3
One simple observation is that this problem gets much easier if you sort nums O(n log(n)
. Then you can use bisect_left
which uses binary search to find where an element would go in a sorted list and is O(log(n))
for each element in m. This yields a total time of O((n+m) log(n))
from bisect import bisect_left
def sol_sorted(nums,maxes):
output = []
nums.sort()
for m in maxes:
ind = bisect_left(nums, m)
output.append(len(nums) - ind)
return output
-
1\$\begingroup\$ You might be able to do this slightly faster involving sorting
maxes
, but then you have to futz around a bunch with keeping output ordering, and id didn't seem worth it to me. \$\endgroup\$Oscar Smith– Oscar Smith2017年10月19日 16:14:04 +00:00Commented Oct 19, 2017 at 16:14 -
\$\begingroup\$ Very nice! thank you for making me a little smarter today \$\endgroup\$Wboy– Wboy2017年10月19日 16:47:29 +00:00Commented Oct 19, 2017 at 16:47
-
\$\begingroup\$ If you sort
maxes
as well, you can get your time toO(n log(n) + m log(m))
, which benefits us the larger the length difference is betweenn
&m
. so if you want to dolen(n) == 10000
,len(m) == 10
, it makes more sense to sort each once and then essentially use the combine step of merge-sort to find the indexes all in one go (atO(n + m)
), as opposed to doing separate bisections for each max. \$\endgroup\$TemporalWolf– TemporalWolf2017年10月19日 20:08:20 +00:00Commented Oct 19, 2017 at 20:08 -
\$\begingroup\$ Yeah, that's what I was saying. The downside is that you have to keep track of the original order of
maxes
so output ends up in the right order. It's not algorithmically slower, but it requires more python, and adds a fair amount of overhead. I would be interested to see if it ever ends up faster in practice. \$\endgroup\$Oscar Smith– Oscar Smith2017年10月19日 20:52:41 +00:00Commented Oct 19, 2017 at 20:52 -
\$\begingroup\$ The base idea looks solid. Using
bisect_left()
instead ofbisect_right()
ruins it, as does subtracting the index found from the length ofnums
. \$\endgroup\$greybeard– greybeard2021年11月24日 06:38:59 +00:00Commented Nov 24, 2021 at 6:38
You could turn your for-loop
into a list comprehension
:
nums = [4, 2, 5, 10, 8]
maxes = [3, 1, 7, 8]
output = [len([x for x in nums if x <= entry]) for entry in maxes]
There are two promising alternatives to ordering nums
and finding the index of the smallest element therein larger than each element of maxes
:
order maxes
(ascendingly) and
tally elements num in nums
(ordered[m-1] <
)
num <= ordered[m]
and accumulate (finally reordering output) -
O((n+m) log(m))
order both and take turns finding, for an element in one sequence, the first exceeding it in the other, continuing with the successor thereof
Looks promising with len(nums)
and len(maxes)
in the same ballpark.
finger exercise:
from bisect import bisect_right
def oscar(nums, maxes):
""" Return an iterable containing for each element m in maxes
the number of elements in nums no larger than m. """
ordered = sorted(nums)
return (bisect_right(ordered, m) for m in maxes)
# for a list, I'd have to decide whether to make it immutable - and how
_empty = ()
# tries to reduce the ranges searched
def smith(nums, maxes):
""" Return an iterable containing for each element m in maxes
the number of elements in nums no larger than m. """
if not maxes or not nums:
return _empty
output = list()
ordered = sorted(nums)
previous = ordered[0]
index = bisect_right(nums, previous)
for m in maxes:
if previous < m:
index = bisect_right(nums, m, index)
elif m < previous:
index = bisect_right(nums, m, 0, index)
output.append(index)
previous = m
return output