fastest way to find number of smaller elements to the right of an array

Question 1

In Daily Coding Problem, Miller&Wu have the following problem : Given an array of integers, return a new array where each element in the new array is number of smaller elements to the right of that element in the original input array.

Their suggested solution is (in python):

import bisect
def smaller_counts(lst):
 result = []
 seen = []
 for num in reversed(lst):
 i = bisect.bisect_left(seen, num)
 resul.append(i)
 bisect.insort(seen, num)
 return list(reversed(result))

While the algorithm is correct, the authors then claim that it takes O(n log(n)), which seems wrong to me:I'd expect the call to insort to be a O(num), since we're inserting in an array and we need to move all the elements after num to the right of that array. That's also confirmed by python doc.

As a result, I think it's a total cost of O(n^2).

Am I missing something, or did the authors make a mistake here (probably assuming that the insort would have the same log(num) cost as the bisect_left) ?

I also thought of posting this on code golf, but wasn't sure it belonged there.

Question 2

Your reasoning seems correct. Making n iterations over an O(n) operation means O(n2) total complexity in general. Whether the insertion actually has O(n) time complexity depends on the data distribution. For example, if we were to insert elements in ascending order (so that new elements are only appended, without shifting existing values) then it could be as cheap as O(1). However, this scenario does not make it possible to make any assumptions about the data distribution.

Note that an O(n log n) solution is possible if seen is an ordered set data structure with O(log n) insertion, such as a balanced binary tree. Python has no suitable data structure in the standard library. The heapq module is similar but cannot provide the necessary operations here. A possible manual implementation might look like this:

from dataclasses import dataclass
from typing import Optional, Tuple
def smaller_counts(lst: list[int]) -> list[int]:
 result = []
 seen = None
 for num in reversed(lst):
 location, seen = treeset_insert_and_get_location(seen, num)
 result.append(location)
 return list(reversed(result))
@dataclass
class Node:
 value: int
 size: int
 left: 'Optional[Node]' = None
 right: 'Optional[Node]' = None
 @property
 def left_size(self) -> int:
 if self.left is not None:
 return self.left.size
 return 0
 @property
 def right_size(self) -> int:
 if self.right is not None:
 return self.right.size
 return 0
def treeset_insert_and_get_location(
 tree: Optional[Node], value: int,
) -> Tuple[int, Node]:
 if tree is None:
 return 0, Node(value=value, size=1)
 if value < tree.value:
 location, tree.left = treeset_insert_and_get_location(
 tree.left, value)
 tree.size += 1
 return location, tree
 if value > tree.value:
 location, tree.right = treeset_insert_and_get_location(
 tree.right, value)
 tree.size += 1
 return tree.size - tree.right_size + location, tree
 assert tree.value == value
 tree.size += 1
 return tree.left_size, tree

^{full code incl. tests}

In practice, the presented solution in the question is still going to be reasonably efficient at smaller input sizes. The O(n) insertion is the kind of operation that modern computers are fairly good at. I'd expect that the presented solution, or even a naive O(n2) solution, would outperform any tree-based data structure until about a few hundred or a few thousand elements.

Question 3

The problem is that we want to position at which we insert num, which won't be returned for any data structure with a log(n) insertion cost that I know of. So I don't think there's even any algorithm that works in O(n*log(n)) for the whole problem.

Question 4

@molyss I've updated the answer to include a crude implementation of a tree-set which can query/insert with O(log n) typical complexity. However, this tree is not self-balancing so that it will degrade to O(n) when dealing with sorted input. This style of tree can efficiently determine the ordered location because each node tracks its size.

Question 5

An obvious solution is sorting a copy of the original array while keeping track of the location of each item in the original array, and then all the information to calculate the result is readily available. Can be done in O(n log n).

Experimentally you can use their algorithm for an array with 1,000, 10,000, 100,000 and a million random items. Execution time should grow by a factor less than 15 each time if it is n log n, and a factor 100 if quadratic.

Question 6

I think that’d be an n^2 solution. The sorting would be O(n log(n)), but then you’d have to iterate over the sorted array and count how many on their right have an original position greater than the current. Sorting the array would do nothing here, unless I’ again missing something

amon amon 136k27 gold badges295 silver badges386 bronze badges · Accepted Answer · 2021-11-23 17:57:43Z

Your reasoning seems correct. Making n iterations over an O(n) operation means O(n2) total complexity in general. Whether the insertion actually has O(n) time complexity depends on the data distribution. For example, if we were to insert elements in ascending order (so that new elements are only appended, without shifting existing values) then it could be as cheap as O(1). However, this scenario does not make it possible to make any assumptions about the data distribution.

Note that an O(n log n) solution is possible if seen is an ordered set data structure with O(log n) insertion, such as a balanced binary tree. Python has no suitable data structure in the standard library. The heapq module is similar but cannot provide the necessary operations here. A possible manual implementation might look like this:

from dataclasses import dataclass
from typing import Optional, Tuple
def smaller_counts(lst: list[int]) -> list[int]:
 result = []
 seen = None
 for num in reversed(lst):
 location, seen = treeset_insert_and_get_location(seen, num)
 result.append(location)
 return list(reversed(result))
@dataclass
class Node:
 value: int
 size: int
 left: 'Optional[Node]' = None
 right: 'Optional[Node]' = None
 @property
 def left_size(self) -> int:
 if self.left is not None:
 return self.left.size
 return 0
 @property
 def right_size(self) -> int:
 if self.right is not None:
 return self.right.size
 return 0
def treeset_insert_and_get_location(
 tree: Optional[Node], value: int,
) -> Tuple[int, Node]:
 if tree is None:
 return 0, Node(value=value, size=1)
 if value < tree.value:
 location, tree.left = treeset_insert_and_get_location(
 tree.left, value)
 tree.size += 1
 return location, tree
 if value > tree.value:
 location, tree.right = treeset_insert_and_get_location(
 tree.right, value)
 tree.size += 1
 return tree.size - tree.right_size + location, tree
 assert tree.value == value
 tree.size += 1
 return tree.left_size, tree

^{full code incl. tests}

In practice, the presented solution in the question is still going to be reasonably efficient at smaller input sizes. The O(n) insertion is the kind of operation that modern computers are fairly good at. I'd expect that the presented solution, or even a naive O(n2) solution, would outperform any tree-based data structure until about a few hundred or a few thousand elements.

The problem is that we want to position at which we insert num, which won't be returned for any data structure with a log(n) insertion cost that I know of. So I don't think there's even any algorithm that works in O(n*log(n)) for the whole problem.
@molyss I've updated the answer to include a crude implementation of a tree-set which can query/insert with O(log n) typical complexity. However, this tree is not self-balancing so that it will degrade to O(n) when dealing with sorted input. This style of tree can efficiently determine the ordered location because each node tracks its size.

Stack Exchange Network

fastest way to find number of smaller elements to the right of an array

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

fastest way to find number of smaller elements to the right of an array

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions