What is the name for this search algorithm?

Question 1

A colleague came up with the following search algorithm:

We have an array of sorted, distinct integers [$i_0$ < $i_2$ < ... < $i_n$]
We are looking for the index of $i_k$ (for simplicity suppose that $i_k$ is in the array)
We suppose that the elements are pretty uniformly distributed between $i_1$ and $i_n$

The algorithm:

let "position" be initially be 0
if $i_k$ == $i_{position},ドル stop
otherwise:
- calculate distance = $i_k$ - $i_{position}$
- if we have just switched direction (eg. the previous distance was positive and this one is negative or vice-versa), make sure that abs(distance) < abs(previous distance) by adding/subtracting 1 from distance such that it gets closer to 0. Note: if distance is already is 1 or -1, it isn't adjusted.
- let position be position + distance (see note below for possible optimization)
- if position is positive, let position be min(position, n) (ie. don't overshoot the end)
- if position is negative, let position be max(position, 0) (ie. don't undershoot the beginning)

I would be interested if somebody studied this algorithm and if so, what's its name? The best I could come up with is that it's a variant of "interpolation search", however I expect the above algorithm to be faster on x86 hardware because it only uses addition/subtraction which is faster than the division/multiplication used by "standard" interpolation search.

Note:

Instead of updating position by adding the distance (ie. position = position + distance) we can use the "average distance between consecutive elements" as a scaling factor: $$position \leftarrow position + distance / {i_n-i_0 \over n+1}$$

This should give a more precise estimate about the element position.

A disadvantage of the above is that it involves a division which is slow(er) on x86/x64 platforms. Perhaps we can get away with finding the closes power of two to the scaling factor (average distance between consecutive elements) and use shift to perform the division (ie. if 2ドル^t$ is the power of two closes to the scaling factor we can do $position \leftarrow position + distance >> k$.

Update:

as people have pointed out the initial element needs to be $i_0$
added steps to avoid oscillating indefinitely between undershooting / overshooting the target
discovered the "galloping search" from Timsort which seems related but not the same

Question 2

This algorithm doesn't seem to work correctly. On the array $\{1,3,5\},ドル searching for 3ドル$ it will jump between 1ドル$ and 5ドル$ forever.

Question 3

Unless your array indices start at 0, the first time you do step 2 you'll refer to a non-existent array location.

Question 4

@RickDecker - correct, I updated the text to reflect the fact that the array starts at 0.

Question 5

@TomvanderZanden - correct. I updated the algorithm to avoid this "oscillating indefinitely" situation.

Question 6

@Cd-Man That resolves my concern with it oscillating indefinitely. However, it is still a pretty inefficient search algorithm since on an array $\{2,4,\ldots,2k-2,2k,2k+2,\ldots,4k-2,4k\},ドル searching for 2ドルk$ it needs linear time since it triggers your "stop it jumping" thing every step.

Question 7

You may be interested to a related algorithm, known as Interpolation Search, which achieves on average the complexity of $O(\log \log n)$ for searching $n$ keys. This is the usual search you do when you scan a phone book for example.

This paper discusses the algorithm, which was originally introduced earlier (1957) by Peterson.

Related data structures: dynamic interpolation search and interpolation search for non independent data.

Question 8

@Cd-MaN is aware of interpolation search, he even mentions it in his question. He hopes his search algorithm improves upon interpolation search by removing the (relatively expensive on many processors) division operation.

Massimo Cafaro Massimo Cafaro 4,37019 silver badges27 bronze badges · Answer 1 · 2015-02-25 10:11:54Z

You may be interested to a related algorithm, known as Interpolation Search, which achieves on average the complexity of $O(\log \log n)$ for searching $n$ keys. This is the usual search you do when you scan a phone book for example.

This paper discusses the algorithm, which was originally introduced earlier (1957) by Peterson.

Related data structures: dynamic interpolation search and interpolation search for non independent data.

@Cd-MaN is aware of interpolation search, he even mentions it in his question. He hopes his search algorithm improves upon interpolation search by removing the (relatively expensive on many processors) division operation.

Stack Exchange Network

What is the name for this search algorithm?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions