The following pseudocode returns index of the given element x
within sorted array A
where p
is starting index and r
is last index of array A
.
I am confused about though process behind picking initial value of high
. I understand that if while
loop has low < high
then we pick high = r + 1
but if while
loop was low <= high
then we would pick high = r
so that element at highest index of array is always included in the search (thus not excluded from the search).
Please elaborate on the crux of thought process behind picking high
index value for initializing.
For example, one thing that I could think of was following:
Is there a relation to picking mid
value using floor? Floor biases mid
towards the left of the sorted array but high = r + 1
biases the mid
towards the right so it kind of balances out.
If I pick high = r
, in addition to changing while
condition to low <= high
, should I change mid
to ceiling((low+high)/2)
?
FIND-SPLIT-POINT(A, p, r, x)
low = p // low end of search range
high = r + 1 // high end of search range
while low < high // more than one element?
mid = floor((low + high)/2) // midpoint of range
if x <= A[mid] // is answer q <= mid?
high = mid // narrow search to A[low : mid]
else low = mid + 1 // narrow search to A[mid + 1 : high]
return low
-
1$\begingroup$ I would suggest you first do some dry runs on paper with, say, an array of size 10. Let us know if you still have doubts. $\endgroup$codeR– codeR2024年05月15日 07:39:15 +00:00Commented May 15, 2024 at 7:39
-
$\begingroup$ This seems to be an iterative version of binary search; there should be plenty of materials on the internet. :) $\endgroup$codeR– codeR2024年05月15日 07:42:02 +00:00Commented May 15, 2024 at 7:42
-
$\begingroup$ The algorithm itself is pretty simple but choice of high index is not elaborated on. Both versions can be proven correct with loop invariants. I have edited the question to reflect that I don't particularly want a loop invariant but elaboration of the process of picking high index value. $\endgroup$jam– jam2024年05月15日 16:54:15 +00:00Commented May 15, 2024 at 16:54
2 Answers 2
The thought process:
- I have an interval containing x.
- I can half the size of the interval, repeatedly.
- This is tricky to get right so I write down a loop invariant.
- I’ll need a loop invariant that is true before the loop starts.
- I’ll need a loop invariant that stays true after each iteration.
- Every iteration must make progress.
You said in a comment that you don't want a loop invariant, but I would strongly advise you to approach this in terms of loop invariants. The appropriate choices for low
and high
are those that lend themselves to an easy-to-state loop invariant.
The loop invariant for this algorithm appears to be that the split point between elements < x and those ≥ x is in the range low
to high
inclusive. If x
occurs exactly once in A
then the split point is also its index, but the algorithm is more general than that. There are length(A) + 1 = r + 2
possible split points, so I don't think assigning a lower value to high
would make sense.
If you replaced floor
with ceil
in the algorithm as written, it wouldn't work at all (consider what happens when high - low = 1
). The reason floor
makes sense here is that low
and high
point between array elements (to split points) while mid
points to an array element. If (low + high)/2
is not an integer, then interpreted in the same way as low
and high
, it points to the middle of an array element, and floor
gets you that element's index with no left/right bias (assuming 0-based indexing; in a 1-based language you should use ceil
). The bias in this algorithm is actually when (low + high)/2
is an integer: the array element to the right of center is tested in that case.
If you know that x
occurs in the array, you could pick a loop invariant like "the index of the leftmost occurrence of x
is in the range low
to high
inclusive" and set high = r
initially. If x
may not occur in the array, it's much harder to come up with a good invariant. You could start with "the range low
to high
inclusive includes the smallest element that is at least as large as x
", but that doesn't handle the case that all elements are smaller than x
. You would probably have to imagine that there is an infinite element past the end of the array, and allow the current range to include it, which takes you back to the original algorithm.