EDIT: code snippet added, check the output in JS console
I wrote the following binary search algorithm in JS, for the first time in my experience. The variable sampleObj
is a random object array in the form of: [{index: 5, sample: 3}, {index: 8, sample: 1}, ...]
, the final console.log(sampleObj)
should log target value and index in an object if found, 'not found'
if not found
My idea is:
- sort the sample object array from small to large value, along with the index of each value,
- find the middle value of the array, if it equals to
target
, return the object; if it's larger, assign the first half of the object array tosampleObj
, otherwise assign the second half, - repeat until found or not found.
The code consists of 3 main parts:
- generate sample object array,
- define binary search method,
- conduct a binary search.
Here is the code, it passed all my test cases, but what can I improve?
// sample size
const N = 20;
const sample = Array.from({length: N}, () => Math.floor(Math.random() * N));
// define target value
const target = 0;
const indices = [...Array(N).keys()];
const sampleWithIndices = [sample, indices];
let sampleObj;
console.log(sampleWithIndices);
console.log(`target is ${target}`);
// 1. generate sample object array
const generateObj = (origin) => {
let output = [];
for(let i = 0; i < origin.length; i++) {
let index = [...Array(N).keys()];
output[i] = {
'sample': origin[i],
'index': index[i]
}
}
return output;
};
sampleObj = generateObj(sample);
// sort sample object by sample array
const sortSampleObj = (input) => {
return input.sort((a, b) => {
return a.sample - b.sample
})
};
sampleObj = sortSampleObj(sampleObj);
// 2. define binary search method
const binarySearch = (inputObj, inputMidIndex, inputMidSampleValue, inputTarget) => {
if(inputMidSampleValue === inputTarget) {
inputObj = [inputObj[inputMidIndex]];
}
else if(inputMidSampleValue > inputTarget) {
inputObj = inputObj.slice(0, inputMidIndex);
}
else {
inputObj = inputObj.slice(inputMidIndex);
}
return inputObj;
};
// 3. conduct binary search
let midIndex, midSampleValue, maxIterations;
maxIterations = Math.round(sampleObj.length / 2);
for(let i = 0; i < maxIterations; i++) {
midIndex = Math.round(sampleObj.length / 2) - 1;
midSampleValue = sampleObj[midIndex].sample;
sampleObj = binarySearch(sampleObj, midIndex, midSampleValue, target);
if (sampleObj.length === 1) {
sampleObj = sampleObj[0].sample === target ? sampleObj : 'not found';
break;
}
}
console.log(sampleObj)
-
\$\begingroup\$ Hi @user122973 - the obvious problem is there's a lot of "assumptions" in the code. e.g. something named sampleObj is used as an array... not sure many people would think that .. \$\endgroup\$Mr R– Mr R2021年06月08日 02:23:14 +00:00Commented Jun 8, 2021 at 2:23
-
\$\begingroup\$ Could you please edit your post and provide a working sample of your code in a code snippet (ctrl + m)? Thanks. \$\endgroup\$kemicofa– kemicofa2021年06月08日 09:23:01 +00:00Commented Jun 8, 2021 at 9:23
-
1\$\begingroup\$ @kemicofaghost sorry I should have thought that, snippet added \$\endgroup\$one-hand-octopus– one-hand-octopus2021年06月08日 09:27:30 +00:00Commented Jun 8, 2021 at 9:27
1 Answer 1
Review
The general impression at first look is that the code is rather messy with poor formatting, poorly organised and a lot of noisy, redundant and/or long winded code.
When testing the code I found bugs. The general rule at Code Review is that code should work to the best of your knowledge. However I do not think you knew that it could fail.
Thorough testing would have found the bug. Always test your code
Bugs
There are bugs. The fact that you use the value maxIterations
to terminate the loop shows you were somewhat aware of a problem.
Main bug
For input array of 20 items, items at sorted index of 3, 5, 8, 10, 13, 15, 19 are not found because the function binarySearch
does not correctly split the array into smaller parts.
Minor problems
Further and minor problems are the inconsistent result, both undefined
and "not found"
have the same meaning (no result).
The found item is added to an array while "not found"
string and undefined
are not in an array.
This make working with the result difficult.
Personally undefined
is preferred over a string as a failed result, and the result (as there can be only one) should be a reference rather than an array containing a reference. (see rewrite)
Complexity
Your complexity \$O(n)\$ is way above what a binary search should be \$O(log(n))\$
Binary search is fast because at most it need only test \$O(log(n))\$ items to find a match or know there is no match. Eg for 32 items you test only 5 items
However you copy parts of the array each time you step the search (half the items of the previous half).
Making a copy means you need to step over each item copied (hidden iteration by JS engine when you call Array.slice). Thus the worst case is \$O(n)\$ which is the same as a linear search. Example An array of 32 will require the copying of 16 + 8 + 4 + 2 + 1 items (31) in the worst case
The rewrite uses a bounds to calculate the next index to test. It does not copy any part of the array.
General Points
Code noise
Code noise is anything that is
- Code not required or is excessively complex (source code complexity)
- Code that can be written in a standard shorter form
- Repeated code
Shorter form
Property names
There is no need to wrap object property names in quotes. Eg You have
output[i] = {'sample': origin[i], 'index': index[i]}
can beoutput[i] = {sample: origin[i], index: index[i]}
Implied return
Arrow function can have an implied return. For example the sort function can be
input.sort((a, b) => a.sample - b.sample);
(same asinput.sort((a, b) => { return a.sample - b.sample); }
) the return is implied if you do not include{}
around the functions code block.
Not required
Excessive complexity
In
generateObj
you create an array of indexes[...Array(N).keys()]
then index that array to get an indexindex: index[i]
. I am not sure why you do this as you have the index already asi
There is no need to loop till
maxIterations
, you can use a while loop to exit when the array size is 1. (I understand you used it to avoid the bugs)It is best to calculate values as close to the point you need them.
The variables
midIndex
andmidSampleValue
are created outside the function that uses thembinarySearch
. These values are not used outside that function.As that function has all the data required to calculate theses values it is far simpler to calculate them in that function rather than outside and needing to pass them as arguments.
Redundant assign
Array.sort sorts the array in place. There is no need to reassign it.
sampleObj = sortSampleObj(sampleObj);
is the same assortSampleObj(sampleObj);
Style
Some style points.
Use
const
for constants.Take care to correctly indent code.
Always create functions rather than use inline code. In the rewrite the search is a function. This makes the code portable, gives semantic meaning to the code, allows for a cleaner exit.
Rewrite
The rewrite fixes bugs and has the correct complexity \$O(log(n))\$ for a binary search.
Notes
The rewrite function
binarySearch
does the searchThe search function takes an additional argument
propName
that defines the name of the property that the search is looking for. It defaults to"sample"
The function returns the item found or undefined
There is a usage example at the bottom.
There are many ways to implement a binary search. The rewrite creates a bounds top
and bottom
The mid point idx
is halfway between top and bottom and is the item to test. If the tested point is greater then top
is set to the mid point idx
else bottom
is set to mid point idx
.
I use binary operator right shift >> (Right shift) to half values as it does a divide and floor in one step. Eg val >> 1
is the same as Math.floor(val / 2)
function binarySearch(data, targetValue, propName = "sample") {
var top = data.length - 1, bottom = 0, idx = top >> 1;
while (top >= bottom){
const item = data[idx], val = item[propName];
if (val === targetValue) { return item }
val < targetValue ? bottom = idx + 1 : top = idx - 1;
idx = (bottom + top) >> 1;
}
}
const data = [{sample: 1, idx 0}, {sample: 10, idx 1}, {sample: 20, idx 2}];
console.log(binarySearch(data, 10)?.idx ?? "Not found")
-
\$\begingroup\$ As a discussion, when you mentioned in main bug: For input array of 20 items, items at sorted index of 3, 5, 8, 10, 13, 15, 19 are not found, I tried searching 0 in
[9, 13, 17, 11, 8, 0, 1, 1, 13, 15, 10, 10, 19, 11, 9, 1, 14, 15, 5, 4]
and the algorithm returned{index: 5, sample 0}
, so I suppose item at the sorted index of 5 could be found? Also for[4, 18, 19, 9, 15, 19, 17, 5, 7, 1, 11, 16, 2, 12, 10, 0, 2, 16, 7, 18]
, 0 could be found at index 15. \$\endgroup\$one-hand-octopus– one-hand-octopus2021年06月08日 15:47:27 +00:00Commented Jun 8, 2021 at 15:47 -
\$\begingroup\$ @user122973 In both examples you provide 0 would be at sorted index 0 (the index of the item when sorted does not match the
item.index
value) \$\endgroup\$Blindman67– Blindman672021年06月08日 16:23:54 +00:00Commented Jun 8, 2021 at 16:23 -
\$\begingroup\$ Can you explain the last line of syntax
binarySearch(data, 10)?.idx ?? "Not found"
, especially?.idx ?? 'not found'
. What is the.idx
after the ternary operator, and why does??
return 'not found'? \$\endgroup\$one-hand-octopus– one-hand-octopus2021年06月09日 19:58:03 +00:00Commented Jun 9, 2021 at 19:58