Background
I am doing an insertion sort and I would like it to be as efficient as the algorithm allows.
Code
After much research, this is what I made:
const isFunction = require("lodash.isfunction"); //copy pasta from lodash :D
const insertion = ( array ) => {
if(!Array.isArray(array))
throw new Error("array must be an Array");
if (array.length === 0)
return [];
//shallow copy
const clonedArray = array.slice();
//Optimized iterative version: https://en.wikipedia.org/wiki/Insertion_sort
let i = 1, j, temp;
while( i < clonedArray.length ){
temp = clonedArray[i];
j = i - 1;
while( j >= 0 && clonedArray[ j ] > temp ){
clonedArray[ j + 1 ] = clonedArray[ j ];
j = j - 1;
}
clonedArray[ j + 1 ] = temp;
i = i + 1;
}
return clonedArray;
};
module.exports = insertion;
Questions
I tested the code and I know it works. But I wonder if the algorithm available in wikipedia is the best insertion sort algorithm.
Having in mind I wish this function's API to be pure ( I don't want to change the original array, I want to return a new one), I have the following questions:
- Is there a better way to code insertion sort?
- What improvements can be made to this code?
2 Answers 2
Other than rewriting it in a less pseudo-code-like way:
const insertion = ( array ) => {
if(!Array.isArray(array))
throw new Error("array must be an Array");
if (array.length === 0)
return [];
//shallow copy
const clonedArray = array.slice();
//Optimized iterative version: https://en.wikipedia.org/wiki/Insertion_sort
let temp;
for (let i = 1; i < clonedArray.length; i++) {
temp = clonedArray[i];
for (let j = i-1; j >= 0 && clonedArray[j] > temp; j--) {
clonedArray[j + 1] = clonedArray[j];
}
clonedArray[j + 1] = temp;
}
return clonedArray;
};
module.exports = insertion;
There isn't much room for improvement from an algorithm perspective, since you're already using the optimized algorithm and you aren't doing anything unnecessary from a JavaScript perspective.
Rewriting it to use for()
instead of while()
won't have any real effect, other than it just looks less like pseudo-code and more like typical JS. It is really just a code style thing though, and open to opinion.
(Also, dropped isFunction()
since you don't seem to use it, but that might just be copy-pasta from the rest of your code.)
-
\$\begingroup\$ If array.length is 0 why return new empty array? Maybe return the original array? I believe it is known to be empty \$\endgroup\$kuskmen– kuskmen2017年11月30日 12:43:43 +00:00Commented Nov 30, 2017 at 12:43
-
1\$\begingroup\$ Since they explicitly states they wanted it to be a pure function, they should return a new one in order for consistency. If you pass in an array and sometimes get a new, sometimes get the existing, you can't rely on it to be pure. Returning a new one is the correct move with that requirement. \$\endgroup\$samanime– samanime2017年11月30日 12:45:36 +00:00Commented Nov 30, 2017 at 12:45
-
\$\begingroup\$ As far as I know being pure means doesn't change state? Returning the same object without modifying it is pure enough, no? \$\endgroup\$kuskmen– kuskmen2017年11月30日 12:47:59 +00:00Commented Nov 30, 2017 at 12:47
-
2\$\begingroup\$ Not if you expect it to always return a new one. It's like if the
map()
function returned the original under certain circumstances. You then change the mapped function and are surprised when your original changes too. While the super technical definition may allow it, from a practical perspective, if a function generally returns a new object, it should always return one. \$\endgroup\$samanime– samanime2017年11月30日 12:50:40 +00:00Commented Nov 30, 2017 at 12:50 -
1\$\begingroup\$ Since an array is always by reference, if you have the function return the original array, you now have two references to that array, instead of 2 separate arrays. If you change one, you change both. It's a technically pure function, but from an "actually using the function" perspective, it's not truly pure. \$\endgroup\$samanime– samanime2017年11月30日 14:51:00 +00:00Commented Nov 30, 2017 at 14:51
Maybe splicing can help?
If Array.splice
is O(1) then the following is an improvement on the average.
If splice is O(n) then there is no benefit
This method does not improve the worst case
You can get a little extra out of the sort by using the previouse moved value as a known low value to start a search from. If the next value from the array is lower or equal to the previous value then you can start from that position rather than i -1
But because you no longer step over each value you need a way to move intervening values. For this you can use splice.
The following is an example implementation.
function sort(array) {
var copyArr= [...array]; // Required by OP
var i = 1, j, temp;
var lastVal = copyArr[0]; // the last moved value
var lastPos = 0; // the position of last moved value
while ( i < copyArr.length ) {
temp = copyArr[i];
if (lastVal >= temp) { // is new value same or lower
j = lastPos; // start the search from that position
} else {
j = i - 1; // if not start from as normal
}
while( j && copyArr[ j-- ] > temp ); // find position for temp
if (j < i - 1){ // did we find a lower position
// yes then move temp from i to the new location via splicing
copyArr.splice(j + 1,0, copyArr.splice(i, 1)[0]);
lastVal = temp; // remember the value and location
lastPos = j;
}
i += 1;
}
return copyArr;
}
For an even distribution of random values the savings are counted as less iterations of the inner loop as an average over many arrays. For a 10 item array there are 50%+ less inner iterations. For 100 its 37% less inner loops, for 1000 it's about 33%.
-
\$\begingroup\$ Interesting. So this algorithm will only prove to be better than the one I currently have if splice is O(1). This is worth investing! \$\endgroup\$Flame_Phoenix– Flame_Phoenix2017年11月30日 17:33:37 +00:00Commented Nov 30, 2017 at 17:33
array.slice
. BTW if copy an array use spread operatorcloneArray = [...array];
For Javascript and because you are making a copy you can use acloneArray = new Float64Array(array.length);
which will improve indexing time \$\endgroup\$