Is there a better way to write insertion sort?

Question 1

Background

I am doing an insertion sort and I would like it to be as efficient as the algorithm allows.

Code

After much research, this is what I made:

const isFunction = require("lodash.isfunction"); //copy pasta from lodash :D
const insertion = ( array ) => {
 if(!Array.isArray(array))
 throw new Error("array must be an Array");
 if (array.length === 0)
 return [];
 //shallow copy
 const clonedArray = array.slice();
 //Optimized iterative version: https://en.wikipedia.org/wiki/Insertion_sort
 let i = 1, j, temp;
 while( i < clonedArray.length ){
 temp = clonedArray[i];
 j = i - 1;
 while( j >= 0 && clonedArray[ j ] > temp ){
 clonedArray[ j + 1 ] = clonedArray[ j ];
 j = j - 1;
 }
 clonedArray[ j + 1 ] = temp;
 i = i + 1;
 }
 return clonedArray;
};
module.exports = insertion;

Questions

I tested the code and I know it works. But I wonder if the algorithm available in wikipedia is the best insertion sort algorithm.

Having in mind I wish this function's API to be pure ( I don't want to change the original array, I want to return a new one), I have the following questions:

Is there a better way to code insertion sort?
What improvements can be made to this code?

Question 2

well, whenever "block moves" are faster than individual ones and/or key comparison doesn't get eclipsed by "memory access", "binary insertion" beats linear, "regular" for "random" input, "galloping" for "semi-sorted".

Question 3

As is you can improve a little by having the first pass also make the copy of the array, saving one iteration by removing the array.slice. BTW if copy an array use spread operatorcloneArray = [...array]; For Javascript and because you are making a copy you can use a cloneArray = new Float64Array(array.length); which will improve indexing time

Question 4

This is a simplified version, please consider that the array may also, instead of numbers have objects. In such case I would could not have a FLoat64Array.

Question 5

It also depends if you're talking better in terms of big-O, or literally every last nanosecond because you'll be calling it a billion times a second. In this case, especially with a generalized function, you're probably looking in terms of big-O, so while a single iteration may help, you're only talking about removing one iteration of n, which wouldn't change big-O. If you are expecting suitably large lengths of your array though, it could be worth-while.

Question 6

The algorithm is as old as the hills, its O(n^2) , with all things equal and with no cost for more efficiency in coding style why not strive for the best performance in the language of implementation. Performance has benefits beyond the understanding of complexity, it opens the very poorly understood hidden workings of the language, a knowledge essential if you wish to be an expert in the language.

Question 7

Other than rewriting it in a less pseudo-code-like way:

const insertion = ( array ) => {
 if(!Array.isArray(array))
 throw new Error("array must be an Array");
 if (array.length === 0)
 return [];
 //shallow copy
 const clonedArray = array.slice();
 //Optimized iterative version: https://en.wikipedia.org/wiki/Insertion_sort
 let temp;
 for (let i = 1; i < clonedArray.length; i++) {
 temp = clonedArray[i];
 for (let j = i-1; j >= 0 && clonedArray[j] > temp; j--) {
 clonedArray[j + 1] = clonedArray[j];
 }
 clonedArray[j + 1] = temp;
 }
 return clonedArray;
};
module.exports = insertion;

There isn't much room for improvement from an algorithm perspective, since you're already using the optimized algorithm and you aren't doing anything unnecessary from a JavaScript perspective.

Rewriting it to use for() instead of while() won't have any real effect, other than it just looks less like pseudo-code and more like typical JS. It is really just a code style thing though, and open to opinion.

(Also, dropped isFunction() since you don't seem to use it, but that might just be copy-pasta from the rest of your code.)

Question 8

If array.length is 0 why return new empty array? Maybe return the original array? I believe it is known to be empty

Question 9

Since they explicitly states they wanted it to be a pure function, they should return a new one in order for consistency. If you pass in an array and sometimes get a new, sometimes get the existing, you can't rely on it to be pure. Returning a new one is the correct move with that requirement.

Question 10

As far as I know being pure means doesn't change state? Returning the same object without modifying it is pure enough, no?

Question 11

Not if you expect it to always return a new one. It's like if the map() function returned the original under certain circumstances. You then change the mapped function and are surprised when your original changes too. While the super technical definition may allow it, from a practical perspective, if a function generally returns a new object, it should always return one.

Question 12

Since an array is always by reference, if you have the function return the original array, you now have two references to that array, instead of 2 separate arrays. If you change one, you change both. It's a technically pure function, but from an "actually using the function" perspective, it's not truly pure.

Question 13

Maybe splicing can help?

If Array.splice is O(1) then the following is an improvement on the average.

If splice is O(n) then there is no benefit

This method does not improve the worst case

You can get a little extra out of the sort by using the previouse moved value as a known low value to start a search from. If the next value from the array is lower or equal to the previous value then you can start from that position rather than i -1

But because you no longer step over each value you need a way to move intervening values. For this you can use splice.

The following is an example implementation.

function sort(array) {
 var copyArr= [...array]; // Required by OP
 var i = 1, j, temp;
 var lastVal = copyArr[0]; // the last moved value
 var lastPos = 0; // the position of last moved value
 while ( i < copyArr.length ) {
 temp = copyArr[i];
 if (lastVal >= temp) { // is new value same or lower 
 j = lastPos; // start the search from that position
 } else {
 j = i - 1; // if not start from as normal
 }
 while( j && copyArr[ j-- ] > temp ); // find position for temp
 if (j < i - 1){ // did we find a lower position
 // yes then move temp from i to the new location via splicing
 copyArr.splice(j + 1,0, copyArr.splice(i, 1)[0]);
 lastVal = temp; // remember the value and location
 lastPos = j;
 }
 i += 1;
 }
 return copyArr;
}

For an even distribution of random values the savings are counted as less iterations of the inner loop as an average over many arrays. For a 10 item array there are 50%+ less inner iterations. For 100 its 37% less inner loops, for 1000 it's about 33%.

Question 14

Interesting. So this algorithm will only prove to be better than the one I currently have if splice is O(1). This is worth investing!

samanime samanime 1862 silver badges7 bronze badges · Accepted Answer · 2017-11-30 12:31:33Z

Other than rewriting it in a less pseudo-code-like way:

const insertion = ( array ) => {
 if(!Array.isArray(array))
 throw new Error("array must be an Array");
 if (array.length === 0)
 return [];
 //shallow copy
 const clonedArray = array.slice();
 //Optimized iterative version: https://en.wikipedia.org/wiki/Insertion_sort
 let temp;
 for (let i = 1; i < clonedArray.length; i++) {
 temp = clonedArray[i];
 for (let j = i-1; j >= 0 && clonedArray[j] > temp; j--) {
 clonedArray[j + 1] = clonedArray[j];
 }
 clonedArray[j + 1] = temp;
 }
 return clonedArray;
};
module.exports = insertion;

There isn't much room for improvement from an algorithm perspective, since you're already using the optimized algorithm and you aren't doing anything unnecessary from a JavaScript perspective.

Rewriting it to use for() instead of while() won't have any real effect, other than it just looks less like pseudo-code and more like typical JS. It is really just a code style thing though, and open to opinion.

(Also, dropped isFunction() since you don't seem to use it, but that might just be copy-pasta from the rest of your code.)

If array.length is 0 why return new empty array? Maybe return the original array? I believe it is known to be empty
Since they explicitly states they wanted it to be a pure function, they should return a new one in order for consistency. If you pass in an array and sometimes get a new, sometimes get the existing, you can't rely on it to be pure. Returning a new one is the correct move with that requirement.
As far as I know being pure means doesn't change state? Returning the same object without modifying it is pure enough, no?
Not if you expect it to always return a new one. It's like if the map() function returned the original under certain circumstances. You then change the mapped function and are surprised when your original changes too. While the super technical definition may allow it, from a practical perspective, if a function generally returns a new object, it should always return one.
Since an array is always by reference, if you have the function return the original array, you now have two references to that array, instead of 2 separate arrays. If you change one, you change both. It's a technically pure function, but from an "actually using the function" perspective, it's not truly pure.

Stack Exchange Network

Is there a better way to write insertion sort?

Background

Code

Questions

2 Answers 2

Maybe splicing can help?

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Is there a better way to write insertion sort?

Background

Code

Questions

2 Answers 2

Maybe splicing can help?

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions