Insertion Sort Algorithm in Python

Question 1

Implementing basic sorting algorithms to learn them, and coding, better. Criticisms/ critiques welcome. Also possible optimizations.

import unittest
import random
def insertion_sort(seq):
 """Accepts a mutable sequence, utilizes insertion sort algorithm to sort in
 place. Returns an ordered list"""
 #marks point between ordered sublist & unordered list
 partition = 1
 while partition < len(seq):
 temp = partition
 #while temp not in correct pos in sublist, decrement pos
 while temp != 0 and seq[temp] < seq[temp-1]:
 seq[temp], seq[temp-1] = seq[temp-1], seq[temp]
 temp -= 1
 partition += 1
 return seq
class test_insertionsort(unittest.TestCase):
 def test_insertionsort(self):
 """Test insertion_sort()"""
 seq = [random.randrange(0, 1000) for _ in range(1000)]
 self.assertEqual(insertion_sort(seq), sorted(seq))
if __name__ == '__main__':
 unittest.main()

Question 2

I don't know much python so this maybe don't apply, but usually instead of inserting things to the array, the correct approach is to build a new one from scratch.

Question 3

@ajax: Insertion sort is usually described as an "in-place sort" so there's nothing wrong with the OP's approach.

Question 4

But I do think that using seq.pop and seq.insert is not really in the spirit of the exercise. These methods hide significant details of what the algorithm is doing. To really understand what is going on, you should try to implement it using only list assignment.

Question 5

@GarethRees I've updated the code to implement your critique. Also, thank you for all the feedback you've been providing, I've been learning quite a bit from it.

Question 6

One thing that can speed up insertion sort is to pull the partition value to the side and slide elements up into the vacancy until the correct location is found, then drop the partition value into that location. This eliminates one of the three assignments in the while loop:

for p in range(1, len(seq)):
 if seq[p] >= seq[p-1]:
 continue
 p_value = seq[p]
 while p != 0 and p_value < seq[p-1]:
 seq[p] = seq[p-1]
 p -= 1
 seq[p] = p_value
return seq

The unit test shows a substantial speed improvement from doing this.

Question 7

Answers from Gareth Rees to your other questions about selection sort and bubble sort still apply here but it is clear that you tried to take them into account already.

You can loop with for partition in range(1, len(seq)):. Also, it removes the need for an additional variable as it doesn't really matter anymore if you lose the current value of partition : it will have the right value at the beginning of the next iteration anyway.

Here's what you could do :

for p in range(1, len(seq)):
 while p != 0 and seq[p] < seq[p-1]:
 seq[p], seq[p-1] = seq[p-1], seq[p]
 p -= 1
return seq

Question 8

Insertion sorts get a bad rap, mostly because people use them the wrong way. Sure the bench mark for all sorting algorithms is to sort an array in place, but in the real world your data doesn't always come to you in a complete package, sometimes it arrives asynchronously.

Instead of force-fitting an insertion sort to existing data, let's redefine the problem to find the best context for an insertion sort:

import unittest
import random
class SortedList(list):
 def insert_item(self, new_item):
 insertion_point = len(self)
 for i, item in enumerate(self):
 if new_item < item:
 insertion_point = i
 break
 self.insert(insertion_point, new_item)
class test_insertionsort(unittest.TestCase):
 def test_insertionsort(self):
 """Test insertion_sort()"""
 sl = SortedList()
 seq = []
 for i in range(10000):
 item = random.randrange(0, 10000)
 sl.insert_item(item)
 seq.append(item)
 self.assertEqual(sl, sorted(seq))
if __name__ == '__main__':
 unittest.main()

Here, instead of generating all the data before the sort, it is fed into the sort as it appears.

(And note that a ~2x improvement is possible to optimize the insertion by starting in the middle of the list instead of always from the beginning).

Question 9

Hi, welcome to Code Review! Here, an answer should be about the code in the question. It's ok to present an alternative implementation, but then you should explain what issues it fixes, or in what way it's more optimal than the code in the question.

pjs pjs 2011 silver badge5 bronze badges · Answer 1 · 2014-02-04 20:42:30Z

One thing that can speed up insertion sort is to pull the partition value to the side and slide elements up into the vacancy until the correct location is found, then drop the partition value into that location. This eliminates one of the three assignments in the while loop:

for p in range(1, len(seq)):
 if seq[p] >= seq[p-1]:
 continue
 p_value = seq[p]
 while p != 0 and p_value < seq[p-1]:
 seq[p] = seq[p-1]
 p -= 1
 seq[p] = p_value
return seq

The unit test shows a substantial speed improvement from doing this.

SylvainD SylvainD 29.8k1 gold badge49 silver badges93 bronze badges · Answer 2 · 2014-02-04 17:43:15Z

Answers from Gareth Rees to your other questions about selection sort and bubble sort still apply here but it is clear that you tried to take them into account already.

You can loop with for partition in range(1, len(seq)):. Also, it removes the need for an additional variable as it doesn't really matter anymore if you lose the current value of partition : it will have the right value at the beginning of the next iteration anyway.

Here's what you could do :

for p in range(1, len(seq)):
 while p != 0 and seq[p] < seq[p-1]:
 seq[p], seq[p-1] = seq[p-1], seq[p]
 p -= 1
return seq

duanev duanev 1112 bronze badges · Answer 3 · 2015-10-29 07:30:48Z

Insertion sorts get a bad rap, mostly because people use them the wrong way. Sure the bench mark for all sorting algorithms is to sort an array in place, but in the real world your data doesn't always come to you in a complete package, sometimes it arrives asynchronously.

Instead of force-fitting an insertion sort to existing data, let's redefine the problem to find the best context for an insertion sort:

import unittest
import random
class SortedList(list):
 def insert_item(self, new_item):
 insertion_point = len(self)
 for i, item in enumerate(self):
 if new_item < item:
 insertion_point = i
 break
 self.insert(insertion_point, new_item)
class test_insertionsort(unittest.TestCase):
 def test_insertionsort(self):
 """Test insertion_sort()"""
 sl = SortedList()
 seq = []
 for i in range(10000):
 item = random.randrange(0, 10000)
 sl.insert_item(item)
 seq.append(item)
 self.assertEqual(sl, sorted(seq))
if __name__ == '__main__':
 unittest.main()

Here, instead of generating all the data before the sort, it is fed into the sort as it appears.

(And note that a ~2x improvement is possible to optimize the insertion by starting in the middle of the list instead of always from the beginning).

Hi, welcome to Code Review! Here, an answer should be about the code in the question. It's ok to present an alternative implementation, but then you should explain what issues it fixes, or in what way it's more optimal than the code in the question.

Stack Exchange Network

Insertion Sort Algorithm in Python

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Insertion Sort Algorithm in Python

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions