1
\$\begingroup\$

I'm a new programmer and I'm periodically going into the Intro To Algorithms CLRS textbook and trying to translate pseudocode into Ruby for skill practice. This is my implementation/translation of insertion sort.

I'm looking for feedback on readability as well as efficiency.

Pseudocode is found on page 17 of Intro To Algorithms Second Edition just in case you happen to have it lying around, more interested in the feedback just on the code itself.

class InsertionSort
 attr_accessor :arr
 def initialize(arr)
 @arr = arr
 end
 
 def sort
 key = 1
 to_the_left = key - 1
 
 while key < self.arr.length
 if arr[key] < arr[to_the_left] 
 
 mover_key = key
 left_ele_moving_right = to_the_left
 
 until arr[mover_key] > arr[left_ele_moving_right] || left_ele_moving_right < 0
 
 arr[left_ele_moving_right], arr[mover_key] = 
 arr[mover_key], arr[left_ele_moving_right]
 
 mover_key -= 1 
 left_ele_moving_right -= 1
 end
 end
 
 key += 1
 to_the_left += 1
 end
 
 self.arr
 end
end
```
asked Jan 5, 2021 at 21:18
\$\endgroup\$

2 Answers 2

3
\$\begingroup\$

Consistency

Sometimes you are using 4 spaces for indentation, sometimes 2. Sometimes you are using vertical whitespace (an empty line) before a method definition, sometimes you don't. Sometimes you have whitespace at the end of a line, sometimes you don't. In one case, you indent a couple of lines just for no discernible reason.

You should choose one style and stick with it. If you are editing some existing code, you should adapt your style to be the same as the existing code. If you are part of a team, you should adapt your style to match the rest of the team.

Most communities have developed standardized community style guides. In Ruby, there are multiple such style guides. They all agree on the basics (e.g. indentation is 2 spaces), but they might disagree on more specific points (single quotes or double quotes).

In general, if you use two different ways to write the exact same thing, the reader will think that you want to convey a message with that. So, you should only use two different ways of writing the same thing IFF you actually want to convey some extra information.

For example, some people always use parentheses for defining and calling purely functional side-effect free methods, and never use parentheses for defining and calling impure methods. That is a good reason to use two different styles (parentheses and no parentheses) for doing the same thing (defining methods).

Indentation

The community standard for indentation is 2 spaces. For example, your first two lines should look like this:

class InsertionSort
 attr_accessor :arr

Instead of this:

class InsertionSort
 attr_accessor :arr

Furthermore, you should only indent in places where there is a logically nested structure, e.g. a method definition inside a class definition, the body of the method definition, the body of a block, etc. An exception is a continuation of the previous line, this should also be indented.

Here, you are indenting for no apparent reason:

arr[left_ele_moving_right], arr[mover_key] =
arr[mover_key], arr[left_ele_moving_right]
 mover_key -= 1 
 left_ele_moving_right -= 1

On the other hand, in the same place, you should indent this:

arr[left_ele_moving_right], arr[mover_key] =
 arr[mover_key], arr[left_ele_moving_right]

To make it clear that the second line is a continuation of the first.

No whitespace at the end of the line

There should be no whitespace at the end of the line. For example, this line:

if arr[key] < arr[to_the_left] 

Has a space at the end. That shouldn't be there. And there are two more lines like this.

Also, there are multiple "empty" lines that actually contain only spaces. Again, those shouldn't be there.

No empty line at the end of the file

Your file should end with a single newline. There should not be an extra empty line at the end of the file.

Vertical whitespace

There should be a blank line after every "logical" break. In particular, there should be a blank line after the attr_accessor:

attr_accessor :arr
def initialize(arr)

On the other hand, there should be no empty line directly after the if and the until.

Otherwise, your use of vertical whitespace looks good! Logical steps in the method are clearly separated by empty lines, the code looks "light", not all bunched together with "room to breathe". I like it, well done!

self is the implicit receiver

In Ruby, if you don't explicitly specify the receiver of a message send, the receiver is self. In idiomatic code, you should never explicitly specify the receiver self, unless it is absolutely necessary.

Right now, I can only think of two cases, where it would be necessary:

  • To use an attribute writer method: self.foo = bar.
  • To use an operator: self + foo

So, this:

self.arr

should just be:

arr

Same here:

while key < self.arr.length

should be

while key < arr.length

Prefer predicate methods over relational operators

You should prefer "speaking" predicate methods such as Numeric#negative? over relational operators such as < 0.

So, this:

left_ele_moving_right < 0

should be

left_ele_moving_right.negative?

Code Formatting

If possible, you should set your editor or IDE to automatically format your code when you type, when you paste, and when you save, and set up your version control system to automatically format your commit when you push, as well as set up your CI system to reject code that is not correctly formatted. If not possible, you should seriously consider using a different editor or IDE, version control system, or CI system.

Here's the result of your code, when I simply paste it into my editor, without doing anything else:

class InsertionSort
 attr_accessor :arr
 def initialize(arr)
 @arr = arr
 end
 def sort
 key = 1
 to_the_left = key - 1
 while key < arr.length
 if arr[key] < arr[to_the_left]
 mover_key = key
 left_ele_moving_right = to_the_left
 until arr[mover_key] > arr[left_ele_moving_right] || left_ele_moving_right < 0
 arr[left_ele_moving_right], arr[mover_key] =
 arr[mover_key], arr[left_ele_moving_right]
 mover_key -= 1
 left_ele_moving_right -= 1
 end
 end
 key += 1
 to_the_left += 1
 end
 arr
 end
end

As you can see, simply copying your code into my editor, the editor corrected every single thing I wrote above except one.

Linting

You should run some sort of linter or static analyzer on your code. Rubocop is a popular one, but there are others.

Rubocop was able to detect all of the style violations I pointed out above (plus some more), and also was able to autocorrect all of the ones I listed.

Let me repeat that: I have just spent two pages pointing out how to correct tons of stuff that you can actually correct within milliseconds at the push of a button. I have set up my editor such that it automatically runs Rubocop with auto-fix as soon as I hit "save".

In particular, running Rubocop on your code, it detects 26 offenses, of which it can automatically correct 23.

Here's what the result of the auto-fix looks like:

class InsertionSort
 attr_accessor :arr
 def initialize(arr)
 @arr = arr
 end
 def sort
 key = 1
 to_the_left = key - 1
 while key < arr.length
 if arr[key] < arr[to_the_left]
 mover_key = key
 left_ele_moving_right = to_the_left
 until arr[mover_key] > arr[left_ele_moving_right] || left_ele_moving_right.negative?
 arr[left_ele_moving_right], arr[mover_key] =
 arr[mover_key], arr[left_ele_moving_right]
 mover_key -= 1
 left_ele_moving_right -= 1
 end
 end
 key += 1
 to_the_left += 1
 end
 arr
 end
end

And here are the offenses that Rubocop could not automatically correct:

Inspecting 1 file
C
Offenses:
insertion_sort.rb:1:1: C: Style/Documentation: Missing top-level class documentation comment.
class InsertionSort
^^^^^
insertion_sort.rb:8:3: C: Metrics/AbcSize: Assignment Branch Condition size for sort is too high. [<10, 21, 7> 24.29/17]
 def sort ...
 ^^^^^^^^
insertion_sort.rb:8:3: C: Metrics/MethodLength: Method has too many lines. [17/10]
 def sort ...
 ^^^^^^^^
1 file inspected, 3 offenses detected

Similar to Code Formatting, it is a good idea to set up your tools such that the linter is automatically run when you paste code, edit code, save code, commit code, or build your project, and that passing the linter is a criterium for your CI pipeline.

attr_reader vs. attr_accessor

You are never writing to arr, so it should probably be a Module#attr_reader instead of a Module#attr_accessor:

attr_reader :arr

Access Restrictions

arr is not intended to be used by other objects. In fact, it shouldn't be used by other objects! It is private internal state of the sorter object. Therefore, it should not be part of the public API, it should be private:

private attr_reader :arr

Iterators

In Ruby, you almost never use loops. You would normally use at least an low-level iterator such as Kernel#loop, #each, or Integer#times. Really, you want to use higher-level iterators such as Enumerable#map, Enumerable#select, Enumerable#group_by, Enumerable#flat_map, Enumerable#inject, etc.

In this case, we are going to use Integer#times for the outer loop.

def sort
 arr.length.times do |key|
 to_the_left = key - 1
 if arr[key] < arr[to_the_left]
 mover_key = key
 left_ele_moving_right = to_the_left
 until arr[mover_key] > arr[left_ele_moving_right] || left_ele_moving_right.negative?
 arr[left_ele_moving_right], arr[mover_key] =
 arr[mover_key], arr[left_ele_moving_right]
 mover_key -= 1
 left_ele_moving_right -= 1
 end
 end
 end
 arr
end

Guard clauses

If you have a case where an entire method or block is wrapped in a conditional, you can replace that with a "guard clause" and reduce the level of nesting.

E.g. this:

def something
 if foo
 bar
 baz
 quux
 else
 42
 end
end

can become this:

def something
 return 42 unless foo
 bar
 baz
 quux
end

After the above change, there is now an opportunity to do that in your code:

def sort
 arr.length.times do |key|
 to_the_left = key - 1
 next unless arr[key] < arr[to_the_left]
 mover_key = key
 left_ele_moving_right = to_the_left
 until arr[mover_key] > arr[left_ele_moving_right] || left_ele_moving_right.negative?
 arr[left_ele_moving_right], arr[mover_key] =
 arr[mover_key], arr[left_ele_moving_right]
 mover_key -= 1
 left_ele_moving_right -= 1
 end
 end
 arr
end

Objects

I don't see a particular reason why insertion sort should be an object. It doesn't have any state (having the array to be sorted, as you do, as state is somewhat weird), and it only has a simple single behavior. It would make more sense for a sorting algorithm to be an object if it carries some complex state that persists between different runs of the algorithm.

In a language which supports procedures or functions, this should probably be a function.

In Ruby, I would make it maybe a singleton method of a module, or an instance method of Array. However, I would not name it sort because that method already exists. I would also not name it insertion_sort because that is an implementation detail. I think stable_sort would be a good name, because the fact that insertion sort is stable (if implemented correctly) is an important distinguishing feature that sets it apart from the default sort in Ruby, which is not stable.

So, either something like this:

module Sort
 module_function def stable_sort(arr)
 # ...
 end
end

or something like this:

class Array
 def stable_sort
 length.times do |key|
 # ...
 end
 end
end

Actually, I am not a big fan of monkey-patching core classes. At least put the method in a separate mixin, so that it is easier to trace where the code is coming from:

module StableSortExtension
 def stable_sort
 length.times do |key|
 # ...
 end
 end
end
class Array
 include StableSortExtension
end

But even better would be to make it a Refinement:

module StableSort
 module StableSortExtension
 def stable_sort
 length.times do |key|
 to_the_left = key - 1
 next unless self[key] < self[to_the_left]
 mover_key = key
 left_ele_moving_right = to_the_left
 while self[mover_key] <= self[left_ele_moving_right] && left_ele_moving_right >= 0
 self[left_ele_moving_right], self[mover_key] =
 self[mover_key], self[left_ele_moving_right]
 mover_key -= 1
 left_ele_moving_right -= 1
 end
 end
 end
 end
 refine Array do
 include StableSortExtension
 end
end

Now, the Array#stable_sort method will only exist for code which explicitly activates the Refinement with

using StableSort

Observe:

ary = [5, 4, 6, 3, 7, 2, 8, 1, 9, 0, 10]
ary.stable_sort
# undefined method `stable_sort' for [5, 4, 6, 3, 7, 2, 8, 1, 9, 0, 10]:Array (NoMethodError)
using StableSort
ary.stable_sort
p ary
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

However, I would not do it this way. This stable_sort method mutates its receiver, which is generally not the case for other sort methods in Ruby, and thus very surprising to people.

So, the method should at least dup the receiver and perform its mutations on that duplicate, then return it.

Which brings me to my next point:

Mutation

Your method mutates its argument. That is an absolute no-go. You should never-ever break someone else's toy. An argument that is handed to you, is sacred.

You can fix this by simply adding something like arr = arr.dup at the beginning of the method (or arr = dup for the case where you want to make it an instance method). But I believe we can improve the code more generally.

More generally, your code looks more like FORTRAN, Pascal, or C, with lots of mutation, looping, manual fiddling with indices.

Insertion sort can actually be expressed very elegantly without mutation and without looping, which will not only simplify the code, but also get rid of the mutation problem.

Something like this:

module Enumerable
 protected def ins(element)
 return [element] if count.zero?
 head = first
 return [element] + self if element <= head
 [head] + drop(1).ins(element)
 end
 def stable_sort
 return [] if count.zero?
 drop(1).stable_sort.ins(first)
 end
end

This version does not mutate anything, and it works with all Enumerables (including Enumerators), not just Arrays. (It does not work with infinite Enumerators, though.)

Frozen string literals

Immutable data structures and purely functional code are always preferred, unless mutability and side-effects are required for clarity or performance. In Ruby, strings are always mutable, but there is a magic comment you can add to your files (also available as a command-line option for the Ruby engine), which will automatically make all literal strings immutable:

# frozen_string_literal: true

It is generally preferred to add this comment to all your files.

Note: I put this at the very end, because there actually are no String literals in your code, so it doesn't matter. Rubocop and some other Linters will complain about this regardless. It's your choice whether you want to get into the habit of always adding it, regardless of whether you use String literals in the code or not. Personally, I always add it, just in case I add a String literal later on.

answered Jan 8, 2021 at 17:48
\$\endgroup\$
2
  • \$\begingroup\$ What is the purpose of StableSortExtension? \$\endgroup\$ Commented Jan 10, 2021 at 6:12
  • \$\begingroup\$ As I wrote in my answer: "put the method in a separate mixin, so that it is easier to trace where the code is coming from". Now, if you do something like Array.ancestors, StableSortExtension will show up in the inheritance tree, and if you are wondering where this stable_sort method that you never heard about and never noticed before comes from, you can probably guess it came from there. \$\endgroup\$ Commented Jan 10, 2021 at 8:09
2
\$\begingroup\$

Insertion sort is undoubtedly the easiest-to-understand and easiest-to-code sorting method, requiring but one line of Ruby code. It seems overkill in the extreme to define a class with instance variables and multiple instance methods. I have written a single method insertion_sort.

If desired it could be written as a module method in a module that could be required as needed. Assuming you wished to compare sorting methods (considering that the more efficient built-in methods Array#sort, Enumerable#sort and Enumerable#sort_by would be the methods of choice for general sorting needs) you might create a module such as the following.

module SortingMethods
 def self.insertion_sort(arr)
 ...
 end
 def self.bubble_sort(arr)
 ...
 end
 
 ...
end

The method would then be invoked

sorted = SortingMethods.insertion_sort(arr)

This is analogous to the use of the built-in Math module, which contains module methods only, methods that would serve as functions (which are not invoked on receivers) in other languages.

The method insertion_sort can be written is as follows.

def insertion_sort(arr)
 arr.each_with_object([]) do |o, sorted|
 sorted.insert(sorted.bsearch_index { |e| e >= o } || sorted.size, o)
 end
end
insertion_sort [8, 29, 10, 20, 26, 5, 20, 2]
 #=> [2, 5, 8, 10, 20, 20, 26, 29]
insertion_sort ["dog", "cat", "pig", "owl", "cow", "bat"]
 #=> ["bat", "cat", "cow", "dog", "owl", "pig"]

This relies on two methods.

Array#insert inserts a given object into self before the element at a given index. If that index equals the size of (the array that is) self, the element is inserted after the last element of self.

Array#bsearch_index returns the index of the element of sorted before which a given object is to be inserted in order to keep sorted sorted. nil is returned if the given object is greater than the last element of sorted (hence the need for || sorted.size). See also Array#bsearch for details.

Note that the argument is not mutated (modified).

The easiest way to demonstrate the calculations is to run the code after having salted it with puts statements. I will write the code in a simpler way that effectively performs the same operations.

arr = [8, 29, 10, 20, 26, 5, 20, 2]
sorted = []
arr.each do |o|
 puts "\no = #{o}, sorted = #{sorted}"
 idx = sorted.bsearch_index { |e| e >= o }
 puts "idx from bsearch initially = #{ idx.nil? ? 'nil' : idx }"
 idx = idx || sorted.size
 puts "idx after 'idx || sorted' = #{idx}"
 sorted.insert(idx, o)
 puts "sorted after sorted.insert(#{idx}, #{o}) = #{sorted}"
end
o = 8, sorted = []
idx from bsearch initially = nil
idx after 'idx || sorted' = 0
sorted after sorted.insert(0, 8) = [8]
o = 29, sorted = [8]
idx from bsearch initially = nil
idx after 'idx || sorted' = 1
sorted after sorted.insert(1, 29) = [8, 29]
o = 10, sorted = [8, 29]
idx from bsearch initially = 1
idx after 'idx || sorted' = 1
sorted after sorted.insert(1, 10) = [8, 10, 29]
o = 20, sorted = [8, 10, 29]
idx from bsearch initially = 2
idx after 'idx || sorted' = 2
sorted after sorted.insert(2, 20) = [8, 10, 20, 29]
o = 26, sorted = [8, 10, 20, 29]
idx from bsearch initially = 3
idx after 'idx || sorted' = 3
sorted after sorted.insert(3, 26) = [8, 10, 20, 26, 29]
o = 5, sorted = [8, 10, 20, 26, 29]
idx from bsearch initially = 0
idx after 'idx || sorted' = 0
sorted after sorted.insert(0, 5) = [5, 8, 10, 20, 26, 29]
o = 20, sorted = [5, 8, 10, 20, 26, 29]
idx from bsearch initially = 3
idx after 'idx || sorted' = 3
sorted after sorted.insert(3, 20) = [5, 8, 10, 20, 20, 26, 29]
o = 2, sorted = [5, 8, 10, 20, 20, 26, 29]
idx from bsearch initially = 0
idx after 'idx || sorted' = 0
sorted after sorted.insert(0, 2) = [2, 5, 8, 10, 20, 20, 26, 29]
answered Feb 2, 2021 at 1:01
\$\endgroup\$
1
  • \$\begingroup\$ It was in reference to a different answer to this review. I've deleted it to avoid any further confusion. \$\endgroup\$ Commented Nov 12, 2021 at 21:16

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.