Custom set data structure in Ruby

Question 1

I wrote a set data structure with some useful functions. I'm not sure if it is a good implementation and if there are other essential functions that a set should support.

class CustomSet
 def initialize(list=[])
 @cs =[]
 list.each do |n|
 put(n)
 end
 end
 def size
 @cs.size
 end 
 def each(&blk)
 @cs.each do |n|
 blk.call(n)
 end
 end 
 def index(n)
 @cs.index { |x| x.eql?(n) }
 end 
 def ==(other_object)
 return false unless other_object.class == self.class
 return false unless other_object.size ==self.size
 other_object.each do |n|
 return false unless index(n) 
 end 
 true
 end
 def eql?(other)
 self == other
 end
 def hash
 @cs.reduce(:+){ |n| n.hash }
 end
 def delete(n)
 @cs.delete(n) if index(n)
 self
 end 
 def difference(other)
 CustomSet.new(@cs.select { |n| other.index(n).nil? })
 end 
 def disjoint?(other)
 union(other).size == self.size + other.size 
 end
 def intersection(other)
 CustomSet.new(@cs.select { |n| other.index(n) })
 end 
 def subset?(other)
 self.size >= other.size && intersection(other) == other
 end 
 def union(other)
 u = CustomSet.new(@cs)
 other.each { |n| u.put(n) }
 u 
 end 
 def member?(n)
 index(n)
 end 
 def put(n)
 @cs << n unless index(n)
 self
 end 
 def empty
 @cs = []
 self
 end 
 def to_list
 @cs
 end 
end

Set treats integers and decimals as distinct elements:

irb(main):019:0> Set.new([1,2,3]) == Set.new([1,2.0,3])
=> false
irb(main):020:0> Set.new([1,2,3]) == Set.new([1,2,3])
=> true
irb(main):021:0> 2 == 2.0
=> true
irb(main):022:0> [2] == [2.0]
=> true
irb(main):023:0> Set.new([1,2,3]) == Set.new([1,3,2])
=> true

Set accepts a range, so the initialize has to copy elements manually unless there is another way.

irb(main):029:0> Set.new(1..3) == Set.new([1,3,2])
=> true

Question 2

Does this do anything that the standard Ruby Set class doesn't do?

Question 3

Oh, I thought it wasn't available, but besides that I am looking for feedback on general implementation of Set. Also wondering if I did the ==, eql? and hash implementation correctly.

Question 4

As mentioned in the comments, Ruby already has a very nice Set class in its standard library. But, as also mentioned, there's still code to review here.

So, review-wise, a few things that caught my eye:

@cs. Sorry, but I'm a stickler for descriptive names. I'm guessing this is supposed to be short for "CustomSet", but that's the name of the class, and the variable is just an array. Something like @members might make more sense.
I'd also add a private accessor method for the variable. Call it members, and replace (almost) all instances of @cs with members. By accessing it though a method, you decouple your code better.
You constructor is unnecessarily complex. Just duplicate the input array:
```
 def initialize(list = [])
 @cs = list.dup
 end
```
#to_list. "List" isn't a thing in Ruby. Array is a thing though, and the conventional name for this sort of method is to_a. However, you don't want to return your instance variable directly! If you do, it can then be modified externally. It also allows external code to keep references, and all manner of other stuff. So return a dup instead.
#empty is self-modifying. It should either be empty! to indicate its effect, or it should be called clear (to match the so-named method on Array). The method should probably also just call clear on the internal @cs array instead of replacing it.
#member? should perhaps be aliased (or renamed) as #include? (again to match Array)
#each can be shortened to just
```
def each(&block)
 @cs.each(&block)
end
```
#put could be written more idiomatically using include?:
```
def put(item)
 @cs << item unless @cs.include?(item)
end
```
Or you can push the new item immediately, and just call uniq! afterward. Or you can use a hash (i.e. instance of Hash - not to be confused with the method discussed later), and use its built-in duplicate handling, e.g.:
```
def put(item)
 @cs[item.hash] = item # assumes @cs is a hash
end
```
Since Ruby 1.9, hashes are ordered too, so in effect it'd be little different from using an array internally.

#difference, #intersection, and #union can be handled via the internal arrays, e.g.:

def difference(other)
 self.class.new(@cs - other.to_a)
end
def intersection(other)
 self.class.new(@cs & other.to_a)
end
def union(other)
 self.class.new(@cs | other.to_a)
end

And as you can see from the above, those methods should probably be aliased (or renamed) as -, &, and | respectively.
@cs.delete(n) if index(n) - there's no need for the if guard. It just requires an extra lookup to do, and slows things down. The call to delete will work fine on its own.
#== and #eql?. Here you may again use the fact that both self and other have array representations that already support all this:
```
def ==(other)
 self.class == other.class && @cs == other.to_a
end
```
I'd include a dup method for cloning a set. E.g.:
```
def dup
 self.class.new(@cs)
end
```
I'd include the Enumerable module to get a bunch of methods "for free". All that's required on your class is that it has an #each method - which is does.

Now, about your #hash. Your method is simply flawed, because it doesn't use reduce correctly. Your method is currently equivalent to (:+).hash (or, if the set's empty: :+). You're simply using reduce wrong.

reduce takes an initial value and a block. The block, in turn, receives two arguments: The value carried over from the previous iteration (or the initial value, first time 'round), and the current element in the array. For instance:

sum = [1, 2, 3].reduce(0) do |sum_so_far, item|
 puts "Sum so far: #{sum_so_far}"
 puts "Current value: #{item}"
 sum_so_far += item # this is returned and becomes sum_so_far in the next iteration
end
puts "Final sum: #{sum}"

which'll print

Sum so far: 0
Current value: 1
Sum so far: 1
Current value: 2
Sum so far: 3
Current value: 3
Final sum: 6

As you can hopefully tell, your use of reduce is very different.

What you seem to be trying to do is either

@cs.reduce(0) { |sum, member| sum += member.hash }

or

@cs.map(&:hash).reduce(0, &:+)

which are functionally identical ways of doing the same thing: Sum the hash values of each object in the set.

However, this approach seems flawed to me, since you'd risk conflicts. A set with just one member might have the hash 7 (the member's hash), and a set with two members might also have a hash of 7 (because the members' hashes are 3 and 4). That's no good.

Honestly, I'd consider simply not implementing #hash unless you have very good reason to do so. You have ==/eql? to check for equality, so why bother with hash? Easier to let the default implementation handle that.

Question 5

assert_equal CustomSet.new([1, 3]), CustomSet.new([3, 1]) will not work with the suggested equal method, by using to_a, members have to be tested individually.

Question 6

Also since Ruby does type conversion, the comparison has to support that. stackoverflow.com/questions/25986896/…

Question 7

@gitarchie Ah, right, good point

Flambino Flambino 33.3k2 gold badges46 silver badges90 bronze badges · Accepted Answer · 2014-09-24 23:54:30Z

As mentioned in the comments, Ruby already has a very nice Set class in its standard library. But, as also mentioned, there's still code to review here.

So, review-wise, a few things that caught my eye:

@cs. Sorry, but I'm a stickler for descriptive names. I'm guessing this is supposed to be short for "CustomSet", but that's the name of the class, and the variable is just an array. Something like @members might make more sense.
I'd also add a private accessor method for the variable. Call it members, and replace (almost) all instances of @cs with members. By accessing it though a method, you decouple your code better.
You constructor is unnecessarily complex. Just duplicate the input array:
```
 def initialize(list = [])
 @cs = list.dup
 end
```
#to_list. "List" isn't a thing in Ruby. Array is a thing though, and the conventional name for this sort of method is to_a. However, you don't want to return your instance variable directly! If you do, it can then be modified externally. It also allows external code to keep references, and all manner of other stuff. So return a dup instead.
#empty is self-modifying. It should either be empty! to indicate its effect, or it should be called clear (to match the so-named method on Array). The method should probably also just call clear on the internal @cs array instead of replacing it.
#member? should perhaps be aliased (or renamed) as #include? (again to match Array)
#each can be shortened to just
```
def each(&block)
 @cs.each(&block)
end
```
#put could be written more idiomatically using include?:
```
def put(item)
 @cs << item unless @cs.include?(item)
end
```
Or you can push the new item immediately, and just call uniq! afterward. Or you can use a hash (i.e. instance of Hash - not to be confused with the method discussed later), and use its built-in duplicate handling, e.g.:
```
def put(item)
 @cs[item.hash] = item # assumes @cs is a hash
end
```
Since Ruby 1.9, hashes are ordered too, so in effect it'd be little different from using an array internally.

#difference, #intersection, and #union can be handled via the internal arrays, e.g.:

def difference(other)
 self.class.new(@cs - other.to_a)
end
def intersection(other)
 self.class.new(@cs & other.to_a)
end
def union(other)
 self.class.new(@cs | other.to_a)
end

And as you can see from the above, those methods should probably be aliased (or renamed) as -, &, and | respectively.
@cs.delete(n) if index(n) - there's no need for the if guard. It just requires an extra lookup to do, and slows things down. The call to delete will work fine on its own.
#== and #eql?. Here you may again use the fact that both self and other have array representations that already support all this:
```
def ==(other)
 self.class == other.class && @cs == other.to_a
end
```
I'd include a dup method for cloning a set. E.g.:
```
def dup
 self.class.new(@cs)
end
```
I'd include the Enumerable module to get a bunch of methods "for free". All that's required on your class is that it has an #each method - which is does.

Now, about your #hash. Your method is simply flawed, because it doesn't use reduce correctly. Your method is currently equivalent to (:+).hash (or, if the set's empty: :+). You're simply using reduce wrong.

reduce takes an initial value and a block. The block, in turn, receives two arguments: The value carried over from the previous iteration (or the initial value, first time 'round), and the current element in the array. For instance:

sum = [1, 2, 3].reduce(0) do |sum_so_far, item|
 puts "Sum so far: #{sum_so_far}"
 puts "Current value: #{item}"
 sum_so_far += item # this is returned and becomes sum_so_far in the next iteration
end
puts "Final sum: #{sum}"

which'll print

Sum so far: 0
Current value: 1
Sum so far: 1
Current value: 2
Sum so far: 3
Current value: 3
Final sum: 6

As you can hopefully tell, your use of reduce is very different.

What you seem to be trying to do is either

@cs.reduce(0) { |sum, member| sum += member.hash }

or

@cs.map(&:hash).reduce(0, &:+)

which are functionally identical ways of doing the same thing: Sum the hash values of each object in the set.

However, this approach seems flawed to me, since you'd risk conflicts. A set with just one member might have the hash 7 (the member's hash), and a set with two members might also have a hash of 7 (because the members' hashes are 3 and 4). That's no good.

Honestly, I'd consider simply not implementing #hash unless you have very good reason to do so. You have ==/eql? to check for equality, so why bother with hash? Easier to let the default implementation handle that.

assert_equal CustomSet.new([1, 3]), CustomSet.new([3, 1]) will not work with the suggested equal method, by using to_a, members have to be tested individually.
Also since Ruby does type conversion, the comparison has to support that. stackoverflow.com/questions/25986896/…

Stack Exchange Network

Custom set data structure in Ruby

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Custom set data structure in Ruby

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions