While working on the following leetcode problem: https://leetcode.com/problems/top-k-frequent-elements/ I used a Priority Queu implementation in Ruby. I have learned how to implement a Priority Queu in Java in Robert Sedgewick's Algorithms book. When inserting Hashes into the Priority Queue for example { 1 => 3}
this made me refactored the Priotity's Queue in such a way so I was able to compare two hashes values.
The implementation is much straight forward when your dealing with numbers and characters but it made me wonder what would be the best implementation with using objects or hashes.
def top_k_frequent(nums, k)
ans = []
h = Hash.new(0)
nums.each do |num|
h[num] += 1
end
heap = Heap.new
h.each do |k,v|
heap.insert({k => v})
end
k.times do
a = heap.del_max
ans.push(a.keys[0])
end
ans
end
class Heap
def initialize
@n = 0
@pq = []
end
def insert(v)
@pq[@n += 1] = v
swim(@n)
end
def swim(k)
while k > 1 && less((k / 2).floor, k)
swap((k / 2).floor, k)
k = k/2
end
end
def swap(i, j)
temp = @pq[i]
@pq[i] = @pq[j]
@pq[j] = temp
end
def less(i, j)
@pq[i].values[0] < @pq[j].values[0]
end
def del_max
max = @pq[1]
swap(1, @n)
@n -= 1
@pq[@n + 1] = nil
sink(1)
max
end
def sink(k)
while 2 * k <= @n
j = 2 * k
if !@pq[j + 1].nil?
j += 1 if j > 1 && @pq[j].values[0] < @pq[j + 1].values[0]
end
break if !less(k, j)
swap(k, j)
k = j
end
end
end
There are a few things that come to mind that I wish to probably learn. Keep in mind that I would like to prepare for algorithm interviews and so I would like to avoid as much synthetic sugar as possible.
@pq[j].values[0]
check for example how to get to a hash values I need to usevalues[0]
is there a better way for this?- There are 3 loops in the
top_k_frequent
method. How would yo go about optimizing this implementation? a.keys[0]
notice how I get the key of the element in Ruby is there a cleaner way of doing this?
This brought to my attention that in case of dealing with more complex objects the Priority Queue's API might be needed to refactored to be able to compare between objects, what would be the best way of dealing with that? Would creating another class which you pass to the Priority Queue would be a better solution?
Here is the implementation I have for Priority Queue's prior to editing to deal with hashes.
class Heap
attr_accessor :pq, :n
def initialize
@pq = [] # heap order complete binary tree
@n = 0 # in pq[1..N] with pq[0] unused
end
def empty?
@n == 0
end
def size
@n
end
def insert(v)
@pq[@n += 1] = v
swim(@n) # position in heap
end
# trickleup
def swim(k)
while k > 1 && less((k / 2).floor, k) # parent: k/2 parent less than child?
swap((k/2).floor, k)
k = k / 2
end
end
def del_max
max = @pq[1] # Retrieve max key from top.
swap(1, @n) # Exchange with last item.
@n -= 1
@pq[@n + 1] = nil # Avoid loitering. max added to end on swap make nil
sink(1) # Restore HEAP property
max
end
# Trickledown
def sink(k)
while 2 * k <= @n
j = 2 * k # child
if !@pq[j + 1].nil? # check if there is a right child
j += 1 if j > 1 && @pq[j] < @pq[j + 1] # if left child less than right child
end
break if @pq[k] > @pq[j] # If parent greater than child break
swap(k, j)
k = j
end
end
def less(i, j)
@pq[i] < @pq[j]
end
def swap(i, j)
temp = @pq[i]
@pq[i] = @pq[j]
@pq[j] = temp
end
end
1 Answer 1
The objects in the priority queue (as implemented above) have their priority determined by <
, >
, and ==
. So however equality is defined for the object determines its priority. For numbers and strings, this is fairly obvious but the default implementation of equality, less than and greater than for hashes may surprise you and not give the result you are expecting.
Perhaps you could consider passing in a block to your initialize method to allow you to provide your own comparison function in those cases where the default one doesn't match your needs. Like Enumerable::sort
does.
-
\$\begingroup\$ Hashes simply do not have a default implementation of
Comparable
at all, precisely because there is no sensible natural ordering. \$\endgroup\$Jörg W Mittag– Jörg W Mittag2020年09月06日 12:34:59 +00:00Commented Sep 6, 2020 at 12:34 -
\$\begingroup\$ Hash A is considered to be 'less than' hash B if A is a subset of B, and this appears to be true even if
a == {}
. See ruby-doc.org/core-2.7.0/Hash.html#method-i-3C \$\endgroup\$nullTerminator– nullTerminator2020年09月07日 13:13:33 +00:00Commented Sep 7, 2020 at 13:13 -
1\$\begingroup\$ A priority queue requires a total ordering, though, which this isn't. (Hence why hashes don't implement
Comparable
.) Sorry, I should have written "a natural ordering that is also total". Strings have at least two different natural orderings that are also total (lexicographic and length), numbers have an obvious natural total ordering. (Well, integers, reals, rationals, at least; complex numbers is a different matter.) For hashes, I don't see that. The OP is using the ordering of the value of the first key-value-pair which to me seems to be totally random. \$\endgroup\$Jörg W Mittag– Jörg W Mittag2020年09月07日 13:22:04 +00:00Commented Sep 7, 2020 at 13:22 -
\$\begingroup\$
priority determined by
<
,>
insink(k)
, butless(i, j)
inswim(k)
. \$\endgroup\$greybeard– greybeard2022年05月30日 04:10:12 +00:00Commented May 30, 2022 at 4:10
"A" < "B"
, or1 < 2
with comparing objects.obj_1
<obj_2
, in this case the API must be implemented in such a way that is able to compareobj_1
withobj_2
just in the case shown above. Notice how I need to use@pq[j].values[0]
in order to get to the value. ruby-doc.org/core-2.7.1/Hash.html \$\endgroup\${ [] => {}, 'two' => 2.0, 3 => :three, Time.now => Object.new }
? Normally, a priority queue stores items with associated priority. In your implementation, you store items without an associated priority, instead the item itself is the priority. However, that obviously limits you to only storing items that have a natural total ordering, e.g. numbers or strings. But objects in general, including hashes, are neither totally ordered nor do they have a natural ordering. \$\endgroup\$