0

I am trying to understand a Huffman code written in python by 'Rosetta code'. Following is small part from the code.

def encode(symb2freq):
 heap = [[wt, [sym, ""]] for sym, wt in symb2freq.items()] #What does this do?

I'm assuming variable heap is a list. But what is wt and sym?

noɥʇʎԀʎzɐɹƆ
10.8k3 gold badges52 silver badges67 bronze badges
asked Dec 16, 2015 at 22:38
3
  • sym looks like the symbols (the keys of the dict coming as parameter) and wt their weights (values associated) Commented Dec 16, 2015 at 22:45
  • 1
    Link to the code, by the way. Commented Dec 16, 2015 at 23:13
  • as a heads up, it's poor etiquette to pose a question and not respond to any of the answers. Either comment or click a check mark to close this issue out. Commented Dec 17, 2015 at 1:33

4 Answers 4

4

sym2freq is a dictionary with keys some symbols and their values the symbol's frequency. For example, if you have the string 'aaabacba', the dictionary will look like this

sym2freq = {'a': 5, 'b': 2, 'c': 1}

That's because we have 5 times the letter a, 2 times the letter b and 1 time the letter c.

Dictionaries have the method items(), which will return a tuple of each key with its respective value. In our case, it would return

>>> sym2freq.items()
(('a', 5), ('b', 2), ('c', 1))

The for sym, wt in symb2freq.items() part of the comprehension list is just unpacking. Every time we fetch one of our tuples from above, we assign the first object to the variable sym and the second to the variable wt. The names sym and wt were chosen purely to be reflective of the values the represent, i.e. symbol and weight (frequency).

>>> sym, wt = ('a', 5)
>>> print sym
'a'
>>> print wt
5

And since the list comprehension will create lists of the structure [wt, [sym, ""]], you'll end up with the list

>>> encode(sym2freq)
[[5, ['a', '']], [2, ['b', '']], [1, ['c', '']]]

The reason why we go from a dictionary of symbol frequencies to a structure like the list heap is so we can sort out symbols in terms of their frequencies, which, as you are learning, is part of constructing the Huffman tree.

answered Dec 16, 2015 at 23:09
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you @Reti43 very much, brilliantly explained!
Yes. That's correct. (author).
3

This is a list comprehension. This is saying

  • Get the items of symb2freq and start to loop over them.
  • Get the first item in symb2freq and unpack them into variables sym and wt.
  • Now add [wt, [sym, ""]] to the list
  • Do this for each item
  • Now put the list in the variable heap

For example, [bar(x) for x in foo] makes a list that applies bar(x) to each value in the list.

answered Dec 16, 2015 at 22:45

Comments

-1

This is an interesting problem. Imagine a dictionary with five letters where the key is the letter and the value is the frequency:

a = {'a':1, 'b':2, 'c':3, 'd':4, 'e':5}

then mapping it so that it's a list of lists with frequency, and a list starting with the letter into the heap variable like this:

[[1, ['a', '']], [2, ['b', '']], [3, ['c', '']], [4, ['d', '']], [5, ['e', '']]]

then when you heapify it, you get:

[[1, ['a', '']],
 [3, ['c', '']],
 [2, ['b', '']],
 [5, ['e', '']],
 [4, ['d', '']]]

Just work things out step by step with a valid dictionary to encode like the one given on the wikipedia page and you'll be able to see what each step does. Or use a trivial example to start like the one I gave and slowly grow the number of elements until you're working with a real document.

What the rosetta code is doing is iterating through the heap and adding digits to the blank string you see after the letter based on frequency, until each letter is mapped to a binary interpretation. So in the end you'd having something like this:

 [['e', '101'], 
 ['d', '010'], 
 ['c', '1001'], 
 ['b', '1100'], 
 ['a', '1101']]

where the most frequent letters require the fewest bits.

answered Dec 16, 2015 at 22:45

Comments

-2

sym is symbol, and wt is weight.

https://en.wikipedia.org/wiki/Huffman_coding

answered Dec 16, 2015 at 22:44

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.