4
\$\begingroup\$

This is an implementation of Huffman Coding that works on ASCII values. I simplified main to show an example of user input to the program. I only removed handling of non-printable ASCII characters, as that is not something I need reviewed. Initially, I tried to use std::priority_queue<std::unique_ptr<node>>, but you cannot easily remove these pointers due to std::priority_queue::top returning a const reference, so I had to use std::shared_ptr. More or less, I just want to know if there are any glaring issue with how I used the standard library or guideline violations.

#include <iostream>
#include <map>
#include <memory>
#include <queue>
#include <string>
#include <unordered_map>
#include <utility>
#include <optional>
#include <sstream>
/// node
/// node in huffman tree
struct node {
 node(char c, int freq, std::shared_ptr<node> left, std::shared_ptr<node> right)
 : c {c}
 , freq {freq}
 , left {std::move(left)}
 , right {std::move(right)}
 {}
 node(int freq, std::shared_ptr<node> left, std::shared_ptr<node> right)
 : freq {freq}
 , left {std::move(left)}
 , right {std::move(right)}
 {}
 std::optional<char> c;
 int freq;
 std::shared_ptr<node> left;
 std::shared_ptr<node> right;
};
/// build_huffman_codings
/// traverses a huffman tree and finds encodings for all characters
/// \param root root of huffman tree or subtree
/// \param accumulator prefix of huffman code gathered before root
/// \param codings mapping of character to its final encoding
void build_huffman_codings(const std::shared_ptr<node>& root,
 std::map<char, std::string>& codings,
 std::string accumulator = "") {
 // leaf node adds to codings
 if (!root->left && !root->right) {
 codings[root->c.value()] = accumulator;
 return;
 }
 // left branch
 if (root->left) {
 build_huffman_codings(root->left, codings, accumulator + "0");
 }
 // right branch
 if (root->right) {
 build_huffman_codings(root->right, codings, accumulator + "1");
 }
}
/// huffman
/// compute huffman codings for given frequencies of characters
/// \param freq mapping of character and frequencies
/// \return mapping of character to a binary representation
std::map<char, std::string> huffman(const std::map<char, int>& freq) {
 // pre-allocate nodes
 std::vector<std::shared_ptr<node>> nodes;
 for (const auto& pair : freq) {
 nodes.emplace_back(std::make_shared<node>(pair.first, pair.second, nullptr, nullptr));
 }
 // compare freq of node pointers
 auto compare_greater = [] (const auto& p1, const auto& p2) {
 return p1->freq > p2->freq;
 };
 // priority queue holds nodes in increasing order by frequency
 std::priority_queue<std::shared_ptr<node>,
 std::vector<std::shared_ptr<node>>,
 decltype(compare_greater)>
 queue {compare_greater, std::move(nodes)};
 const std::size_t size = queue.size();
 // repeat size - 1 times
 for (std::size_t i = 1; i < size; ++i) {
 // remove first two nodes
 std::shared_ptr<node> x = queue.top();
 queue.pop();
 std::shared_ptr<node> y = queue.top();
 queue.pop();
 // add new node
 queue.emplace(std::make_shared<node>(x->freq + y->freq, x, y));
 }
 std::map<char, std::string> codings;
 build_huffman_codings(queue.top(), codings);
 return codings;
}
int main() {
 // store character with its frequency
 std::map<char, int> frequencies;
 // example user input - real implementation handles non-printable ascii
 std::stringstream freq_txt {
 "a 5\n"
 "b 25\n"
 "c 7\n"
 "d 15\n"
 "e 4\n"
 "f 12\n"
 };
 char c;
 int f;
 while (freq_txt) {
 freq_txt >> c;
 freq_txt >> f;
 frequencies[c] = f;
 }
 // huffman code table
 const auto codings = huffman(frequencies);
 for (const auto& pair : codings) {
 std::cout << pair.first << ' ' << pair.second << '\n';
 }
}
asked Apr 23, 2020 at 1:14
\$\endgroup\$

1 Answer 1

1
\$\begingroup\$

At first glance, I see a a lot of good usage of the standard library. So, the remarks I have are rather small:

  • in huffman(), you create a vector nodes, as you know exactly the amount of elements to store into it, I would reserve it. This reduces memory usage slightly and improves performance. (nodes.reserve(freq.size()))
  • In the same function, you don't seem to have handling for size == 0, most likely because of the simplification?
  • a small optimization in the for-loop, you can use std::move() on x and y.
  • In build_huffman_condings, you have std::string accumulator ="". It's better to write = std::string{} as this doesn't require strlen to be called.

On the more high level, I am wondering about the used datastructures, especially std::map. I know std::unordered_map ain't an ideal hash-map, though, it does perform better than std::map if you don't need ordering. I don't see in your code that need, hence I would recommend replacing it. (You can also use abseil ... for a better implementation, or as the values are small, boosts flat_map)

In short, your usage of the standard library looks fine.

answered Apr 26, 2020 at 8:23
\$\endgroup\$
4
  • \$\begingroup\$ Thanks for responding. I used map over unordered_map because my calling code needed to output the table with ascii values in order. \$\endgroup\$ Commented Apr 26, 2020 at 12:59
  • \$\begingroup\$ Even than, if you know the characters, depending on the length of the text, it might be beneficial to use hash_map and at the end loop over all characters and locate. Though that all depends on the use cases. \$\endgroup\$ Commented Apr 26, 2020 at 15:52
  • \$\begingroup\$ Since queue.top() returns a const shared_ptr& - and even if I made x and y const, should I still use move on them? I'm getting warnings to not move const variables. I'm assuming it will handle the reference counts correctly and the warning may not apply in this situation. \$\endgroup\$ Commented Apr 27, 2020 at 2:08
  • \$\begingroup\$ You indeed make a copy of the shared_ptr when creating variable x, which is needed. However, when passing it along to the new node, you could move it. \$\endgroup\$ Commented Apr 27, 2020 at 10:54

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.