I have a huffman encoded byte array of size 400 MB. Where the huffman tokens are all possible 4 bit values (0-15). I have to decode it within 1 minute. I need an efficient way to decode. In a 16gb ram windows system with a processor speed 2.8 Ghz. can i make it in 1 minute?
asked Jun 3, 2016 at 10:46
Sayantan Ghosh
1,0763 gold badges13 silver badges31 bronze badges
-
What you are saying doesn't make sense. You should have 256 different huffman tokens, not 16. Do you think "the tokens have all possible lengths" (1 to 15 bits)? Tokens can be longer than that. Decoding 7 MB per second or 400 cycles per byte would point to quite badly performing code.gnasher729– gnasher7292016年06月03日 10:52:14 +00:00Commented Jun 3, 2016 at 10:52
-
I meant the leaf values will be 0-15 (4 bit values).Sayantan Ghosh– Sayantan Ghosh2016年06月03日 12:19:25 +00:00Commented Jun 3, 2016 at 12:19
-
Fx8150 cpu single thread decoding (on 4-bit max encoded) makes 45MB/s when written in C++ language.huseyin tugrul buyukisik– huseyin tugrul buyukisik2022年03月01日 19:46:05 +00:00Commented Mar 1, 2022 at 19:46
1 Answer 1
It took about eight seconds on my four-year old 2 GHz i7 processor, using zlib's inflate decompressor given only Huffman encoded input that was compressed 4:1 down to 400 MB.
So yes, you should be able to do much better than a minute.
answered Jun 3, 2016 at 14:39
Mark Adler
115k15 gold badges139 silver badges184 bronze badges
Sign up to request clarification or add additional context in comments.
2 Comments
Mark Adler
Since you apparently don't know how to spell "you", then Huffman decoding may be too difficult for you. What the heck is a "ds"? In any case, you can look at the inflate code in zlib for how to do fast table-based Huffman decoding.
Sayantan Ghosh
Thanks. By ds i meant data structures.