From this stackoverflow post
The main trick behind this algorithm is that if you, observing a stream of random integers, see an integer which binary representation starts with some known prefix, there is a higher chance that the cardinality of the stream is 2^(size of the prefix).
Hyperloglog uses hash to achieve randomness, but how does one prove that hashing a value give random output? Or even more strictly does hash guarantees pseudo random like uniform output?
If hash does not guarantee uniform output, is there a way we can upperbound and quantize the non-uniformity of a hash function?
-
Part 4: Overview of the Proof, I think this can answer your question.Yonlif– Yonlif2019年02月20日 20:28:02 +00:00Commented Feb 20, 2019 at 20:28