2
$\begingroup$

Inspired by this SO post, I'd like to generate random bitstrings of a length $n$, having at most $k$ bits set for some $k$. Bitstrings should be selected uniformly at random amongst all possible bitstrings. (The original question also asks them to be unique, but that's easily handled with e.g. a set to catch duplicates).

Currently, I have a few approaches, none of which is ideal:

  • For large $k$ ($\approx n/2$), just perform rejection sampling across all bitstrings of length $k$.
  • For very small $k$, we can explicitly calculate the distribution of 1-bit counts via the binomial coefficients, then randomly select a target count from this distribution. (Generating bitstrings with exactly k bits set is much easier)
  • For general use, choose a sequence of length $n$ out of a pool of $n$ 0s and $k$ 1s. While efficient, this approach is biased in favour of 1-bit counts near $k/2$.

Is there an efficient and correct algorithm for selecting random bitstrings with up to $k$ bits set?

asked Sep 25, 2024 at 14:34
$\endgroup$
2
  • $\begingroup$ Partial answer: cs.stackexchange.com/questions/67664/… $\endgroup$ Commented Sep 26, 2024 at 6:55
  • $\begingroup$ @Pseudonym: as noted in my question, "generating bitstrings with exactly k bits set is much easier". $\endgroup$ Commented Sep 27, 2024 at 1:20

1 Answer 1

0
$\begingroup$

Yes. Let $f(n,k)$ denote the number of such bitstrings (i.e., number of $n$-bit strings containing $\le k$ ones). Then it is easy to sample from this set. In particular, use the following recursive algorithm:

Sample the first bit of such a string by choosing 1 with probability $p$ and 0 with probability 1ドル-p$, where $p=f(n-1,k-1)/f(n,k)$. If the first bit is 1, pick the remaining $n-1$ bits by recursively sampling from $n-1$-bit strings containing $\le k-1$ ones. If the first bit is 0, pick the remaining $n-1$ bits by recursively sampling from $n-1$-bit strings containing $\le k$ ones.

So all that remains is to describe an algorithm to compute $f(n,k)$. And this is easy to compute using dynamic programming, by taking advantage of the recurrence

$$f(n,k) = f(n-1,k-1) + f(n-1,k),$$

with base cases $f(n,0)=1$.

answered Sep 26, 2024 at 7:00
$\endgroup$
1
  • 3
    $\begingroup$ I think OP knows this algorithm (their second bullet point), but it really isn't efficient - it needs bignum arithmetic and potentially many random bits per output bit. It's better for exactly $k$ set bits, because then $p=k/n$. $\endgroup$ Commented Sep 26, 2024 at 20:38

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.