Probability of overflow in a summation of fixed-size signed integers

Question 1

How can I estimate the probability that the sum $S_n$ of $n$ uniform random 48-bit signed integers overflows a 64-bit signed integer?

Edit: the overflow can occur at any step, not only on the final result.

Since this is a normal distribution, I guess that the cumulative sums more or less oscillate around zero, and that the probability that they ever exceed 64 bits is vanishingly small. I would like to formalise this intuition.

Question 2

By overflow do you mean that the final result is different from what it would have been if evaluated with unlimited integers, or that any of the individual additions overflows? (if that happens, the final result can still be correct)

Question 3

Re: "the probability that they ever exceed 64 bits is vanishingly small": Note that if we're assuming two's complement, then there is a slight bias toward negative numbers (because $-2^{n-1}$ is possible but 2ドル^{n-1}$ is not, so the average possible value is -0.5), and that bias will eventually add up. So if by "ever" you literally mean "ever" -- the probability as n goes to infinity -- then the probability is 1, no? (Even without that bias I actually wonder if the probability might be 1, but if so then that's not as obvious.)

Question 4

@ruakh since this is a sum of independent variables, the variance will continually grow, so for very large $n$ the probability of being in the 64-bit range goes to zero.

Question 5

stats.stackexchange.com/questions/317852/…

Question 6

Since you edited the question, I don't know the distribution of the max of partial sums. You're probably better off asking on stats stack exchange because this is not really a CS question.

Question 7

Suppose the 48-bit signed integers are drawn uniformly randomly from $[-2^{47}, 2^{47}-1]$. To overflow, there must be at least $n = 2^{63}/2^{47} = 2^{16}$ integers drawn, so $n$ is large enough for the Central Limit Theorem to apply, so the result is approximately^{[citation needed]} Normal but discretized.

For independent variables $X_i$ with mean $\mu$ and variance $\sigma^2$, the resulting sum will be approximately Normal with mean $n \mu$ and variance $n \sigma^2$. For our uniformly random integers, the mean $\mu = -1/2$ is slightly biased negatively as ruakh notes, and the variance is about $(2^{48})^2/12 = 2^{96}/12 \approx 6.6e27$, so this means the resulting distribution is approximately $N(-n/2, 6.6e27 \times n )$.

Since the standard deviations are only about 8ドル.1e13 \sqrt n$, even for $n=1e8$, 2ドル^{63}$ would be more than 10 SDs away, but for $n=1e9$, it's only 3.6 SDs, so pick your $n$ based on z-score probabilities. (For SDs this large, the biased mean has little effect on the result.)

Question 8

Calculate the average and variance of choosing a single signed 48 bit variable. Multiply by n to get the average and variance of choosing n 48 bit variables. The standard variation is roughly the square root of the variance.

Calculate how many standard deviations away from the average you need to be for an overflow. The probability of being more than k standard deviations away from the mean is about exp(-k^2).

Now that is the probability of overflow after summing n numbers. It ignores the possibility that the sum might overflow and then become smaller again. So the probability of overflow at some point within n steps is higher.

Question 9

You're using the Central Limit Theorem, correct? I didn't know if CLT applied to sums like it applied to normalized means. See my comment on the question post.

Question 10

As long as you can calculate the mean, the variance and the standard deviation, for the CLT you only need to know "how many standard deviations am I allowed to be away from the mean". If you had n values from -1000 to +10000, so each has a mean of 4,500, then your mean for n values would be 4,500n. And for very very large n if 4,500n gets close to 2^63 then the number of standard deviations allowed without overflow would get small very quickly.

Question 11

I mean how big does n have to be to be approximately normal. But I think 2^16 is way more than enough.

Question 12

Hoeffding's inequality gives non-asymptotic probability bounds. If $X$ is on $[a,b]$ and $C>b-a$ then $$P\left(\left|\sum_i X_i - \sum_i E[X_i]\right|>t\right)\leq 2\exp\left(\frac{-2t^2}{nC^2} \right).$$

This gives qualitatively similar bounds to the CLT that @qwr used, but slightly more rigorous since the CLT statement is asymptotic and Hoeffding's inequality holds for all $n$ and $t$

qwr qwr 6301 gold badge7 silver badges22 bronze badges · Accepted Answer · 2024-02-03 01:29:16Z

Suppose the 48-bit signed integers are drawn uniformly randomly from $[-2^{47}, 2^{47}-1]$. To overflow, there must be at least $n = 2^{63}/2^{47} = 2^{16}$ integers drawn, so $n$ is large enough for the Central Limit Theorem to apply, so the result is approximately^{[citation needed]} Normal but discretized.

For independent variables $X_i$ with mean $\mu$ and variance $\sigma^2$, the resulting sum will be approximately Normal with mean $n \mu$ and variance $n \sigma^2$. For our uniformly random integers, the mean $\mu = -1/2$ is slightly biased negatively as ruakh notes, and the variance is about $(2^{48})^2/12 = 2^{96}/12 \approx 6.6e27$, so this means the resulting distribution is approximately $N(-n/2, 6.6e27 \times n )$.

Since the standard deviations are only about 8ドル.1e13 \sqrt n$, even for $n=1e8$, 2ドル^{63}$ would be more than 10 SDs away, but for $n=1e9$, it's only 3.6 SDs, so pick your $n$ based on z-score probabilities. (For SDs this large, the biased mean has little effect on the result.)

Stack Exchange Network

Probability of overflow in a summation of fixed-size signed integers

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Probability of overflow in a summation of fixed-size signed integers

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions