Showing posts with label generating functions. Show all posts
Showing posts with label generating functions. Show all posts

20 September 2011

Using generating functions to prove that the only sequences which are their own sequence of running averages are the constant sequences

Here's a problem that occurred to me yesterday: consider a sequence of real numbers a0, a1, a2, ... . Let bk = (a0 + a1 + ... + ak)/(k+1) be the average of the first (k+1) of the ai. When is a sequence equal to its own sequence of averages? That is, if we see a bunch of numbers, when is it true that every number we see is the average of all the numbers we've seen so far?

Of course the answer is the constant sequences. But can we prove this using generating functions?

It turns out we can. If we have

f(z) = a0 + a1 z + a2 z2 + ...

then

f(z)/(1-z) = a0 + (a0 + a1)z + (a0 + a1 + a2) z^2 + ... = b0 + 2b1 z + 3b2 z^2 + ...

and let's call the right-hand side here g(z). What can we do to g(z) to transform it into

h(z) = b0 + b1 z + b2 z2 + ... ?

Well, a linear operator that takes the function (of z) zk to the function (of z) zk/(k+1) could be applied to g in order to get h. Clearly this is related to integration. In fact the operator

I φ(z) = (1/z) ∫0z φ(s) ds

does the trick. So we have

h(z) = I g(z) = (1/z) ∫0z f(s)/(1-s) ds.

But remember that we wanted the generating function h of the averages to be just the generating function f of the original sequence. So this gives

f(z) = (1/z) ∫0z f(s)/(1-s) ds

and this is in fact a differential equation. Multiply through by z and differentiate both sides to get

z f'(z) + f(z) = f(z)/(1-z).

A bit of rearranging gives

f'(z)/f(z) = 1/(1-z)

and we recognize that the left-hand side is the derivative of log f(z). Integrating both sides gives

log f(z) = log(1-z) + C

where C is a constant of integration, and finally we get f(z) = eC/(1-z). But this is just the generating function of a constant sequence.

18 March 2011

What is the origin of Kirillov's lucky number problem?

I've run across a few references to a Russian superstition as follows: a bus ticket has a six-digit number, and a ticket is said to be "lucky" if the sum of the first three digits equals the sum of the last three digits. See, for example, Lectures on Generating Functionsby Sergei Lando -- an excellent little book I accidentally discovered in the library a couple days ago. Lando gives the problem of figuring out how many lucky tickets there are, and says that in the early 1970s A. A. Kirillov would often open his seminar this way. The cataloging information in the book says that Lando was born in 1955; Olshanski writes, in his preface to Kirillov's seminar on representation theory, that many students attended Kirillov's seminar. I suspect that Lando actually heard Kirillov pose this problem at some point.

What I'm interested in is whether this is an actual superstition, perhaps held by non-mathematicians. It seems like the sort of thing that a mathematician would make up to create a good problem. This web page mentions the superstition in a non-mathematical context, with the twist that you're supposed to eat lucky tickets. This one does as well, and links to these people who sell lucky ticket cookies. It seems quite likely that:

1. the superstition exists among some segment of the Russian population, or at least did at some point, and
2. the entrance of the problem into the mathematical culture is due to Kirillov -- if I were still at Penn I'd ask him.

But did the problem travel from the rest of the world to mathematics? Or, because this sum condition feels so mathematical, did it travel from mathematics to the rest of the world?

(Also, can you solve the problem? How many lucky tickets are there?)

25 January 2009

Existence of bijections?

I have two counting sequences that I know to be the same. That is, I have objects of type A and of type B, which each have a size, and I know that the number of A-objects of size n and the number of B-objects of size n are the same. But I am struggling to find a bijection between the things they count.

Are there results of the form "sure, the two counting sequences are the same, but there's no `natural' bijection between the corresponding sets?" (Obviously bijections will always exist between sets of the same size, so this would of course depend on the definition of the word "natural" -- perhaps a meaning analogous to the category-theoretic one is what I'm looking for, perhaps not.) I've never seen any, but most of the people I know are more interested in finding the bijections than in proving there aren't any.

30 May 2008

"Square roots" of probability distributions

Think about probability distributions supported on the positive integers. Not all of them have a "square root" -- that is, given a random variable X supported on the positive integer, there do not exist independent, identically distributed variables Y1, Y2 such that X has the same distribution as Y1 + Y2.

You might be wondering why it's natural to refer to this as a "square root". Well, let pk be the probability of the event X = k. Then the probability generating function for X is Similarly, let qj be the probability of the event Y = j, and define the probability generating function Let Y, Y1, Y2 be equidistributed. Then X is equidistributed with Y1 + Y2 if and only if f(z) = g(z)2, since addition of independent random variables corresponds to convolution of distributions and multiplication of their generating functions.

Conversely, the random variable X has a square root in this sense if and only if its generating function f(z) has a square root which has a Taylor series at z = 0 with all coefficients positive.

The simplest case is when X has finite support. Then X has a square root if and only if its generating function f(z) is the square of a polynomial. For example, the random variable which takes values 0, 1, 2 with probabilities 1/4, 1/2, 1/4 has a square root; its generating function is

But the random variable taking values 0, 1, 2 with probability 1/3 each is not, since 1/3 + z/3 + z2/3 is not a square.

But what about distributions with infinite support? Some of these are easy -- the square root of the Poisson distribution with mean λ is Poisson with mean λ/2. This can easily be seen since the probability generating function of a Poisson(λ) random variable is (In fact, the Poisson distribution is infinitely divisible; in the nomenclature used in this post one might say it has roots of all orders.)

Now consider a random variable X, with geometric distribution with parameter 1/2 supported on {0, 1, 2, 3, ...}; this has P(X = k) = 2-(k+1). This is a special case of the negative binomial distribution, which is infinitely divisible. We have f(z) = 1/2 + z/22 + z2/23 + ... = 1/(2-z). So the square root distribution has generating function



and in general the coefficient of zn is, by the binomial theorem,



That binomial coefficient is ${1 \over 2} {2n \choose n}$; we have ${2n \choose n} \sim 4^n/\sqrt{\pi n},ドル so the coefficient of zn in our probability generating function, which we'll call qk, is asymptotic to



In particular, the "square root" distribution decays just a bit faster than the distribution that it's the square root of, the main difference being the additional factor of n-1/2. This is reasonable, if you think about the process of convolution. We have

pn = q0 qn + q1 qn-1 + q2 qn-2 + ... + qn-1 q1 + qn q0

and each of the n terms is roughly 1/(n2n). This is just another negative binomial distribution. (I actually didn't realize that until I started writing this post; the post was originally titled "what's the name of this distribution?" and then I did some research.)

26 February 2008

Combinatorial quotients

Consider a combinatorial class (in the sense of Flajolet and Sedgewick) or combinatorial species (in the sense of Bergeron, Labelle, and Leroux). There are certain generating functions associated with combinatorial classes/species (I'm conflating the two for the moment because the difference between the two theories aren't important here), and the operations of sum and product have nice "combinatorial" interpretations -- addition of generating functions corresponds to disjoint unions of classes/species and multiplication of generating functions what I'll call here "combinatorial product", i. e. for two species F and G, an FG-structure on the set [n] is a pair which consists of an F-structure on some subset S of [n] and a G-structure on [n]-S. Then the generating function of the FG-structures is the product of the generating function of the F-structures and that of the G-structures.

Other natural operations on functions have combinatorial interpretations: composition of functions, taking exponentials or logarithms, differentiation, integration, etc. Even subtraction has an interpretation, although one has to be careful: this is the "virtual species". (A virtual species is a pair of ordinary species (F, G), written as F-G, where (F, G) = (H, K) if F + K = G + H. Replacing species with integers and addition with multiplication, this looks like the construction of the rational numbers from the integers.)

But is there a natural quotient operation? If there is, I can't find a mention of it in Flajolet and Sedgewick, and Marko Petkovsek (who I was just talking about this with) tells me it's not in Bergeron et al. either; this is one of the sources he's using for our current topics course in combinatorics. Let F be a species with 1 structure on the set of size 0, and let Fn be its restriction to size n. Let F+ be its restriction to all positive sizes. Then
F^{-1} = (1 + F_+)^{-1}

where 1 is the multiplicative identity species (hence the name); it has one structure on a set of size 0. We can of course write this as
1 - F_+ + F_+^2 - F_+^3 + F_+^4 - \cdots

and we collect the "positive" and "negative" terms to get

So the inverse of the species F exists, and it's a virtual species in which the positive part is even-length tuples of nonempty F-structures, and the negative part is odd-length tuples of nonempty F-structures. This doesn't quite seem like a "combinatorially natural" operation -- but on the other hand I was able to describe it in a single sentence, so it seems plausible that it could be a natural thing to occur somewhere.

Notice that this also proves that if

and F(z)G(z) = 1, f0 = 1, then all the gn are integers, since the generating function G(z) actually counts something! (Well, a virtual something -- but that's close enough. It won't surprise you to learn that for two species H and K, the generating function corresponding to the virtual species H-K is H(z) - K(z). G here is a virtual species, so the gn are (possibly negative) integers. Of course, there are other ways to prove this, but I think this is a cute "combinatorial proof".

10 February 2008

Splitting permutations of [n] into two classes

The number of permutations of [n] with an even number of cycles and with an odd numbers of cycles are equal, for all n ≥ 2. (I assume this is a standard result, just because anyone who's stared at the Stirling numbers for long enough would think of it, but I don't recall hearing it before.)

Here's a proof which relies on bivariate generating functions. Note that a permutation can be viewed a set of directed cycles. In the notation of combinatorial classes, we write this
\cal{P} = \hbox{Set}(\mu \hbox{ Cyc}(\cal{Z}))

Now, there are (n-1)! cycles on an n-element set (order the n elements in n! ways, but then divide by n for the rotation) giving the generating function

which simplifiies to

which is just log 1/(1-z).
The mark μ then becomes the variable u, and taking sets corresponds to exponentiation, so we have
P(z,u) = \exp( u \log (1-z)^{-1}) = (1-z)^{-u}

as an exponential generating function for permutations, where z marks size and u marks number of cycles. That is,

where $\left[ {n \atop k} \right]$ is the number of permutations of [n] with k cycles.

Finally, let u = -1. Then we get

and taking the coefficient of zn on both sides for any n ≥ 2 and multiplying through by n!, we get

But on the right-hand side, we get a contribution of +1 for each permutation of [n] with an even number of cycles, and -1 for each permutation with an odd number of cycles. Thus those two sets are equal in cardinality.

I feel like this is a standard result, but I hadn't seen this proof before coming up with it myself. There's a similar proof that doesn't require the bivariate generating function machinery; the generating function of the numbers $\left[ {n \atop k} \right]$ for fixed n is

and let z = -1 here. And I may have even seen a bijective proof of this fact, which gives a bijection between permutations of [n] with an odd number of cycles and an even number of cycles.

By the way, the analogous result that, say, one-third of permutations of [n] have a number of cycles divisible by 3, one-third have 3k+1 cycles for some k, and one-third have 3k+2 cycles for some k isn't true. (It should be approximately true, though, but that doesn't seem particularly interesting.)

Edited, 3:24 pm: See the comments for a bijective proof offered by Jeremy Henty; it really is quite simple but eluded my grasp this morning.

02 February 2008

How many fixed points do involutions have?

It's a fairly well-known fact that if we pick a permutation of the set [n] at random, then the expected number of cycles of length k in that permutation is 1/k, for 1 ≤ k ≤ n. (Somewhat more memorably, the average number of elements in a permutation of [n] which are members of a k-cycle is 1.)

So what would you expect to happen if you just considered permutations have cycles of length 1 and 2 -- that is, involutions -- and sampled uniformly at random from them? If you're like me, your first thought is that the average involution will have twice as many 1-cycles as 2-cycles, and the same number of elements in 1-cycles as 2-cycles -- that is, the average involution on [n] will have n/2 1-cycles (i. e. fixed points) and n/4 2-cycles, for a total of n/2 fixed points and n/2 elements in 2-cycles.

But then you look at "n/2 fixed points" and think that that seems awfully large...

it turns out the average number of fixed points of an involution chosen uniformly at random from all involutions is about n1/2. This follows from standard generating function arguments. The exponential generating function of the number of involutions marked for their number of fixed points is exp(uz+z2/2); that is, the coefficient of znuk in that function is n! times the number of involutions on [n] with k fixed points. Standard methods from, say, Flajolet and Sedgewick (which I will probably buy when it comes out in print later this year, because I seem to cite it constantly) gives that the expected number of fixed points is

and this can actually be rewritten as nan-1/an, where an is the number of involutions on [n], that is, [画像:$a_n = n! [z^n] \exp \left( z + {z^2 \over 2} \right)$]. (There's a nice interpretation for this -- an-1/an is the probability that any given element of an involution is actually a fixed point -- although it's hard to say exactly why this should be true.)

Then, if you're still like me, you think "alas, I have forgotten how to figure out the asymptotics of coefficients of entire functions". But the asymptotic number of involutions is the last example of Herbert Wilf's generatingfunctionology. After some work, the asymptotic formula he gives for an gives that the expected number of fixed points in an involution is n1/2 - 1/2 + o(1)

Once you know that fixed points are rare, then it's not hard to guess that their distribution should be approximately Poisson, and thus variance should be of the same order of magnitude as the mean -- and the variance result turns out to be true. (I don't know about the Poisson result.) The variance is, I believe, n1/2 - 1 + o(1), although this is only from numerical evidence. (The generating-function way to calculate the variance relies on the definition of the variance as the mean of the square minus the square of the mean; this means I need better asymptotics in order to verify this. The better asymptotics are certainly achievable, but they're not at my fingertips.)

The result is a bit surprising, though -- why does cutting out cycles of length 3 and greater so drastically change the relative numbers of 1-cycles and 2-cycles? But involutions make up a vanishingly small proportion of all permutations, and weird things can happen in these asymptotically negligible sets without the bulk of the population caring at all.

26 October 2007

Trickery with divergent sums

When |q| < 1, 1 - q + q2 - q3 + ... = 1/(1+q).

And (1 - q + q2 - q3 + ...)2 = 1 - 2q + 3q2 - 4q3 + ..., viewing both sides as formal power series.

Thus we have
1 - 2q + 3q2 - 4q3 + ... = 1/(1+q)2
as formal power series. Let q = 1; then
1 - 2 + 3 - 4 + ... = 1/4.
Isn't that weird?

I learned this one today; it's an example of "Abel summation". Basically, in order to sum a sequence, we take its generating function and evaluate it at 1. Nothing too controversial there; at least to me it seems like the natural thing to do. But it looks a little more mysterious if you see it as I first did, which is without the q's. It seems "obvious" that 1 - 1 + 1 - 1 + 1 - 1 + ... = 1/2, if it's going to equal anything; we define such an infinite sum as the limit of its partial sums, which are alternately zero and one, so we should take something halfway between them. But then multiply
(1 - 1 + 1 - 1 + 1 - 1 + ...) (1 - 1 + 1 - 1 + 1 - 1 + 1 ...)
and you get 1 - 2 + 3 - 4 + 5 - 6 ..., in the following sense: let
(a0 + a1 + a2 + ...) and (b0 + b1 + b2 + ...)
be formal sums. Then we define
(a0 + a1 + a2 + ...) (b0 + b1 + b2 + ...) = c0 + c1 + c2 + ...
where
c0 = a0b0, c1 = a1b0 + a0b1, c2 = a2b0 + a1b1 + a0b2
and so on. (The definition is of course inspired by the definition on formal power series.) Then here we have ai = bi = (-1)i, and so each term in the sum defining cn is (-1)n; there are n+1 such terms. So cn = (-1)n(n+1), which gives that sum. Substituting the value 1 - 1 + 1 - 1 + ... = 1/2 into the above equation gives

1 - 2 + 3 - 4 + 5 - 6 + ... = 1/4.

Suppressing the q's makes this seem a whole lot more mysterious.

There's a Wikipedia article that shows how you can write out that sum four times and rearrange the terms to get 1; that doesn't seem all that satisfying to me, though, because with similar methods one can show that 1 - 1 + 1 - 1 + 1 - ... equals either zero or one.

John Baez has reproduced Euler's "proof" that 1 + 2 + 3 + 4 + ... = ζ(-1) = -1/12, which to me seems even crazier. But here's another reason why ζ(-1) = 1/12 "makes sense", if you buy 1 - 2 + 3 - 4 + ... = 1/4.

Let S = 1 + 2 + 3 + 4 + ...; add this "term by term" to 1/4 = 1 -ひく 2 +たす 3 -ひく 4 + ... . Then you get

S + 1/4 = 2 +たす 0 +たす 6 +たす 0 +たす 10 +たす 0 + ...

Then divide through by 2 to get

S/2 + 1/8 = 1 +たす 0 +たす 3 +たす 0 +たす 5 +たす 0 +たす 7 +たす 0 +たす 9 + ...

but it's clear that

2S = 2 + 4 + 6 + 8 + ...

and we might as well just stick some zeroes in there to get

2S = 0 + 2 + 0 + 4 + 0 + 6 + 0 + 8 + ...

(If we were still doing this with the q-analogue, this would amount to replacing q with q1/2.) Adding these two sums together, term-by-term, gives

5S/2 + 1/8 = 1 +たす 2 +たす 3 +たす 4 +たす 5 + ...

but we know the sum on the right-hand side is just S. So we have 5S/2 + 1/8 = S; solving for S gives S = -1/12. (I make no guarantee that this is the "most efficient" way to get ζ(-1).)
A basically identical calculation, starting with and thus (letting q = 1) -1/8 = 1 -ひく 8 +たす 27 -ひく 64 + ... yields ζ(-3) = 1/120.

Similarly, we can "compute" ζ(-2); begin with the generating functionand letting q = 1, we get 0 = 1 - 4 + 9 - 16 + ...; let S = 1 + 4 + 9 + 16 + ... and add the two sums. Then S = 2 + 0 + 18 + 0 + ...; divide by two to get S/2 = 1 +たす 0 +たす 9 +たす 0 + ...

But remember that S = 1 + 4 + 9 + 16 + ...; thus 4S = 0 + 4 + 0 + 16 + 0 + 36 + ... if we let ourselves stick in zeros arbitrarily. Adding these two expressions, we get

9S/2 = 1 +たす 4 +たす 9 +たす 16 +たす 25 + ...

or 9S/2 = S; thus S = 0, which is in line with the standard result that ζ(-2n) = 0 for all integers n. This is a consequence of the functional equation for the zeta function,
since when s = -2n for some positive integer n, the right-hand side is zero.

25 October 2007

Something I didn't know about generating functions (and probably should have)

I learned today, while reading Flajolet and Sedgewick's Introduction to the Analysis of Algorithms, something that I probably should have known but didn't. I should have known this because I've learned what ordinary and exponential generating functions are about four times now (since combinatorics classes often don't assume that much background, they seem to get defined over and over again).

Namely, given a sequence a0, a1, a2, ..., we can form its exponential generating function $A(z) = \sum_{n=0}^\infty a_n {z^n \over n!}$. Then the ordinary generating function $\tilde{A}(z) = \sum_{n=0}^\infty a_n z^n$ is given by the integral

when this integral exists. (This is something like the Laplace transform.)

The proof is straightforward, and I'll give it here. (I'll ignore issues of convergence.) Consider that integral; by recalling what A(z) is, it can be rewritten as

and we can interchange the integral and the sum to get

But we can pull out the part of the integrand which doesn't depend on t to get

and that integral is well-known as the integral representation for the gamma function. The integral is k!, which cancels with the k! in the denominator, and we recover $\tilde{A}(z) = \sum_{k=0}^\infty a_k z^k$.

It's easy to come up with examples to verify this; for example, the exponential generating function of the binomial coefficients ${N \choose 2}$ is z2ez/2; plugging this into our formula gives z2(1-z)-3, which is the ordinary generating function of the same sequence.

I'm not sure how useful it is to know this, though... I can't think of too many times where I had an exponential generating function for some sequence and thought "oh, if only I had the ordinary generating function."

16 October 2007

Polynomials with interesting sets of zeros

Polynomials with interesting sets of zeros, from Richard Stanley.

There are certain polynomials which are naturally defined in terms of combinatorial objects, which have interesting sets of zeros. For example, consider the polynomials

F1(x) = x
F2(x) = x + x2
F3(x) = x + x2 + x3
F4(x) = x + 2x2 + x3 + x4
F5(x) = x + 2x2 + 2x3 + x4 + x5

where the xk coefficient of Fn(x) is the number of partitions of n into k parts, that is, the number of ways to write n as a sum of k integers where order doesn't matter. For example, the partitions of 5 are

5, 4+1, 3+2, 3+1+1, 2+2+1, 2+1+1+1, 1+1+1+1+1

which have 1, 2, 2, 3, 3, 4, and 5 parts respectively; that's where F5 comes from. It turns out that if you take Fn for large n and plot its zeros in the complex plane, you get a nice-looking figure; see this PDF file. This is true of other combinatorially defined families of polynomials as well. The zeroes of this particular set of polynomials are given in some recent work of Robert Boyer and William Goh. (Boyer gave a talk today at Temple which I went to, hence why this is on my mind lately; I'd actually heard about this phenomenon when I took a class from Stanley as an undergrad.)

At least in the case of the partitions, it seems noteworthy that if we sum yn Fn(x) over all n, getting the generating function of all partitions by size and number of parts, we have something nice, namely



and a lot of the other sets of polynomials described by Stanley can probably be described similarly. (Boyer and Goh allude to this at the end of their expository paper.) I don't know enough analytic number theory to say something intelligent about this, though; the proof that the pictures keep looking "like that" (in the case of partition) apparently uses some very heavy machinery.

13 October 2007

What comes next: 1, 2, 5, ...

What comes next in this sequence: 1, 2, 5, ...?

I know of at least seven reasonable answers: 10, 11, 12, 13, 14, or 15.

Here are explanations for each of them. Throughout, I'll denote the nth term of each sequence by an, where indexing starts at 1. (So a3 = 5, and we're looking for a4. This indexing convention is awkward for the first couple cases but agrees with the conventional indexing for Catalan and Bell numbers.)

1, 2, 5, 10, ... -- an = (n-1)2 + 1.

1, 2, 5, 11, ... -- $a_n = {n-1 \choose 3}$

1, 2, 5, 12, ... -- an = 2an-1 + an-2, where a1 = 1, a2 = 2. This sequence occurs in the convergents of the continued fraction for √2. The first few convergents are

1/1, 3/2, 7/5, 17/12, 41/29

and in general the nth convergent, bn/an, has an = an-1 + bn-1, bn = an + bn-1, with a1 = b1 = 1. An explicit formula for these is given by



1, 2, 5, 13, ... -- alternate members of the Fibonacci sequence, which begins 1, 1, 2, 3, 5, 8, 13. If you don't like referring to another sequence, this one can be defined by the recurrence an = 3an-1 - an-2, where a1 = 1, a2 = 2. An exact formula for the nth Fibonacci number is fn = (φn - (-&phi)-n)/√5, where φ = (-1+√5)/2; then an = f2n-1.

1, 2, 5, 14, ... -- the Catalan numbers. These satisfy $a_n = {1 \over 2n+1} {2n \choose n}$. The Catalan numbers count basically everything. Okay, not everything. But in Richard Stanley's book Enumerative Combinatorics, he gives 66 sets counted by Catalan numbers; he has an ever-expanding list of them on his web site, which is currently up to 162, numbered (a), (b), (c), ..., (z), (aa), ..., (zz), (aaa), ... (zzz), (a4), ..., (z4), ..., (f7). This list may hold the prize for "longest list of things indicated by letters". The canonical example of "things counted by Catalan numbers" is binary trees, counted by number of leaves, and most of these other things can eventually be reduced to that case. The Catalan numbers satisfy the nice recurrence



which explains why they're so prevalent; they correspond to combinatorial objects that can be generated recursively, where each object of size n+1 consists of a pair of similar objects whose sizes together add up to n.

1, 2, 5, 15, ... -- the Bell numbers. These are the number of set partitions of the set {1, 2, ..., n}; that is, the number of ways to arrange n balls in some boxes. For four balls, we have the partitions

1234, 123/4, 124/3, 134/2, 234/1, 12/34, 13/24, 14/23, 12/3/4, 13/2/4, 14/2/3, 23/1/4, 24/1/3, 34/1/2, 1/2/3/4

where slashes separate the different boxes. There's no explicit formula for these, but they satisfy the recursion



which can be justified as follows. If we want to count set partitions of {1, 2, ..., n+1}, we can count them by the number of elements in the block containing n+1. The number of set partitions of {1, 2, ..., n+1} where n+1 is in a block of size j+1 is ${n \choose j} a_{n-j}$; we can pick the j elements which are in the same block as n+1 and then make a set partition out of the remaining n-j elements. Thus



and letting k = n-j and recalling ${n \choose n-k} = {n \choose k}$ gives the desired result.

These sequences diverge from each other quickly; the next term of each sequence is 17, 21, 29, 34, 42, and 52 respectively. And the growth rates of the six sequences in question are quite different; roughly, the sequences given here grow like n2, n3/6, (1+&sqrt;(2))n, ((3+&sqrt;(5))/2)n, 4n/(π n3)1/2, and (really fast). Their generating functions also look quite different. For the first five sequences, we have ordinary generating functions which converge in some neighborhood of the origin, which are



but the last one doesn't even have a convergent ordinary generating function; its exponential generating function is exp(exp(z)-1). There's a subtle difference among the ordinary generating functions as well -- the first two have singularities at z=1, so the sequences they represent grow only as fast as powers of n, while the last three have singularities closer to the origin than z=1, so the sequences they represent grow roughly exponentially.

This is all an example of what I've heard called "the law of small numbers" -- the fact that there just aren't enough small numbers to avoid numerical coincidences like these.

26 September 2007

Fun with images! (Mostly about math tattoos.)

Stokes' theorem graffiti. In a bathroom. Only in Camberville. (I would have said "only in Cambridge", but this particular graffito is actually in Somerville. Diesel Cafe, to be precise; there was a summer where I spent a lot of time there doing math, so I'm not surprised.)

and a tattoo of the Y combinator, which is part of this collection of images of science tattoos. Other mathematically inspired tattoos in that collection are:


I can think of others. An ex-girlfriend had an infinity symbol tattooed on her (although I believe that was more of a literary thing than a scientific one). A friend of mine has

¬ (p ∧ ¬ p))

tattooed on his leg. A friend of a friend in college had the Taylor series for sine on her arm. It's written out, as

x - {x^3 \over 3!} + {x^5 \over 5!} - {x^7 \over 7!} + \cdots

except with exactly enough terms to go once around her arm; it's not in the more compact form

\sum_{n=0}^\infty {(-1)^n x^{2n+1} \over (2n+1)!}.

I don't have any tattoos; I can't think of anything I'd want permanently inked on me. But if I were to get a tattoo, the equation

\prod_{i=0}^\infty \left( 1 + z^{2^i} \right) = {1 \over 1-z}

would be up there. (I had it written on my wall in college.) I've heard it called the "computer scientist's identity"; it's an analytic version of the fact that every positive integer has a unique binary expansion. The left-hand side expands to

(1+z1)(1+z2)(1+z4)(1+z8)...

and you can imagine expanding that out; then you get something like

1 + z1 + z2 + z2+1 + z4 + z4+1 + z4+2 + z4+2+1 + ...

where each sum of distinct powers of two appears as an exponent exactly once. The right-hand side is the geometric series

1/(1-z) = 1 + z + z2 + z3 + ...

and in order for these two expressions to be equal, the coefficient of zn in each of them has to be equal. All the coefficients on the right-hand side are 1; thus the coefficients on the left-hand side must be too. That means each nonnegative integer appears exactly once as an exponent, i. e. as a sum of distinct powers of two. The fact that one can disguise this combinatorial fact as a fact about functions from C to C -- and then apply the methods of complex analysis, although they're not incredibly useful in this particular case -- is a fascinating example of the interplay between the discrete and the continuous.

10 September 2007

How many well-formed formulas are there?

The following is exercise 1.1.2 of "A Mathematical Introduction to Logic", by Herbert Enderton:
Show that there are no [well-formed formulas] of length 2, 3, or 6, but that any other positive length is possible.


Enderton defines a wff as follows: every sentence symbol An is a wff. If α and β are wffs, then so are (¬ α), (α ∧ β), (α ∨ β), (α → β), and (α ↔ &beta); furthermore, that's all of the wffs. (The parentheses count as symbols. The sentence symbol An counts as a single symbol.)

The problem is pretty straightforward. A, (A ∧ B), and (A ∧ (B ∧ C)) are wffs of length 1, 5, and 9 respectively. If α is a wff of length n, then (¬ α) is a wff of length n+3; thus we can generate wffs with lengths in the following sequences: {1, 4, 7, ...}, {5, 8, 11, ...}, {9, 12, 15, ...}. This is all the integers except 2, 3, and 6.

Now, each of the ways of forming new wffs from old takes one or two wffs, adds them together, and adds three new symbols. So we can't get a wff of length 2 or 3 because it would have to come from a wff of length -1 or 0. Furthermore, there are no wffs of length 6, because they'd either have to be of the form (¬ α) where α has length 3, or (α * β) where * is ∧, ∨, →, or ↔ and α and β have total length 3, which would mean that one of α and β has length 2.

The natural follow-up question, to me at least, is "how many wffs are there"? This is silly to ask because if we have infinitely many sentence symbols, there are of course infinitely many wffs of any length for which there is at least one wff. So let's restrict ourselves to one sentence symbol, A. Let fn be the number of wffs of length n. Let F(z) = Σn fn zn be the generating function of this sequence. Then we have

f(z) = z + z3f(z) + 4z3f(z)2

where the first term on the right-hand side corresponds to the formula which consists of just the sentence symbol, the second term to all those formulas of type (¬ α), and the third term to all those forms of type (α * β). (The coefficient 4 comes up because there are four things that "*" could be.)

Solving this for f, we get

f(z) = {1-z^3 - \sqrt{1-2z^3+z^6-16z^4} \over 8z^3}

and so we get

f(z) = z + z4 + 4z5 + z7 + 12z8 + 32z9 + z10 + 24z11 + 160z12 + 321z13 + 40z14 + 480z15 + ...

which is a bit surprising; the coefficients don't grow "smoothly". This is because, for example, wffs of length 9 are of the form (A * (A * A)) or ((A * A) * A) where * is as above (and yes, these two forms are different), so there are 32 of these; the only wff of length 10 is (¬ (¬ (¬ A))), since any other such formula would come from two formulas with total length 7, and one of those would have to be of length 2, 3, or 6 which is impossible. But I'm more interested in the asymptotics.

It's a general principle that the coefficients of a generating function grow roughly like fn ~ (1/ρ)n, where ρ is the absolute value of the singularity of f(z) = Sigma;n fn zn; this is true up to a "subexponential factor" which has to do with the nature of that singularity (polar, algebraic, etc.) The singularities of f(z) in this case arise from taking the square root of numbers near 0; thus they are located at the roots of the polynomial 1 - 2z3> +z6 -16z4. The smallest root is about ρ = 0.4728339090, so the number of wffs of length n grows like (1/ρ)n, or about 2.11n, times some subexponential factor. If you take larger coefficients they do have a ratio of about 2.11.

The subexponential factor is probably something like n-3/2 -- that is, the number of wffs of length n is probably about 2.11n n-3/2 -- because that particular subexponential factor often arises when one analyzes the asymptotics of tree-like structures. Don't ask me to explain that.

The interpretation this brings to mind is that if you write a wff by just writing symbols at random, at any given step there are in some sense "on average" 2.11 symbols that you could write down which would keep the wff grammatical. There are eight possible symbols, so this suggests that the best special-purpose compression algorithm for this formal language would shorten formulas to log(2.11)/log(8), or about 36%, of their original length.

30 August 2007

profiles of interconnection networks

Flajolet and Sedgewick's Analytic Combinatorics (link goes to 12 MB PDF of the book, which is still in progress) is a most interesting book. I took a course based on the first three chapters of the book last fall; the purpose of the class was basically to teach us how to translate combinatorial structures into generating functions, in a fairly routine way. Unfortunately, we didn't get to the part of the text where we could then tell things about the asymptotics of those combinatorial structures, which is a tremendously useful thing to do.

So I've been working through parts of the book lately. In example V.8 Flajolet and sedgewick talk about a problem considered by Lagarias, Odlyzko and Zagier in their paper On The Capacity of Disjointly Shared Networks (although not quite in the same language). Flajolet and sedgewick ask: "There are 2n points on a line, with n point-to-point connections between pairs of points. What is the probable behavior of the width of such an interconnection network?" The width is defined in the following way: imagine the connections as circular arcs between the points on a horizontal line; let a vertical line sweep from left to right; then the width is the maximum number of arcs encountered by such a line.

For example, if we have n = 6 (and thus 12 points), we might make the connections 9-12, 5-10, 1-2, 6-8, 3-11, 4-7; as the line sweeps between 6 and 7 there are four open connections (and never are there more), so the width is 4. The "profile" of this network is the sequence given by the number of open connections just after the line sweeps past 0, past 1, past 2, and so on (call these a0, a1, a2, ...); this is the sequence

0, 1, 0, 1, 2, 3, 4, 3, 2, 3, 2, 1, 0

and in general this sequence increases to somewhere around n/2 and then decreases. Flajolet and sedgewick says that Louchard proved this, but I haven't looked at that paper. What surprised me, when I saw this problem, is that the profile actually looks something like a parabola, as you can see from the simulations at the top of p. 311 in Flajolet and sedgewick. Louchard proved that the profile "conforms asymptotically to a deterministic parabola 2nx(1-x)... to which are superimposed random fluctuations of O(√n)". So, for example, in such a network with one thousand nodes (n = 500), the profile goes as high as 250 (when x = 1/2) before going back down to zero.

Why should this be?

Imagine that you're putting together a random interconnection network. The profile sequence clearly has the property that each member ak is one more or one less than the last one. It'll be one more if k is the first member of one of the pairs, and one less if it's the second member. What's the probability that k is the first (i. e. smaller) member of its pair? If k is small, it'll be large; if k is near 2n then it'll be small. In fact, if we know k is a member of some pair, the other member of the pair is equally likely to be any integer from 1 to 2n except k; thus the probability that k is the smaller member of its pair is approximately 1 - k/2n, and the probability that it's the larger member is approximately k/2n.

Thus ak = ak-1 + 1 with probability 1 - k/2n, and ak = ak-1 - 1 the rest of the time. The expected value of ak - ak-1 is approximately (1-k/2n) - (k/2n), or 1-k/n. So, for example, one-fourth of the way through the process, when k = n/2, we expect ak to increase by about one-half each time we increase k by one.

Integrating with respect to n, we get

at ≈ ∫0t (1-k/n) dk = t - t2/2n

which is just a rescaling of Louchard's parabola. Flajolet and sedgewick's book is full of examples of things like this, where some apparently random structure turns out to have a lot of order when you "zoom out".

22 August 2007

Texas 30, Baltimore 3.

While at the Phillies game tonight (the Phillies lost, 15-3, to the Dodgers), I saw on one of the out-of-town scoreboard a score of Texas 26, Baltimore 3.

I figured it had to be an error, as did the other people in the stands who noticed it. Twenty-six runs? (At that moment, the other out-of-town scoreboard reported the score in that game as 16 to 3, because it can't show scores above 19. Those don't happen too often in baseball.)

In the end, Texas won that game -- the first game of a doubleheader, no less -- 30 to 3. MLB.com's account of the game is here. It's the most runs scored since the Chicago Colts (now Cubs) beat the Louisville Colonels (who haven't existed since 1899) by a score of 36 to 7, on June 29, 1897. Nobody in "modern" baseball (by convention, this is since 1901) has scored more than 29. Thirty runs is also an American League record. There have been eighteen 25-run games in modern baseball.

You'd think that a team scoring thirty runs would be likely to score in every inning, or at least in more innings than not. But Texas only scored in four innings -- 5 runs in the fourth, 9 in the sixth, 10 in the eighth and 6 in the ninth.

This reminded me of a record I once came across: teams which have scored in every inning of a nine-inning game. This is rarer than you'd think -- only seven times in major league history, all in the National League. (Note that it's almost always the visiting team that sets this particular record; a home team which has scored in the first eight innings will almost certainly not come up in the ninth.) Those seven teams won by scores of 19-8, 26-12, 20-10, 36-7 (the same 36-7 game as above), 22-8, 15-2, and 13-6. The losing teams in these efforts put up an average of 7.6 runs a game -- quite a bit higher than the average of four or five runs -- which lends a bit of credence to the idea that the two teams' scores in a game aren't independent.

And here's a mathematical question: if a team scores 30 runs, what is the probability they manage to do it while only scoring in four or less innings? We'll assume the team is a visiting team, and thus has all nine innings to work in. Furthermore, the number of runs the team scores in each inning is independent. The most recent data I could find for run distribution is from John Jarvis' collection, for the 2005 American League, so I'll use that. The distribution of the number of runs scored in a single inning, for the 2005 AL, is as follows:
Number of runs scored 0 1 2 3 4 5 6 7 8 9 10
Number of times 14458 3071 1442 676 325 150 78 22 4 4 6
Probability .714 .152 .0713 .0334 .0161 .00741 .00385 .00109 .000198 .000198 .000297


Incidentally, under the assumption of independence, the probability of a team scoring in all nine innings in a game in the 2005 AL is (1-.769)9, or about one in 79,000; there are 2,430 games in a season currently (162 per team, times thirty teams, divided by two), so we expect one such game every thirty years or so.

We can thus take the probability generating function corresponding to this distribution:

f(z) = (14458 + 3071z + 1442z2 + 676z3 + 325z4 + 150z5 + 78z6 + 22z7 + 4z8 + 4z9 + 6z10)/20236.

The probability generating function corresponding to the sum of this distribution with itself is g(z)2; this is the sum of the number of runs scored in two independent innings, given that scoring takes place in both of those innings. A similar interpretation holds for higher powers. From this, we can easily find the probability that a team scores a total of thirty runs in the first k innings of a game, and none in the remaining 9-k innings; it's

[z30](f(z)-p)k p9-k

where p = 14458/20236 is the probability of not scoring in a given inning. The probability that a team scores a total of thirty runs in some k innings is this times the binomial coefficient C(9,k), which is the number of ways to pick the innings in which the scoring happens. Call this P(k). Then we get

k 3 4 5 6 7 8 9
1010P(k) 2.91 74.85 614.58 1670.55 1888.56 908.64 150.82
normalized P(k) .00055 .01409 .11572 .31455 .35560 .17108 .02840


The normalized probabilities are just P(k) divided by the sum of the P(k) values; thus the value in column k is exactly the probability I sought originally. Rather surprisingly, to me, a team which scores thirty runs is most likely to score in seven of the nine innings, which is also the median of the distribution; innings with no score are so common that at least one happens in all but three percent of thirty-run games. To answer the original question, the probability that a team scoring thirty runs does so while scoring in four or less innings is about 1.5%. (Assuming, of course, that innings are independent and run distributions remain like the 2005 American League indefinitely.) And you can't even begin to suspect I'm wrong until, oh, a couple hundred thirty-run games have happened without this occurring again.

(Complete data of this form: a team scoring one run is of course most likely to score in one inning. A team scoring two or three runs is most likely to have scored in two innings; 4-7 runs, three innings; 8-12 runs, four innings; 13-19 runs, five innings; 20-28 runs, six innings; 29-40 runs, seven innings; 41-57 runs, eight innings; 58 or more runs, nine innings. I suspect at least the first few of these numbers could be checked against actual data.)

07 August 2007

more exotic dice

Mark Dominus at The Universe of Discourse shows that there exists a pair of six-sided dice not labeled 1, 2, 3, 4, 5, 6 (or some trivial transformation thereof) that gives the same probability of throwing any sum from 2 to 12 as a pair of standard six-sided dice.. In particular, if one die has faces {1,2,2,3,3,4} and the other has faces {1,3,4,5,6,8} the distribution comes out right. This results from the factorization

(x6 + x5 + x4 + x3 + x2 + x)2 = (x4 + 2x3 + 2x2 + x) (x8 + x6 + x5 + x4 + x3 + x)

and if it isn't clear to you how the fact about dice follows from this, you should read Mark's entry. But the interesting thing is not the result so much as how he got it; I've seen an ad hoc derivation of this alternate pair of dice, but Mark gets is by factoring the left side and rearranging it to get the right side.

In particular, this indicates that you couldn't make, say, a pair of dice that have the same sums as two d7s (i. e. seven-sided dice). That would require us to find two polynomials P(x), Q(x) such that

(x7 + x6 + x5 + x4 + x3 + x2 + x)2 = P(x) Q(x)

but the thing we're squaring on the left-hand side only factors as x(x6 + x5 + x4 + x3 + x2 + x + 1). So we could have

P(x) = (x6 + x5 + x4 + x3 + x2 + x + 1), Q(x) = x2(x6 + x5 + x4 + x3 + x2 + x + 1)

which corresponds to dice with sides {0, 1, 2, 3, 4, 5, 6} and {2, 3, 4, 5, 6, 7, 8}; this clearly isn't the intent of the problem. There's no other way to rearrange the factors. In the case of the six-sided dice, we had

x6 + x5 + x4 + x3 + x2 + x = x(x2 + x + 1)(x2 - x + 1)(x+1)

which allowed for a lot more rearrangement.

So, to take a similar problem, do there exist a pair of eight-sided dice with the analogous property? This time we're looking for a refactoring of

(x8 + x7 + x6 + x5 + x4 + x3 + x2 + x)2 = P(x) Q(x).

There are certain other conditions this refactoring has to satisfy, as Mark points out; we must have P(1) = Q(1) = 8 (which corresponds to the dice having eight sides) and none of the coefficients of P or Q can be negative; furthermore neither one can have a nonzero constant term (this corresponds to having faces labeled zero). The left-hand side factors nicely, as x(x+1)(x2+1)(x4+1). So we have

x2(x+1)sup>2(x2+1)sup>2(x4+1)sup>2 = P(x) Q(x)

and one of the factors x must be assigned to P(x), while the other must be assigned to Q(x). Now, at x=1 this becomes

12(1+1)2(1+1)2(1+1)2 = P(1) Q(1)

and we remember we needed P(1) = 8, Q(1) = 8; each of the six remaining factors on the left-hand side is a factor of 2, so we must have three of these making up P(x), and three making up Q(x). The possible factorizations, then, are

P(x) = x(x+1)2(x2+1), Q(x) = x(x2+1)(x^4+1)2
P(x) = x(x+1)2(x^4+1), Q(x) = x(x2+1)2(x^4+1)
P(x) = x(x+1)(x2+1)2, Q(x) = x(x+1)(x4+1)2

in addition to the original one; these correspond to pairs of dice labeled {1, 2, 2, 3, 3, 4, 4, 5} and {1, 3, 5, 5, 7, 7, 9, 11}; {1, 2, 2, 3, 5, 6, 6, 7} and {1, 3, 3, 5, 5, 7, 7, 9}; {1, 2, 3, 3, 4, 4, 5, 6} and {1, 2, 5, 5, 6, 6, 9, 10}. The proliferation of these pairs here seems to stem from the simple factorization, which in turn is a consequence of the fact that 8 is a power of 2. Alternatively, we can imagine replacing each d8 with three coins, labeled {0, 1}, {0, 2}, and {0, 4}; rolling a d8 corresponds to flipping these three coins and taking 1 plus the sum of their labels. The "alternative" dice correspond to the different ways in which we can split up this collection of six coins into two triplets; there are a lot of ways to rearrange these coins. Essentially, the d8 "factors" into these component coins. In the d6 case I don't see the dice being able to be factored in this way into smaller dice (a coin is just a two-sided die); this is because factoring the corresponding polynomials gives polynomials with negative coefficients, which has no "physical" analogue.

03 August 2007

a bunch of links, and a joke

Ranking is a trap, from Statistical Modeling, Causal Inference, and Social Science, originally from junk charts.

There's a plot that illustrates how common various first names are, and have been in the past, but instead of plotting the frequency with which the names occur against time, they plot the rank of the name in a list of all names. At least there are enough different names that this shouldn't obscure trends too much; if I remember correctly frequencies of first names follow a power law, with the nth most common name having a frequency proportional to n for some constant α> 1. The problem is worse when considering ranks in smaller populations, because then the deviation from some "ideal" distribution is likely to be a lot worse.

(Incidentally, do last names behave differently than first names? I would suspect they do, because people don't choose their children's last names -- or, if they do, they choose them from a pool of two -- while they choose their children's first names from a larger pool. So "undesirable" last names should stick around longer.)

Sum Divergent Series, III, from The Everything Seminar; this is the sequel to the post I mentioned when I was talking about generating functions a few days ago, and shows how to do some more strange sums (for example, 1 + 2 + 3 + 4 + ... = -1/12). It's followed by an interesting post about Mellin transforms. The idea on which the Mellin transform is based is stated as follows:
Generating functions are useful when you can break down Objects into constituent elements whose Size adds to give you the original Size and zeta functions are useful when you can break down Objects into constituent elements whose Size multiplies to give you the original Size.<

Since the integers have both additive and multiplicative structure, I can see how having a tool for going back and forth between those structures -- the Mellin transform, as it turns out -- would be quite useful.

Bivariate baseball score plots is the blog of bivariate baseball score plots, which features plots of the final scores in baseball games; unfortunately it's difficult to get any great insights just from staring at the plots. (This may in itself be a great insight -- namely that the good teams and the bad teams aren't that different.) In other baseball-related news, Strange Maps features The United Countries of Baseball, a map in which various parts of the country are painted with the colors of various baseball teams. There are other such maps -- common census's map is based on actually asking people; Geographer Dan at Baseball Think Factory has a map of which teams are blacked out on MLB.TV in various locations (although these territories overlap) and somewhere (I can't find it) there's another map of his which just shows which is the closest team to each point.

Finally, a joke I made by accident yesterday:
my friend: I have baby bok choi that I really need to use soon. What should I do with them?
me: Wait for it to grow up. Then do whatever you do with regular bok choi.
my friend's girlfriend: That is such a mathematician answer.

of course, I have no idea what one does with regular bok choi. I am not a fan of leaf vegetables.

30 July 2007

seven trees in one! and how laziness can be bad.

Sum divergent series, II, from The Everything Seminar. We learn

(1 + 1 + 2 + 5 + 14 + 42 + ...) = (1-√(-3))/2

in some sense (this follows from the usual generating function for the Catalan numbers, by letting z approach 1). This is a bit surprising, because we add a bunch of positive numbers and get something which isn't even real, but weird things happen with infinity. It turns out that this is a sixth root of unity, and therefore in some sense

(1 + 1 + 2 + 5 + 14 + 42 + ...)7 = (1 + 1 + 2 + 5 + 14 + 42 + ...)

which suggests that there ought to be a simple bijection between 7-tuples of trees (counted on the left) and trees (counted on the right). What that means is that we should be able take seven trees and in some way make one big tree out of them, and that furthermore this scheme is reversible -- that is, given the one big tree, we can determine which seven small trees it came from. And, in fact, there is! You can read about it here. The actual theorem is that "There is a very explicit bijection between the set of all seven-tuples of trees and the set of all trees", where "very explicit" has a certain technical sense. The bijection is fairly simple (see page 3) and takes about a quarter-page to specify; proving that is in, in fact, a bijection takes a page; the rest of the paper is taken up with describing why anyone would think such a crazy thing would exist in the first place.

A divergent series that comes up a lot in combinatorics is

P(z) = 1 + z + 2z2 + 6z3 + 24z4 + 120z5 + ...

where the coefficients are the factorials; I don't know of any way to assign a meaningful number to this sum, although one might exist. It is the generating function of the permutations; that is, the coefficient of zn is the number of permutations of [1, 2, ..., n], or the number of ways in which we can write the numbers from 1 to n in some order. It turns out that although this series is divergent, we can calculate with this in a useful way. For example, if we want to know the number of so-called indecomposable permutations, we can write

I(z) = 1 - 1/P(z)

where the division is in the sense of a formal power series, and then I(z) turns out to be the generating function of the indecomposable permutations. (The details can be found on p. 82 of Analytic Combinatorics by Flajolet and Sedgewick; the book is still in preparation, so the page number might change.) It would be possible to do this without using divergent generating functions -- but a lot harder. This is true of a lot of results with generating functions -- to transform a lot of basic combinatorics into routine computations. The advantage of this is clear -- by making problems that used to be hard into routine exercises, that frees up people to do more work that builds on those exercises. It's often been said that mathematicians are naturally lazy, which is a statement that might seem quite strange to the layperson. But what this means is that we try to come up with ways to automate a computation so that we don't actually have to do it ourselves. The unfortunate byproduct of this is that it often seems that doing the actual computation is useful for getting a feeling for the problem. If I were actually doing some work that involved indecomposable permutations, I would almost certainly write them out for, say, n=3 and n=4; that would give me more of an intuitive sense for the problem, even if it would be a very inefficient way to work if I just wanted to know how many of them there are.

17 July 2007

how often is a team at .500?

The Phillies' current record is 46 wins and 46 losses.

When I heard this, I thought "hmm, the Phillies have been at .500 quite often this season". Baseball-reference.com tells us that they have been 0-0 (yes, that counts!) 20-20, 21-21, 22-22, 23-23, 24-24, 26-26, 28-28, 29-29, 44-44, and 46-46 this season; that's eleven times. Is that a lot? (I remember first noticing that between the 40th and 48th games of this season; after they were 20-20 they lost, won, lost, won, lost, won, lost, and won, in that order.)

Given that the team is 46-46, how many times should we expect them to have had the same number of wins and losses? It's a lot easier to work this out, of course, if we replace "46" with some smaller number.

For example, say the team had won two games and lost two games. Then there are C(4,2) = 6 ways we can arrange their two wins and two losses: WWLL, WLWL, WLLW, LWWL, LWLW, LLWW. In the first and last of these, the team was 0-0 and 2-2 at various times; in all the others they were also 1-1 after two games. This seems kind of obtuse, but let's flip things around. In six of these possibilities (which are all equally likely, because we've assumed the team wins exactly half its games), they're 0-0 after 0 games. In four of them, they're 1-1 after 2 games. In six of them, they're 2-2 after 4 games. The expected number of times that the team is at .500? It's (6+4+6)/6, or 16/6.

Sixteen is a power of two.

If we try this again for a 3-3 team, there are C(6,3) = 20 ways we can arrange three wins and three losses; there are 20, 12, 12, and 20 ways to arrange them so that the team is at some point 0-0, 1-1, 2-2, and 3-3 repspectively. So the total number of times we expect them to be at .500? It's (20+12+12+20)/20, or 64/20.

Sixty-four is again a power of two. Hmm, this can't be a coincidence.

Let's try to find that sum in the numerator in general. If the team has n wins and n losses (so eventually I'll set n=46 to solve the original problem), then how many ways are there to arrange the wins and losses so that the team wins m of the first 2m games? Clearly this is C(2m,m) C(2(n-m), n-m); we first have to pick which of the first 2m games are the first m wins, then which of the remaining 2(n-m) wins are the n-m remaining wins. So what we want to find is the sum

C(0,0) C(2n,n) + C(2,1) C(2n-2, n-1) + ... + C(2n-2, n-1) C(2,1) + C(2n, n) C(0,0)

and I don't see how to do this directly. However, consider the (infinite) power series

1 + 2z + 6z2 + 20z3 + 70z4 + ...

where the coefficients are C(0,0) = 1, C(2,1) = 2, C(4,2) = 6, C(6,3) = 20, C(8,4) = 70, and so on. (This is called the generating function of this series; generating functions are a ridiculously powerful tool which I will only scratch the surface of here.) This turns out to be the Taylor series of the function (1-4z)-1/2 at z=0. Now, consider what happens if we multiply this power series by itself, so we have

(1 + 2z + 6z2 + 20z3 + 70z4 + ...)(
1 + 2z + 6z2 + 20z3 + 70z4 + ...)
= (1)(1) + [(2)(1) + (1)(2)]z + [(6)(1)+(2)(2)+(1)(6)] z2 + [(20)(1)+(6)(2)+(2)(6)+(1)(20)] z3 + ...

and the coefficient of zn is exactly the sum we want to find! But the power series multiplied by itself is just (1-4z)-1, so the coefficient of zn is 4n.

Finally, we conclude that if we work out the expected number of times at .500 for a team with n wins and n losses, it's 4n/C(2n,n). But it's well-known that C(2n,n) is approximately 4n/(πn)1/2. So a team with n wins and n losses is expected to have been at .500 very nearly (πn)1/2 times.

When n=46, this approximation gives 12.021. (The exact number 446/C(92,46) is, to three decimal places, 12.054.) The Phillies have been at .500 eleven times so far; this is actually less than the expectation, which surprised me. A team which is .500 at the end of the season is expected to have been at .500 sixteen times during the season. For the Phillies, though, since they're already at 46-46, that adjusts the estimate upward, to around twenty-two.

In general, though, one might not want to use the expectation of a random variable like this. It's possible that most teams which are .500 at the end of the season really hit that mark in the middle of the season, but a very few teams are .500 some ridiculously large number of times. However, the most times a team can be at .500 over a 162-game regular season is 82 (0-0, 1-1, 2-2, ..., 81-81), so the expectation probably is a decent guess. Also, the expectation is often a lot more accessible than more detailed information. It is in this case, because I haven't figured out how to get the whole distribution yet, so I don't know the probability, say, that a 46-46 team has been at .500 eleven or more times. That seems harder to figure out, and the best way to find that number would probably be via a simulation; getting an exact, analytic answer doesn't seem easy.
Subscribe to: Comments (Atom)

AltStyle によって変換されたページ (->オリジナル) /