Showing posts with label probabilistic algorithms. Show all posts
Showing posts with label probabilistic algorithms. Show all posts
15 January 2008
Boltzmann sampling
I'm reading Random sampling of plane partitions, by Olivier Bodini, Eric Fusy, and Carine Pivoteau. (arXiv:0712.0111v1). (Note: if you follow posts where I mention new things I've come across in the arXiv, you'll find that I'm currently reading papers I come across there at about a six-week lag.)
The authors give a way to sample from plane partitions of size n uniformly at random. The method is not as efficient as one might like -- it begins by generating a plane partition of size approximately n, where "approximately" means something like "within o(n) of", from a distribution which is uniform when restricted to partitions of exactly n, by a method known as "Boltzmann sampling", and then throws out those partitions which are not of size n. Still, a plane partition of size n can be chosen uniformly at random in time O(n4/3). (Note that uniformity is important here; if we didn't care about the distribution, we could just write down a bunch of numbers that sum up to n and be done! More seriously, uniform sampling of this sort of combinatorial object with a bunch of highly interdependent parts tends to be tricky.)
But the idea I really like here is that of Boltzmann sampling, the inspiration for which I assume comes from physics. Namely, given a combinatorial class, we give each object a weight xn where n is its size, where x is a parameter between 0 than 1; then we pick objects with probabilities proportional to their weights. It turns out to be routine to give a Boltzmann sampler -- that is, an algorithm which picks a member of the class according to this distribution -- for any combinatorial class we can specify. (This is according to the papers of Duchon et al. and Flajolet et al. I've listed below, which I haven't actually read yet.) It reminds me of the partition function of statistical mechanics (the fact that this has "partition" in the name is a coincidence, as one could do this for any combinatorial class). Say a certain sort of physical system can occupy states labeled 1, 2, ..., N. Let Ej be the energy of state j. Then give each state the weight exp(-β Ej), where β = 1/(kT) is the "inverse temperature"; the probabilities that the system occupies various states are proportional to these weights. Replacing energy by size and exp(-β) by x gives the combinatorial Boltzmann sampling. So the parameter x is some sort of temperature.
References
Olivier Bodini, Eric Fusy, and Carine Pivoteau. Random sampling of plane partitions. arXiv:0712.0111v1.
P. Duchon, P. Flajolet, G. Louchard, and G. Schaeffer. Boltzmann samplers for the random generation of combinatorial structures. Combinatorics, Probability and Computing, 13(4-5):577-625, 2004. Special issue on Analysis of Algorithms.
P. Flajolet, E. Fusy, and C. Pivoteau. Boltzmann sampling of unlabelled structures. In Proceedings of the 4th Workshop on Analytic Algorithms and Combinatorics, ANALCO'07 (New Orleans), pages 201-211. SIAM, 2007.
The authors give a way to sample from plane partitions of size n uniformly at random. The method is not as efficient as one might like -- it begins by generating a plane partition of size approximately n, where "approximately" means something like "within o(n) of", from a distribution which is uniform when restricted to partitions of exactly n, by a method known as "Boltzmann sampling", and then throws out those partitions which are not of size n. Still, a plane partition of size n can be chosen uniformly at random in time O(n4/3). (Note that uniformity is important here; if we didn't care about the distribution, we could just write down a bunch of numbers that sum up to n and be done! More seriously, uniform sampling of this sort of combinatorial object with a bunch of highly interdependent parts tends to be tricky.)
But the idea I really like here is that of Boltzmann sampling, the inspiration for which I assume comes from physics. Namely, given a combinatorial class, we give each object a weight xn where n is its size, where x is a parameter between 0 than 1; then we pick objects with probabilities proportional to their weights. It turns out to be routine to give a Boltzmann sampler -- that is, an algorithm which picks a member of the class according to this distribution -- for any combinatorial class we can specify. (This is according to the papers of Duchon et al. and Flajolet et al. I've listed below, which I haven't actually read yet.) It reminds me of the partition function of statistical mechanics (the fact that this has "partition" in the name is a coincidence, as one could do this for any combinatorial class). Say a certain sort of physical system can occupy states labeled 1, 2, ..., N. Let Ej be the energy of state j. Then give each state the weight exp(-β Ej), where β = 1/(kT) is the "inverse temperature"; the probabilities that the system occupies various states are proportional to these weights. Replacing energy by size and exp(-β) by x gives the combinatorial Boltzmann sampling. So the parameter x is some sort of temperature.
References
Olivier Bodini, Eric Fusy, and Carine Pivoteau. Random sampling of plane partitions. arXiv:0712.0111v1.
P. Duchon, P. Flajolet, G. Louchard, and G. Schaeffer. Boltzmann samplers for the random generation of combinatorial structures. Combinatorics, Probability and Computing, 13(4-5):577-625, 2004. Special issue on Analysis of Algorithms.
P. Flajolet, E. Fusy, and C. Pivoteau. Boltzmann sampling of unlabelled structures. In Proceedings of the 4th Workshop on Analytic Algorithms and Combinatorics, ANALCO'07 (New Orleans), pages 201-211. SIAM, 2007.
02 July 2007
jury duty: coincidences, and semi-juries
Today I had jury duty.
During the lunch break I went to Reading Terminal Market, where I spent 9ドル.05 for lunch (it was a large lunch, because I didn't want to be sitting around hungry); my pay for one day of jury duty was 9ドル. (I was not chosen for a trial.) Coincidence? Probably, because I wasn't thinking "I'm spending my nine bucks on lunch" when I was walking around choosing where I would buy it.
[フレーム]Then I wandered down to a bookstore and found myself flipping through Steven Landsburg's book More Sex Is Safer Sex: The Unconventional Wisdom of Economics. . (I didn't buy it; it seemed interesting, but not 26ドル worth of interesting.) In particular, this book suggests that the jury system is broken (the link goes to the Freakonomics blogs, where he was interviewed a few weeks ago). The basic idea is that jurors have no incentive to do a good job. This is clearly true, although I'm not sure how to incentivize the jury system. When I got home I ran across an entry in the Freakonomics blog which mentioned that book. Coincidence? Maybe. Maybe not. (And I thin I saw the original Landsburg interview a few weeks ago and forgot about it, which may have primed me to be more likely to look at that specific book.)
Yet another coincidence: I went to high school with the judge's son. (I don't think this is how I got out of serving.) I also went to high school with the son of one of my panel-mates. It occurred to me as I was heading in this morning that if a few hundred people were called today, the chances I'd know one of them were not bad; there are probably about a million adults in Philadelphia, of whom I know a few hundred. I don't think anyone I know was there today (if so, I didn't see them) but as I said there were parents of people I knew. In some ways Philly is the largest small town in the country.
While waiting to be selected, it occurred to me that the voir dire procedure is set up so that no individual juror who is selected was biased. We were a panel of 50 for a sexual assault case, from which fourteen jurors were essentially selected; although I wasn't counting, I would guess that there were no more than twenty people who satisfied the following three conditions:
Still, would it have been such a horrible thing to have a sexual assault victim on the jury? A randomly selected panel of fourteen would probably have had at least one. I don't see why each individual juror has to be unbiased in order for the group as a whole to be unbiased.
Finally, some math. In Landsburg's book he suggests the following: break each jury up into two half-juries of six. If they come to the same conclusion, that's the verdict; he wasn't clear on what to do if they came to opposing conclusions. (Presumably it would be treated like current hung juries are.) In this study by Bruce Spencer it's suggested that juries are "right" about 88% of the time. This got me thinking -- how likely does this mean an individual juror is to be "right" about the verdict? If we assume that jurors make their decisions independently, that majority rules (which is a bit ingenuous because juries in criminal cases have to be unanimous), and throw out 6-6 results, it turns out each individual juror has to come to the right decision with probability 63.6% to recover this 88% probability. This is related to the post I made a couple weeks ago about the World Series; if one team is slightly better than another, they have a decent chance of winning a single game but not so good a chance of winning a whole series. The teams here are, of course, "guilty" and "not guilty".
So what does this say for the half-jury suggestion? Let's say that each juror, indepedently, has a 63.6% chance of being right, and there's a jury of six. Say the defendant is guilty. Then the probability that all six jurors will say this is (.636)6 = .066; the probability that five think he's guilty and one thinks he's innocent is 6(.636)5(.364) = .227; the probability of the 4-2 split is 15(.636)4(.364)2 = .325. So the probability of one of these three results is 0.618; the probability of having a 3-3 split is 20(.636)3(.364)3 = 0.248. So the probability of a half-jury finding the defendant guilty is (.618)/(1-.248) = .821. Not surprisingly, this is less than the chance of a full jury finding the defendant guilty.
But the chance that both half-juries find the defendant guilty is (.821)2 = .674; the chance that they both find him innocent (even though he did it!) is (1-.821)2 = .032. So the probability of finding the defendant guilty, given that there's a verdict at all, is .674/(.674+.032) = .955. In the end, this plan achieves much greater accuracy at the expense of increasing the number of hung juries. It seems worth considering, though. (Incidentally, you can't beat the hung jury problem by changing the sub-jury size; either at least one sub-jury is of even size or there's an even number of sub-juries, since 12 is even.)
I suspect, though, that this sort of thing would be rejected as being unnecessarily complicated. But the current voir dire process is byzantine enough that that hardly seems like a legitimate complaint.
edit (Tuesday, 9:16 AM): Landsburg has commented to this entry. In particular he points out that my assumption that juries would have the same accuracy in the arrangement with two half-juries as in the current system is incorrect; jurors would have more incentive to be accurate in his proposed system. This is true because in his proposed system the jurors are rewarded when both juries agree. But what I intended to show was that even without such a reward, his system still leads to a greater proportion of correct verdicts.
edit (Tuesday, 2:16 PM): Richard Dawkins suggested in 1997 that "Two juries of six members, or three juries of four members, would probably be an improvement over the present system". He also points out that jurors don't act independently, which is true; in my original analysis I was suggesting that even though jurors don't act independently, we'll assume that they act independently up until the moment they begin deliberation. This assumption is of course not true, but it was only a crude analysis.
During the lunch break I went to Reading Terminal Market, where I spent 9ドル.05 for lunch (it was a large lunch, because I didn't want to be sitting around hungry); my pay for one day of jury duty was 9ドル. (I was not chosen for a trial.) Coincidence? Probably, because I wasn't thinking "I'm spending my nine bucks on lunch" when I was walking around choosing where I would buy it.
[フレーム]Then I wandered down to a bookstore and found myself flipping through Steven Landsburg's book More Sex Is Safer Sex: The Unconventional Wisdom of Economics. . (I didn't buy it; it seemed interesting, but not 26ドル worth of interesting.) In particular, this book suggests that the jury system is broken (the link goes to the Freakonomics blogs, where he was interviewed a few weeks ago). The basic idea is that jurors have no incentive to do a good job. This is clearly true, although I'm not sure how to incentivize the jury system. When I got home I ran across an entry in the Freakonomics blog which mentioned that book. Coincidence? Maybe. Maybe not. (And I thin I saw the original Landsburg interview a few weeks ago and forgot about it, which may have primed me to be more likely to look at that specific book.)
Yet another coincidence: I went to high school with the judge's son. (I don't think this is how I got out of serving.) I also went to high school with the son of one of my panel-mates. It occurred to me as I was heading in this morning that if a few hundred people were called today, the chances I'd know one of them were not bad; there are probably about a million adults in Philadelphia, of whom I know a few hundred. I don't think anyone I know was there today (if so, I didn't see them) but as I said there were parents of people I knew. In some ways Philly is the largest small town in the country.
While waiting to be selected, it occurred to me that the voir dire procedure is set up so that no individual juror who is selected was biased. We were a panel of 50 for a sexual assault case, from which fourteen jurors were essentially selected; although I wasn't counting, I would guess that there were no more than twenty people who satisfied the following three conditions:
- 1. doesn't possess a strong technical background (we had to write our occupations on the forms that were distributed; it seemed that all the people who were asked about their occupation by the judge were either people in technical fields or people who worked as lawyers, police officers, etc.);
- 2. does not claim that jury duty would pose an extreme hardship;
- 3. had not been sexually assaulted or had someone close to them sexually assaulted. It's often said that one in four people are sexually assaulted during their lifetime.
Still, would it have been such a horrible thing to have a sexual assault victim on the jury? A randomly selected panel of fourteen would probably have had at least one. I don't see why each individual juror has to be unbiased in order for the group as a whole to be unbiased.
Finally, some math. In Landsburg's book he suggests the following: break each jury up into two half-juries of six. If they come to the same conclusion, that's the verdict; he wasn't clear on what to do if they came to opposing conclusions. (Presumably it would be treated like current hung juries are.) In this study by Bruce Spencer it's suggested that juries are "right" about 88% of the time. This got me thinking -- how likely does this mean an individual juror is to be "right" about the verdict? If we assume that jurors make their decisions independently, that majority rules (which is a bit ingenuous because juries in criminal cases have to be unanimous), and throw out 6-6 results, it turns out each individual juror has to come to the right decision with probability 63.6% to recover this 88% probability. This is related to the post I made a couple weeks ago about the World Series; if one team is slightly better than another, they have a decent chance of winning a single game but not so good a chance of winning a whole series. The teams here are, of course, "guilty" and "not guilty".
So what does this say for the half-jury suggestion? Let's say that each juror, indepedently, has a 63.6% chance of being right, and there's a jury of six. Say the defendant is guilty. Then the probability that all six jurors will say this is (.636)6 = .066; the probability that five think he's guilty and one thinks he's innocent is 6(.636)5(.364) = .227; the probability of the 4-2 split is 15(.636)4(.364)2 = .325. So the probability of one of these three results is 0.618; the probability of having a 3-3 split is 20(.636)3(.364)3 = 0.248. So the probability of a half-jury finding the defendant guilty is (.618)/(1-.248) = .821. Not surprisingly, this is less than the chance of a full jury finding the defendant guilty.
But the chance that both half-juries find the defendant guilty is (.821)2 = .674; the chance that they both find him innocent (even though he did it!) is (1-.821)2 = .032. So the probability of finding the defendant guilty, given that there's a verdict at all, is .674/(.674+.032) = .955. In the end, this plan achieves much greater accuracy at the expense of increasing the number of hung juries. It seems worth considering, though. (Incidentally, you can't beat the hung jury problem by changing the sub-jury size; either at least one sub-jury is of even size or there's an even number of sub-juries, since 12 is even.)
I suspect, though, that this sort of thing would be rejected as being unnecessarily complicated. But the current voir dire process is byzantine enough that that hardly seems like a legitimate complaint.
edit (Tuesday, 9:16 AM): Landsburg has commented to this entry. In particular he points out that my assumption that juries would have the same accuracy in the arrangement with two half-juries as in the current system is incorrect; jurors would have more incentive to be accurate in his proposed system. This is true because in his proposed system the jurors are rewarded when both juries agree. But what I intended to show was that even without such a reward, his system still leads to a greater proportion of correct verdicts.
edit (Tuesday, 2:16 PM): Richard Dawkins suggested in 1997 that "Two juries of six members, or three juries of four members, would probably be an improvement over the present system". He also points out that jurors don't act independently, which is true; in my original analysis I was suggesting that even though jurors don't act independently, we'll assume that they act independently up until the moment they begin deliberation. This assumption is of course not true, but it was only a crude analysis.
20 June 2007
The Phillies defeat the Qankees in nine games?
Scott Boras is pushing for a nine-game World Series, with the first two games to be played at a neutral site. This gives me a reason to post the following, which I wrote a couple weeks ago, about how good the World Series is at identifying the "best" team. A best-of-nine World Series wouldn't be much of an improvement over a best-of-seven World Series, in this regard. Let's say one of the two teams in the World Series has a 55% chance of winning over the other in any given game; they'd have about a 61% chance of winning a seven-game series, and a 62% chance of winning nine. It would bring more money. It would also lengthen the season by, say, another three days (two playing days and a travel day), and as many people pointed out in April the season is already too long for an outdoor sport. Game 7 of this year's World Series, if it happens, will happen on Thursday, November 1. Do we really want November baseball to be a routine event?
(Also, Boras likens these first two, neutral-site games to the Super Bowl. Which makes me wonder -- has the Super Bowl ever been played at a site that wasn't actually neutral, because it just happened that the host team did well that year?)
Anyway, on to the math.
In the World Series of baseball, two teams P and Q play against each other until one has won four games; this team is declared the champion. If team P has probability p of winning each individual game, and the games are independent, what is the probability that team P wins the series?
(From here on out, I will call the two teams the Phillies and the Qankees. "Phillies" because, well, I actually want them to win; Qankees because Qankees is fun to say. It's pronounced quan-keys, like how a little kid would say "cranky".)
This problem is often posed in introductory probability texts (in fact, a couple days ago I showed a student I'm tutoring how to do it), and the solution those texts have in mind runs something like this. If the Phillies win, they do it in four, five, six, or seven games. The probability of them winning in four games is clearly p4. The probability of them winning in five games is 4p4q, where q = 1-p. This is because there are four ways for the Phillies to win in five games -- we just pick which of the first four games they lose. Similarly, the probability of the Phillies winning in six is 10p4q2 -- the number "10" is the number of ways we can pick two games out of the first five for them to lose. And they win in seven with probability 20p4q3.
Thus, the total probability of the Phillies winning is p4(1 + 4q + 10q2 + 20q3.) If we remember that q = 1-p, we get that this is p4(35 - 84p + 70p2 - 20p3). Plugging in numbers, we see, for example, that if the Phillies have a 55% chance of winning each individual game, they have about a 60.8% chance of winning the entire series, and that the Qankees would then be throwing temper tantrums because they didn't get their twenty-seventh championship. (How you would know they have a 55% chance of winning each game is a different story.)
But if you're designing a system of playoffs, this polynomial isn't that interesting. It seems to me that you can basically sum up the whole problem in a single number. You can view the playoffs as a probabilistic algorithm for determining which team is better. (They're a pretty weak probabilistic algorithm, at least when the teams are fairly evenly matched.) The natural question to ask seems to be: if the Phillies have a probability 1/2 + ε of winning a single game, what's the probability of them winning the series? (This turns out to be a sneaky way to compute the derivative of the above polynomial at p = 1/2.) So we let p = 1/2 + ε in the above computation, and -- here's the trick -- we act as if ε2 = 0. Then the Phillies' probability of winning is
(1/2 + ε4) (1 + 4(1/2 - ε) + 10(1/2 - ε)2 + 20(1/2 - ε)3)
which simplifies to 1/2 + (35/16)ε. I'll call this coefficient 35/16 the amplification of this playoff system.
Doing the same computation for different series lengths gives:
The amplification is very nearly the square root of the number of wins needed. I'd bet it's 2/√π times the number of wins -- the constant seems to work out, and π seems to occur a lot in these types of problems, because in the end we have to compute factorials and Stirling's approximation is a very good approximation to factorials that includes π. (A confession: the numerical work actually comes from computing the probabilities in the first way given above, because I have a fast computer at my disposal and didn't want to figure out how to program the second solution.)
What does this tell us, then? It tells us that to get the amplification to be twice as good, we have to play four times as many games! This is something that occurs pretty often -- political opinion polling, for example, follows the same principle. To get the "margin of error" down from the standard 3% to 1.5% requires polling four thousand people instead of one thousand.
A principle that occurs fairly often in randomized algorithm design is that if you have an algorithm that gives you a correct answer with probability greater than 1/2, then you can run the algorithm repeatedly and be more confident in its results. This is actually the principle behind playing a series of games instead of a single game, but what if we played a series of series? Or a tennis match, with its multi-tiered structure (point, game, set, match)? I'll look at this in a future post.
(Also, Boras likens these first two, neutral-site games to the Super Bowl. Which makes me wonder -- has the Super Bowl ever been played at a site that wasn't actually neutral, because it just happened that the host team did well that year?)
Anyway, on to the math.
In the World Series of baseball, two teams P and Q play against each other until one has won four games; this team is declared the champion. If team P has probability p of winning each individual game, and the games are independent, what is the probability that team P wins the series?
(From here on out, I will call the two teams the Phillies and the Qankees. "Phillies" because, well, I actually want them to win; Qankees because Qankees is fun to say. It's pronounced quan-keys, like how a little kid would say "cranky".)
This problem is often posed in introductory probability texts (in fact, a couple days ago I showed a student I'm tutoring how to do it), and the solution those texts have in mind runs something like this. If the Phillies win, they do it in four, five, six, or seven games. The probability of them winning in four games is clearly p4. The probability of them winning in five games is 4p4q, where q = 1-p. This is because there are four ways for the Phillies to win in five games -- we just pick which of the first four games they lose. Similarly, the probability of the Phillies winning in six is 10p4q2 -- the number "10" is the number of ways we can pick two games out of the first five for them to lose. And they win in seven with probability 20p4q3.
Thus, the total probability of the Phillies winning is p4(1 + 4q + 10q2 + 20q3.) If we remember that q = 1-p, we get that this is p4(35 - 84p + 70p2 - 20p3). Plugging in numbers, we see, for example, that if the Phillies have a 55% chance of winning each individual game, they have about a 60.8% chance of winning the entire series, and that the Qankees would then be throwing temper tantrums because they didn't get their twenty-seventh championship. (How you would know they have a 55% chance of winning each game is a different story.)
But if you're designing a system of playoffs, this polynomial isn't that interesting. It seems to me that you can basically sum up the whole problem in a single number. You can view the playoffs as a probabilistic algorithm for determining which team is better. (They're a pretty weak probabilistic algorithm, at least when the teams are fairly evenly matched.) The natural question to ask seems to be: if the Phillies have a probability 1/2 + ε of winning a single game, what's the probability of them winning the series? (This turns out to be a sneaky way to compute the derivative of the above polynomial at p = 1/2.) So we let p = 1/2 + ε in the above computation, and -- here's the trick -- we act as if ε2 = 0. Then the Phillies' probability of winning is
(1/2 + ε4) (1 + 4(1/2 - ε) + 10(1/2 - ε)2 + 20(1/2 - ε)3)
which simplifies to 1/2 + (35/16)ε. I'll call this coefficient 35/16 the amplification of this playoff system.
Doing the same computation for different series lengths gives:
Number of wins needed 2 3 4 5 6 8 12 16
Amplification 3/2 15/8 35/16 315/128 693/256 6435/2048
(as decimal) 1.50 1.88 2.19 2.46 2.71 3.14 3.87 4.48
The amplification is very nearly the square root of the number of wins needed. I'd bet it's 2/√π times the number of wins -- the constant seems to work out, and π seems to occur a lot in these types of problems, because in the end we have to compute factorials and Stirling's approximation is a very good approximation to factorials that includes π. (A confession: the numerical work actually comes from computing the probabilities in the first way given above, because I have a fast computer at my disposal and didn't want to figure out how to program the second solution.)
What does this tell us, then? It tells us that to get the amplification to be twice as good, we have to play four times as many games! This is something that occurs pretty often -- political opinion polling, for example, follows the same principle. To get the "margin of error" down from the standard 3% to 1.5% requires polling four thousand people instead of one thousand.
A principle that occurs fairly often in randomized algorithm design is that if you have an algorithm that gives you a correct answer with probability greater than 1/2, then you can run the algorithm repeatedly and be more confident in its results. This is actually the principle behind playing a series of games instead of a single game, but what if we played a series of series? Or a tennis match, with its multi-tiered structure (point, game, set, match)? I'll look at this in a future post.
Subscribe to:
Comments (Atom)