[Python-Dev] iterzip()
Tim Peters
tim.one@comcast.net
2002年4月29日 22:36:12 -0400
[Neil Schemenauer]
> Adding a fourth generation drops the time from 5.13 to 2.11 on my
> machine. Adding a fifth doesn't seem to make a difference. I used 10
> as the threshold for both new generations.
Alas, these thresholds are a little hard to work with. For example
...
else if (collections0 > threshold1) {
...
collections1++;
/* merge gen0 into gen1 and collect gen1 */
...
collections0 = 0;
}
else {
generation = 0;
collections0++;
... /* collect gen0 */ ...
}
Let's say threshold1 is 10 (because it is <wink>), and we just finished a
gen1 collection. Then collections0 is 0. We have to do 11 gen0 collections
then before "collections0 > threshold1" succeeds, and that point is actually
the 12th time gen0 has filled up since the last time we did a gen1
collection.
Similarly for collections1 vs threshold2.
This makes it hard to multiply them out in an obvious way <wink>.
Anyway, with 4 generations it takes in the ballpark of 700 * 10 * 10 * 10 =
700,000 excess allocations before a gen3 collection is triggered, so I
expect you saw exactly one gen3 collection during the lifetime of the test
run (there are about 1,000,000 excess allocations during its run). Also
that adding a fifth generation wouldn't matter at all in this test, since
you'd still see exactly one gen3 collection, and a gen4 collection would
never happen.
Now another ballpark: On the only machine that matters in real life (mine),
I'm limited to 2**31 bytes of user address space, and an object
participating in gc can rarely be smaller than 40 bytes. That means I can't
have more than 2**31/40 ~= 55 million gcable objects alive at once, and that
also bounds the aggregate excess of allocations over deallocations. That
surprised me. It means the "one million tuple" test is already taxing a
non-trivial percentage of this box's theoretical capacity. Indeed, I tried
boosting it to 10 million, and after glorious endless minutes of listening
to the disk grind itself to dust (with gc disabled, even), Win98 rebooted
itself.
So another factor-of-10 generation or two would probably move the gross
surprises here out of the realm of practical concern. Except, of course,
for the programs where it wouldn't <wink>.