[Python-Dev] Re: radix tree arena map for obmalloc

2019年6月17日 18:10:10 -0700

Heh. I wasn't intending to be nasty, but this program makes our arena
recycling look _much_ worse than memcrunch.py does. It cycles through
phases. In each phase, it first creates a large randomish number of
objects, then deletes half of all objects in existence. Except that
every 10th phase, it deletes 90% instead. It's written to go through
100 phases, but I killed it after 10 because it was obviously going to
keep on growing without bound.
Note 1: to do anything deterministic with obmalloc stats these days
appears to require setting the envar PYTHONHASHSEED to 0 before
running (else stats vary even by the time you get to an interactive
prompt).
Note 2: there are 3 heavily used size classes here, for ints,
2-tuples, and class instances, of byte sizes 32, 64, and 96 on 64-bit
boxes, under my PR and under released 3.7.3.
First with my branch, after phase 10 finishes building objects:
phase 10 adding 9953410
phase 10 has 16743920 objects
# arenas allocated total = 3,114
# arenas reclaimed = 0
# arenas highwater mark = 3,114
# arenas allocated current = 3,114
3114 arenas * 1048576 bytes/arena = 3,265,265,664
# bytes in allocated blocks = 3,216,332,784
No arenas have ever been reclaimed, but space utilization is excellent
(about 98.5% of arenas are being used by objects).
Then after phase 10 deletes 90% of everything still alive:
phase 10 deleting factor 90% 15069528
phase 10 done deleting
# arenas allocated total = 3,114
# arenas reclaimed = 0
# arenas highwater mark = 3,114
# arenas allocated current = 3,114
3114 arenas * 1048576 bytes/arena = 3,265,265,664
# bytes in allocated blocks = 323,111,488
Still no arenas have been released, and space utilization is horrid.
A bit less than 10% of allocated space is being use for objects.
Now under 3.7.3. First when phase 10 is done building:
phase 10 adding 9953410
phase 10 has 16743920 objects
# arenas allocated total = 14,485
# arenas reclaimed = 2,020
# arenas highwater mark = 12,465
# arenas allocated current = 12,465
12465 arenas * 262144 bytes/arena = 3,267,624,960
# bytes in allocated blocks = 3,216,219,656
Space utilization is again excellent. A significant number of arenas
were reclaimed - but usefully? Let's see how things turn out after
phase 10 ends deleting 90% of the objects:
phase 10 deleting factor 90% 15069528
phase 10 done deleting
# arenas allocated total = 14,485
# arenas reclaimed = 2,020
# arenas highwater mark = 12,465
# arenas allocated current = 12,465
12465 arenas * 262144 bytes/arena = 3,267,624,960
# bytes in allocated blocks = 322,998,360
Didn't manage to reclaim anything! Space utililization is again
horrid, and it's actually consuming a bit more arena bytes than when
running under the PR.
Which is just more of what I've been seeing over & over: 3.7.3 and
the PR both do a fine job of recycling arenas, or a horrid job,
depending on the program.
For excellent recycling, change this program to use a dict instead of a set. So
 data = {}
at the start, fill it with
 data[serial] = Stuff()
and change
 data.pop()
to use .popitem().
The difference is that set elements still appear in pseudo-random
order, but dicts are in insertion-time order. So data.popitem() loses
the most recently added dict entry, and the program is then just
modeling stack allocation/deallocation.
def doit():
 import random
 from random import randrange
 import sys
 class Stuff:
 # add cruft so it takes 96 bytes under 3.7 and 3.8
 __slots__ = tuple("abcdefg")
 def __hash__(self):
 return hash(id(self))
 LO = 5_000_000
 HI = LO * 2
 data = set()
 serial = 0
 random.seed(42)
 for phase in range(1, 101):
 toadd = randrange(LO, HI)
 print("phase", phase, "adding", toadd)
 for _ in range(toadd):
 data.add((serial, Stuff()))
 serial += 1
 print("phase", phase, "has", len(data), "objects")
 sys._debugmallocstats()
 factor = 0.5 if phase % 10 else 0.9
 todelete = int(len(data) * factor)
 print(f"phase {phase} deleting factor {factor:.0%} {todelete}")
 for _ in range(todelete):
 data.pop()
 print("phase", phase, "done deleting")
 sys._debugmallocstats()
doit()
_______________________________________________
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/ZTLJGXEM7NCASL5NVGMRMDN3O4GGUEIX/

Reply via email to