homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author A. Skrobov
Recipients A. Skrobov, christian.heimes, eryksun, paul.moore, rhettinger, serhiy.storchaka, steve.dower, tim.golden, zach.ware
Date 2016年03月08日.08:54:49
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1457427292.14.0.664371477253.issue26415@psf.upfronthosting.co.za>
In-reply-to
Content
OK, I've now looked into it with a fresh build of 3.6 trunk on Linux x64.
Peak memory usage is about 3KB per node:
$ /usr/bin/time -v ./python -c 'import ast; ast.parse("0,"*1000000, mode="eval")'
	Command being timed: "./python -c import ast; ast.parse("0,"*1000000, mode="eval")"
	...
	Maximum resident set size (kbytes): 3015552
	...
Out of the 2945 MB total peak memory usage, only 330 MB are attributable to the heap use:
$ valgrind ./python -c 'import ast; ast.parse("0,"*1000000, mode="eval")'
==21232== ...
==21232== HEAP SUMMARY:
==21232== in use at exit: 3,480,447 bytes in 266 blocks
==21232== total heap usage: 1,010,171 allocs, 1,009,905 frees, 348,600,304 bytes allocated
==21232== ...
So, apparently, it's not the nodes themselves taking up a disproportionate amount of memory -- it's the heap getting so badly fragmented that 89% of its memory allocation is wasted.
gprof confirms that there are lots of mallocs/reallocs going on, up to 21 per node:
$ gprof python
Flat profile:
Each sample counts as 0.01 seconds.
 % cumulative self self total 
 time seconds seconds calls s/call s/call name 
 17.82 0.31 0.31 2000020 0.00 0.00 PyParser_AddToken
 13.79 0.55 0.24 2 0.12 0.16 freechildren
 12.64 0.77 0.22 21039125 0.00 0.00 _PyMem_RawMalloc
 6.32 0.88 0.11 17000101 0.00 0.00 PyNode_AddChild
 5.75 0.98 0.10 28379846 0.00 0.00 visit_decref
 5.75 1.08 0.10 1000004 0.00 0.00 ast_for_expr
 4.60 1.16 0.08 2867 0.00 0.00 collect
 4.02 1.23 0.07 20023405 0.00 0.00 _PyObject_Free
 2.30 1.27 0.04 3031305 0.00 0.00 _PyType_Lookup
 2.30 1.31 0.04 3002234 0.00 0.00 _PyObject_GenericSetAttrWithDict
 2.30 1.35 0.04 1 0.04 0.05 ast2obj_expr
 1.72 1.38 0.03 28366858 0.00 0.00 visit_reachable
 1.72 1.41 0.03 12000510 0.00 0.00 subtype_traverse
 1.72 1.44 0.03 3644 0.00 0.00 list_traverse
 1.44 1.47 0.03 3002161 0.00 0.00 _PyObjectDict_SetItem
 1.15 1.49 0.02 20022785 0.00 0.00 PyObject_Free
 1.15 1.51 0.02 15000763 0.00 0.00 _PyObject_Realloc
So, I suppose what needs to be done is to try reducing the number of reallocs involved in handling an AST node; the representation of the nodes themselves doesn't need to change.
History
Date User Action Args
2016年03月08日 08:54:52A. Skrobovsetrecipients: + A. Skrobov, rhettinger, paul.moore, christian.heimes, tim.golden, zach.ware, serhiy.storchaka, eryksun, steve.dower
2016年03月08日 08:54:52A. Skrobovsetmessageid: <1457427292.14.0.664371477253.issue26415@psf.upfronthosting.co.za>
2016年03月08日 08:54:52A. Skrobovlinkissue26415 messages
2016年03月08日 08:54:49A. Skrobovcreate

AltStyle によって変換されたページ (->オリジナル) /