homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Giuseppe.Attardi
Recipients Giuseppe.Attardi
Date 2012年05月09日.09:39:43
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1336556387.72.0.794459300498.issue14762@psf.upfronthosting.co.za>
In-reply-to
Content
I confirm the presence of a serious memory leak in ElementTree, using the iterparse() function.
Memory grows disproportionately to dozens of GB when parsing a large XML file.
For further information, see discussion in:
 http://www.gossamer-threads.com/lists/python/bugs/912164?do=post_view_threaded#912164
but notice that the comments attributing the problem to the OS are quite off the mark.
To replicate the problem, try this on a Wikipedia dump:
 iterparse = ElementTree.iterparse(file)
 id = None
 for event, elem in iterparse:
 if elem.tag.endswith("title"):
 title = elem.text
 elif elem.tag.endswith("id") and not id:
 id = elem.text
 elif elem.tag.endswith("text"):
 print id, title, elem.text[:20]
History
Date User Action Args
2012年05月09日 09:39:47Giuseppe.Attardisetrecipients: + Giuseppe.Attardi
2012年05月09日 09:39:47Giuseppe.Attardisetmessageid: <1336556387.72.0.794459300498.issue14762@psf.upfronthosting.co.za>
2012年05月09日 09:39:44Giuseppe.Attardilinkissue14762 messages
2012年05月09日 09:39:43Giuseppe.Attardicreate

AltStyle によって変換されたページ (->オリジナル) /