This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2012年05月09日 09:39 by Giuseppe.Attardi, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Messages (4) | |||
|---|---|---|---|
| msg160266 - (view) | Author: Giuseppe Attardi (Giuseppe.Attardi) | Date: 2012年05月09日 09:39 | |
I confirm the presence of a serious memory leak in ElementTree, using the iterparse() function. Memory grows disproportionately to dozens of GB when parsing a large XML file. For further information, see discussion in: http://www.gossamer-threads.com/lists/python/bugs/912164?do=post_view_threaded#912164 but notice that the comments attributing the problem to the OS are quite off the mark. To replicate the problem, try this on a Wikipedia dump: iterparse = ElementTree.iterparse(file) id = None for event, elem in iterparse: if elem.tag.endswith("title"): title = elem.text elif elem.tag.endswith("id") and not id: id = elem.text elif elem.tag.endswith("text"): print id, title, elem.text[:20] |
|||
| msg160275 - (view) | Author: Eli Bendersky (eli.bendersky) * (Python committer) | Date: 2012年05月09日 11:39 | |
Can you specify how you import ET? I.e. from the pure Python or the C accelerator? Also, do you realize that the element iterparse returns should be discarded with 'clear'? [see tutorial here: http://eli.thegreenplace.net/2012/03/15/processing-xml-in-python-with-elementtree/] |
|||
| msg160286 - (view) | Author: Jesús Cea Avión (jcea) * (Python committer) | Date: 2012年05月09日 12:47 | |
Can this be reproduced in 3.2/3.3? |
|||
| msg160288 - (view) | Author: Giuseppe Attardi (Giuseppe.Attardi) | Date: 2012年05月09日 13:35 | |
You are right, I should discard the elements. Thank you. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:30 | admin | set | github: 58967 |
| 2012年05月09日 13:36:30 | Giuseppe.Attardi | set | status: open -> closed resolution: not a bug |
| 2012年05月09日 13:35:29 | Giuseppe.Attardi | set | messages: + msg160288 |
| 2012年05月09日 12:47:18 | jcea | set | nosy:
+ jcea messages: + msg160286 |
| 2012年05月09日 11:39:01 | eli.bendersky | set | messages: + msg160275 |
| 2012年05月09日 11:18:42 | pitrou | set | nosy:
+ eli.bendersky, flox |
| 2012年05月09日 09:39:44 | Giuseppe.Attardi | create | |