homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: _elementtree.c calls Python callbacks while a Python exception is set
Type: Stage:
Components: Versions: Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: christian.heimes, fdrake, python-dev, vstinner
Priority: normal Keywords:

Created on 2013年07月18日 21:01 by vstinner, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Messages (5)
msg193325 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013年07月18日 21:01
The ElementTree module allows to write a XML parser using Python callbacks. The module relies on the expat library which is implemented in C. Expat calls these Python callbacks, but ElementTree does not check if a Python exception was raised or not.
Example 1:
-------------------
import unittest
from xml.etree import ElementTree as ET
class Target(object):
 def start(self, tag, attrib):
 print("start")
 raise ValueError("raise start")
 def end(self, tag):
 print("end")
 raise ValueError("raise end")
 def close(self):
 print("close")
 raise ValueError("raise close")
parser = ET.XMLParser(target=Target())
parser.feed("<root><test /></root>")
-------------------
Output with Python 3.3:
-------------------
start
startendendTraceback (most recent call last):
 File "x.py", line 18, in <module>
 parser.feed("<root><test /></root>")
 File "x.py", line 10, in end
 print("end")
 File "x.py", line 10, in end
 print("end")
 File "x.py", line 6, in start
 print("start")
 File "x.py", line 7, in start
 raise ValueError("raise start")
ValueError: raise start
-------------------
start() was called twice, as end() method, even if the first start() method raised an exception.
The traceback is strange: it looks like end() was called by start(), which is wrong.
Example 2:
-------------------
import unittest
from xml.etree import ElementTree as ET
class Target(object):
 def start(self, tag, attrib):
 raise ValueError("raise start")
 def end(self, tag):
 raise ValueError("raise end")
 def close(self):
 raise ValueError("raise close")
parser = ET.XMLParser(target=Target())
parser.feed("<root><test /></root>")
-------------------
Output with Python 3.3:
-------------------
Traceback (most recent call last):
 File "x.py", line 15, in <module>
 parser.feed("<root><test /></root>")
 File "x.py", line 9, in end
 raise ValueError("raise end")
ValueError: raise end
-------------------
end() was called even if start() already failed. The exception which was set by start has been replaced by end() exception.
In my opinion, it's not a good thing to call PyEval_EvalFrameEx() and similar functions when a Python exception is set, because it behaves badly (ex: print("end") in Example 1 raises an exception... which is wrong, the traceback is also corrupted) and may replaces the old exception with a new exception (ex: "end" replaces "started").
msg193326 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013年07月18日 21:04
For the issue #18408, I added assertions in PyEval_EvalFrameEx() and similar functions to exit with an assertion error in debug mode if these functions are called with an exception set:
New changeset 48a869a39e2d by Victor Stinner in branch 'default':
Issue #18408: PyEval_EvalFrameEx() and PyEval_CallObjectWithKeywords() now fail
http://hg.python.org/cpython/rev/48a869a39e2d
New changeset 5bd9db528aed by Victor Stinner in branch 'default':
Issue #18408: PyObject_Str(), PyObject_Repr() and type_call() now fail with an
http://hg.python.org/cpython/rev/5bd9db528aed
lxml test suite failed with an C assertion error because of these changes. I fixed the issue with the following change:
New changeset 6ec0e9347dd4 by Victor Stinner in branch 'default':
Issue #18408: Fix _elementtree.c, don't call Python function from an expat
http://hg.python.org/cpython/rev/6ec0e9347dd4
Instead of having to check if an exception is set in each Python callback, it would be better to "stop" the XML parser. Modules/pyexpat.c calls "flag_error(self); XML_SetCharacterDataHandler(self->itself, noop_character_data_handler);" on error:
/* This handler is used when an error has been detected, in the hope
 that actual parsing can be terminated early. This will only help
 if an external entity reference is encountered. */
static int
error_external_entity_ref_handler(XML_Parser parser,
 const XML_Char *context,
 const XML_Char *base,
 const XML_Char *systemId,
 const XML_Char *publicId)
{
 return 0;
}
/* Dummy character data handler used when an error (exception) has
 been detected, and the actual parsing can be terminated early.
 This is needed since character data handler can't be safely removed
 from within the character data handler, but can be replaced. It is
 used only from the character data handler trampoline, and must be
 used right after `flag_error()` is called. */
static void
noop_character_data_handler(void *userData, const XML_Char *data, int len)
{
 /* Do nothing. */
}
static void
flag_error(xmlparseobject *self)
{
 clear_handlers(self, 0);
 XML_SetExternalEntityRefHandler(self->itself,
 error_external_entity_ref_handler);
}
msg193328 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013年07月18日 21:43
New changeset 5a6cdc0d7de1 by Victor Stinner in branch 'default':
Issue #18501, #18408: Fix expat handlers in pyexpat, don't call Python
http://hg.python.org/cpython/rev/5a6cdc0d7de1 
msg193339 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013年07月18日 23:44
See also the issue #18488, similar issue in sqlite.
msg195111 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013年08月13日 23:59
> Instead of having to check if an exception is set in each Python
> callback, it would be better to "stop" the XML parser.
I created the issue #18733 to track this optimization.
The initial issue is fixed, so I'm closing it.
History
Date User Action Args
2022年04月11日 14:57:48adminsetgithub: 62701
2013年08月13日 23:59:23vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg195111
2013年07月18日 23:44:58vstinnersetmessages: + msg193339
2013年07月18日 21:43:43python-devsetnosy: + python-dev
messages: + msg193328
2013年07月18日 21:08:43fdrakesetnosy: + fdrake
2013年07月18日 21:05:09christian.heimessetnosy: + christian.heimes
2013年07月18日 21:04:47vstinnersetmessages: + msg193326
2013年07月18日 21:01:01vstinnercreate

AltStyle によって変換されたページ (->オリジナル) /