This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2014年11月21日 23:54 by JocelynJ, last changed 2022年04月11日 14:58 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| test-22915.diff | JocelynJ, 2014年11月22日 22:52 | test | review | |
| toto.py | JocelynJ, 2014年11月22日 22:57 | testcase | ||
| sax_non_str_file_name.patch | serhiy.storchaka, 2014年11月23日 10:47 | review | ||
| Messages (12) | |||
|---|---|---|---|
| msg231504 - (view) | Author: Jocelyn (JocelynJ) * | Date: 2014年11月21日 23:54 | |
With the attached code, and an xml file, I got the following error with python 3.4.2: $ cat toto.xml <?xml version='1.0' encoding='UTF-8'?> <test></test> $ python3.4 toto.py Traceback (most recent call last): File "toto.py", line 10, in <module> parse(proc.stdout, ContentHandler()) File "/usr/lib/python3.4/xml/sax/__init__.py", line 33, in parse parser.parse(source) File "/usr/lib/python3.4/xml/sax/expatreader.py", line 107, in parse xmlreader.IncrementalParser.parse(self, source) File "/usr/lib/python3.4/xml/sax/xmlreader.py", line 119, in parse self.prepareParser(source) File "/usr/lib/python3.4/xml/sax/expatreader.py", line 111, in prepareParser self._parser.SetBase(source.getSystemId()) TypeError: must be str, not int The same test works with python2, and I would expect the code to work with python 3 too. Thanks, Jocelyn |
|||
| msg231505 - (view) | Author: R. David Murray (r.david.murray) * (Python committer) | Date: 2014年11月22日 00:10 | |
My guess is that this is similar to issue 21044. |
|||
| msg231508 - (view) | Author: R. David Murray (r.david.murray) * (Python committer) | Date: 2014年11月22日 02:29 | |
I've looked at the sax code, and this does indeed have the same root cause: in python2 a dummy string was used for the 'name' attribute of io objects that are based on file descriptors, whereas in python3 the name is the integer value of the file descriptor. In test_sax we can see getSystemId being tested that it returns a filename (see test_expat_locator_withinfo). The fix should be analogous to the issue 21044 fix: check that 'name' is a string and if not don't use it. I'm marking this as easy; hopefully someone will want to tackle figuring out exactly where in the code it needs to be fixed and adding tests for it. |
|||
| msg231537 - (view) | Author: Jocelyn (JocelynJ) * | Date: 2014年11月22日 22:52 | |
Here is a patch to add a test to test_sax.py. I'm not sure on the fix. Is the following one a correct one ? def prepareParser(self, source): if source.getSystemId() is not None: - self._parser.SetBase(source.getSystemId()) + self._parser.SetBase(str(source.getSystemId())) Or should I remove the call to SetBase if getSystemId() is not a string ? |
|||
| msg231538 - (view) | Author: Jocelyn (JocelynJ) * | Date: 2014年11月22日 22:57 | |
Forgot to attach the testcase when opening the bug. |
|||
| msg231545 - (view) | Author: R. David Murray (r.david.murray) * (Python committer) | Date: 2014年11月23日 05:46 | |
Has anyone investigated what exactly sax uses SystemId/SetBase *for*? I think think we need that info in order to decide what to do, and I'm not familiar with sax. |
|||
| msg231553 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2014年11月23日 10:47 | |
This bug should be fixed in other place. Here is a patch. |
|||
| msg231568 - (view) | Author: R. David Murray (r.david.murray) * (Python committer) | Date: 2014年11月23日 16:27 | |
Serhiy's patch looks correct to me. Given that if the source doesn't have a name attribute it is simply not set in the existing code, this change should be "safe" (backward compatible). Elsewhere the possibility was raised of converting the int to a string (<fdopen: N>), but that issue is a more global one and would apply at the io module level...and if implemented this fix would automatically take advantage of it. So I think this should be committed. |
|||
| msg231572 - (view) | Author: Jocelyn (JocelynJ) * | Date: 2014年11月23日 17:35 | |
The only explicit documentation I found on SystemId is from the java specification (it is my understanding that python sax implementation is adapted from Java one): http://www.saxproject.org/apidoc/org/xml/sax/InputSource.html#setSystemId%28java.lang.String%29 The documentation says that "The system identifier is optional if there is a byte stream or a character stream". So, I agree that Serhiy's patch looks correct. Note that I'm not sure that my testcase with a subprocess is covered by Serhiy's tests, as these tests call parser() with a file object. |
|||
| msg231777 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2014年11月27日 20:17 | |
New changeset 27ae1a476ef7 by Serhiy Storchaka in branch '3.4': Issue #22915: SAX parser now supports files opened with file descriptor or https://hg.python.org/cpython/rev/27ae1a476ef7 New changeset ce9881eecfb4 by Serhiy Storchaka in branch 'default': Issue #22915: SAX parser now supports files opened with file descriptor or https://hg.python.org/cpython/rev/ce9881eecfb4 |
|||
| msg231780 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2014年11月27日 20:53 | |
Jocelyn, in both cases the argument of parse() is a file object with integer name. Tests use one of simplest way to create such object. |
|||
| msg231782 - (view) | Author: Jocelyn (JocelynJ) * | Date: 2014年11月27日 21:10 | |
Ok, I understand your tests now. Thanks for the fixes ! |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:58:10 | admin | set | github: 67104 |
| 2014年11月27日 21:10:16 | JocelynJ | set | messages: + msg231782 |
| 2014年11月27日 20:53:22 | serhiy.storchaka | set | status: open -> closed resolution: fixed messages: + msg231780 stage: commit review -> resolved |
| 2014年11月27日 20:17:05 | python-dev | set | nosy:
+ python-dev messages: + msg231777 |
| 2014年11月23日 17:35:31 | JocelynJ | set | messages: + msg231572 |
| 2014年11月23日 16:27:17 | r.david.murray | set | messages:
+ msg231568 stage: patch review -> commit review |
| 2014年11月23日 10:47:15 | serhiy.storchaka | set | files:
+ sax_non_str_file_name.patch nosy: + christian.heimes, serhiy.storchaka messages: + msg231553 assignee: serhiy.storchaka stage: needs patch -> patch review |
| 2014年11月23日 05:46:36 | r.david.murray | set | messages: + msg231545 |
| 2014年11月22日 22:57:39 | JocelynJ | set | files:
+ toto.py messages: + msg231538 |
| 2014年11月22日 22:52:19 | JocelynJ | set | files:
+ test-22915.diff keywords: + patch messages: + msg231537 |
| 2014年11月22日 09:33:53 | Avneesh.Chadha | set | nosy:
+ Avneesh.Chadha |
| 2014年11月22日 08:02:54 | serhiy.storchaka | set | stage: needs patch |
| 2014年11月22日 02:29:52 | r.david.murray | set | keywords:
+ easy messages: + msg231508 |
| 2014年11月22日 00:11:13 | r.david.murray | set | versions: + Python 3.5 |
| 2014年11月22日 00:10:02 | r.david.murray | set | nosy:
+ r.david.murray messages: + msg231505 |
| 2014年11月21日 23:54:11 | JocelynJ | create | |