homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Expat sax parser silently ignores the InputSource protocol
Type: behavior Stage: resolved
Components: Library (Lib), Unicode, XML Versions: Python 3.4, Python 3.5, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: 17089 Superseder:
Assigned To: serhiy.storchaka Nosy List: christian.heimes, ezio.melotti, fdrake, georg.brandl, loewis, python-dev, serhiy.storchaka, tshepang, ygale
Priority: critical Keywords: needs review, patch

Created on 2008年02月24日 14:03 by ygale, last changed 2022年04月11日 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
sax_character_stream.patch serhiy.storchaka, 2013年02月13日 17:47 Patch for 3.x review
sax_character_stream-2.7.patch serhiy.storchaka, 2013年02月13日 17:48 Patch for 2.7 review
sax_character_stream_3.patch serhiy.storchaka, 2015年03月26日 07:26 review
Messages (10)
msg62901 - (view) Author: Yitz Gale (ygale) Date: 2008年02月24日 14:03
The expat sax parser in xml.sax.expatreader
does not fully support the InputSource protocol
in xml.sax.xmlreader. It only accepts
byte streams. It ignores the encoding
indicated in the InputStream object and
only uses the encoding read from
the XML or defaults to UTF-8.
Rather than silently doing the wrong thing,
it should raise an error when fed a character stream,
or when given an encoding, via the InputSource
interface.
And most importantly, these limitations should be mentioned
in the documentation.
msg62903 - (view) Author: Yitz Gale (ygale) Date: 2008年02月24日 14:09
See also: #1483 and #2174.
msg116975 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2010年09月20日 21:23
As nobody appears to be interested I'll close this in a couple of weeks unless someone objects.
msg116984 - (view) Author: Yitz Gale (ygale) Date: 2010年09月20日 21:46
Perhaps more people would be interested if
you raise the priority. This bug can cause
serious data corruption, or even crashes.
It should also be tagged as "easy".
An alternative would be to remove the expat
sax parser from the libraries, since we don't
support it. But that seems a little extreme.
msg117170 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2010年09月23日 06:45
I'll have a look.
msg181383 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013年02月04日 19:46
Here is a patch, which made xml.sax.xmlreader and related utilities to support character stream. A lot of new tests added (including Yitz Gale's tests from issue1483). Some old tests fixed (they were used text stream as byte stream, this doesn't work in general case).
msg182055 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013年02月13日 17:50
This patch is rather complicated and I doubt whether it is necessary to apply it to the older version. Can anyone review it?
msg231555 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014年11月23日 12:11
Ping.
msg239311 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015年03月26日 07:26
Updated to the tip, added whatsnew entry and fixed the documentation.
What parts of this patch besides tests are worth to be applied to maintained releases?
msg239936 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015年04月02日 18:01
New changeset 84d49ad9109b by Serhiy Storchaka in branch '2.7':
Issue #2175: Added tests for xml.sax.saxutils.prepare_input_source().
https://hg.python.org/cpython/rev/84d49ad9109b
New changeset fa47897e7889 by Serhiy Storchaka in branch '3.4':
Issue #2175: Added tests for xml.sax.saxutils.prepare_input_source().
https://hg.python.org/cpython/rev/fa47897e7889
New changeset e0292b3ba245 by Serhiy Storchaka in branch 'default':
Issue #2175: Added tests for xml.sax.saxutils.prepare_input_source().
https://hg.python.org/cpython/rev/e0292b3ba245
New changeset 407883c52bf3 by Serhiy Storchaka in branch 'default':
Issue #2175: SAX parsers now support a character stream of InputSource object.
https://hg.python.org/cpython/rev/407883c52bf3 
History
Date User Action Args
2022年04月11日 14:56:31adminsetgithub: 46428
2015年04月02日 20:31:57serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2015年04月02日 18:01:02python-devsetnosy: + python-dev
messages: + msg239936
2015年03月26日 07:26:06serhiy.storchakasetfiles: + sax_character_stream_3.patch

messages: + msg239311
2014年11月23日 12:11:48serhiy.storchakasetkeywords: + needs review

messages: + msg231555
versions: + Python 3.5, - Python 3.3
2014年02月03日 15:42:18BreamoreBoysetnosy: - BreamoreBoy
2013年12月18日 22:08:55serhiy.storchakasetnosy: + christian.heimes

versions: - Python 3.2
2013年02月13日 21:52:51fdrakesetnosy: + fdrake
2013年02月13日 17:52:22serhiy.storchakalinkissue10590 dependencies
2013年02月13日 17:50:59serhiy.storchakasetmessages: + msg182055
2013年02月13日 17:48:32serhiy.storchakasetfiles: + sax_character_stream-2.7.patch
2013年02月13日 17:47:52serhiy.storchakasetfiles: + sax_character_stream.patch
2013年02月13日 17:47:14serhiy.storchakasetfiles: - sax_character_stream.patch
2013年02月04日 19:46:21serhiy.storchakasetfiles: + sax_character_stream.patch

components: - Documentation, Extension Modules
versions: + Python 3.4
keywords: + patch
nosy: + ezio.melotti

messages: + msg181383
stage: patch review
2013年01月31日 10:02:25serhiy.storchakasetdependencies: + Expat parser parses strings only when XML encoding is UTF-8
2013年01月16日 18:26:43serhiy.storchakasetassignee: docs@python -> serhiy.storchaka

nosy: + serhiy.storchaka
2012年01月11日 12:31:16tshepangsetnosy: + tshepang
2011年06月12日 18:34:16terry.reedysetversions: + Python 3.3, - Python 3.1
2010年10月29日 10:07:21adminsetassignee: georg.brandl -> docs@python
2010年09月23日 06:45:14georg.brandlsetpriority: normal -> critical

nosy: + georg.brandl
messages: + msg117170

assignee: loewis -> georg.brandl
2010年09月20日 21:46:26ygalesetstatus: pending -> open

messages: + msg116984
2010年09月20日 21:23:47BreamoreBoysetstatus: open -> pending
nosy: + BreamoreBoy
messages: + msg116975

2010年06月09日 22:00:39terry.reedysetversions: + Python 3.1, Python 2.7, Python 3.2, - Python 2.6, Python 2.5, Python 3.0
2008年03月20日 02:42:37jafosetpriority: normal
assignee: loewis
nosy: + loewis
2008年02月24日 14:09:11ygalesetmessages: + msg62903
2008年02月24日 14:03:02ygalecreate

AltStyle によって変換されたページ (->オリジナル) /