This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2006年04月14日 20:07 by ngrig, last changed 2022年04月11日 14:56 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| enctest.py | ngrig, 2006年04月14日 20:07 | Test for UTF-16 treatment in xml.sax.saxutils.XMLGenerator | ||
| Messages (3) | |||
|---|---|---|---|
| msg28244 - (view) | Author: Nikolai Grigoriev (ngrig) | Date: 2006年04月14日 20:07 | |
When output encoding in xml.sax.saxutils.XMLGenerator is set to UTF-16, the result is a terrible mess. Namely: - it does not encode the XML declaration at the very top of the file (leaving it in single-byte Latin); - it leaves closing '>' of each start tag unencoded (that is, always outputs a single byte); - it inserts a spurious byte order mark for each tag, each attribute, each text node, and each processing instruction. A test illustrating the issue is attached. The issue is applicable to both stable (2.4.3) and current (2.5) versions of Python. --------------------------------------------- Looking in xml/sax/saxutils.py, I see the problem in XMLGenerator._write(): - one-byte strings aren't recoded at all (sic!); - two-byte strings are converted using unicode.encode(); this results in a BOM for each call of _write() on Unicode strings. The issue is easy to fix by using StreamWriter instead of a plain stream as the output sink. I am going to submit a patch shortly. Regards, Nikolai Grigoriev |
|||
| msg28245 - (view) | Author: Nikolai Grigoriev (ngrig) | Date: 2006年04月16日 07:42 | |
Logged In: YES user_id=195108 FYI: I posted a patch (#1470548) that fixes the issue. Regards, Nikolai Grigoriev |
|||
| msg83907 - (view) | Author: Daniel Diniz (ajaksu2) * (Python triager) | Date: 2009年03月21日 02:02 | |
Patch on issue 1470548. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:56:16 | admin | set | github: 43213 |
| 2009年04月05日 13:45:12 | georg.brandl | set | status: open -> closed resolution: duplicate dependencies: - xml.sax.saxutils.XMLGenerator cannot output UTF-16 superseder: xml.sax.saxutils.XMLGenerator cannot output UTF-16 |
| 2009年03月21日 02:02:11 | ajaksu2 | set | dependencies:
+ xml.sax.saxutils.XMLGenerator cannot output UTF-16 type: behavior versions: + Python 2.6, - Python 2.5 nosy: + ajaksu2 messages: + msg83907 stage: test needed |
| 2006年04月14日 20:07:46 | ngrig | create | |