homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: XML codec
Type: enhancement Stage:
Components: Unicode Versions: Python 2.6
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: lemburg Nosy List: doerwalter, jafo, lemburg
Priority: normal Keywords: patch

Created on 2007年11月07日 17:52 by doerwalter, last changed 2022年04月11日 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
diff.txt doerwalter, 2007年11月07日 17:52
diff2.txt doerwalter, 2007年11月08日 21:25
Messages (9)
msg57211 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2007年11月07日 17:52
The patch adds an XML codec. It implements encoding detection as
specified in http://www.w3.org/TR/2004/REC-xml-20040204/#sec-guessing
and supports externally specified encodings for both encoding and decoding.
msg57213 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2007年11月07日 17:53
I think it's good to add this; I don't have time to review though.
msg57221 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2007年11月07日 19:43
Nice codec !
The only nit I have is the name: "xml" isn't intuitive enough. I had to
read the code to figure out what the codec actually does. 
"xml" used a encoding usually refers to having Unicode text converted to
ASCII with XML entity escapes for all non-ASCII characters.
How about "xml-auto-detect" or something along those lines ?!
msg57222 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2007年11月07日 21:42
"xml-auto-detect" sounds OK to me, it even makes sense for the encoder,
because it normally detects the encoding to use for writing from the XML
declaration.
We could put "xml-auto-detect" into the alias mapping and keep xml as
the module name.
But I noticed I have to rewrap a lot of lines, before I check it in.
msg57224 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2007年11月07日 21:54
Leaving the module name as "xml" would remove that name from the
namespace of possible encodings.
"xml" as encoding name is problematic, as many people regard writing
data in XML as "encoding the data in XML".
I'd simply not use it at all, not even for a codec that converts between
 Unicode and ASCII+XML entities.
msg57280 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2007年11月08日 21:25
OK, I've changed the name of the codec to xml_auto_detect and added
support for EBCDIC.
msg57281 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2007年11月08日 21:37
Thanks, Walter !
msg63696 - (view) Author: Sean Reifschneider (jafo) * (Python committer) Date: 2008年03月17日 17:52
Marc-Andre: Is this good to be committed, or does it need to be reviewed
further?
msg63703 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2008年03月17日 18:14
There was resistance in python-dev against this patch (see the thread at
http://mail.python.org/pipermail/python-dev/2007-November/075138.html),
so this issue should probably closed as rejected.
However there was consensus, that a detect_xml_encoding() function might
be usefull.
History
Date User Action Args
2022年04月11日 14:56:28adminsetgithub: 45740
2008年03月18日 15:14:31jafosetstatus: open -> closed
resolution: rejected
2008年03月17日 18:14:25doerwaltersetmessages: + msg63703
2008年03月17日 17:52:30jafosetpriority: normal
assignee: lemburg
messages: + msg63696
nosy: + jafo
2007年11月08日 21:37:27lemburgsetmessages: + msg57281
2007年11月08日 21:25:53doerwaltersetfiles: + diff2.txt
messages: + msg57280
2007年11月07日 21:59:56gvanrossumsetnosy: - gvanrossum
2007年11月07日 21:54:18lemburgsetmessages: + msg57224
2007年11月07日 21:42:20doerwaltersetmessages: + msg57222
2007年11月07日 19:43:06lemburgsetnosy: + lemburg
messages: + msg57221
2007年11月07日 17:53:57gvanrossumsetnosy: + gvanrossum
messages: + msg57213
2007年11月07日 17:52:18doerwaltercreate

AltStyle によって変換されたページ (->オリジナル) /