homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: xml.sax.saxutils.escape does not escapes \x00
Type: behavior Stage:
Components: XML Versions: Python 3.1, Python 3.2, Python 2.7, Python 2.6
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: animus, loewis
Priority: normal Keywords:

Created on 2011年12月22日 14:49 by animus, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Messages (6)
msg150096 - (view) Author: Alexey Gorshkov (animus) Date: 2011年12月22日 14:49
function xml.sax.saxutils.escape('\x00qweqwe<') returns '\x00qweqwe&lt;'
\x00 did not escaped to &#0;
is this is a correct behavior?
this is influences tools like xmpppy, which sends \x00 not encoded and leads to xmpp error.
msg150097 - (view) Author: Alexey Gorshkov (animus) Date: 2011年12月22日 14:55
sorry, xmpppy uses it's own escape method, but anyway... :)
msg150136 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2011年12月23日 08:36
This is correct behavior. \x00 is not supported in XML: not in raw form, and not in escaped form. To transmit binary data in XML, use base64.
msg163291 - (view) Author: Alexey Gorshkov (animus) Date: 2012年06月20日 19:03
>This is correct behavior. \x00 is not supported in XML:
> not in raw form, and not in escaped form
last sentence in forth paragraph of section 1.3 in XML 1.1 specification says following:
======
Due to potential problems with APIs,
#x0 is still forbidden both directly and as a character reference.
======
And, second sentence in paragraph 2 in subsection 'Validity constraint: Notation Declared' of section 4.2.2 says following:
======
The characters to be escaped are the control characters #x0 to #x1F and #x7F (most of which cannot appear in XML), space #x20, the delimiters '<' #x3C, '>' #x3E and '"' #x22, the unwise characters '{' #x7B, '}' #x7D, '|' #x7C, '\' #x5C, '^' #x5E and '`' #x60, as well as all characters above #x7F.
======
(xml 1.1) http://www.w3.org/TR/2006/REC-xml11-20060816/
(xml 1.0) http://www.w3.org/TR/2008/REC-xml-20081126/ 
msg163292 - (view) Author: Alexey Gorshkov (animus) Date: 2012年06月20日 19:32
What am I trying to say is: if those characters are forbidden, then maybe they need to be escaped rather than ignored?
msg163294 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2012年06月20日 19:56
The characters are forbidden both in raw form *and* in escaped form. So even if they get escaped, they *still* will lead to errors. So there is no point in escaping them.
History
Date User Action Args
2022年04月11日 14:57:24adminsetgithub: 57857
2018年12月29日 16:44:51ned.deilylinkissue35613 superseder
2012年06月20日 19:56:12loewissetstatus: open -> closed
resolution: wont fix
messages: + msg163294
2012年06月20日 19:32:12animussetmessages: + msg163292
2012年06月20日 19:03:32animussetstatus: closed -> open

messages: + msg163291
2011年12月23日 08:36:30loewissetstatus: open -> closed
nosy: + loewis
messages: + msg150136

2011年12月22日 14:55:52animussetmessages: + msg150097
2011年12月22日 14:51:20animussetcomponents: + XML
2011年12月22日 14:49:17animuscreate

AltStyle によって変換されたページ (->オリジナル) /