This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2010年01月17日 19:59 by Steven.Hartland, last changed 2022年04月11日 14:56 by admin.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| xmlrpc_byte_string.patch | vstinner, 2010年01月21日 02:15 | |||
| xmlrpc_dump_invalid_string-2.7_2.patch | serhiy.storchaka, 2013年05月24日 16:36 | Patch for 2.7 | review | |
| Messages (13) | |||
|---|---|---|---|
| msg97972 - (view) | Author: Steven Hartland (Steven.Hartland) | Date: 2010年01月17日 19:59 | |
When using SimpleXMLRPCServer that is used to return data that includes strings that have a \x00 in them this data is returned, which is invalid. The expected result is that the data should be treated as binary and base64 encoded. The bug appears to be in the core xmlrpc library which relies on type( value ) to determine the data type. This returns str for a string even if it includes the null char. |
|||
| msg98095 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2010年01月21日 02:09 | |
Marshaller.dump_string() encodes a byte string in <string>...</string> using the escape() function. A byte string can be encoded in base64 using <base64>...</base64>. It's described in the XML-RPC specification, but I don't know if all XML-RPC implementations do understand this type. http://www.xmlrpc.com/spec Should we change the default type to base64, or only fallback to base64 if the byte string cannot be encoded in XML. Test if a byte string can be encoded in XML can be slow, and set default type to base64 may cause compatibility issues :-/ |
|||
| msg98096 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2010年01月21日 02:15 | |
Here is an example of patch using the following test: all(32 <= ord(byte) <= 127 for byte in value) I don't know how much slower is the patch, but at least it doesn't raise an "ExpatError: not well-formed (invalid token): ...". |
|||
| msg98097 - (view) | Author: Steven Hartland (Steven.Hartland) | Date: 2010年01月21日 02:26 | |
One thing that springs to mind is how valid is that when applied to utf8 data? |
|||
| msg189782 - (view) | Author: Mark Lawrence (BreamoreBoy) * | Date: 2013年05月21日 20:32 | |
Even if the original patch is valid it will need reworking as xmlrpclib isn't in Python 3, the code is now in xmlrpc/client. It also looks as if dump_string has been renamed dump_unicode. |
|||
| msg189801 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2013年05月22日 09:29 | |
I don't really understand the issue. If you want to pass binary data (rather than unicode text), you should use a Binary object as explained in the docs: http://docs.python.org/2/library/xmlrpclib.html#binary-objects |
|||
| msg189803 - (view) | Author: Martin v. Löwis (loewis) * (Python committer) | Date: 2013年05月22日 11:28 | |
The original report really includes two parts: a) when a string containing 0円 is marshalled, ill-formed XML is produced b) the expected behavior is that base64 is used IMO: While a) is correct, b) is not. Antoine is correct that xmlrpclib.Binary should be used if you want to transmit binary data. Consequently, an Error should be reported if an attempt is made to produce ill-formed XML. OTOH, ill-formed XML can also be produced when sending a byte string that does not match the encoding declaration. Because of that, I propose to close this by documentating the limitations, rather than changing the code. |
|||
| msg189808 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2013年05月22日 12:11 | |
The limitations is already documented: """However, it’s the caller’s responsibility to ensure that the string is free of characters that aren’t allowed in XML, such as the control characters with ASCII values between 0 and 31 (except, of course, tab, newline and carriage return); failing to do this will result in an XML-RPC request that isn’t well-formed XML. If you have to pass arbitrary bytes via XML-RPC, use the bytes class or the class:Binary wrapper class described below.""" Here is a patch which forbids creating ill-formed XML. |
|||
| msg189822 - (view) | Author: Martin v. Löwis (loewis) * (Python committer) | Date: 2013年05月22日 15:02 | |
Serhiy: The patch fixes the OP's concern, but not the extended concern about producing ill-formed XML (at least not for 2.7). If the string contains non-UTF-8 data, yet the XML declaration says UTF-8, it's still ill-formed, and not caught by your patch. I wonder whether xmlrpclib.Error would be a better exception than ValueError (although ValueError is also plausible); either way, the case should be documented. |
|||
| msg189831 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2013年05月22日 18:07 | |
Indeed, 2.7 needs more work. Here is a patch for 2.7. UnicodeError (which subclasses ValueError) can be raised implicitly here, that is why I think ValueError is a good exception. I'll be very grateful to you for your help with a documentation. |
|||
| msg189851 - (view) | Author: Martin v. Löwis (loewis) * (Python committer) | Date: 2013年05月23日 07:02 | |
I'm still skeptical that a new exception should be introduced in 2.7.x, or 3.3 (might this break existing setups?). I suggest to ask the release manager for a decision. But if this is done, then I propose to add the following text to ServerProxy: versionchanged (2.7.6): Sending strings with characters that are ill-formed in XML (e.g. \x00) now raises ValueError. |
|||
| msg189919 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2013年05月24日 16:36 | |
Updating tests I found some related errors.
XML-RPC doesn't work in general case for non UTF-8 encoding:
>>> import xmlrpclib
>>> xmlrpclib.dumps(('\u20ac',), encoding='iso-8859-1')
'<params>\n<param>\n<value><string>\\u20ac</string></value>\n</param>\n</params>\n'
>>> xmlrpclib.dumps((u'\u20ac',), encoding='iso-8859-1')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/xmlrpclib.py", line 1085, in dumps
data = m.dumps(params)
File "/usr/lib/python2.7/xmlrpclib.py", line 632, in dumps
dump(v, write)
File "/usr/lib/python2.7/xmlrpclib.py", line 654, in __dump
f(self, value, write)
File "/usr/lib/python2.7/xmlrpclib.py", line 700, in dump_unicode
value = value.encode(self.encoding)
UnicodeEncodeError: 'latin-1' codec can't encode character u'\u20ac' in position 0: ordinal not in range(256)
We should use 'xmlcharrefreplace' error handler.
Non-ASCII strings is passed as Unicode strings (this should be documented).
>>> xmlrpclib.loads(xmlrpclib.dumps(('\xe2\x82\xac',)))
((u'\u20ac',), None)
'\r' and '\r\n' are deserialized as '\n'.
>>> xmlrpclib.loads(xmlrpclib.dumps(('\r',)))
(('\n',), None)
>>> xmlrpclib.loads(xmlrpclib.dumps(('\r\n',)))
(('\n',), None)
|
|||
| msg407580 - (view) | Author: Irit Katriel (iritkatriel) * (Python committer) | Date: 2021年12月03日 12:22 | |
2.7 is no longer relevant, and it looks like these examples are working now:
>>> xmlrpc.client.dumps(('\u20ac',), encoding='iso-8859-1')
'<params>\n<param>\n<value><string>€</string></value>\n</param>\n</params>\n'
>>> xmlrpc.client.dumps((u'\u20ac',), encoding='iso-8859-1')
'<params>\n<param>\n<value><string>€</string></value>\n</param>\n</params>\n'
There is possibly still a documentation enhancement to make regarding non-ascii strings. This is what I get now with Serhiy's examples:
>>> xmlrpc.client.loads(xmlrpc.client.dumps(('\xe2\x82\xac',)))
(('â\x82¬',), None)
>>> xmlrpc.client.loads(xmlrpc.client.dumps(('\r',)))
(('\n',), None)
>>> xmlrpc.client.loads(xmlrpc.client.dumps(('\r\n',)))
(('\n',), None)
|
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:56:56 | admin | set | github: 51976 |
| 2021年12月03日 12:22:16 | iritkatriel | set | nosy:
+ iritkatriel messages: + msg407580 |
| 2018年09月05日 11:39:06 | fredrikhl | set | nosy:
+ fredrikhl |
| 2017年07月15日 09:44:29 | Alex Corcoles | set | versions: + Python 3.5, Python 3.6, Python 3.7 |
| 2017年07月15日 09:44:14 | Alex Corcoles | set | nosy:
+ Alex Corcoles |
| 2017年07月12日 16:34:07 | serhiy.storchaka | link | issue30909 superseder |
| 2016年01月20日 10:20:00 | serhiy.storchaka | link | issue10066 superseder |
| 2014年02月03日 18:27:19 | BreamoreBoy | set | nosy:
- BreamoreBoy |
| 2013年05月25日 15:48:15 | serhiy.storchaka | set | nosy:
+ effbot |
| 2013年05月24日 16:44:00 | serhiy.storchaka | set | files: - xmlrpc_dump_invalid_string-2.7.patch |
| 2013年05月24日 16:43:17 | serhiy.storchaka | set | files: - xmlrpc_dump_invalid_string.patch |
| 2013年05月24日 16:36:45 | serhiy.storchaka | set | files:
+ xmlrpc_dump_invalid_string-2.7_2.patch messages: + msg189919 |
| 2013年05月23日 07:02:48 | loewis | set | messages: + msg189851 |
| 2013年05月22日 18:07:29 | serhiy.storchaka | set | files:
+ xmlrpc_dump_invalid_string-2.7.patch messages: + msg189831 |
| 2013年05月22日 15:02:59 | loewis | set | messages: + msg189822 |
| 2013年05月22日 12:11:04 | serhiy.storchaka | set | files:
+ xmlrpc_dump_invalid_string.patch versions: + Python 2.7, Python 3.3, Python 3.4, - Python 2.6 nosy: + serhiy.storchaka messages: + msg189808 stage: test needed -> patch review |
| 2013年05月22日 11:28:09 | loewis | set | messages: + msg189803 |
| 2013年05月22日 09:29:48 | pitrou | set | nosy:
+ pitrou messages: + msg189801 |
| 2013年05月21日 20:32:52 | BreamoreBoy | set | nosy:
+ BreamoreBoy messages: + msg189782 |
| 2010年01月21日 02:26:22 | Steven.Hartland | set | messages: + msg98097 |
| 2010年01月21日 02:15:02 | vstinner | set | files:
+ xmlrpc_byte_string.patch keywords: + patch messages: + msg98096 |
| 2010年01月21日 02:09:11 | vstinner | set | nosy:
+ vstinner messages: + msg98095 |
| 2010年01月17日 20:17:37 | brian.curtin | set | priority: normal nosy: + loewis stage: test needed |
| 2010年01月17日 19:59:27 | Steven.Hartland | create | |