Message 255440 - Python tracker

➜

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

In-reply-to
Author	martin.panter
Recipients	barry, demian.brecht, ezio.melotti, gregory.p.smith, martin.panter, r.david.murray, scharron, serhiy.storchaka
Date	2015年11月26日.23:19:02
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1448579943.5.0.317912844783.issue22233@psf.upfronthosting.co.za>

Content
For the record, this is a simplified version of the original scenario, showing the low-level HTTP protocol: >>> request = ( ... b"GET /%C4%85 HTTP/1.1\r\n" ... b"Host: graph.facebook.com\r\n" ... b"\r\n" ... ) >>> s = create_connection(("graph.facebook.com", HTTPS_PORT)) >>> with ssl.wrap_socket(s) as s: ... s.sendall(request) ... response = s.recv(3000) ... 50 >>> pprint(response.splitlines(keepends=True)) [b'HTTP/1.1 404 Not Found\r\n', b'WWW-Authenticate: OAuth "Facebook Platform" "not_found" "(#803) Some of the ' b'aliases you requested do not exist: \xc4\x85"\r\n', b'Access-Control-Allow-Origin: *\r\n', b'Content-Type: text/javascript; charset=UTF-8\r\n', b'X-FB-Trace-ID: H9yxnVcQFuA\r\n', b'X-FB-Rev: 2063232\r\n', b'Pragma: no-cache\r\n', b'Cache-Control: no-store\r\n', b'Facebook-API-Version: v2.0\r\n', b'Expires: 2000年1月01日 00:00:00 GMT\r\n', b'X-FB-Debug: 07ouxMl1Z439Ke/YzHSjXx3o9PcpGeZBFS7yrGwTzaaudrZWe5Ef8Z96oSo2dINp' b'3GR4q78+1oHDX2pUF2ky1A==\r\n', b'Date: 2015年11月26日 23:03:47 GMT\r\n', b'Connection: keep-alive\r\n', b'Content-Length: 147\r\n', b'\r\n', b'{"error":{"message":"(#803) Some of the aliases you requested do not exist: ' b'\\u0105","type":"OAuthException","code":803,"fbtrace_id":"H9yxnVcQFuA"}}'] In my mind, the simplest way forward would be to change the "email" module to only parse lines using the "universal newlines" algorithm. The /Lib/email/feedparser.py module should use StringIO(s, newline="").readlines() rather than s.splitlines(keepends=True). That would mean all email parsing behaviour would change; for instance, given the following message: >>> m = email.message_from_string( ... "WWW-Authenticate: abc\x85<body or header?>\r\n" ... "\r\n" ... ) instead of the current behaviour: >>> m.items() [('WWW-Authenticate', 'abc\x85')] >>> m.get_payload() '<body or header?>\r\n\r\n' it would change to: >>> m.items() [('WWW-Authenticate', 'abc\x85<body or header?>')] >>> m.get_payload() ''

Content

For the record, this is a simplified version of the original scenario, showing the low-level HTTP protocol:
>>> request = (
... b"GET /%C4%85 HTTP/1.1\r\n"
... b"Host: graph.facebook.com\r\n"
... b"\r\n"
... )
>>> s = create_connection(("graph.facebook.com", HTTPS_PORT))
>>> with ssl.wrap_socket(s) as s:
... s.sendall(request)
... response = s.recv(3000)
... 
50
>>> pprint(response.splitlines(keepends=True))
[b'HTTP/1.1 404 Not Found\r\n',
 b'WWW-Authenticate: OAuth "Facebook Platform" "not_found" "(#803) Some of the '
 b'aliases you requested do not exist: \xc4\x85"\r\n',
 b'Access-Control-Allow-Origin: *\r\n',
 b'Content-Type: text/javascript; charset=UTF-8\r\n',
 b'X-FB-Trace-ID: H9yxnVcQFuA\r\n',
 b'X-FB-Rev: 2063232\r\n',
 b'Pragma: no-cache\r\n',
 b'Cache-Control: no-store\r\n',
 b'Facebook-API-Version: v2.0\r\n',
 b'Expires: 2000年1月01日 00:00:00 GMT\r\n',
 b'X-FB-Debug: 07ouxMl1Z439Ke/YzHSjXx3o9PcpGeZBFS7yrGwTzaaudrZWe5Ef8Z96oSo2dINp'
 b'3GR4q78+1oHDX2pUF2ky1A==\r\n',
 b'Date: 2015年11月26日 23:03:47 GMT\r\n',
 b'Connection: keep-alive\r\n',
 b'Content-Length: 147\r\n',
 b'\r\n',
 b'{"error":{"message":"(#803) Some of the aliases you requested do not exist: '
 b'\\u0105","type":"OAuthException","code":803,"fbtrace_id":"H9yxnVcQFuA"}}']
In my mind, the simplest way forward would be to change the "email" module to only parse lines using the "universal newlines" algorithm. The /Lib/email/feedparser.py module should use StringIO(s, newline="").readlines() rather than s.splitlines(keepends=True). That would mean all email parsing behaviour would change; for instance, given the following message:
>>> m = email.message_from_string(
... "WWW-Authenticate: abc\x85<body or header?>\r\n"
... "\r\n"
... )
instead of the current behaviour:
>>> m.items()
[('WWW-Authenticate', 'abc\x85')]
>>> m.get_payload()
'<body or header?>\r\n\r\n'
it would change to:
>>> m.items()
[('WWW-Authenticate', 'abc\x85<body or header?>')]
>>> m.get_payload()
''

History
Date	User	Action	Args
2015年11月26日 23:19:03	martin.panter	set	recipients: + martin.panter, barry, gregory.p.smith, ezio.melotti, r.david.murray, serhiy.storchaka, demian.brecht, scharron
2015年11月26日 23:19:03	martin.panter	set	messageid: <1448579943.5.0.317912844783.issue22233@psf.upfronthosting.co.za>
2015年11月26日 23:19:03	martin.panter	link	issue22233 messages
2015年11月26日 23:19:02	martin.panter	create

homepage