homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Urllib/Urlopen IncompleteRead with HTTP header with new line characters
Type: behavior Stage:
Components: Library (Lib) Versions: Python 2.7
process
Status: closed Resolution: duplicate
Dependencies: Superseder: httplib fails to handle semivalid HTTP headers
View: 24363
Assigned To: Nosy List: martin.panter, rugk
Priority: normal Keywords:

Created on 2016年06月11日 14:04 by rugk, last changed 2022年04月11日 14:58 by admin. This issue is now closed.

Messages (3)
msg268212 - (view) Author: (rugk) Date: 2016年06月11日 14:03
Test file: https://gist.github.com/rugk/3ea35d04d66c2295e02d0b6cb6d822a2
Python version: 2.7.5+
Issue description: When Urllib gets a HTTP header with line breaks/new line characters it shows the following error:
```
Traceback (most recent call last):
 File "./downloadtest.py", line 17, in <module>
 respdata = resp.read()
 File "/usr/lib/python2.7/socket.py", line 351, in read
 data = self._sock.recv(rbufsize)
 File "/usr/lib/python2.7/httplib.py", line 543, in read
 return self._read_chunked(amt)
 File "/usr/lib/python2.7/httplib.py", line 597, in _read_chunked
 raise IncompleteRead(''.join(value))
httplib.IncompleteRead: IncompleteRead(0 bytes read)
```
Compare the results with curl...
# Broken version
## curl
```
$curl -i https://rugk.dedyn.io/pythontest/bug
HTTP/1.1 200 OK
Server: nginx
Date: 2016年6月11日 13:34:36 GMT
Content-Type: text/plain
Transfer-Encoding: chunked
Connection: keep-alive
Strict-Transport-Security: max-age=15768000; includeSubDomains; preload
Public-Key-Pins: 
pin-sha256="306cc4Cc2py0x48ZiX2G5vt5OxF9afmouqccrFqb8Jc=";
pin-sha256="dWkVtg0EuckExnceVFvu3tuEApEygbxr2FPTlpHAUrQ=";
pin-sha256="DjjVxb2/6kxfX8qyP2TE/j8B0tOB60MhTTvJdNsFPaU=";
max-age=5184000; includeSubDomains;
report-uri="https://rugkdyndns.report-uri.io/r/default/hpkp/enforce"
Bug: 
```
## python
```
$ ./downloadtest.py https://rugk.dedyn.io/pythontest/bug
Accessing https://rugk.dedyn.io/pythontest/bug...
Traceback (most recent call last):
 File "./downloadtest.py", line 17, in <module>
 respdata = resp.read()
 File "/usr/lib/python2.7/socket.py", line 351, in read
 data = self._sock.recv(rbufsize)
 File "/usr/lib/python2.7/httplib.py", line 543, in read
 return self._read_chunked(amt)
 File "/usr/lib/python2.7/httplib.py", line 597, in _read_chunked
 raise IncompleteRead(''.join(value))
httplib.IncompleteRead: IncompleteRead(0 bytes read)
```
# working version
## curl
```
$ curl -i https://rugk.dedyn.io/pythontest/works
HTTP/1.1 200 OK
Server: nginx
Date: 2016年6月11日 13:46:09 GMT
Content-Type: text/plain
Transfer-Encoding: chunked
Connection: keep-alive
Strict-Transport-Security: max-age=15768000; includeSubDomains; preload
Public-Key-Pins: pin-sha256="306cc4Cc2py0x48ZiX2G5vt5OxF9afmouqccrFqb8Jc="; pin-sha256="dWkVtg0EuckExnceVFvu3tuEApEygbxr2FPTlpHAUrQ="; pin-sha256="DjjVxb2/6kxfX8qyP2TE/j8B0tOB60MhTTvJdNsFPaU="; max-age=5184000; includeSubDomains; report-uri="https://rugkdyndns.report-uri.io/r/default/hpkp/enforce"
Bug: 
```
## python
```
$ ./downloadtest.py https://rugk.dedyn.io/pythontest/works
Accessing https://rugk.dedyn.io/pythontest/works...
RAW:
Bug: 
Decoded:
Bug:
```
You can also test it with HTTP URLs and get the same result.
In usual browsers every request works...
I cannot guarantee that the test server will stay available...
msg268215 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016年06月11日 15:04
HTTP header fields are not supposed to have line breaks unless followed by a space or tab. So the server is actually providing a faulty response.
However Python could do better at handling this case. There is already a bug open for this: Issue 24363.
For the record, the full server response I get is:
'HTTP/1.1 200 OK\r\n'
'Server: nginx\r\n'
'Date: 2016年6月11日 14:47:19 GMT\r\n'
'Content-Type: text/plain\r\n'
'Transfer-Encoding: chunked\r\n'
'Connection: close\r\n'
'Vary: Accept-Encoding\r\n'
'Strict-Transport-Security: max-age=15768000; includeSubDomains; preload\r\n'
'Public-Key-Pins: \n'
'pin-sha256="306cc4Cc2py0x48ZiX2G5vt5OxF9afmouqccrFqb8Jc=";\n'
'pin-sha256="dWkVtg0EuckExnceVFvu3tuEApEygbxr2FPTlpHAUrQ=";\n'
'pin-sha256="DjjVxb2/6kxfX8qyP2TE/j8B0tOB60MhTTvJdNsFPaU=";\n'
'max-age=5184000; includeSubDomains;\n'
'report-uri="https://rugkdyndns.report-uri.io/r/default/hpkp/enforce"\r\n'
'\r\n'
'28\r\n'
'Bug: https://bugs.python.org/issue27296\n'
'\r\n'
'0\r\n'
'\r\n'
msg268237 - (view) Author: (rugk) Date: 2016年06月11日 17:55
Yeah, it might not be the standard or best practise to send such headers, but at least all mayor browsers and curl do not complain about this. Mayor browsers even threat this HPKP header as it is supposed.
But instead of showing complex error messages Python could just ignore the malformed header...
History
Date User Action Args
2022年04月11日 14:58:32adminsetgithub: 71483
2016年06月11日 17:55:03rugksetmessages: + msg268237
2016年06月11日 15:04:09martin.pantersetstatus: open -> closed

nosy: + martin.panter
messages: + msg268215

superseder: httplib fails to handle semivalid HTTP headers
resolution: duplicate
2016年06月11日 14:04:01rugkcreate

AltStyle によって変換されたページ (->オリジナル) /