-
Notifications
You must be signed in to change notification settings - Fork 90
Description
Closely related to #97.
Prompted by encode/httpx#1363 (comment)
So, h11 currently has stricter-than-urllib3 rules on header name validation...
>>> import httpx >>> httpx.get("https://club.huawei.com/") Traceback (most recent call last): ... httpx.RemoteProtocolError: malformed data
Which is occurring because the response looks like this...
HTTP/1.1 200 OK Connection: keep-alive Content-Encoding: gzip Content-Security-Policy: base-uri Content-Type: text/html; charset=utf-8 Date: 2020年10月15日 13:19:33 GMT Server: CloudWAF Set-Cookie: HWWAFSESID=a74181602debc465809; path=/ Set-Cookie: HWWAFSESTIME=1602767969615; path=/ Set-Cookie: a3ps_2132_saltkey=yCXrVqdR06Nk5u2PrmLgs9eqlGIpQd9FogV2GL6bxGP3HH2XweRXIeCVny%2BrVDpoOYNLphTU9uVN1HP1%2Fav1bvV2Yrafq%2BXdJR%2BVAVPHizU92ISGAest0dKt7%2FIbdulNYXV0aGtleQ%3D%3D; path=/; secure; httponly Set-Cookie: a3ps_2132_errorinfo=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; Max-Age=0; path=/; secure; httponly Set-Cookie: a3ps_2132_errorcode=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; Max-Age=0; path=/; secure; httponly Set-Cookie: a3ps_2132_auth=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; Max-Age=0; path=/; secure; httponly Set-Cookie: a3ps_2132_lastvisit=1602764373; expires=Sat, 14-Nov-2020 13:19:33 GMT; Max-Age=2592000; path=/; secure; httponly Set-Cookie: a3ps_2132_lastact=1602767973%09portal.php%09; expires=Fri, 16-Oct-2020 13:19:33 GMT; Max-Age=86400; path=/; secure; httponly Set-Cookie: a3ps_2132_currentHwLoginUrl=http%3A%2F%2Fcn.club.vmall.com%2F; expires=Thu, 15-Oct-2020 15:19:33 GMT; Max-Age=7200; path=/; secure; httponly Transfer-Encoding: chunked X-XSS-Protection: 1; mode=block banlist-ip: 0 banlist-uri: 0 get-ban-to-cache-result/portal.php: userdata not support get-ban-to-cache-result62.31.28.214: userdata not support result-ip: 0 result-uri: 0
That's not all that unexpected, since it's obviously simply just due to h11 being a wonderfully thoroughly engineered package. And doing a great job of following the relevant specs. However we might(?) want to be following a path of as-lax-as-possible-if-still-parsable on stuff that comes in from the wire, while keeping the constraints on always ensuring spec-compliant outputs. (?)
In practice, if httpx is failing to parse responses like this, then at least some subset of users are going to see behaviour that from their perspective is a regression vs. other HTTP tooling.
What are our thoughts here?