homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: SSL match_hostname fails for internationalized domain names
Type: enhancement Stage: resolved
Components: asyncio, SSL Versions: Python 3.8, Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: christian.heimes Nosy List: Socob, abracadaber, alex, christian.heimes, dstufft, janssen, kedare, ned.deily, njs, tialaramex, wumpus, yselivanov
Priority: deferred blocker Keywords: patch

Created on 2016年10月11日 08:02 by abracadaber, last changed 2022年04月11日 14:58 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 3010 closed njs, 2017年08月06日 21:44
PR 5128 merged christian.heimes, 2018年01月07日 15:41
PR 5395 merged christian.heimes, 2018年01月28日 20:20
PR 5843 merged miss-islington, 2018年02月24日 01:35
Messages (21)
msg278466 - (view) Author: Anton Sychugov (abracadaber) Date: 2016年10月11日 08:02
In accordance with http://tools.ietf.org/html/rfc6125#section-6.4.2:
"If the DNS domain name portion of a reference identifier is an internationalized domain name, then an implementation MUST convert any U-labels [IDNA-DEFS] in the domain name to A-labels before checking the domain name."
The question is: Where in python stdlib should it to convert domain name from U-label to A-label? Should it be in ssl._dnsname_match, e.g.:
...
hostname = hostname.encode('idna').decode('utf-8')
...
Or should it be at ssl._dnsname_match caller level?
I found that error appears after using ssl.SSLContext.wrap_bio, which in turn uses internal newPySSLSocket, which in turn always decode server_hostname through:
PySSLSocket *self;
...
PyObject *hostname = PyUnicode_Decode(server_hostname, strlen(server_hostname), "idna", "strict");
...
self->server_hostname = hostname;
In this way, SSLSocket always contains U-label in its server_hostname field, and ssl._dnsname_match falis with "ssl.CertificateError: hostname ... doesn't match either of ..."
And i don't understand where is a bug, or is it a bug.
msg278483 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2016年10月11日 13:20
Thanks for bringing this to my attention. I can confirm that the code is broken. Further more there are no tests for IDN for server_hostname.
* server_hostname must be an IDN U-label (locälhost)
* SSL handshake correctly converts and sends TLS SNI as IDN A-label (xn--loclhost-2za)
* getpeercert() returns DNS SAN as IDN A-label. It's less than ideal but required.
* the serverhostname_callback is called with IDN U-label
* match_hostname() is called with IDN U-label
The bug is clearly in match_hostname(). The function fails to convert the hostname U-label to A-label before it compares the certificate.
I have a rough draft of a patch here https://github.com/tiran/cpython/tree/issue28414_idna_verify
By the way IDNA support in Python is broken in general, #17305. We still don't support the latest IDNA standard from 2008 (!). IDNA 2003 is not compatible with German, Greek, Farsi and Sinhalese domains, http://unicode.org/faq/idn.html.
msg278488 - (view) Author: Anton Sychugov (abracadaber) Date: 2016年10月11日 13:42
Yes, I misspelled, match_hostname() fails with ssl.CertificateError.
msg278519 - (view) Author: Anton Sychugov (abracadaber) Date: 2016年10月12日 08:07
Christian, thanks a lot for your comment and for patch you provide. It becomes much clearer.
I'll be watching for #17305.
msg279165 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2016年10月21日 21:55
Christian, what's the status on this one?
msg279920 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2016年11月02日 11:02
It's a big, complicated mess. I can't implement IDN support correctly because Python lacks UTS#46 and IDNA2008 support. I just found out that IDNA 2008 is not enough because it does not provide a case mapping. Lack of case mapping broke my fix for curl CVE-2016-8625.
At the moment IDN support is broken in a sane way: it just doesn't work and fails.
A partial fix will introduce security issues. http://unicode.org/reports/tr46/#Processing lists "www.sparkasse-gießen.de" as a critical example. It's the domain of a German savings and loan bank.
msg292025 - (view) Author: Mathieu Poussin (kedare) Date: 2017年04月21日 10:58
Hello Christian.
Is there any update about this issue ?
Do we have any alternative to avoid this problem ?
Thank you.
msg295384 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2017年06月08日 07:33
If the SSL module followed the pattern of encoding all str to bytes at the edges while leaving bytes alone, and used exclusively bytes internally (and in this case by "bytes" I mean "bytes objects containing A-labels"), then it would at least fix this bug and also make it possible for library authors to implement their own IDNA handling. Right now if you pass in a pre-encoded byte-string, exactly what ssl.py needs to compare to the certificate, then ssl.py will convert it *back* to text :-(.
msg295391 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2017年06月08日 10:03
I have an idea for a different approach that can be applied to both ssl and socket module. Stay tuned to this station for a PEP broadcast!
msg295775 - (view) Author: Nick Lamb (tialaramex) Date: 2017年06月12日 13:12
I endorse njs' recommended fix here. Don't try to get clever, this is a security component, it should be the dumbest it can be possibly be while being correct, because if it's smarter it will probably be wrong.
msg298583 - (view) Author: Nick Lamb (tialaramex) Date: 2017年07月18日 12:23
Did I miss Christian's "PEP Broadcast"?
msg299812 - (view) Author: Alex Gaynor (alex) * (Python committer) Date: 2017年08月06日 19:46
This came up on m.d.s.p. today: https://groups.google.com/d/msg/mozilla.dev.security.policy/K3sk5ZMv2DE/fx6c3WWFBgAJ
I haven't dug in deeply, but it sounds like we handle IDNs in CNs and SANs differently?
I think we should look for a way to solve that specific problem, without biting off the whole thing -- one solution would be to simply drop support for CNs in match_hostname, as both Chrome and Firefox have already done :-)
msg299813 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2017年08月06日 20:15
> I haven't dug in deeply, but it sounds like we handle IDNs in CNs and SANs differently?
No -- Python's ssl module uses exactly the same hostname checking logic in both cases, and it's equally broken regardless. But, since CAs do all kinds of weird stuff with CNs, there's some chance that our brokenness and their brokenness will align and make things work by chance. Specifically, this will happen if the CA puts the U-encoded name in the CN field. Nick Lamb's concern is that CAs may be using this as justification for continuing to issue certs that are broken in this way. I don't know if that's true, but it's possible.
> one solution would be to simply drop support for CNs in match_hostname
That would indeed fix Nick Lamb's concern, but I'm dubious about this word "simply" :-). Obviously we should do this eventually, but it's going to break a bunch of people, you'll have to have a big fight about Python 2 and Redhat will probably refuse to take the patch and etc etc. OTOH fixing match_hostname to use A-labels would provide immediate benefits to Python's users (right now Python just... can't do SSL connections to IDNs) with minimal breakage, so you can call it a bug fix, and then worry about deprecating the CN field on its own schedule.
msg299846 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2017年08月07日 12:23
For the record, I'm now considering match_hostname() on U-Labels crazy level 'A sure sign of someone who wears his underpants on his head.'. Once upon a time I had some hope to make it work and keep server_hostname to be an IDN U-Label. I no longer think it feasible and safe at the same time.
Pros:
* ACE is native encoding in SNI TLSEXT.
* ACE is native encoding in X509v3 SAN extension.
* ACE is native encoding in DNS.
* ACE is required to avoid partial wildcards on punycode ("x*" must not match "xn--...").
* OpenSSL's hostname verification operates on ACE.
* ACE is not ambiguous, ACE -> U-label -> ACE depends on IDNA standard and settings.
Cons:
* Making SSLSocket.server_hostname IDN A-label instead of U-label is backwards incompatible.
Self-quote from https://github.com/pyca/cryptography/issues/3357#issuecomment-318902879
---
I have been struggling with similar issues in Python's ssl module. The current implementation cannot verify and match IDN host names. It's also a bit of a mess, SNI callback and server_hostname are IDN U-labels, cert attributes are IDN A-labels. I have played with several approaches to fix the issue. So far only one approach is both simple enough to be memorable and not a potential source of security issues. It's also fully backwards compatible with ASCII-only host names.
User supplied input (hostname for TCP connection, server hostname) can be specified as either IDN U-label (str), IDN A-label (aka ACE, str) or ACE bytes. Internally the socket module and SSL module use ACE bytes only. Text (str) are converted to ACE bytes using IDNA. Since ACE str are just ASCII, IDNA encoding of ACE str is equivalent to encoding with ASCII encoding.
All output (SAN dNSName, SAN URI, CN, SNI callback, server_hostname attribute) are decoded as ACE strings. Since IDN is not a bijective mapping and also depends on the IDNA standard (2003, 2008, UTS46), this avoids some potential security issues. X.509 hostname verification and matching is defined on ACE, not IDN U-labels. I would rather keep them as bytes, but it wouldn't be backwards compatible. Also the aligns the SSL module with the socket module. socket.getnameinfo() decodes with ASCII, not with IDNA.
The new approach will make the SSL module compatible with the external idna package and IDNA 2008. Users just have to pass in ACE bytes as server_hostname.
---
msg306450 - (view) Author: Nick Lamb (tialaramex) Date: 2017年11月17日 17:57
As much for myself when I next run into this on my checklist as for any other readers: Despite the appearance of nothing happening PR 3010 (linked) actually has a little bit of momentum and seems likely to eventually land in Python.
msg310992 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2018年01月28日 21:04
At Christian's request and considering the importance of the ssl module, I'm going to allow an extension for landing of this feature until 3.7.0b2, currently scheduled for 2018年02月26日. If anyone else can help Christian get this in before b2, that would be great. I'm removing older versions for now. We can discuss potential backports after the feature lands.
msg311128 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2018年01月29日 13:25
New changeset 66e5742becce38e69a8f09e5f7051445fc57e92e by Christian Heimes in branch 'master':
bpo-28414: ssl module idna test (#5395)
https://github.com/python/cpython/commit/66e5742becce38e69a8f09e5f7051445fc57e92e
msg311130 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2018年01月29日 13:26
In PR https://github.com/python/cpython/pull/5395 I added a test to verify that most IDNA domains are now working. IDNA 2008 deviations and the fundamental issue of IDNA server callback and IDNA encoded server_hostname attribute are still open. I'll address them in another PR.
msg312476 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2018年02月21日 07:15
Christian: we're less than a week out from b2. Do you need any help here?
msg312685 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2018年02月24日 01:35
New changeset 11a1493bc4198f1def5e572049485779cf54dc57 by Nathaniel J. Smith (Christian Heimes) in branch 'master':
[bpo-28414] Make all hostnames in SSL module IDN A-labels (GH-5128)
https://github.com/python/cpython/commit/11a1493bc4198f1def5e572049485779cf54dc57
msg312696 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2018年02月24日 03:18
New changeset 1c37e277190565f0e30fc9281caae4c899ac3b50 by Nathaniel J. Smith (Miss Islington (bot)) in branch '3.7':
[bpo-28414] Make all hostnames in SSL module IDN A-labels (GH-5128) (GH-5843)
https://github.com/python/cpython/commit/1c37e277190565f0e30fc9281caae4c899ac3b50
History
Date User Action Args
2022年04月11日 14:58:38adminsetgithub: 72600
2018年11月13日 16:24:39christian.heimeslinkissue35234 superseder
2018年02月24日 06:08:31njssetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2018年02月24日 03:18:30njssetmessages: + msg312696
2018年02月24日 01:35:27miss-islingtonsetpull_requests: + pull_request5618
2018年02月24日 01:35:17njssetmessages: + msg312685
2018年02月21日 07:15:52njssetmessages: + msg312476
2018年01月29日 13:26:49christian.heimessetmessages: + msg311130
2018年01月29日 13:25:15christian.heimessetmessages: + msg311128
2018年01月28日 21:04:41ned.deilysetpriority: normal -> deferred blocker
versions: + Python 3.8, - Python 2.7, Python 3.5, Python 3.6
nosy: + ned.deily

messages: + msg310992
2018年01月28日 20:20:43christian.heimessetpull_requests: + pull_request5230
2018年01月28日 16:10:21wumpussetnosy: + wumpus
2018年01月07日 15:41:45christian.heimessetkeywords: + patch
stage: patch review
pull_requests: + pull_request4989
2017年11月17日 17:57:27tialaramexsetmessages: + msg306450
2017年11月02日 08:50:30asvetlovlinkissue31872 superseder
2017年08月07日 12:23:11christian.heimessetmessages: + msg299846
2017年08月06日 21:44:15njssetpull_requests: + pull_request3043
2017年08月06日 20:15:23njssetmessages: + msg299813
2017年08月06日 19:47:36alexsetnosy: + janssen, dstufft
2017年08月06日 19:46:39alexsetnosy: + alex
messages: + msg299812
2017年07月18日 12:23:53tialaramexsetmessages: + msg298583
2017年06月12日 13:12:15tialaramexsetnosy: + tialaramex
messages: + msg295775
2017年06月08日 10:03:11christian.heimessetmessages: + msg295391
2017年06月08日 07:33:27njssetnosy: + njs
messages: + msg295384
2017年04月21日 10:58:59kedaresetnosy: + kedare
messages: + msg292025
2017年01月09日 18:33:23Socobsetnosy: + Socob
2016年11月02日 11:02:28christian.heimessetmessages: + msg279920
2016年10月21日 21:55:32yselivanovsetmessages: + msg279165
2016年10月12日 08:07:12abracadabersetmessages: + msg278519
2016年10月11日 16:49:32gvanrossumsetnosy: - gvanrossum
2016年10月11日 13:42:10abracadabersetmessages: + msg278488
2016年10月11日 13:20:32christian.heimessetversions: + Python 2.7, Python 3.6, Python 3.7, - Python 3.4
2016年10月11日 13:20:13christian.heimessetmessages: + msg278483
2016年10月11日 09:05:21abracadabersetassignee: christian.heimes

components: + SSL
nosy: + christian.heimes
2016年10月11日 08:23:36abracadabersettype: enhancement
2016年10月11日 08:02:36abracadabercreate

AltStyle によって変換されたページ (->オリジナル) /