Message259112
| Author |
vstinner |
| Recipients |
abarry, eryksun, ezio.melotti, paul.moore, serhiy.storchaka, steve.dower, tim.golden, vstinner, zach.ware |
| Date |
2016年01月28日.09:41:02 |
| SpamBayes Score |
-1.0 |
| Marked as misclassified |
Yes |
| Message-id |
<1453974063.09.0.258642734771.issue26227@psf.upfronthosting.co.za> |
| In-reply-to |
| Content |
> Added comments on Rietveld.
Crap. It's easy to miss a compilation error on extensions :-/
I used "make && ./python -m test -v test_socket" to validate gethostbyaddr_encoding-2.patch and it succeded.
Maybe we should setup.py to *fail* if an extension failed to be compiled?
New patch should have less typos :-) I also checked for reference leak using ./python -m test -R 3:3 test_socket => no leak.
> Why not use PyUnicode_DecodeFSDefault on all platforms? It is used in
gethostname() on Unix.
I don't know which encoding is the best choice on UNIX. I prefer to move step by step and fix an obvious bug on Windows blocking Émanuel (see his issue #26226). (Émanuel uses Émanuel-PC for its hostname, an non-ASCII hostname ;-))
I guess that UTF-8 works in most cases on UNIX, whereas using the locale encoding can introduce regressions if the hostname is non-ASCII. For example, decoding non-ASCII hostname would fail with LANG=C which forces an ASCII locale encoding.
The issue #9377 proposes a more advanced code to choose the encoding to decode hostnames. Sorry, I didn't follow this issue recently, so I don't know if it proposes to use surrogateescape and/or IDNA.
I prefer to discuss the encoding used on UNIX in a new issue (or better continue the existing discussion on issue #9377?). |
|