Message272681
| Author |
steve.dower |
| Recipients |
Drekin, benjamin.peterson, brett.cannon, eric.araujo, georg.brandl, gvanrossum, ncoghlan, paul.moore, pitrou, steve.dower, tshepang |
| Date |
2016年08月14日.16:30:34 |
| SpamBayes Score |
-1.0 |
| Marked as misclassified |
Yes |
| Message-id |
<1471192234.73.0.194726043861.issue17620@psf.upfronthosting.co.za> |
| In-reply-to |
| Content |
I'm working on this as part of my fix for issue1602. Not yet sure how this will come out - compatibility with GNU readline seems to be the biggest issue, as if we want to keep that then we can't allow embedded '0円' in the encoded text (i.e. UTF-16 cannot be used, which implies that sys.stdin.encoding cannot always be used directly).
Adding __readlinehook__ as an alternative may be feasible, but a decent amount of work given how we call into the current readline implementation. Unfortunately, it looks like detecting when a readline hook has been added is going to involve significant changes to the tokenizer, which I really don't want to do.
The easiest approach wrt issue1602 seems to be to special case the console by reencoding from utf-16-le to utf-8 and forcing the encoding in the tokenizer to utf-8 (instead of sys.stdin.encoding) in this case. I'll start here so that at least we can parse Unicode from the interactive prompt. |
|
History
|
|---|
| Date |
User |
Action |
Args |
| 2016年08月14日 16:30:34 | steve.dower | set | recipients:
+ steve.dower, gvanrossum, brett.cannon, georg.brandl, paul.moore, ncoghlan, pitrou, benjamin.peterson, eric.araujo, tshepang, Drekin |
| 2016年08月14日 16:30:34 | steve.dower | set | messageid: <1471192234.73.0.194726043861.issue17620@psf.upfronthosting.co.za> |
| 2016年08月14日 16:30:34 | steve.dower | link | issue17620 messages |
| 2016年08月14日 16:30:34 | steve.dower | create |
|