Message 264145 - Python tracker

➜

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

In-reply-to
Author	Joshua.Landau
Recipients	Joshua.Landau
Date	2016年04月25日.01:58:43
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1461549524.26.0.280315753344.issue26843@psf.upfronthosting.co.za>

Content
This is effectively a continuation of https://bugs.python.org/issue9712. The line in Lib/tokenize.py Name = r'\w+' must be changed to a regular expression that accepts Other_ID_Start at the start and Other_ID_Continue elsewhere. Hence tokenize does not accept '℘·'. See the reference here: https://docs.python.org/3.5/reference/lexical_analysis.html#identifiers I'm unsure whether unicode normalization (aka the `xid` properties) needs to be dealt with too. Credit to toriningen from http://stackoverflow.com/a/29586366/1763356.

Content

This is effectively a continuation of https://bugs.python.org/issue9712.
The line in Lib/tokenize.py
 Name = r'\w+'
must be changed to a regular expression that accepts Other_ID_Start at the start and Other_ID_Continue elsewhere. Hence tokenize does not accept '℘·'.
See the reference here:
 https://docs.python.org/3.5/reference/lexical_analysis.html#identifiers
I'm unsure whether unicode normalization (aka the `xid` properties) needs to be dealt with too.
Credit to toriningen from http://stackoverflow.com/a/29586366/1763356.

History
Date	User	Action	Args
2016年04月25日 01:58:44	Joshua.Landau	set	recipients: + Joshua.Landau
2016年04月25日 01:58:44	Joshua.Landau	set	messageid: <1461549524.26.0.280315753344.issue26843@psf.upfronthosting.co.za>
2016年04月25日 01:58:44	Joshua.Landau	link	issue26843 messages
2016年04月25日 01:58:43	Joshua.Landau	create

homepage