Message 138893 - Python tracker

➜

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

In-reply-to
Author	Devin Jeanpierre
Recipients	Devin Jeanpierre, benjamin.peterson, petri.lehtinen, r.david.murray, tim.peters
Date	2011年06月24日.08:40:44
SpamBayes Score	2.4577007e-12
Marked as misclassified	No
Message-id	<1308904845.51.0.466556934052.issue11909@psf.upfronthosting.co.za>

Content
You're right, and good catch. If a doctest starts with a "#coding:XXX" line, this should break. One option is to replace the call to tokenize.tokenize with a call to tokenize._tokenize and pass 'utf-8' as a parameter. Downside: that's a private and undocumented API. The alternative is to manually add a coding line that specifies UTF-8, so that any coding line in the doctest would be ignored. My preferred option would be to add the ability to read unicode to the tokenize API, and then use that. I can file a separate ticket if that sounds good, since it's probably useful to others too. One other thing to be worried about -- I'm not sure how doctest would treat tests with leading "coding:XXX" lines. I'd hope it ignores them, if it doesn't then this is more complicated and the above stuff wouldn't work. I'll see if I have the time to play around with this (and add more test cases to the patch, correspondingly) this weekend.

Content

You're right, and good catch. If a doctest starts with a "#coding:XXX" line, this should break.
One option is to replace the call to tokenize.tokenize with a call to tokenize._tokenize and pass 'utf-8' as a parameter. Downside: that's a private and undocumented API. The alternative is to manually add a coding line that specifies UTF-8, so that any coding line in the doctest would be ignored. 
My preferred option would be to add the ability to read unicode to the tokenize API, and then use that. I can file a separate ticket if that sounds good, since it's probably useful to others too.
One other thing to be worried about -- I'm not sure how doctest would treat tests with leading "coding:XXX" lines. I'd hope it ignores them, if it doesn't then this is more complicated and the above stuff wouldn't work.
I'll see if I have the time to play around with this (and add more test cases to the patch, correspondingly) this weekend.

History
Date	User	Action	Args
2011年06月24日 08:40:45	Devin Jeanpierre	set	recipients: + Devin Jeanpierre, tim.peters, benjamin.peterson, r.david.murray, petri.lehtinen
2011年06月24日 08:40:45	Devin Jeanpierre	set	messageid: <1308904845.51.0.466556934052.issue11909@psf.upfronthosting.co.za>
2011年06月24日 08:40:44	Devin Jeanpierre	link	issue11909 messages
2011年06月24日 08:40:44	Devin Jeanpierre	create

homepage