Message120600
| Author |
vstinner |
| Recipients |
vstinner |
| Date |
2010年11月06日.10:49:34 |
| SpamBayes Score |
8.9079374e-05 |
| Marked as misclassified |
No |
| Message-id |
<1289040579.7.0.473143443383.issue10335@psf.upfronthosting.co.za> |
| In-reply-to |
| Content |
In Python3, the following pattern becomes common:
with open(fullname, 'rb') as fp:
coding, line = tokenize.detect_encoding(fp.readline)
with open(fullname, 'r', encoding=coding) as fp:
...
It opens the file is opened twice, whereas it is unnecessary: it's possible to reuse the raw buffer to create a text file. And I don't like the detect_encoding() API: pass the readline function is not intuitive.
I propose to create tokenize.open_python() function with a very simple API: just one argument, the filename. This function calls detect_encoding() and only open the file once.
Attached python adds the function with an unit test and a patch on the documentation. It patchs also functions currently using detect_encoding().
open_python() only supports read mode. I suppose that it is enough. |
|
History
|
|---|
| Date |
User |
Action |
Args |
| 2010年11月06日 10:49:39 | vstinner | set | recipients:
+ vstinner |
| 2010年11月06日 10:49:39 | vstinner | set | messageid: <1289040579.7.0.473143443383.issue10335@psf.upfronthosting.co.za> |
| 2010年11月06日 10:49:37 | vstinner | link | issue10335 messages |
| 2010年11月06日 10:49:37 | vstinner | create |
|