Message81182
| Author |
aclover |
| Recipients |
aclover, christian.heimes, jackjansen, jorend, lemburg |
| Date |
2009年02月05日.01:42:20 |
| SpamBayes Score |
0.0001631276 |
| Marked as misclassified |
No |
| Message-id |
<1233798142.36.0.399908566423.issue691291@psf.upfronthosting.co.za> |
| In-reply-to |
| Content |
> The problem is that codecs.open() forces binary mode on the underlying
file object, and this defeats the U mode.
Actually the problem is it doesn't defeat it!
The function is documented to force binary, but it actually only does
"mode = mode + 'b'", which can leave you with a mode of 'rUb'. This mode
should be invalid but in practice the 'U' wins out, and causes the
expected problems for UTF-16 and some East Asian codecs.
Until such time as text/universal mode is supported at the overlying
decoded stream level, I suggest that 'U' should be .replace()d out of
the mode as well as 'b' being added, as the documentation would imply. |
|