This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2008年04月19日 21:04 by azverkan, last changed 2022年04月11日 14:56 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| 2to3bug.py | azverkan, 2008年04月19日 21:04 | testcase | ||
| 2to3_encoding.patch | vstinner, 2009年05月04日 20:55 | |||
| Messages (8) | |||
|---|---|---|---|
| msg65637 - (view) | Author: Brandon Ehle (azverkan) | Date: 2008年04月19日 21:04 | |
While running the 2to3 script on the scons codebase, I ran into an UnicodeDecodeError. Attached is just the portion of the script that causes the error. 2to3 throws an error on the string regardless of whether the unicode string literal is prepended on the front. RefactoringTool: Skipping implicit fixer: buffer RefactoringTool: Skipping implicit fixer: idioms RefactoringTool: Skipping implicit fixer: ws_comma Traceback (most recent call last): File "/usr/local/bin/2to3", line 5, in <module> sys.exit(refactor.main()) File "/usr/local/lib/python3.0/lib2to3/refactor.py", line 81, in main rt.refactor_args(args) File "/usr/local/lib/python3.0/lib2to3/refactor.py", line 188, in refactor_args self.refactor_file(arg) File "/usr/local/lib/python3.0/lib2to3/refactor.py", line 217, in refactor_file input = f.read() + "\n" # Silence certain parse errors File "/usr/local/lib/python3.0/io.py", line 1611, in read decoder.decode(self.buffer.read(), final=True)) File "/usr/local/lib/python3.0/io.py", line 1199, in decode output = self.decoder.decode(input, final=final) File "/usr/local/lib/python3.0/codecs.py", line 300, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf8' codec can't decode bytes in position 59-60: invalid data |
|||
| msg65638 - (view) | Author: Collin Winter (collinwinter) * (Python committer) | Date: 2008年04月19日 21:48 | |
2to3 running under Python 2.5.1 handles this file just fine. 2to3 running under 3.0a4+ (r62404) fails as detailed below. However, that file doesn't run correctly under Python itself: collinwinter@Silves:~/src/python/py3k$ ./python /home/collinwinter/Desktop/2to3bug.py File "/home/collinwinter/Desktop/2to3bug.py", line 3 collinwinter@Silves:~/src/python/py3k This suggests this problem isn't 2to3-specific. Refiling this issue against py3k's Unicode support. |
|||
| msg65641 - (view) | Author: Brandon Ehle (azverkan) | Date: 2008年04月20日 01:38 | |
Someone on the #python IRC channel suggested that the default for python 3.0 for unicode string literals is reversed from python 2.5. If you remove the unicode string literal (u'') from the front of the string, it runs fine under python 3.0 and fails under 2.5 and 2.6 instead. |
|||
| msg65642 - (view) | Author: Brandon Ehle (azverkan) | Date: 2008年04月20日 01:40 | |
Also, I can confirm that running 2to3 with Python 2.6 correctly converts the script but running 2to3 with Python 3.0 results in a UnicodeDecodeError exception. |
|||
| msg86641 - (view) | Author: Daniel Diniz (ajaksu2) * (Python triager) | Date: 2009年04月27日 01:42 | |
Confirmed in py3k on rev71995. |
|||
| msg86643 - (view) | Author: Benjamin Peterson (benjamin.peterson) * (Python committer) | Date: 2009年04月27日 02:39 | |
The problem is that 2to3 just reads the file with whatever locale.getpreferredencoding() returns. It should use tokenize.detect_encoding() to discover the correct encoding to open it with. |
|||
| msg87175 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2009年05月04日 20:55 | |
Patch using tokenize.detect_encoding() to read the encoding of Python scripts instead of using default io.open() encoding (utf-8). We might write unit test. See also related issue: #5093 |
|||
| msg87481 - (view) | Author: Benjamin Peterson (benjamin.peterson) * (Python committer) | Date: 2009年05月09日 00:33 | |
Fixed in r72491. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:56:33 | admin | set | github: 46912 |
| 2009年05月09日 00:33:45 | benjamin.peterson | set | status: open -> closed resolution: fixed messages: + msg87481 |
| 2009年05月04日 20:55:19 | vstinner | set | files:
+ 2to3_encoding.patch nosy: + vstinner messages: + msg87175 keywords: + patch |
| 2009年04月27日 02:39:29 | benjamin.peterson | set | messages: + msg86643 |
| 2009年04月27日 01:42:30 | ajaksu2 | set | type: behavior components: + 2to3 (2.x to 3.x conversion tool) versions: + Python 2.6, Python 3.1, - Python 3.0 nosy: + ajaksu2, benjamin.peterson messages: + msg86641 stage: test needed |
| 2008年04月20日 01:40:01 | azverkan | set | messages: + msg65642 |
| 2008年04月20日 01:38:09 | azverkan | set | messages: + msg65641 |
| 2008年04月19日 22:16:59 | collinwinter | set | title: 2to3 throws a utf8 decode error on a iso-8859-1 string -> Py3k fails to parse a file with an iso-8859-1 string |
| 2008年04月19日 21:48:49 | collinwinter | set | priority: high assignee: collinwinter -> messages: + msg65638 components: + Unicode, - 2to3 (2.x to 3.x conversion tool) |
| 2008年04月19日 21:04:59 | azverkan | create | |