homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: detect_encoding should fail with SyntaxError on invalid encoding
Type: behavior Stage: resolved
Components: Library (Lib), Unicode Versions: Python 3.2, Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: ezio.melotti, flox, python-dev, vstinner
Priority: normal Keywords: patch

Created on 2012年06月03日 10:29 by flox, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
issue14990_detect_encoding.diff flox, 2012年06月03日 10:31 review
Pull Requests
URL Status Linked Edit
PR 6572 closed lukasz.langa, 2018年04月23日 01:09
Messages (7)
msg162205 - (view) Author: Florent Xicluna (flox) * (Python committer) Date: 2012年06月03日 10:29
I've hit this issue while playing with tokenize for the pep8.py module.
The tokenize detect_encoding() should report SyntaxError when the encoding is improperly declared.
However it raises a LookupError in some cases.
$ ./python -m tokenize Lib/test/bad_coding2.py 
unexpected error: unknown encoding: utf8-sig
Traceback (most recent call last):
 File "./Lib/runpy.py", line 162, in _run_module_as_main
 "__main__", fname, loader, pkg_name)
 File "./Lib/runpy.py", line 75, in _run_code
 exec(code, run_globals)
 File "./Lib/tokenize.py", line 686, in <module>
 main()
 File "./Lib/tokenize.py", line 656, in main
 tokens = list(tokenize(f.readline))
 File "./Lib/tokenize.py", line 489, in _tokenize
 line = line.decode(encoding)
LookupError: unknown encoding: utf8-sig
msg162206 - (view) Author: Florent Xicluna (flox) * (Python committer) Date: 2012年06月03日 10:31
This patch seems to fix the issue.
msg162303 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2012年06月04日 23:06
The patch is correct according to the PEP 263:
 If a source file uses both the UTF-8 BOM mark signature and a
 magic encoding comment, the only allowed encoding for the comment
 is 'utf-8'. Any other encoding will cause an error.
The fix should also be applied to 3.2.
(Note: Python 3.1 doesn't accept bugfixes anymore.)
msg162428 - (view) Author: Florent Xicluna (flox) * (Python committer) Date: 2012年06月06日 23:11
It should raise a SyntaxError, if coding is 'utf8'.
I don't agree with the last patch proposed.
If the import report a SyntaxError, 'tokenize' should do the same.
$ ./python Lib/test/bad_coding2.py
 File "Lib/test/bad_coding2.py", line 1
SyntaxError: encoding problem: utf-8
and it complies strictly with PEP263.
msg162429 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2012年06月06日 23:13
Oops, I didn't want to attach my patch to the issue. Mine is wrong, whereas yours is the right fix :-)
msg164811 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012年07月07日 10:27
New changeset 5020afc0b7c9 by Florent Xicluna in branch '3.2':
Issue #14990: tokenize: correctly fail with SyntaxError on invalid encoding declaration.
http://hg.python.org/cpython/rev/5020afc0b7c9 
msg164812 - (view) Author: Florent Xicluna (flox) * (Python committer) Date: 2012年07月07日 10:29
Thanks. Fixed in trunk too, changeset b4322ad1fec4 
History
Date User Action Args
2022年04月11日 14:57:31adminsetgithub: 59195
2018年04月23日 01:09:05lukasz.langasetpull_requests: + pull_request6272
2012年07月07日 10:29:50floxsetstatus: open -> closed
resolution: fixed
messages: + msg164812

stage: patch review -> resolved
2012年07月07日 10:27:14python-devsetnosy: + python-dev
messages: + msg164811
2012年06月06日 23:13:55vstinnersetmessages: + msg162429
2012年06月06日 23:13:32vstinnersetfiles: - detect_encoding.patch
2012年06月06日 23:11:30floxsetmessages: + msg162428
2012年06月04日 23:06:01vstinnersetfiles: + detect_encoding.patch
versions: - Python 3.1
nosy: + ezio.melotti, vstinner

messages: + msg162303

components: + Unicode
2012年06月03日 10:31:05floxsetfiles: + issue14990_detect_encoding.diff
keywords: + patch
messages: + msg162206

stage: needs patch -> patch review
2012年06月03日 10:29:02floxcreate

AltStyle によって変換されたページ (->オリジナル) /