This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2010年09月04日 11:32 by flox, last changed 2022年04月11日 14:57 by admin.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| detect_encoding_default.diff | flox, 2010年09月04日 11:32 | Patch, apply to 3.x | review | |
| Messages (3) | |||
|---|---|---|---|
| msg115567 - (view) | Author: Florent Xicluna (flox) * (Python committer) | Date: 2010年09月04日 11:32 | |
The function tokenize.detect_encoding() detects the encoding either in the coding cookie or in the BOM. If no encoding is found, it returns 'utf-8':
When result is 'utf-8', there's no (easy) way to know if the encoding was really detected in the file, or if it falls back to the default value.
Cases (with utf-8):
- UTF-8 BOM found, returns ('utf-8-sig', [])
- cookie on 1st line, returns ('utf-8', [line1])
- cookie on 2nd line, returns ('utf-8', [line1, line2])
- no cookie found, returns ('utf-8', [line1, line2])
The proposal is to allow to call the function with a different default value (None or ''), in order to know if the encoding is really detected.
For example, this function could be used by the Tools/scripts/findnocoding.py script.
Patch attached.
|
|||
| msg122106 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2010年11月22日 10:52 | |
> no cookie found, returns ('utf-8', [line1, line2])
I never understood the usage of the second item. IMO it should be None if no cookie found.
|
|||
| msg173002 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2012年10月15日 20:55 | |
> I never understood the usage of the second item. IMO it should be None if no cookie found. UTF-8 is the default source encoding for Python 3. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:06 | admin | set | github: 53980 |
| 2014年11月02日 12:11:24 | berker.peksag | set | nosy:
+ berker.peksag versions: + Python 3.5, - Python 3.4 |
| 2012年10月15日 20:55:02 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka messages: + msg173002 |
| 2012年07月21日 13:19:57 | flox | set | versions: + Python 3.4, - Python 3.3 |
| 2010年12月31日 01:43:04 | eric.araujo | set | nosy:
+ eric.araujo versions: + Python 3.3, - Python 3.2 |
| 2010年12月30日 22:14:16 | georg.brandl | unlink | issue7962 dependencies |
| 2010年11月22日 10:52:50 | vstinner | set | messages: + msg122106 |
| 2010年11月22日 05:14:16 | eric.araujo | set | nosy:
+ vstinner |
| 2010年09月04日 18:54:27 | pitrou | set | nosy:
+ benjamin.peterson |
| 2010年09月04日 13:23:06 | flox | link | issue7962 dependencies |
| 2010年09月04日 11:32:08 | flox | create | |