homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: ConfigParser does not parse utf-8 files with BOM bytes
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.3
process
Status: closed Resolution: duplicate
Dependencies: Superseder: Python3: guess text file charset using the BOM
View: 7651
Assigned To: lukasz.langa Nosy List: Sean.Wang, eric.araujo, lukasz.langa
Priority: normal Keywords:

Created on 2012年03月15日 02:24 by Sean.Wang, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Messages (3)
msg155843 - (view) Author: Sean Wang (Sean.Wang) Date: 2012年03月15日 02:24
ConfigParser failed to parse a utf-8 file with BOM bytes('\xef\xbb\xbf'),
it would raise ConfigParser.MissingSectionHeaderError.
I think that other files with BOM would have the same problem; because the argument "SECTCRE" does not consider the BOM conditions.
Now the workaround is like below:
cp=ConfigParser.ConfigParser()
cfgfile=os.path.join(curpath,'config.cfg')
cp.readfp(codecs.open(cfgfile, 'r','utf-8-sig'))
msg156042 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2012年03月16日 14:46
Could you paste the exact code that fails? In 3.2+ there is a read_something method that takes an encoding argument, so that should work for example.
msg156079 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2012年03月16日 20:23
What you considered a workaround is actually what you should be using faced with BOM bytes. This is a broader issue in Python, not necessarily connected with ConfigParser or any other library. Also, this has been already reported here:
http://bugs.python.org/issue7519
For the UTF-8 BOM context please see:
http://bugs.python.org/issue7651
To solve the actual problem we should really do something about that last issue.
If you have any further questions, please ask. If not, I will close this issue.
History
Date User Action Args
2022年04月11日 14:57:28adminsetgithub: 58519
2012年03月20日 12:32:44lukasz.langasetstatus: open -> closed
resolution: duplicate
stage: needs patch -> resolved
superseder: Python3: guess text file charset using the BOM
versions: - Python 2.7, Python 3.2
2012年03月16日 20:23:32lukasz.langasetassignee: lukasz.langa
messages: + msg156079
2012年03月16日 14:46:21eric.araujosetnosy: + eric.araujo
messages: + msg156042
2012年03月15日 20:19:01pitrousetnosy: + lukasz.langa
stage: needs patch

versions: + Python 3.2, Python 3.3
2012年03月15日 02:24:51Sean.Wangcreate

AltStyle によって変換されたページ (->オリジナル) /