This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2015年01月06日 19:05 by jdufresne, last changed 2022年04月11日 14:58 by admin. This issue is now closed.
| Messages (2) | |||
|---|---|---|---|
| msg233549 - (view) | Author: Jon Dufresne (jdufresne) * | Date: 2015年01月06日 19:05 | |
The following test script demonstrates that Python's csv library does not handle a BOM. I would expect the returned row to be equal to expected and to print 'True' to stdout.
In the wild, it is typical for other CSV writers to add a BOM. MS Excel is especially picky about the BOM when reading a utf-8 encoded file. So many writers add a BOM for interopability with MS Excel.
If a python program accepts a CSV file as input (often the case in web apps), these files will not be handled correctly without preprocessing. In my opinion, this should "just work" when reading the file.
---
import codecs
import csv
f = open('foo.csv', 'wb')
f.write(codecs.BOM_UTF8 + b'a,b,c')
f.close()
expected = ['a', 'b', 'c']
f = open('foo.csv')
r = csv.reader(f)
row = next(r)
print(row)
print(row == expected)
---
Output
---
$ ./python ~/test.py
['\ufeffa', 'b', 'c']
False
---
|
|||
| msg233550 - (view) | Author: R. David Murray (r.david.murray) * (Python committer) | Date: 2015年01月06日 19:52 | |
This is not a problem with the csv module in particular. See issue 7651. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:58:11 | admin | set | github: 67367 |
| 2015年01月06日 19:52:05 | r.david.murray | set | status: open -> closed superseder: Python3: guess text file charset using the BOM nosy: + r.david.murray messages: + msg233550 resolution: duplicate stage: resolved |
| 2015年01月06日 19:05:05 | jdufresne | create | |