This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2012年02月10日 17:14 by maubp, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Messages (4) | |||
|---|---|---|---|
| msg153067 - (view) | Author: Peter (maubp) | Date: 2012年02月10日 17:14 | |
Consider the following example where I have a gzipped text file,
$ python3
Python 3.2 (r32:88445, Feb 28 2011, 17:04:33)
[GCC 4.2.1 (Apple Inc. build 5664)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import gzip
>>> with gzip.open("ex1.sam.gz") as handle:
... line = handle.readline()
...
>>> line
b'EAS56_57:6:190:289:82\t69\tchr1\t100\t0\t*\t=\t100\t0\tCTCAAGGTTGTTGCAAGGGGGTCTATGTGAACAAA\t<<<7<<<;<<<<<<<<8;;<7;4<;<;;;;;94<;\tMF:i:192\n'
Notice the file was opened in binary mode ("rb" is the default for gzip.open which is surprising given "t" is the default for open on Python 3), and a byte string is returned.
Now try explicitly using non-binary reading "r", and again you get bytes rather than a (unicode) string as I would expect:
>>> with gzip.open("ex1.sam.gz", "r") as handle:
... line = handle.readline()
...
>>> line
b'EAS56_57:6:190:289:82\t69\tchr1\t100\t0\t*\t=\t100\t0\tCTCAAGGTTGTTGCAAGGGGGTCTATGTGAACAAA\t<<<7<<<;<<<<<<<<8;;<7;4<;<;;;;;94<;\tMF:i:192\n'
Now try and use "t" or "rt" to be even more explicit that text mode is desired,
>>> with gzip.open("ex1.sam.gz", "t") as handle:
... line = handle.readline()
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/pjcock/lib/python3.2/gzip.py", line 46, in open
return GzipFile(filename, mode, compresslevel)
File "/Users/pjcock/lib/python3.2/gzip.py", line 157, in __init__
fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')
ValueError: can't have text and binary mode at once
>>> with gzip.open("ex1.sam.gz", "rt") as handle:
... line = handle.readline()
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/pjcock/lib/python3.2/gzip.py", line 46, in open
return GzipFile(filename, mode, compresslevel)
File "/Users/pjcock/lib/python3.2/gzip.py", line 157, in __init__
fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')
ValueError: can't have text and binary mode at once
See also Issue #5148 which is perhaps somewhat related.
|
|||
| msg153127 - (view) | Author: Nadeem Vawda (nadeem.vawda) * (Python committer) | Date: 2012年02月11日 13:55 | |
The problem here is that gzip.GzipFile does not support text mode, only
binary mode. Unfortunately, its __init__ method doesn't handle unexpected
mode strings sensibly, so you get a confusing error message.
If you need to open a compressed file in text mode in Python 3.2, use
io.TextIOWrapper:
with io.TextIOWrapper(gzip.open("ex1.sam.gz", "r")) as f:
line = f.readline()
In 3.3, it would be nice for gzip.open to do this transparently when mode
is "rt"/"wt"/"at". However, binary mode will still need to be the default
(for modes "r", "w" and "a"), to ensure backward compatibility.
In the meanwhile, I'll add a note to the documentation about this
limitation, and fix GzipFile.__init__ to produce a more sensible error
message.
|
|||
| msg153139 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2012年02月11日 22:06 | |
New changeset 4b32309631da by Nadeem Vawda in branch '3.2': Issue #13989: Document that GzipFile does not support text mode. http://hg.python.org/cpython/rev/4b32309631da New changeset 8dbe8faea0e7 by Nadeem Vawda in branch 'default': Merge: #13989: Document that GzipFile does not support text mode. http://hg.python.org/cpython/rev/8dbe8faea0e7 |
|||
| msg160080 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2012年05月06日 13:11 | |
New changeset 55202ca694d7 by Nadeem Vawda in branch 'default': Closes #13989: Add support for text modes to gzip.open(). http://hg.python.org/cpython/rev/55202ca694d7 |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:26 | admin | set | github: 58197 |
| 2012年05月06日 13:11:08 | python-dev | set | status: open -> closed resolution: fixed messages: + msg160080 stage: resolved |
| 2012年02月11日 22:06:32 | python-dev | set | nosy:
+ python-dev messages: + msg153139 |
| 2012年02月11日 13:55:56 | nadeem.vawda | set | messages: + msg153127 |
| 2012年02月10日 17:49:20 | pitrou | set | nosy:
+ nadeem.vawda |
| 2012年02月10日 17:14:53 | maubp | create | |