homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: open() rejects bytes as filename
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.0
process
Status: closed Resolution: duplicate
Dependencies: Superseder: os.listdir can return byte strings
View: 3187
Assigned To: Nosy List: amaury.forgeotdarc, dlitz
Priority: normal Keywords:

Created on 2008年08月26日 17:06 by dlitz, last changed 2022年04月11日 14:56 by admin. This issue is now closed.

Messages (2)
msg71986 - (view) Author: Dwayne Litzenberger (dlitz) Date: 2008年08月26日 17:05
On Linux/ext3, filenames are stored natively as sequences of octets. On
Win32/NTFS, they are stored natively as sequences of Unicode code points.
In Python 2.x, the way to unambiguously open a particular file was to
pass the filename as a str object on Linux/ext3 and as a unicode object
on Win32/NTFS. os.listdir(".") would return every filename as a str
object, and os.listdir(u".") would return every filename as a unicode
object---based on the current locale settings---*except* for filenames
that couldn't be decoded that way.
Consider this bash script (executed on Linux under a UTF-8 locale):
 export LC_CTYPE=en_CA.UTF-8 # requires the en_CA.UTF-8 locale to be
built
 mkdir /tmp/foo
 cd /tmp/foo
 touch $'UTF-8 compatible filename\xc2\xa2'
 touch $'UTF-8 incompatible filename\xc0'
Under Python 2.52, you get this:
 >>> import os
 >>> os.listdir(u".")
 ['UTF-8 incompatible filename\xc0', u'UTF-8 compatible filename\xa2']
 >>> os.listdir(".")
 ['UTF-8 incompatible filename\xc0', 'UTF-8 compatible filename\xc2\xa2']
 >>> [open(f, "r") for f in os.listdir(u".")]
 [<open file 'UTF-8 incompatible filename�, mode 'r' at 0xb7cee578>,
<open file 'UTF-8 compatible filename¢', mode 'r' at 0xb7cee6e0>]
Under Python 3.0b3, you get this:
 >>> import os
 >>> os.listdir(".")
 [b'UTF-8 incompatible filename\xc0', 'UTF-8 compatible filename¢']
 >>> os.listdir(b".")
 [b'UTF-8 incompatible filename\xc0', b'UTF-8 compatible filename\xc2\xa2']
 >>> [open(f, "r") for f in os.listdir(".")]
 Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "<stdin>", line 1, in <listcomp>
 File "/home/dwon/python3.0b3/lib/python3.0/io.py", line 284, in __new__
 return open(*args, **kwargs)
 File "/home/dwon/python3.0b3/lib/python3.0/io.py", line 184, in open
 raise TypeError("invalid file: %r" % file)
 TypeError: invalid file: b'UTF-8 incompatible filename\xc0'
This behaviour of open() makes it impossible to write code that opens
arbitrarily-named files on Linux/ext3.
msg71988 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008年08月26日 17:14
This is actively being discussed (and developed) in issue3187 
History
Date User Action Args
2022年04月11日 14:56:38adminsetgithub: 47938
2008年08月26日 17:14:06amaury.forgeotdarcsetstatus: open -> closed
resolution: duplicate
superseder: os.listdir can return byte strings
messages: + msg71988
nosy: + amaury.forgeotdarc
2008年08月26日 17:06:48dlitzsettype: behavior
components: + Library (Lib), - Windows
2008年08月26日 17:06:01dlitzcreate

AltStyle によって変換されたページ (->オリジナル) /