homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: tempfile.mkdtemp fails with non-ascii paths on Python 2
Type: behavior Stage: resolved
Components: Unicode Versions: Python 2.7
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: Nosy List: akira, ezio.melotti, gregory.p.smith, risto3, scoder, serhiy.storchaka, vstinner
Priority: normal Keywords:

Created on 2015年01月25日 11:02 by akira, last changed 2022年04月11日 14:58 by admin. This issue is now closed.

Messages (8)
msg234662 - (view) Author: Akira Li (akira) * Date: 2015年01月25日 11:02
Python 2.7.9 (default, Jan 25 2015, 13:41:30) 
 [GCC 4.9.2] on linux2
 Type "help", "copyright", "credits" or "license" for more information.
 >>> import os, sys, tempfile
 >>> d = u'\u20ac'.encode(sys.getfilesystemencoding()) # non-ascii
 >>> if not os.path.isdir(d): os.makedirs(d)
 ... 
 >>> os.environ['TEMP'] = d
 >>> tempfile.mkdtemp(prefix=u'')
 Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File ".../python2.7/tempfile.py", line 331, in mkdtemp
 file = _os.path.join(dir, prefix + name + suffix)
 File ".../python2.7/posixpath.py", line 80, in join
 path += '/' + b
 UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 13: ordinal not in range(128)
Related: https://bugs.python.org/issue1681974 
msg234664 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015年01月25日 12:05
Why do you use an unicode prefix? Does it work with a bytes prefix?
You should use Python 3 if you want the best Unicode support.
msg257333 - (view) Author: Richard PALO (risto3) Date: 2016年01月02日 07:42
I notice similar problems, as found when running the test suite for lxml 3.5.0 on python2.7
======================================================================
ERROR: test_etree_parse_io_error (lxml.tests.test_io.ETreeIOTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
 File "/opt/local/lib/python2.7/unittest/case.py", line 329, in run
 testMethod()
 File "/tmp/pkgsrc/textproc/py-lxml/work/lxml-3.5.0/src/lxml/tests/test_io.py", line 276, in test_etree_parse_io_error
 dn = tempfile.mkdtemp(prefix=dirnameRU)
 File "/opt/local/lib/python2.7/tempfile.py", line 339, in mkdtemp
 _os.mkdir(file, 0700)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 40-53: ordinal not in range(128)
======================================================================
ERROR: test_etree_parse_io_error (lxml.tests.test_io.ElementTreeIOTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
 File "/opt/local/lib/python2.7/unittest/case.py", line 329, in run
 testMethod()
 File "/tmp/pkgsrc/textproc/py-lxml/work/lxml-3.5.0/src/lxml/tests/test_io.py", line 276, in test_etree_parse_io_error
 dn = tempfile.mkdtemp(prefix=dirnameRU)
 File "/opt/local/lib/python2.7/tempfile.py", line 339, in mkdtemp
 _os.mkdir(file, 0700)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 40-53: ordinal not in range(128)
the code snippet is in test_io.py", line 276
 266	 def test_etree_parse_io_error(self):
 267		# this is a directory name that contains characters beyond latin-1
 268		dirnameEN = _str('Directory')
 269		dirnameRU = _str('КÐ260円Ñ032円Ð260円Ð273円Ð276円Ð263円')
 270		filename = _str('nosuchfile.xml')
 271		dn = tempfile.mkdtemp(prefix=dirnameEN)
 272		try:
 273		 self.assertRaises(IOError, self.etree.parse, os.path.join(dn, filename))
 274		finally:
 275		 os.rmdir(dn)
 276		dn = tempfile.mkdtemp(prefix=dirnameRU)
 277		try:
 278		 self.assertRaises(IOError, self.etree.parse, os.path.join(dn, filename))
 279		finally:
 280		 os.rmdir(dn)
even if I change dirnameRU to a simple French 'Répertoire' I still get errors...
It is not an option to upgrade to 3.0, sorry.
BTW, I tried passing dirnameRU.encode('utf-8') but that just generates
a different error:
ERROR: test_etree_parse_io_error (lxml.tests.test_io.ETreeIOTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
 File "/opt/local/lib/python2.7/unittest/case.py", line 329, in run
 testMethod()
 File "/tmp/pkgsrc/textproc/py-lxml/work/lxml-3.5.0/src/lxml/tests/test_io.py", line 278, in test_etree_parse_io_error
 self.assertRaises(IOError, self.etree.parse, os.path.join(dn, filename))
 File "/opt/local/lib/python2.7/posixpath.py", line 73, in join
 path += '/' + b
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 40: ordinal not in range(128)
msg257334 - (view) Author: Richard PALO (risto3) Date: 2016年01月02日 07:58
If I also add .encode('utf-8') to filename on line 278, that seems gets over the pathname problem.
I guess it comes down to the fact that if sys.filesystemencoding() is utf-8, which in my case it is (on SunOS), I believe these conversion should be automatic.
msg257338 - (view) Author: Richard PALO (risto3) Date: 2016年01月02日 08:59
curiously enough, I was able to test with python3.5.
The same errors result, and the same workaround seems to get over it.
msg257340 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016年01月02日 09:37
The similar problem in Python 3 was addressed in issue24230. But this was a new feature.
As for lxml tests, I suggest to use bytes names compatible with all Windows OEM encodings (consisting of ASCII + b'\xa9\xb0\xb2\xb3\xb4\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc8\xc9\xe6\xf0\xf1\xf3\xf4\xf5\xf6\xf7') and with UTF-8.
msg257342 - (view) Author: Richard PALO (risto3) Date: 2016年01月02日 10:28
This turns out to be related to the locale environment set to 'C'.
A UTF-8 locale seems to get over the issue.
A fellow pkgsrc colleague filed an issue with lxml already relating to that fact for the test suite (https://bugs.launchpad.net/lxml/+bug/1522052)
cheers
msg370480 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020年05月31日 15:11
Python 2.7 is no longer supported.
History
Date User Action Args
2022年04月11日 14:58:12adminsetgithub: 67504
2020年05月31日 15:11:08serhiy.storchakasetstatus: open -> closed
resolution: out of date
messages: + msg370480

stage: resolved
2016年01月02日 10:28:34risto3setmessages: + msg257342
2016年01月02日 09:37:48serhiy.storchakasetnosy: + scoder, gregory.p.smith, serhiy.storchaka
messages: + msg257340
2016年01月02日 08:59:45risto3setmessages: + msg257338
2016年01月02日 07:58:22risto3setmessages: + msg257334
2016年01月02日 07:42:24risto3setnosy: + risto3
messages: + msg257333
2015年01月25日 12:05:10vstinnersetmessages: + msg234664
2015年01月25日 11:02:16akiracreate

AltStyle によって変換されたページ (->オリジナル) /