Using python 3.2 in Windows 7 I am getting the following in IDLE:
>>compile('pass', r'c:\temp\工具\module1.py', 'exec')
UnicodeEncodeError: 'mbcs' codec can't encode characters in position 0--1: invalid character
Can anybody explain why the compile statement tries to convert the unicode filename using mbcs? I know that sys.getfilesystemencoding returns 'mbcs' in Windows, but I thought that this is not used when unicode file names are provided.
for example:
f = open(r'c:\temp\工具\module1.py')
works.
For a more complete test save the following in a utf8 encoded file and run it using the standard python.exe version 3.2
# -*- coding: utf8 -*-
fname = r'c:\temp\工具\module1.py'
# I do have the a file named fname but you can comment out the following two lines
f = open(fname)
print('ok')
cmp = compile('pass', fname, 'exec')
print(cmp)
Output:
ok
Traceback (most recent call last):
File "module8.py", line 6, in <module>
cmp = compile('pass', fname, 'exec')
UnicodeEncodeError: 'mbcs' codec can't encode characters in position 0--1: inval
id character
-
tried locally in XP and get a proper code object back. Is this being run from the CLI or is this run via a file?monkut– monkut2012年01月10日 05:56:33 +00:00Commented Jan 10, 2012 at 5:56
-
I'm going to guess that it's not the call signature that's the problem, but the content of the file that is causing the unicode error. check to make sure that "module1.py" is correctly encoded, with the encoding signature assigned.monkut– monkut2012年01月10日 06:24:52 +00:00Commented Jan 10, 2012 at 6:24
-
@monkut: In Python 3.x, you don't have to worry about encoding - if there are UTF-8 characters in the file, then they'll be rendered as UTF-8 characters.Makoto– Makoto2012年01月10日 06:26:58 +00:00Commented Jan 10, 2012 at 6:26
-
hmmmm... still seems like an encoding issue with "module1.py". Perhaps the sig is set to "mbcs" overriding the default?monkut– monkut2012年01月10日 06:38:13 +00:00Commented Jan 10, 2012 at 6:38
-
2The compile function converts the filename argument to bytes using the filesystem encoding: hg.python.org/cpython/file/4f8c24830a5c/Python/… . I suspect it shouldn't be doing this.Thomas K– Thomas K2012年01月10日 13:25:34 +00:00Commented Jan 10, 2012 at 13:25
3 Answers 3
From Python issue 10114, it seems that the logic is that all filenames used by Python should be valid for the platform where they are used. It is encoded using the filesystem encoding to be used in the C internals of Python.
I agree that it probably shouldn't throw an error on Windows, because any Unicode filename is valid. You may wish to file a bug report with Python for this. But be aware that the necessary changes might not be trivial, because any C code using the filename has to have something to do if it can't be encoded.
8 Comments
Here a solution that worked for me: Issue 427: UnicodeEncodeError: 'ascii' codec can't encode characters in position 1-6: ordinal not in range (128):
If you look the PyScripter help file in the topic "Encoded Python Source Files" (last paragraph) it tells you how to configure Python to support other encodings by modifying the site.py file. This file is in the lib subdirectory of the Python installation directory. Find the function setencoding and make sure that the support locale aware default string encodings is on. (see below)
def setencoding():
"""Set the string encoding used by the Unicode implementation. The
default is 'ascii', but if you're willing to experiment, you can
change this."""
encoding = "ascii" # Default value set by _PyUnicode_Init()
if 0: <<<--- set this to 1 ---------------------------------
# Enable to support locale aware default string encodings.
import locale
loc = locale.getdefaultlocale ()
if loc[1]:
encoding = loc[1]
if 0:
# Enable to switch off string to Unicode coercion and implicit
# Unicode to string conversion.
encoding = "undefined"
if encoding != "ascii":
# On Non-Unicode builds this will raise an AttributeError...
sys.setdefaultencoding (encoding) # Needs Python Unicode
build !
Comments
I think you could try to change the "\" in the path of file into "/",just like
compile('pass', r'c:\temp\工具\module1.py', 'exec')
compile('pass', r'c:/temp/工具/module1.py', 'exec')
I have met a problem just like you, I used this method to solve the problem. I hope it can work with yours.