I'm using Python 2.7.3. My operating system is Windows7(32-bit). In the cmd, I typed this code:
chcp 1254
and I converted decoding system to 1254. But,
#!/usr/bin/env python
# -*- coding:cp1254 -*-
print "öçışğüÖÇİŞĞÜ"
When I ran above codes, I got that output:
÷2■しかく3ÍæÌo▄
But when I put u after the print command (print u"öçışğüÖÇİŞĞÜ")
When I edited codes as that:
#!/usr/bin/env python
# -*- coding:cp1254 -*-
import os
a = r"C:\\"
b = "ö"
print os.path.join(a, b)
I got that output:
÷
That's why when I tried
print unicode(os.path.join(a, b))
command. I got that error:
print unicode(os.path.join(a, b))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xf6 in position 13: ordinal
not in range(128)
By trying a different way:
print os.path.join(a, b).decode("utf-8").encode(sys.stdout.encoding)
When I tried above code, I got that error:
print os.path.join(a, b).decode("utf-8").encode(sys.stdout.encoding)
File "C:\Python27\lib\encodings\utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xf6 in position 13: invalid start byte
As a result, I can't get rid of this error. How can I solve it ? Thanks.
-
I'm not reproducing the error: the initial code works fine here.Alex Huszagh– Alex Huszagh2015年06月11日 15:20:50 +00:00Commented Jun 11, 2015 at 15:20
-
What is the output of this command ?: chcppython_pardus– python_pardus2015年06月11日 15:33:39 +00:00Commented Jun 11, 2015 at 15:33
-
"C:\\ö" I tried it on 2 Python 2.7 installs, one a Windows 7 32 bit install.Alex Huszagh– Alex Huszagh2015年06月11日 17:02:53 +00:00Commented Jun 11, 2015 at 17:02
-
Are you running this from the standard "Command Prompt"?Alex Huszagh– Alex Huszagh2015年06月11日 17:11:47 +00:00Commented Jun 11, 2015 at 17:11
-
For me, it produces the incorrect encoding C:\\├╢ with the standard command prompt, from Cygwin, it works perfectly fine, suggesting the issue is the representation of the encoding in stdout.Alex Huszagh– Alex Huszagh2015年06月11日 17:20:48 +00:00Commented Jun 11, 2015 at 17:20
1 Answer 1
When I run your original code, but use chcp 857 (the Turkish OEM code page) I can reproduce your issue, so I do not think you were running chcp 1254:
÷2■しかく3ÍæÌo▄
If you declare your source encoding as:
# -*- coding:cp1254 -*-
You must save your source code in that encoding. If you don't use Unicode strings, you must also use the same encoding at the console. Then it works correctly.
Example (source declared cp1254, but saved incorrectly as cp1252, and console chcp 1254):
öçisgüÖÇISGÜ
Example (source declared and saved correctly as cp1254, console chcp 1254):
öçışğüÖÇİŞĞÜ
It is important to note that with Unicode strings, you don't have to match the source encoding with the encoding of your console.
Example (declared and saved as UTF-8, with Unicode string):
#!python2
# -*- coding:utf8 -*-
print u"öçışğüÖÇİŞĞÜ"
Output (use any code page that supports Turkish...1254, 857, 1026...):
öçışğüÖÇİŞĞÜ