I'm having the following problems in reading and printing Turkish in python, the Turkish letters in the word cannot be recognized. But such problem doesn't arise when I try to store strings on other languages such as Russian, Japanese and Chinese.
>>> s = u'abartmadığını'
>>> s
u'abartmad???n?'
>>> print s
abartmad???n?
How can I adjust the encoding to solve this problem? I am using Python 2.7.10 on Windows 10 and changing the code page of command line to 28595 doesn't seem to work, I just got the following error in python console.
LookupError: unknown encoding: cp28595
-
Maybe you might need to accept using non-Turkish letters, because Turkish letters might not be usable in unicode.Franz Noel– Franz Noel2015年12月05日 04:09:42 +00:00Commented Dec 5, 2015 at 4:09
-
@FranzNoel nope, the same thing works well on Mac OS, there must be some issues with the environmentGJ.– GJ.2015年12月05日 04:18:44 +00:00Commented Dec 5, 2015 at 4:18
-
Works well on Linux. Must be something with Windows 10. Are you using the CMD terminal?Muposat– Muposat2015年12月05日 04:19:44 +00:00Commented Dec 5, 2015 at 4:19
-
Are you typing that directly at the console? That's likely not going to work without a Turkish version of Windows, or configuring the Windows system locale to Turkey.Mark Tolonen– Mark Tolonen2015年12月05日 08:17:43 +00:00Commented Dec 5, 2015 at 8:17
2 Answers 2
The Windows console is notorious for not supporting Unicode well. Use an IDE that supports UTF-8 output. Here's an example from PythonWin, part of the pywin32 third-party module:
PythonWin 2.7.9 (default, Dec 10 2014, 12:24:55) [MSC v.1500 32 bit (Intel)] on win32.
Portions Copyright 1994-2008 Mark Hammond - see 'Help/About PythonWin' for further copyright information.
>>> s = u'abartmadığını. 我是美国人。 ру́сский язы́к'
>>> s
u'abartmad\u0131\u011f\u0131n\u0131. \u6211\u662f\u7f8e\u56fd\u4eba\u3002 \u0440\u0443\u0301\u0441\u0441\u043a\u0438\u0439 \u044f\u0437\u044b\u0301\u043a'
>>> print s
abartmadığını. 我是美国人。 ру́сский язы́к
2 Comments
Encode it to utf-8
>>> s = u'abartmadığını'
>>> print s.encode('utf-8')
abartmadığını
2 Comments
Explore related questions
See similar questions with these tags.