Skip to main content
Stack Overflow
  1. About
  2. For Teams

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

Required fields*

Required fields*

What's the deal with Python 3.4, Unicode, different languages and Windows? [duplicate]

Happy examples:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
czech = u'Leoš Janáček'.encode("utf-8")
print(czech)
pl = u'Zdzisław Beksiński'.encode("utf-8")
print(pl)
jp = u'リング 山村 貞子'.encode("utf-8")
print(jp)
chinese = u'五行'.encode("utf-8")
print(chinese)
MIR = u'Машина для Инженерных Расчётов'.encode("utf-8")
print(MIR)
pt = u'Minha Língua Portuguesa: çáà'.encode("utf-8")
print(pt)

Unhappy output:

b'Leo\xc5\xa1 Jan\xc3\xa1\xc4\x8dek'
b'Zdzis\xc5\x82aw Beksi\xc5\x84ski'
b'\xe3\x83\xaa\xe3\x83\xb3\xe3\x82\xb0 \xe5\xb1\xb1\xe6\x9d\x91 \xe8\xb2\x9e\xe5\xad\x90'
b'\xe4\xba\x94\xe8\xa1\x8c'
b'\xd0\x9c\xd0\xb0\xd1\x88\xd0\xb8\xd0\xbd\xd0\xb0 \xd0\xb4\xd0\xbb\xd1\x8f \xd0\x98\xd0\xbd\xd0\xb6\xd0\xb5\xd0\xbd\xd0\xb5\xd1\x80\xd0\xbd\xd1\x8b\xd1\x85 \xd0\xa0\xd0\xb0\xd1\x81\xd1\x87\xd1\x91\xd1\x82\xd0\xbe\xd0\xb2'
b'Minha L\xc3\xadngua Portuguesa: \xc3\xa7\xc3\xa1\xc3\xa0'

And if I print them like this:

jp = u'リング 山村 貞子'
print(jp)

I get:

Traceback (most recent call last):
 File "x.py", line 5, in <module>
 print(jp)
 File "C:\Python34\lib\encodings\cp850.py", line 19, in encode
 return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position
0-2: character maps to <undefined>

I've also tried the following from this question (And other alternatives that involve sys.stdout.encoding):

#!/usr/bin/env python
# -*- coding: utf-8 -*-
from __future__ import print_function
import sys
def safeprint(s):
 try:
 print(s)
 except UnicodeEncodeError:
 if sys.version_info >= (3,):
 print(s.encode('utf8').decode(sys.stdout.encoding))
 else:
 print(s.encode('utf8'))
jp = u'リング 山村 貞子'
safeprint(jp)

And things get even more cryptic:

πâ¬πâ│πé░ σ▒▒μ\æ Φ▓₧σ¡É

And the docs were not very helpful.

So, what's the deal with Python 3.4, Unicode, different languages and Windows? Almost all possible examples I could find, deal with Python 2.x.

Is there a general and cross-platform way of printing ANY Unicode character from any language in a decent and non-nasty way in Python 3.4?

EDIT:

I've tried typing at the terminal:

chcp 65001

To change the code page, as proposed here and in the comments, and it did not work (Including the attempt with sys.stdout.encoding)

Answer*

Draft saved
Draft discarded
Cancel
8
  • How would you do the interactive versions? I guess Python is python -i -m run, but I cannot figure out ipython, even though it's stated on win-unicode-console's page that it's integrated. Commented Aug 7, 2015 at 21:48
  • @zsero: the docs show several approaches e.g., py -i -m run c:\path\to\ipython. You could also use qtconsole interface or a web-browser-based notebook. If it doesn't work for you; ask a separate question about what do you want to do with ipython and what fails exactly. Commented Aug 7, 2015 at 22:14
  • @eryksun: no. Notice that py -mrun is used. Commented Aug 24, 2015 at 6:15
  • @sebastian I guess I solved my issue with your help. Your answer is bite confusing: as a python 3.6 user I did not understood if I should ignore or take into account what you write bellow it. If it is the case a kind of "for the previous version:" would make it more clear. Thanks for your patience! Commented Jan 13, 2017 at 20:36
  • 1
    Lucida console doesn't support Chinese or Japanese either. Commented Jan 13, 2017 at 23:38

lang-py

AltStyle によって変換されたページ (->オリジナル) /