1

I just want to get UTF-8 working. I tried this:

#!/usr/bin/env python
# -*- coding: utf-8 -*- 
t = "одобрение за"
print t

But when I run this program from the command line, output looks like: одобрение за

I've searched up and down the net, tried the whole sys.setdefaultencoding thing, tried calling encode() and decode(), tried placing the little "u" in front, tried unicode(), etc.

I'm about ready to explode from frustration. Is there a definitive answer for what the heck you're supposed to do?

asked May 12, 2013 at 19:08
6
  • joelonsoftware.com/articles/Unicode.html Commented May 12, 2013 at 19:13
  • 3
    What's the encoding of your terminal? Did you check sys.stdout.encoding? The fact that your string is a unicode string does not mean that the terminal is able to display it correctly. Commented May 12, 2013 at 19:30
  • Are you running this from Windows using cmd.exe? Commented May 12, 2013 at 19:50
  • 1
    You forgot to add u to "одобрение за". Try this: t = u"одобрение за" Commented May 12, 2013 at 19:56
  • @stalk Is that necessary? Commented May 12, 2013 at 20:00

2 Answers 2

2

Your code works for me (tm)

In [1]: t = u"одобрение за"
In [2]: print t
одобрение за

Make sure your terminal supports UTF-8. One way is to check the LANG env-variable:

$ echo $LANG
en_US.UTF-8

also, try the locale command.

$LANG/locale just tells you what your system will use when writing to stdout/stderr. Best way to test if terminal supports UTF-8 is probably to print something to it and see if it looks correct. Something like this:

echo -e '\xe2\x82\xac' 

You should get a -sign.

If not, try a different shell...

answered May 12, 2013 at 19:33
Sign up to request clarification or add additional context in comments.

Comments

0

Since you are using Windows cmd.exe, you have to follow two steps:

  1. Make sure your console is using Lucidia console font family (other fonts cannot display UTF-8 properly).

  2. Type chcp 65001 (that's change codepage) and hit enter.

  3. Run your command.

For subsequent runs (once you close the cmd.exe window), you'll have to change the codepage again. The font should be permanent.

answered May 12, 2013 at 19:53

4 Comments

"LookupError: unknown encoding: cp65001"
Uh, where did you type that? Do you have UTF-8 enabled in languages and settings/regions or whatever is called in the control panel? You'll also need to enable "advanced support" which will install other files to provide support for all programs.
I am using W7 so I don't see that option
Hmm, in Windows 7 UTF is enabled by default, but you may need a font. Try opening up PowerShell (should be installed by default), and then type this $OutputEncoding = New-Object -typename System.Text.UTF8Encoding and then run your script again.

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.