i'm trying to print : Pokémon GO Việt Nam
print u"Pokémon GO Việt Nam"
and i'm getting :
print u"PokÚmon GO Vi?t Nam"
SyntaxError: (unicode error) 'utf8' codec can't decode byte 0xe9 in position 0: unexpected end of data
i've tried :
.encode("utf-8")
.decode("utf-8")
.decode('latin-1').encode("utf-8")
unicode(str.decode("iso-8859-4"))
My python version is 2.7.9 , Notepad++ UTF-8 encoding . with no luck , how can i print it ? and i'm encountering this kind of issues all the time , what's the proper way to debug and get the right encoding ?
-
What version of python are you using? I printed this using python 3.5 and it worked fine.Chris– Chris2016年08月31日 21:30:09 +00:00Commented Aug 31, 2016 at 21:30
-
Are you typing it or are you getting it from a different source? Copying and pasting from SO yields correct results on both 2.7 and 3.5, on my OS.Ben Beirut– Ben Beirut2016年08月31日 21:32:12 +00:00Commented Aug 31, 2016 at 21:32
-
Using Python 3+ works with print as a functionAndrew Li– Andrew Li2016年08月31日 21:32:19 +00:00Commented Aug 31, 2016 at 21:32
-
my python version is 2.7.9Brenda Martinez– Brenda Martinez2016年08月31日 21:40:45 +00:00Commented Aug 31, 2016 at 21:40
3 Answers 3
#!/usr/bin/python
# -*- coding: utf-8 -*-
print "Pokémon GO Việt Nam"
You can find here more info
For PyCharm settings, go to the menu: PyCharm --> Preference then use the search to look up "encoding", you should reach the following screen:
13 Comments
u"pokémon" but not "Pokémon GO Việt Nam"Specify the encoding
#!/usr/bin/python
# -*- coding: utf-8 -*-
in the top of the program
1 Comment
As an alternative you can encode the unicode string:
print u"Pokémon GO Việt Nam".encode('utf-8')
The advantage is that the bytes in the resulting string are independent of the encoding of the source file: u"ệ".encode('utf-8') is always the same 3 bytes "\xe1\xbb\x87".
It is also consistent with what you'd do if you have an unicode string in a variable.
# get text from somewhere...
text = u"Pokémon GO Việt Nam"
# assuming your terminal expects UTF-8 -- this won't work on Windows.
print text.encode('utf-8')