0

I'm trying to run the command u'\xe1'.decode("utf-8") in python and I get this error:

Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/encodings/utf_8.py", line 16, in decode
 return codecs.utf_8_decode(input, errors, True)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe1' in position 0: ordinal not in range(128)

Why does it say I'm trying to decode ascii when I'm passing utf-8 as the first argument? In addition to this, is there any way I can get the character á from u'\xe1' and save it in a string?

asked Nov 20, 2014 at 22:56
4
  • 3
    what exactly are you trying to do? Commented Nov 20, 2014 at 23:07
  • The python script I'm running takes text, processes it, and prints a JSON string containing a categorized version of the original text. What I've noticed is that characters like this sometimes end up as their unicode values in the printed JSON string. Commented Nov 20, 2014 at 23:11
  • when you print your string you will see á Commented Nov 20, 2014 at 23:15
  • So I was able to solve the problem. Thank you guys for the help. But I'm still confused on why the error says it's an ascii encoding problem when I'm using utf-8 instead. Commented Nov 21, 2014 at 22:15

1 Answer 1

1

decode will take a string and convert it to unicode (eg: "\xb0".decode("utf8") ==> u"\xb0")

encode will take unicode and convert it to a string (eg: u"\xb0".encode("utf8") ==> "\xb0")

neither has much to do with the rendering of a string... it is mostly an internal representation

try

print u"\xe1"

(your terminal will need to support unicode (idle will work ... dos terminal not so much))

>>> print u"\xe1"
á
>>> print repr(u"\xe1".encode("utf8"))
'\xc3\xa1'
>>> print repr("\xc3\xa1".decode("utf8"))
u'\xe1'
answered Nov 20, 2014 at 23:05
Sign up to request clarification or add additional context in comments.

3 Comments

hey can we do this too??. >>> chr(ord("\xe1")) 'á'
The rule given in the answer is mostly true for 2.x, perhaps always for 3.x. The example output is for 2.x, slightly different in 3.x.
in python 2 it is >>> unichr(ord("\xe1")) 'á' @Hackaholic

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.