0

Why does this work:

a = 'a'.encode('utf-8')
print unicode(a)
>>> u'a'

And this will give me an Error:

b = 'b'.encode('utf-8_sig')
print unicode(b)

Saying:
>>> UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 0: ordinal not in range(128)

asked Mar 18, 2014 at 12:13

2 Answers 2

2

Because you haven't told unicode what encoding to use:

>>> a = 'a'.encode('utf-8')
>>> print unicode(a)
a
>>> b = 'b'.encode('utf-8_sig')
>>> print unicode(b)
Traceback (most recent call last):
 File "<pyshell#3>", line 1, in <module>
 print unicode(b)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 0: ordinal not in range(128)
>>> print unicode(b, 'utf-8_sig')
b
answered Mar 18, 2014 at 12:19
Sign up to request clarification or add additional context in comments.

2 Comments

Is unicode using 'utf-8' bei default or why i dont get an error in the first issue?
'a'.encode('utf-8') is just 'a', so you don't need to tell unicode how to deal with it, whereas 'b'.encode('utf-8_sig') is '\xef\xbb\xbfb'
1

'ascii' codec can't decode byte 0xef says two things:

  1. unicode(b) uses ascii (sys.getdefaultencoding()) character encoding
  2. \xef byte is not in ascii range. It is the first byte in BOM introduced by 'utf-8-sig' encoding (used on Windows)

The first example works because 'a' bytestring is ascii. 'a'.encode('utf-8') is equivalent to 'a'.decode(sys.getdefaultencoding()).encode('utf-8') and in this case it is equal to 'a' itself.

In general, use bytestring.decode(character_encoding) = unicode_string and unicode_string.encode(character_encoding) = bytestring. bytestring is a sequence of bytes. Unicode string is a sequence of Unicode codepoints.

Do not call .encode() on bytestrings. 'a' is a bytestring literal in Python 2. u'a' is a Unicode literal.

answered Mar 18, 2014 at 14:06

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.