0

What is the correct way to convert '\xbb' into a unicode string? I have tried the following and only get UnicodeDecodeError:

unicode('\xbb', 'utf-8')
'\xbb'.decode('utf-8')
asked Mar 21, 2011 at 21:42
1
  • It is part of a file that someone pasted from Word (so its a str). If you type print u'\xbb' you get the double arrow (>>) character. Commented Mar 21, 2011 at 21:50

3 Answers 3

8

Since it comes from Word it's probably CP1252.

>>> print '\xbb'.decode('cp1252')
»
answered Mar 21, 2011 at 21:57
Sign up to request clarification or add additional context in comments.

Comments

1

It looks to be Latin-1 encoded. You should use:

unicode('\xbb', 'Latin-1')

answered Mar 21, 2011 at 21:56

Comments

0

Not sure what you are trying to do. But in Python3 all strings are unicode per default. In Python2.X you have to use u'my unicode string \xbb' (or double, tripple quoted) to get unicode strings. When you want to print unicode strings you have to encode them in character set that is supported on the output device, eg. the terminal. u'my unicode string \xbb'.endoce('iso-8859-1') for instance.

Bill the Lizard
407k213 gold badges579 silver badges892 bronze badges
answered Mar 21, 2011 at 22:00

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.