Python not recognizing unicode

Asked 13 years, 1 month ago

Viewed 442 times

I'm trying to make a script that converts japanese katakana to romaji ("シ" to "shi"). Here's what I'm trying:

x = u''
x = raw_input('Enter katakana: ')
x = x.replace(u'\u30B7', u'shi')

Enter Katakana: シ
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe3 in position 0: ordinal not in range(128)

As long as I have the unicode in my script written as u'\u30B7' and not シ, it should be able to handle it, right?

Improve this question

edited Nov 26, 2012 at 4:22

dda's user avatar

dda

6,2212 gold badges27 silver badges37 bronze badges

asked Nov 25, 2012 at 23:12

tkbx's user avatar

tkbx

16.4k34 gold badges93 silver badges123 bronze badges

Add a comment |

1 Answer 1

Sorted by: Reset to default

raw_input returns the entered string in a byte-encoded form that varies depending on the terminal used. Try decoding the input explicitly to Unicode first with:

import sys
x = raw_input('Enter katakana: ').decode(sys.stdin.encoding)

The error you get is from replace trying to naively convert the byte-encoded x to Unicode via the default ascii codec.

Improve this answer

answered Nov 25, 2012 at 23:16

Mark Tolonen's user avatar

Mark Tolonen

181k26 gold badges184 silver badges279 bronze badges

Comments

Your Answer

Draft saved

Draft discarded

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

lang-py

CollectivesTM on Stack Overflow

Python not recognizing unicode

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CollectivesTM on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related