Skip to main content
Stack Overflow
  1. About
  2. For Teams

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

Required fields*

Python - Unicode

The execution of a simple script is not going as thought.

notAllowed = {"â":"a", "à":"a", "é":"e", "è":"e", "ê":"e",
 "î":"i", "ô":"o", "ç":"c", "û":"u"}
word = "dôzerté"
print word
for char in word:
 if char in notAllowed.keys():
 print "hooray"
 word = word.replace(char, notAllowed[char])
print word
print "finished"

The output return the word unchanged, even though it should have changed "ô" and "é" to o and e, thus returning dozerte...

Any ideas?

Answer*

Draft saved
Draft discarded
Cancel
3
  • It might have (not very familiar with Py3), but I tried that in 2.7 and after adding unicode marks it worked for me :) Commented Mar 8, 2012 at 14:26
  • Thanks kgr. Your fix worked great! :) edit: sorry, i's python 2.7 Commented Mar 8, 2012 at 14:39
  • @Joey: Python 3 still has byte strings and character strings same as Python 2. There is nothing wrong with byte strings per se; you still need them in many scenarios where you are dealing with binary data and non-Unicode interfaces. All Python 3 changed is that (a) it made the unprefixed-string-literal syntax refer to char strings instead of byte strings, and (b) used char strings for several interfaces that were previously bytes but work equally well as bytes or chars. Commented Mar 9, 2012 at 21:56

lang-py

AltStyle によって変換されたページ (->オリジナル) /