I am trying to decode strings in a list of strings, for example 'caf\\xc3\\xab' what I want if this to be 'café'.
I tried some things but ran into problems.
when i do:
for i in range(len(words):
words[i] = words[i].decode("utf8")
I still need to convert to byte type but how do I do this,
also when I do it like this I need to remove the double backslashes for this to work
b'caf\\xc3\\xab'.decode("utf8")
1 Answer 1
Suppose you have string as follow:
bef = 'caf\\xc3\\xab'
To convert to 'café' you can do the following:
aft = bef.encode().decode('unicode-escape').encode('latin1').decode('utf-8')
Then print(aft) should show 'café'
Nikos Hidalgo
3,7669 gold badges27 silver badges41 bronze badges
Sign up to request clarification or add additional context in comments.
Comments
lang-py
words.decode()is not an in-place operation, you need to capture the return value:word = word.decode("utf8"). (Further note: this will only change the value of the loop variableword, but not the elements inwords.)