I'm trying to figure out how to use the unicode support in python; I would like to convert this string to unicode : "ABCDE" --> "\x00A\x00B\x00C\x00D\x00E"
Any built-in functionnality can do that, or shall i use join() ?
Thanks !
3 Answers 3
That's UTF-16BE, not Unicode.
>>> 'ABCDE'.decode('ascii').encode('utf-16be')
'\x00A\x00B\x00C\x00D\x00E'
2 Comments
.decode('ascii') bit? It's implied by the .encode('utf-16be') ('\xff'.encode('utf-16be') will fail with an UnicodeDecodeError ascii codec error the same way as '\xff'.decode('ascii') will)The key to understanding unicode in python is that unicode means UNICODE. A unicode object is an idealized representation to the characters, not actual bytes.
Comments
the str object should be firstly converted to unicode object by decode method. then convert the unicode object to str object using encode method with character-encoding you want.