I am trying to write some Data to a file. In some instances, obviously depending on the Data I am trying to write, I get a UnicodeEncodeError (UnicodeEncodeError: 'charmap' codec can't encode character '\U0001f622' in position 141: character maps to ) I did some research and found out that I can encode the data I am writing with the encode function.
This is the code prior to modifying it (not supporting Unicode):
scriptDir = os.path.dirname(__file__)
path = os.path.join(scriptDir, filename)
with open(path, 'w') as fp:
for sentence in iobTriplets:
fp.write("\n".join("{} {} {}".format(triplet[0],triplet[1],triplet[2]) for triplet in sentence))
fp.write("\n")
fp.write("\n")
So I though maybe I could just add the encoding when writing like that:
fp.write("\n".join("{} {} {}".format(triplet[0],triplet[1],triplet[2]).encode('utf8') for triplet in sentence))
But that doesn't work as I am getting the following error: TypeError: sequence item 0: expected str instance, bytes found
I also tried opening the file in byte mode with adding a b behind the w. However that didn't yield any results.
Does anybody know how to fix this? Btw: I am using python 3.
1 Answer 1
You have already opened the file with automatic encoding. There is no need to manually encode anything unless you are writing to binary.
You can specify any supported encoding in open():
with open(path, 'w', encoding='utf-16be') as fp:
Unless the file is opened as binary, you need to remove the str.encode() in the fp.write():
fp.write("\n".join("{} {} {}".format(triplet[0],triplet[1],triplet[2]) for triplet in sentence))
encode('utf-8')open, maybe something like ASCII. So, try usingopen(path, 'w', encoding='utf-8')encode(). I am curious as to why not opening the file in binary mode isn't working. What does "I also tried opening the file in byte mode with adding a b behind the w. However that didn't yield any results." mean exactlyb'\n'.join(...)if you are going to be joining bytes. that is likely the source of your error, but then you will have to use binary mode when opening the file