0

I am trying to convert a huge csv file from utf-16 to utf-8 format using python and below is the code:

with open(r'D:\_apps\aaa\output\srcfile, 'rb') as source_file:
 with open(r'D:\_apps\aaa\output\destfile, 'w+b') as dest_file:
 contents = source_file.read()
 dest_file.write(contents.decode('utf-16').encode('utf-8'))

But this code uses lots of memory and fails with Memoryerror. Please help me with an alternate method.

asked Mar 17, 2022 at 6:51
1
  • Split the file? Perhaps it would help to specify the encoding when opening the files? Then, if possible, you could perhaps stream directly from one file to the other. Commented Mar 17, 2022 at 6:57

1 Answer 1

2

an option is to convert the file line by line:

with open(r'D:\_apps\aaa\output\srcfile', 'rb') as source_file, \
 open(r'D:\_apps\aaa\output\destfile', 'w+b') as dest_file:
 for line in source_file:
 dest_file.write(line.decode('utf-16').encode('utf-8'))

or you could open the files with your desired encoding:

with open(r'D:\_apps\aaa\output\srcfile', 'r', encoding='utf-16') as source_file, \
 open(r'D:\_apps\aaa\output\destfile', 'w+', encoding='utf-8') as dest_file:
 for line in source_file:
 dest_file.write(line)
answered Mar 17, 2022 at 6:57
Sign up to request clarification or add additional context in comments.

3 Comments

glad to hear! happy pythoning!
In the second solution, you have to use modes without "b", ie. 'r' and 'w+' as the second args to open().
@lenz Yes, you are right. I did the same.

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.