Skip to main content
Stack Overflow
  1. About
  2. For Teams

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

Required fields*

Required fields*

Writing Unicode text to a text file? [duplicate]

I'm pulling data out of a Google doc, processing it, and writing it to a file (that eventually I will paste into a Wordpress page).

It has some non-ASCII symbols. How can I convert these safely to symbols that can be used in HTML source?

Currently I'm converting everything to Unicode on the way in, joining it all together in a Python string, then doing:

import codecs
f = codecs.open('out.txt', mode="w", encoding="iso-8859-1")
f.write(all_html.encode("iso-8859-1", "replace"))

There is an encoding error on the last line:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xa0 in position 12286: ordinal not in range(128)

Partial solution:

This Python runs without an error:

row = [unicode(x.strip()) if x is not None else u'' for x in row]
all_html = row[0] + "<br/>" + row[1]
f = open('out.txt', 'w')
f.write(all_html.encode("utf-8"))

But then if I open the actual text file, I see lots of symbols like:

Qur’an 

Maybe I need to write to something other than a text file?

Answer*

Draft saved
Draft discarded
Cancel
5
  • But this does not work on Python 2, right? (I should said, on this Python 3 code, it looks so concise and reasonable) Commented Oct 15, 2017 at 19:35
  • 1
    it should not work on Python 2. We stay on Python 3. 3 is so much better. Commented Oct 16, 2017 at 4:31
  • 2
    This is THE answer. This is how you properly write utf-8 to a file, thanks! Commented Dec 9, 2020 at 11:45
  • @KerwinSneijders the question is about Python 2.7, not Python 3 Commented Nov 2, 2021 at 20:03
  • 1
    Python 2.x is no longer supported, more and more people will never use python 2 anymore and will find this question on SO when searching for a python 3 solution. And I don't think there should be 2 questions both for python 2 and 3, so because python 2.x is no longer supported, this should be the new accepted answer Commented Nov 2, 2021 at 21:06

lang-py

AltStyle によって変換されたページ (->オリジナル) /