Edit - Stack Overflow

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

Required fields*

Rev

Required fields*

How to convert between bytes and strings in Python 3?

This is a Python 101 type question, but it had me baffled for a while when I tried to use a package that seemed to convert my string input into bytes.

As you will see below I found the answer for myself, but I felt it was worth recording here because of the time it took me to unearth what was going on. It seems to be generic to Python 3, so I have not referred to the original package I was playing with; it does not seem to be an error (just that the particular package had a .tostring() method that was clearly not producing what I understood as a string...)

My test program goes like this:

import mangler # spoof package
stringThing = """
<Doc>
 <Greeting>Hello World</Greeting>
 <Greeting>你好</Greeting>
</Doc>
"""
# print out the input
print('This is the string input:')
print(stringThing)
# now make the string into bytes
bytesThing = mangler.tostring(stringThing) # pseudo-code again
# now print it out
print('\nThis is the bytes output:')
print(bytesThing)

The output from this code gives this:

This is the string input:
<Doc>
 <Greeting>Hello World</Greeting>
 <Greeting>你好</Greeting>
</Doc>
This is the bytes output:
b'\n<Doc>\n <Greeting>Hello World</Greeting>\n <Greeting>\xe4\xbd\xa0\xe5\xa5\xbd</Greeting>\n</Doc>\n'

So, there is a need to be able to convert between bytes and strings, to avoid ending up with non-ascii characters being turned into gobbledegook.

Answer*

Draft saved

Draft discarded

Edit Summary*

Cancel

If you look at the actual method implementations you'll see that utf-8 is the default encoding, therefore you can omit it given that you know that the encoding is indeed utf-8, i.e. stringThing.encode() and bytesThing.decode() will do just fine.

ccpizza
– ccpizza

2016年07月17日 15:29:16 +00:00
Commented Jul 17, 2016 at 15:29
@ccpizza Making the encoding explicit in the above examples makes it much clearer what is going on, and IMHO is good practice. Not all unicode is UTF-8. It also avoids the silent failure referred to in the last paragraph.

Bobble
– Bobble

2016年07月18日 18:06:50 +00:00
Commented Jul 18, 2016 at 18:06
totally agree; explicit is better than implicit, but imo it is good to know what is the implicit. Whether to use it or not is another question. Just because you can doesn't mean you should :)

ccpizza
– ccpizza

2016年07月18日 21:17:07 +00:00
Commented Jul 18, 2016 at 21:17
In Python 3 it's safer to use decode('utf-8', 'backslashreplace') to avoid an exception if the encoding is unknown. One shouldn't always assume UTF-8!

Nagev
– Nagev

2018年02月12日 17:19:31 +00:00
Commented Feb 12, 2018 at 17:19
bytesThing.decode(encoding = locale.getpreferredencoding()) is more accurate, than ignorantly assuming UTF-8

Mikhail T.
– Mikhail T.

2025年03月11日 21:28:24 +00:00
Commented Mar 11, 2025 at 21:28

Add a comment |

How to Edit

Correct minor typos or mistakes
Clarify meaning without changing it
Add related resources or links
Always respect the author’s intent
Don’t use edits to reply to the author

How to Format

create code fences with backticks ` or tildes ~
```
like so
```
add language identifier to highlight code
```python
def function(foo):
print(foo)
```
put returns between paragraphs
for linebreak add 2 spaces at end
_italic_ or **bold**
indent code by 4 spaces
backtick escapes `like _so_`
quote by placing > at start of line
to make links (use https whenever possible)

<https://example.com>

[example](https://example.com)

<a href="https://example.com">example</a>

formatting help »
answering help »

How to Tag

A tag is a keyword or label that categorizes your question with other, similar questions. Choose one or more (up to 5) tags that will help answerers to find and interpret your question.

complete the sentence: my question is about...
use tags that describe things or concepts that are essential, not incidental to your question
favor using existing popular tags
read the descriptions that appear below the tag

If your question is primarily about a topic for which you can't find a tag:

combine multiple words into single-words with hyphens (e.g. python-3.x), up to a maximum of 35 characters
creating new tags is a privilege; if you can't yet create a tag you need, then post this question without it, then ask the community to create it for you

popular tags »

lang-py

CollectivesTM on Stack Overflow

How to convert between bytes and strings in Python 3?

Answer*