Confused about Python bytes type

Asked 8 years, 1 month ago

Viewed 148 times

I'm reading a binary file (Python 3) and am trying to convert chunks with the help of the struct module.

f = open(fn, "rb")
try:
 a=f.read(2)
...

When I use :

unpack("h",b'\x6b\x0a')

it gives me the expected

(2667,)

But I can't use that syntax on

b'6b0a'

even though:

print(type(b'6b0a'))
print(type(b'\x6b\x0a'))

gives the same type:

<class 'bytes'>
<class 'bytes'>

What am I mixing up? I think this used to work for me in Python 2.x.

Improve this question

asked Nov 10, 2017 at 2:21

roadrunner66's user avatar

roadrunner66

7,9914 gold badges34 silver badges39 bronze badges

Add a comment |

3 Answers 3

Sorted by: Reset to default

b'\x6b\x0a' is two bytes: 0x6b 0x0a. b'6b0a' is four bytes: 0x36 0x62 0x30 0x61.

>>> binascii.unhexlify(b'6b0a')
b'k\n'

Improve this answer

answered Nov 10, 2017 at 2:26

Ignacio Vazquez-Abrams's user avatar

Ignacio Vazquez-Abrams

804k160 gold badges1.4k silver badges1.4k bronze badges

2 Comments

roadrunner66

roadrunner66 Over a year ago

Ty. How do I convert b'6b0a' into b'\x6b\x0a?

2017年11月10日T04:47:22.97Z+00:00

roadrunner66

roadrunner66 Over a year ago

Ty. Didn't see that because of the representation as ascii, which was my problem to begin with. So import binascii print(unpack("h", binascii.unhexlify(b'6b0a'))) --> ((2667,) works.

2017年11月10日T15:18:21.92Z+00:00

Inside bytes literals you may use 2 methods for specifying individual bytes:

a direct ASCII character, e. g. b'xyz', or
an escape sequence (starting with \), e.g. b'\n123円\x56\\'

Of course, you may combine both methods in one bytes literal, e. g. b'xy\n\x56abc'

So your b'\x6b\x0a' and b'6b0a' a both the valid bytes literals - but unfortunately different:

the 1^st one consists of 2 bytes (represented as escape sequences \x6b and \x0a)
the 2^nd one consists of 4 bytes (represented as ASCII characters 6, b, 0, and a).

Improve this answer

edited Nov 10, 2017 at 3:54

answered Nov 10, 2017 at 3:42

MarianD's user avatar

MarianD

14.4k12 gold badges50 silver badges62 bronze badges

Comments

\x6b

inside a b'' is the standard string literal's escape sequence (allowed in bytes literals, too) for the hexadecimal representation of a single byte, while

6b

inside that bytes literal are two ASCII characters representing two bytes.

From Python documentation:

Only ASCII characters are permitted in bytes literals (regardless of the declared source code encoding). Any binary values over 127 must be entered into bytes literals using the appropriate escape sequence.

While bytes literals and representations are based on ASCII text, bytes objects actually behave like immutable sequences of integers, with each value in the sequence restricted such that 0 <= x < 256.

Improve this answer

edited Nov 10, 2017 at 3:19

answered Nov 10, 2017 at 2:55

MarianD's user avatar

MarianD

14.4k12 gold badges50 silver badges62 bronze badges

Comments

Your Answer

Draft saved

Draft discarded

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

lang-py

CollectivesTM on Stack Overflow

Confused about Python bytes type

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CollectivesTM on Stack Overflow

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related