Python struct.unpack not working

Question 1

I'm trying to run this:

def ReadWord(fid,fmt,Addr):
 fid.seek(Addr)
 s = fid.readline(2)
 s = unpack(fmt + 'h', s)
 if(type(s) == tuple):
 return s[0]
 else:
 return s

with:

len(s) = 2
len(fmt) = 1
calcsize(fmt) = 0
calcsize(fmt + 'h') = 2

However, Python returns:

struct.error: unpack requires a string argument of length 4

According to python struct.unpack documentation :

The string must contain exactly the amount of data required by the format (len(string) must equal calcsize(fmt)).

So if the length of my string is 2 and calcsize of fmt+'h' is also 2, why does python say "unpack requires a string argument of length 4" ??

EDIT :

Thanks for all your answers. Here is the full code:

http://qtwork.tudelft.nl/gitdata/users/guen/qtlabanalysis/analysis_modules/general/lecroy.py

So as you can see in the read_timetrace function, fmt is set to '<' or '>' in a if...else statement. Printing it confirmes that.

But you should also know that I'm working on windowsx64 (for work).

EDIT2

Here's the full traceback, sorry for the mistake.

Traceback (most recent call last):
 File "C:\Users\maxime.vast\Desktop\Test Campaign Template\Test Suite\Include\readLecroyTRCFile.py", line 139, in <module>
 read_timetrace("C:\Users\maxime.vast\Desktop\Test Campaign Template\Test Suite\Traces\KL.ES.001.001.trc")
 File "C:\Users\maxime.vast\Desktop\Test Campaign Template\Test Suite\Include\readLecroyTRCFile.py", line 60, in read_timetrace
 WAVE_ARRAY_1 = ReadLong(fid, fmt, aWAVE_ARRAY_1)
 File "C:\Users\maxime.vast\Desktop\Test Campaign Template\Test Suite\Include\readLecroyTRCFile.py", line 100, in ReadLong
 s = unpack(fmt + 'l', s)
struct.error: unpack requires a string argument of length 4
[Finished in 0.2s]

EDIT3:

I replaced readline by read and add :

print "len(s) ", len(s)
print "len(fmt) ", len(fmt)
print "calcsize(fmt) ", calcsize(fmt)
print "calcsize(fmt + 'h') ", calcsize(fmt + 'h')
print "fmt ", fmt

to ReadLong function.

Here's the new traceback :

len(s) 4
len(fmt) 1
calcsize(fmt) 0
calcsize(fmt + 'h') 2
fmt <
len(s) 4
len(fmt) 1
calcsize(fmt) 0
calcsize(fmt + 'h') 2
fmt <
len(s) 4
len(fmt) 1
calcsize(fmt) 0
calcsize(fmt + 'h') 2
fmt <
len(s) 1
len(fmt) 1
calcsize(fmt) 0
calcsize(fmt + 'h') 2
fmt <
Traceback (most recent call last):
 File "C:\Users\maxime.vast\Desktop\Test Campaign Template\Test Suite\Include\readLecroyTRCFile.py", line 143, in <module>
 read_timetrace("C:\Users\maxime.vast\Desktop\Test Campaign Template\Test Suite\Traces\KL.ES.001.001.trc")
 File "C:\Users\maxime.vast\Desktop\Test Campaign Template\Test Suite\Include\readLecroyTRCFile.py", line 60, in read_timetrace
 WAVE_ARRAY_1 = ReadLong(fid, fmt, aWAVE_ARRAY_1)
 File "C:\Users\maxime.vast\Desktop\Test Campaign Template\Test Suite\Include\readLecroyTRCFile.py", line 104, in ReadLong
 s = unpack(fmt + 'l', s)
struct.error: unpack requires a string argument of length 4
[Finished in 0.2s]

Question 2

Now that you added the complete file, please also add the full traceback you get.

Question 3

@MathersMax: You really shouldn't be using .readline() to read a binary file.

Question 4

I just noticed another problem: you're opening your waveform file in text mode. You should open it in binary mode because in text mode on Windows '\x0d\x0a' sequences get translated to '\x0a' on reading (and vice versa on writing).

Question 5

@MatherMax I can see you're a new member here - it's great that you've edited in response to comments, accepted the answer and got the help you needed! It's usually not a good idea to edit the solution into the question - particularly as you've got an accepted answer. Maybe you could move that edit. All the best!

Question 6

I'll add a little more code to my answer that shows how you can reduce the duplication of all those Readxxx() functions.

Question 7

FWIW, you should be using read(2), not readline(2). And if the fmt string really is '>' you should not be getting that error. Here's a short demo that performs as expected.

from struct import unpack
fname = 'qbytes'
#Create a file of all byte values
with open(fname, 'wb') as f:
 f.write(bytearray(range(256)))
def ReadWord(fid, fmt, addr):
 fid.seek(addr)
 s = fid.read(2)
 s = unpack(fmt + 'h', s)
 return s[0]
fid = open(fname, 'rb')
for i in range(16):
 addr = i
 n = 256*i + i+1
 #Interpret file data as big-endian
 print i, ReadWord(fid, '>', addr), n
fid.close()

output

0 1 1
1 258 258
2 515 515
3 772 772
4 1029 1029
5 1286 1286
6 1543 1543
7 1800 1800
8 2057 2057
9 2314 2314
10 2571 2571
11 2828 2828
12 3085 3085
13 3342 3342
14 3599 3599
15 3856 3856

BTW, struct.unpack() always returns a tuple, even if the return value is a single item.

Using readline(2) on a binary file can give unexpected results. In my test file in the above code there's a (Linux-style) newline \xa0 in the file. So if you change s = fid.read(2) to s = fid.readline(2) everything works fine at first, but on line 10 it crashes because it only reads a single byte, due to that newline char:

from struct import unpack
fname = 'qbytes'
#Create a file of all byte values
with open(fname, 'wb') as f:
 f.write(bytearray(range(256)))
def ReadWord(fid, fmt, addr):
 fid.seek(addr)
 s = fid.readline(2)
 print repr(s),
 s = unpack(fmt + 'h', s)
 return s[0]
with open(fname, 'rb') as fid:
 for i in range(16):
 addr = i
 n = 256*i + i+1
 #Interpret file data as big-endian
 print i, ReadWord(fid, '>', addr), n

output

0 '\x00\x01' 1 1
1 '\x01\x02' 258 258
2 '\x02\x03' 515 515
3 '\x03\x04' 772 772
4 '\x04\x05' 1029 1029
5 '\x05\x06' 1286 1286
6 '\x06\x07' 1543 1543
7 '\x07\x08' 1800 1800
8 '\x08\t' 2057 2057
9 '\t\n' 2314 2314
10 '\n'
Traceback (most recent call last):
 File "./qtest.py", line 30, in <module>
 print i, ReadWord(fid, '>', addr), n
 File "./qtest.py", line 22, in ReadWord
 s = unpack(fmt + 'h', s)
struct.error: unpack requires a string argument of length 2

postscript

You have several functions in your code that almost do the same thing. That breaks the DRY principle: Don't Repeat Yourself. Here's one way to fix that, using partial function application. See the functools docs for more info.

from functools import partial
def ReadNumber(fid, datalen=1, fmt='>', conv='b', addr=0):
 fid.seek(addr)
 s = fid.read(datalen)
 if len(s) != datalen:
 raise IOError('Read %d bytes but expected %d at %d' % (len(s), datalen, addr)) 
 return unpack(fmt+conv, s)[0]
ReadByte = partial(ReadNumber, datalen=1, conv='b') 
ReadWord = partial(ReadNumber, datalen=2, conv='h') 
ReadLong = partial(ReadNumber, datalen=4, conv='l') 
ReadFloat = partial(ReadNumber, datalen=4, conv='f') 
ReadDouble = partial(ReadNumber, datalen=8, conv='d')

You need to use keywords to call these new functions. Eg,

ReadLong(fid, fmt='>', addr=addr)

True, that's slightly more long-winded, but it makes the code a little more readable.

Question 8

+1 for read instead of readline. Now that the OP has clarified the issue, what happened was that the exception was actually raised by ReadLong which expect 4 bytes. I guess why this could ever happen is that readline encountered line or file end. In general the OP should have used read and checked that all bytes were available and read.

Question 9

The length of the format is rather unimportant on its own. What’s important is what kind of formats you specify there. There are for example format specifications which specify one byte or even eight bytes. So it really depends on the format how many characters there should be in s.

For example:

>>> struct.unpack('b', 'A')
(65,)
>>> struct.unpack('L', 'A')
Traceback (most recent call last):
 File "<pyshell#3>", line 1, in <module>
 struct.unpack('L', 'A')
error: unpack requires a string argument of length 4
>>> struct.unpack('L', 'AAAA')
(1094795585,)

If fmt is really > as you say, then it should work fine:

>>> struct.unpack('>h', 'AA')
(16705,)

So I assume that when the error appears, fmt is not just >, but something else that would consume an additional 2 bytes. Try printing fmt before the unpack.

Question 10

As len(fmt) = 1, it means that fmt has value. If fmt = 'h', then fmt+'h' will be 'hh'. Therefore, unpack() will expect 4 bytes data as each 'h' required a short integer (2 bytes).

PM 2Ring 55.6k6 gold badges96 silver badges203 bronze badges · Accepted Answer · 2015-09-11 09:30:10Z

FWIW, you should be using read(2), not readline(2). And if the fmt string really is '>' you should not be getting that error. Here's a short demo that performs as expected.

from struct import unpack
fname = 'qbytes'
#Create a file of all byte values
with open(fname, 'wb') as f:
 f.write(bytearray(range(256)))
def ReadWord(fid, fmt, addr):
 fid.seek(addr)
 s = fid.read(2)
 s = unpack(fmt + 'h', s)
 return s[0]
fid = open(fname, 'rb')
for i in range(16):
 addr = i
 n = 256*i + i+1
 #Interpret file data as big-endian
 print i, ReadWord(fid, '>', addr), n
fid.close()

output

0 1 1
1 258 258
2 515 515
3 772 772
4 1029 1029
5 1286 1286
6 1543 1543
7 1800 1800
8 2057 2057
9 2314 2314
10 2571 2571
11 2828 2828
12 3085 3085
13 3342 3342
14 3599 3599
15 3856 3856

BTW, struct.unpack() always returns a tuple, even if the return value is a single item.

Using readline(2) on a binary file can give unexpected results. In my test file in the above code there's a (Linux-style) newline \xa0 in the file. So if you change s = fid.read(2) to s = fid.readline(2) everything works fine at first, but on line 10 it crashes because it only reads a single byte, due to that newline char:

from struct import unpack
fname = 'qbytes'
#Create a file of all byte values
with open(fname, 'wb') as f:
 f.write(bytearray(range(256)))
def ReadWord(fid, fmt, addr):
 fid.seek(addr)
 s = fid.readline(2)
 print repr(s),
 s = unpack(fmt + 'h', s)
 return s[0]
with open(fname, 'rb') as fid:
 for i in range(16):
 addr = i
 n = 256*i + i+1
 #Interpret file data as big-endian
 print i, ReadWord(fid, '>', addr), n

output

0 '\x00\x01' 1 1
1 '\x01\x02' 258 258
2 '\x02\x03' 515 515
3 '\x03\x04' 772 772
4 '\x04\x05' 1029 1029
5 '\x05\x06' 1286 1286
6 '\x06\x07' 1543 1543
7 '\x07\x08' 1800 1800
8 '\x08\t' 2057 2057
9 '\t\n' 2314 2314
10 '\n'
Traceback (most recent call last):
 File "./qtest.py", line 30, in <module>
 print i, ReadWord(fid, '>', addr), n
 File "./qtest.py", line 22, in ReadWord
 s = unpack(fmt + 'h', s)
struct.error: unpack requires a string argument of length 2

postscript

You have several functions in your code that almost do the same thing. That breaks the DRY principle: Don't Repeat Yourself. Here's one way to fix that, using partial function application. See the functools docs for more info.

from functools import partial
def ReadNumber(fid, datalen=1, fmt='>', conv='b', addr=0):
 fid.seek(addr)
 s = fid.read(datalen)
 if len(s) != datalen:
 raise IOError('Read %d bytes but expected %d at %d' % (len(s), datalen, addr)) 
 return unpack(fmt+conv, s)[0]
ReadByte = partial(ReadNumber, datalen=1, conv='b') 
ReadWord = partial(ReadNumber, datalen=2, conv='h') 
ReadLong = partial(ReadNumber, datalen=4, conv='l') 
ReadFloat = partial(ReadNumber, datalen=4, conv='f') 
ReadDouble = partial(ReadNumber, datalen=8, conv='d')

You need to use keywords to call these new functions. Eg,

ReadLong(fid, fmt='>', addr=addr)

True, that's slightly more long-winded, but it makes the code a little more readable.

+1 for read instead of readline. Now that the OP has clarified the issue, what happened was that the exception was actually raised by ReadLong which expect 4 bytes. I guess why this could ever happen is that readline encountered line or file end. In general the OP should have used read and checked that all bytes were available and read.

CollectivesTM on Stack Overflow

Python struct.unpack not working

3 Answers 3

postscript

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CollectivesTM on Stack Overflow

3 Answers 3

postscript

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related