I have a binary file and I wonder how I can read it using numpy. The format of the data is 10 characters of string followed by 100 floats (stored using 4 characters each). I know the following snippet with struct module can solve this, but for large files the struct code is too time-consuming.
f = open(file, 'rb')
while True:
tag = f.read(10)
if tag== '': break
b = []
for i in range(100):
b.append(struct.unpack('f', f.read(4)))
yield tag, b
I'm a little confused with the numpy.fromfile, it seems this can meet my requirement.
-
a sample of your input? the first few lines atleast?void– void2018年03月09日 06:34:13 +00:00Commented Mar 9, 2018 at 6:34
1 Answer 1
fromfile takes an open file object. Without a test file, I'll just write some code without testing it:
f = open('test', 'rb')
arr1 = np.fromfile(f, dtype='S1', count=10)
arr2 = np.fromfile(f, dtype='f4') # count=100 optional
f.close()
In words - open the file, read the string part, then read the float part.
If it's a repeated pattern, it should work to put that code in a loop. I'd then collect the arg1 and arg2 pieces in lists, and concatenate at the end.