I am trying to read a fortran file with headers as integers and then the actual data as 32 bit floats. Using numpy's fromfile('mydatafile', dtype=np.float32) it reads in the whole file as float32 but I need the headers to be in int32 for my output file. Using scipy's FortranFile it reads the headers:
f = FortranFile('mydatafile', 'r')
headers = f.read_ints(dtype=np.int32)
but when I do:
data = f.read_reals(dtype=np.float32)
it returns an empty array. I know it shouldn't be empty because using numpy's fromfile it reads all of the data. Oddly enough the scipy method worked for other files in my dataset, but not this one. Perhaps i'm not understanding the difference between each of the two read methods with numpy and scipy. Is there a way to isolate the headers (dtype=np.int32) and data (dtype=np.float32) when reading in the file with either method?
2 Answers 2
np.fromfile takes a "count" argument, which specifies how many items to read. If you know the number of integers in the header in advance, a simple way to do what you want without any type conversions would just be to read the header as integers, followed by the rest of the file as floats:
with open('filepath','r') as f:
header = np.fromfile(f, dtype=np.int, count=number_of_integers)
data = np.fromfile(f, dtype=np.float32)
Comments
@DavidTrevelyan has an quite okay way. Another way is to use the fortranfile package in combination with struct. Neither way is ideal, but then neither is scipy's FortranFile.
At least this way you can read mixed-type data. Here's an example:
from fortranfile import FortranFile
from struct import unpack
with FortranFile(to_open) as fh:
dat = fh.readRecord()
val_list = unpack('=4i20d'.format(ln), dat)
You can install it using pip install fortranfile. struct is standard, the (un)pack format is here.
'float32'and then convert the part that corresponds to the header withdata[:n].view(np.int32), wherenis the numbe of elements in the header.FortranFileis for Fortran "unformatted" binary files. Despite being called "unformatted", these files store the data as records that consist of a header indicating the number of items, followed by the item data itself, followed by a second copy of the header.numpy.fromfileon the other hand is for raw binary data, without any footers/headers. Fortran can also output files in this format (depends on arguments of theOPENstatement). So you need to know what file format you have, and use the correct method of the two; using the wrong method leads to wrong data.