-
Notifications
You must be signed in to change notification settings - Fork 192
-
stdlib/src/stdlib_io_npy_save.fypp
Lines 70 to 79 in 9c4abca
Is there a reason to perform the byte encoding this way? I assume this function guarantees little-endian byte order is observed? Or it correctly handles the range of unsigned integers?
Wouldn't it be sufficient to implement this with transfer:
str = transfer(int(val,int16),str)
There is a big difference in the number of instructions needed. In the old days equivalence
might have been used, but in this case it interferes with the result
attribute.
Here is a demonstration on Compiler Explorer: https://godbolt.org/z/Ecr7Teree
If the endianness is an issue, an if-else block (if (little_endian) then
...) could be used. The endianness can be determined as a compile-time constant:
stdlib/src/stdlib_hash_32bit.fypp
Lines 54 to 56 in 9f1aa24
Beta Was this translation helpful? Give feedback.
All reactions
Replies: 1 comment 1 reply
-
The requirement in the npy
format is little endian encoding for the header. Writing the 2/4 byte version sequence of the npy
header is unlikely to become performance critical, therefore I didn't attempt to do any optimization but went for the obvious solution. But feel free to submit an improved version.
Beta Was this translation helpful? Give feedback.
All reactions
-
I was merely curious because I was adapting your encoding routines to output a bitmap file header. I agree it's not performance critical here. It's good to know there are different possibilities. In principle compilers could be taught to recognize the approach you adopted as a byte conversion and optimize it accordingly.
The original C++ code I was following to output a bitmap, used the following shift approach.
unsigned char header[2]; header[0] = (unsigned char)(value ); header[1] = (unsigned char)(value >> 8);
If you put this into GodBolt: https://godbolt.org/z/jdnnf65cq, it needs even one instruction less than the Fortran transfer
version.
For the bitmap header, it's possible to ditch the character conversion entirely, and simply use the structure-constructor of a derived-type, e.g.
type :: bmp_header_ sequence character(len=2) :: header = 'BM' integer(int32) :: filesize integer(int16) :: reserved1 = 0 integer(int16) :: reserved2 = 0 integer(int32) :: offset end type
which can be written directly in binary (when little-endian):
write(unit, iostat=stat) bmp_header_(filesize=filesize,offset=54)
I think it's a good use of both the sequence
attribute and the intrinsic fixed-size integer kind specifiers. The only caveat is the signed/unsigned integer conversion... It's probably best-practice to overload the structure-constructor and explicitly transfer and truncate integers to circumvent wraparound issues.
Beta Was this translation helpful? Give feedback.