Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Integer to byte sequence conversion #668

ivan-pi started this conversation in General
Discussion options

!> Write integer as byte string in little endian encoding, 2-byte truncated version
pure function to_bytes_i2(val) result(str)
!> Integer value to convert to bytes
integer, intent(in) :: val
!> String of bytes
character(len=2) :: str
str = achar(mod(val, 2**8)) // &
& achar(mod(val, 2**16) / 2**8)
end function to_bytes_i2

Is there a reason to perform the byte encoding this way? I assume this function guarantees little-endian byte order is observed? Or it correctly handles the range of unsigned integers?

Wouldn't it be sufficient to implement this with transfer:

str = transfer(int(val,int16),str)

There is a big difference in the number of instructions needed. In the old days equivalence might have been used, but in this case it interferes with the result attribute.

Here is a demonstration on Compiler Explorer: https://godbolt.org/z/Ecr7Teree

If the endianness is an issue, an if-else block (if (little_endian) then ...) could be used. The endianness can be determined as a compile-time constant:

! Dealing with different endians
logical, parameter, public :: &
little_endian = ( 1 == transfer([1_int8, 0_int8], 0_int16) )
You must be logged in to vote

Replies: 1 comment 1 reply

Comment options

The requirement in the npy format is little endian encoding for the header. Writing the 2/4 byte version sequence of the npy header is unlikely to become performance critical, therefore I didn't attempt to do any optimization but went for the obvious solution. But feel free to submit an improved version.

You must be logged in to vote
1 reply
Comment options

ivan-pi Jul 26, 2022
Maintainer Author

I was merely curious because I was adapting your encoding routines to output a bitmap file header. I agree it's not performance critical here. It's good to know there are different possibilities. In principle compilers could be taught to recognize the approach you adopted as a byte conversion and optimize it accordingly.

The original C++ code I was following to output a bitmap, used the following shift approach.

unsigned char header[2];
header[0] = (unsigned char)(value );
header[1] = (unsigned char)(value >> 8);

If you put this into GodBolt: https://godbolt.org/z/jdnnf65cq, it needs even one instruction less than the Fortran transfer version.

For the bitmap header, it's possible to ditch the character conversion entirely, and simply use the structure-constructor of a derived-type, e.g.

 type :: bmp_header_
 sequence
 character(len=2) :: header = 'BM'
 integer(int32) :: filesize
 integer(int16) :: reserved1 = 0
 integer(int16) :: reserved2 = 0
 integer(int32) :: offset
 end type

which can be written directly in binary (when little-endian):

write(unit, iostat=stat) bmp_header_(filesize=filesize,offset=54)

I think it's a good use of both the sequence attribute and the intrinsic fixed-size integer kind specifiers. The only caveat is the signed/unsigned integer conversion... It's probably best-practice to overload the structure-constructor and explicitly transfer and truncate integers to circumvent wraparound issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants

AltStyle によって変換されたページ (->オリジナル) /