-
Notifications
You must be signed in to change notification settings - Fork 194
Proposal for a reference string to number conversion facility in stdlib #743
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I ran an exhaustive test over the full range of 32-bit IEEE-754 numbers and observed many discrepancies in the real values read by str2float
vs. the values read via internal input.
program test_real4s use stdlib_str2num, only: str2float implicit none character(112) :: buff integer(4) :: ival real(4) :: rval, rcheck equivalence(ival, rval) integer(8) :: j, passes, fails, tests passes = 0 fails = 0 tests = 0 do j = 0, int(z'ffffffff',8) ival = j if (iand(ival, z'7f800000') == int(z'7f800000') .and. & iand(ival, z'007fffff') /= 0) then cycle ! skip NaNs end if tests = tests + 1 write(buff, 1) rval 1 format(E112.105) read(buff, 1) rcheck if (rval /= rcheck) then print 2, ival, rval, rcheck, rval, rcheck 2 format('I/O library failure: ival ',z8,' rval ',z8,' rcheck ',z8, & ';',2(1x,E112.105)) error stop end if rcheck = str2float(buff) if (rval == rcheck) then passes = passes + 1 else print 3, rval, rcheck, rval, rcheck 3 format('str2float failure: rval ',z8,' rcheck ',z8, & /' want ',E112.105,/' got ',E112.105) end if end do print *, passes, ' total str2float passes out of ', tests, ' tests' print *, fails, ' total str2float failures out of ', tests, ' tests' end
Thanks for pointing it out and the test @klausler, I saw the problem with the small values!!
So, with the current strategy, this value for instance:
0.140129846432481707092372958328991613128026194187651577175706828388979108268586060148663818836212158203125E-44
Would be read first with an integer containing:
int_wp = 14012984
the variable resp = 9, contains the decimal places that should be applied first to get 0.14012984 and the exponent integer cotains the correct value
sige*max(0,i_exp) = -44
But, since the total exponent is computed from
expbase(nwnb-1+resp-sige*max(0,i_exp))
which can not hold 1e-53 that gives an underflow, the strategy doesn't work properly here...
Thinking how to make sure to manage those limit cases without degrading the performance
For the edge cases, something like this could work:
exp_aux = nwnb-1+resp-sige*max(0,i_exp) if( exp_aux>0 .and. exp_aux<=nwnb+nfnb ) then v = sign*int_wp*expbase(exp_aux) else if(exp_aux>nwnb+nfnb) then v = sign*int_wp*fractional_base(exp_aux-(nwnb+nfnb))*expbase(nwnb+nfnb) end if
Tried on
"0.462428493227189633404830762485672323322486440819250204679832533683631057286333998490590602159500122070312E-43"
With ifort and ifx the error is below acceptable tolerance, with gfortran I get:
formatted read : 0.462428493E-43
str2real : 0.448415509E-43
klausler
commented
Nov 2, 2023
For the edge cases, something like this could work:
exp_aux = nwnb-1+resp-sige*max(0,i_exp) if( exp_aux>0 .and. exp_aux<=nwnb+nfnb ) then v = sign*int_wp*expbase(exp_aux) else if(exp_aux>nwnb+nfnb) then v = sign*int_wp*fractional_base(exp_aux-(nwnb+nfnb))*expbase(nwnb+nfnb) end ifTried on "0.462428493227189633404830762485672323322486440819250204679832533683631057286333998490590602159500122070312E-43"
With ifort and ifx the error is below acceptable tolerance, with gfortran I get: formatted read : 0.462428493E-43 str2real : 0.448415509E-43
This test is reading the results of binary->decimal conversions that have enough decimal digits to be exact. The values read back must match the original binary values exactly; there is no "acceptable tolerance".
Well, the problem being that even this is not exact
use iso_fortran_env real :: x x = 10._real32**(-38) print *, x
gfortran : 9.99999935E-39
ifx : 9.9999994E-39
and it get worst with
x = 10._real32**(-44)
gfortran : 9.80908925E-45
ifx : 9.8090893E-45
Anyhow, fixed the problem. It was coming from the table of exponents which did not accurately defined the 10 basis. Changed for the following:
integer(kind=ikind), parameter :: nwnb = 39 !> number of whole number factors integer(kind=ikind), parameter :: nfnb = 37 !> number of fractional number factors integer :: e real(wp), parameter :: whole_number_base(nwnb) = [(10._wp**(nwnb-e),e=1,nwnb)] real(wp), parameter :: fractional_base(nfnb) = [(10._wp**(-e),e=1,nfnb)] real(wp), parameter :: expbase(nwnb+nfnb) = [whole_number_base, fractional_base]
Now I get:
gfortran
input : "0.462428493227189633404830762485672323322486440819250204679832533683631057286333998490590602159500122070312E-43"
formatted read : 0.462428493E-43
str2real : 0.462428493E-43
ifort
input : "0.462428493227189633404830762485672323322486440819250204679832533683631057286333998490590602159500122070312E-43"
formatted read : .4624285E-43
str2real : .4624285E-43
klausler
commented
Nov 2, 2023
If you're using negative powers of ten in your algorithm, it's never going to be exact.
I know, but I haven't found a better approach to be fast and accurate. In any case you can see in the last results that both the formatted read and the proposed algorithm give the same output to the last decimal place.
Previously I was defining the exponents table with "1eN", this thread https://stackoverflow.com/questions/47337262/defining-exponent-in-fortran hinted me to using "10._wp**N" which did indeed improve the precision.
I'll update when finishing more tests.
Changed the interface according to the discussions in the forum asking for something like "r = to_num( s , mold = r )" (the mold argument is there just to disambiguate the function call).
@klausler I'm rerunning your tests loosening the error check by
if (abs(rval - rcheck)<=10*epsilon(0.0) * abs(rcheck)) then
It has been running for quite a while on 16threads and so far no error messages, don't know when it will finish, I'll update fully then... a relative error to 10*epsilon(0._wp) is already quite good I would say. If there is a different method that would enable both speed and better accuracy I would gladly try to adapt it.
klausler
commented
Nov 2, 2023
How much faster is this inexact algorithm than an internal READ
statement with a good Fortran compiler?
Between 30x to 70x depending on the hardware and compiler (using -O3 -march=native)
I've done all the tests with gfortran 11 / 12 / 13 and ifort19 / ifort21 / ifx23
klausler
commented
Nov 2, 2023
Using a variant of my test from above, sweeping over all the ca. 4B valid 32-bit IEEE-754 numbers, I measured that it takes 4795 total CPU seconds to just do all the internal writes, 2107 additional CPU seconds to also do all the internal reads, or 526 additional CPU seconds to also run str2num
instead of using internal reads. That's a speed-up of 4x, which is not 70x but would still be great if the results were the same. (This is with f18 ("llvm flang"), which has pretty fast binary<->decimal conversions.)
Just saw the tests were finish, the only errors prompted were
str2float failure: rval 7F800000 rcheck 0 want Infinity got 0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000E+00 str2float failure: rval FF800000 rcheck 0 want -Infinity got 0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000E+00 4278190080 total str2float passes out of 4278190082 tests 0 total str2float failures out of 4278190082 tests
I did get those speed ups with str2double (now just to_num) ... I'll double check speedups for real32
One idea for coming across the precision issue would be to store the exponents in double precision. As the multiplication is done just once at the end, it should no add much penalty. I'll try that out.
So, with the exponents table in single precision:
input : "0.123456789123456789123456789123456789" formatted read : 0.123456791 str2real : 0.123456776 difference abs : -0.149011612E-7 difference rel : -0.120699415E-4%
input : "0.589575869907087640261923735717316179460653520220192613836518827249684807090268634510721312835812568664551E-38" formatted read : 0.589575870E-38 str2real : 0.589575730E-38 difference abs : -0.140129846E-44 difference rel : -0.237679069E-4%
exponents in double precision
input : "0.123456789123456789123456789123456789" formatted read : 0.123456791 str2real : 0.123456784 difference abs : -0.745058060E-8 difference rel : -0.603497074E-5%
input : "0.589575869907087640261923735717316179460653520220192613836518827249684807090268634510721312835812568664551E-38" formatted read : 0.589575870E-38 str2real : 0.589575870E-38 difference abs : 0.00000000 difference rel : 0.00000000%
Mixing the precisions gets the error down to epsilon(0.0) for real32
I'm kind of puzzled by the NaN assignment. I did define a parameter real(wp), parameter :: rNaN = real(z'7fc00000',wp)
which is assigned if "N" if found
else if( iachar(s(p:p)) == NaN ) then v = rNaN; return
But a simple print returns 0.00000000 instead (same for Inf) ... any clues ?
@klausler reran your tests with the latest version and got the error down to zero. No prompts other than the Infinity and -Infinity with the condition on if (rval == rcheck) then
... have yet to understand why the parameters are not being correctly assigned. Would you mind cross-validating? With the new interface just change
use stdlib_str2num, only: to_num
...
rcheck = to_num(buff,rcheck)
Edit: found and fixed the issue with Inf and NaN
Mixing the precisions gets the error down to epsilon(0.0) for real32
I'm kind of puzzled by the NaN assignment. I did define a parameter
real(wp), parameter :: rNaN = real(z'7fc00000',wp)
which is assigned if "N" if foundelse if( iachar(s(p:p)) == NaN ) then v = rNaN; returnBut a simple print returns 0.00000000 instead (same for Inf) ... any clues ?
If possible I suggest to use the ones provided by ieee_arithmetic
as in other stdlib
module (e.g. in the stats mean module )
@jvdp1 by any chance do you know why there are those two fails in the build? when I look at the logs, they do not seem to be related to the current PR.
@jvdp1 by any chance do you know why there are those two fails in the build? when I look at the logs, they do not seem to be related to the current PR.
No idea. But it seems unrelated to this PR.
@klausler after verifying that your tests runs successfully without any errors with strict equivalence, I adapted it to check for the speedups. I saw values oscillating between 6x and 13x stabilizing around 8x for the real32 reader (to_num) compared with the formatted read.
Compiler: gfortran 13.2
Options: -O3 -march=native
OS: WSL Ubuntu 20.04
CPU: Intel Xeon Gold @ 2.9GHz
@jvdp1 regarding this:
If possible I suggest to use the ones provided by ieee_arithmetic as in other stdlib module (e.g. in the stats mean module )
I have just one remark:
Defining a parameter variable should in principle offer better optimization compared to having to call the function ieee_value(1._wp, ieee_positive_inf)
which is not an intrinsic function. Thus one cannot do something like:
real(wp), parameter :: Inf = ieee_value(1._wp, ieee_positive_inf)
(at least not with gfortran or ifort). One can only evaluate within the implementation. Then, I'm not sure how this would behave with all the compilers out there, and if it is an actual concern?
I have just one remark: Defining a parameter variable should in principle offer better optimization compared to having to call the function
ieee_value(1._wp, ieee_positive_inf)
which is not an intrinsic function. Thus one cannot do something like:real(wp), parameter :: Inf = ieee_value(1._wp, ieee_positive_inf)
(at least not with gfortran or ifort). One can only evaluate within the implementation. Then, I'm not sure how this would behave with all the compilers out there, and if it is an actual concern?
There have been many discussions about NaN and Inf, see e.g., #128 . As I remember the main issue was portability.
Co-authored-by: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
Co-authored-by: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are some final? comments, mainly related to FORD
and the generation of the specs.
Co-authored-by: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
Co-authored-by: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
Co-authored-by: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
Is it possible to re-launch the build checks without a commit? I've seen a few times here that the macos-latest build fails after staying idle for hours. It doesn't seem to be related to the changes in the PR.
Is it possible to re-launch the build checks without a commit? I've seen a few times here that the macos-latest build fails after staying idle for hours. It doesn't seem to be related to the changes in the PR.
Done! Indeed, it seems related to the CI, instead of stdlib
itself
Hi, is there something else that should be done before the PR can be merged?
IMO this PR sounds good and ready to be merged. But it would be good to have at least one additional reviewer (maybe @klausler @milancurcic @gnikit @awvwgk ?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work @jalvesz. This would be an amazing addition to stdlib.
I had a read over the added files and they look good to me.
I would be keen to merge this and make a minor stdlib release, unless others have any objections.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this PR, looks good. Only minor nit-picks in comments.
Not a big deal, but it would be nice if the style was consistent within the module, e.g. the whitespaces around operators, some are there, some aren't.
Another one is the use of semicolons to put multiple statements on one line. I think we should just not use those at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to have an interface to a shallow wrapper that takes string_type
as input. Better integration with the rest of stdlib.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't used the string_type from stdlib so didn't think about that. Maybe after merging the PR we could take a look at that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely, since it's an additional feature, can be a future PR.
I would be keen to merge this and make a minor stdlib release, unless others have any objections.
@gnikit I have no objections to that.
@jvdp1, I think you can go ahead and merge.
thank you @jalvesz for this PR, and all participants for useful discussions and reviews. As suggested I will merge it this PR.
Thank you @jvdp1 @klausler @milancurcic @gnikit @ivan-pi for you excellent guidance and reviews! hope this PR will be useful for the Fortran community!
The following PR contains an adaptation of this project https://github.com/jalvesz/Fortran-String-to-Num which was developed after the discussions about how to do fast string to double conversion here https://fortran-lang.discourse.group/t/faster-string-to-double/2208/85
I have provided unit tests for string to real32 and real64, as well as a short example.
Might need some help/guidance to finalize a proper automatic documentation, and totally open to change the naming convention if needed.