Commit 7b99d43

authored

Merge pull request #915 from jalvesz/intrinsics

intrinsics module with alternative implementations

2 parents 6c2565d + 12612bc commit 7b99d43Copy full SHA for 7b99d43

File tree

13 files changed

+1022

-1

lines changed

doc/specs
- stdlib_intrinsics.md
example
- CMakeLists.txt
- intrinsics
src
test
- CMakeLists.txt
- intrinsics
  - CMakeLists.txt
  - test_intrinsics.fypp

13 files changed

+1022

-1

lines changed

`‎doc/specs/stdlib_intrinsics.md‎`

Lines changed: 158 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,158 @@`
	`1`	`+---`
	`2`	`+title: intrinsics`
	`3`	`+---`
	`4`	`+`
	`5`	+# The `stdlib_intrinsics` module
	`6`	`+`
	`7`	`+[TOC]`
	`8`	`+`
	`9`	`+## Introduction`
	`10`	`+`
	`11`	+The `stdlib_intrinsics` module provides replacements for some of the well known intrinsic functions found in Fortran compilers for which either a faster and/or more accurate implementation is found which has also proven of interest to the Fortran community.
	`12`	`+`
	`13`	`+<!-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -->`
	`14`	+### `stdlib_sum` function
	`15`	`+`
	`16`	`+#### Description`
	`17`	`+`
	`18`	+The `stdlib_sum` function can replace the intrinsic `sum` for `real`, `complex` or `integer` arrays. It follows a chunked implementation which maximizes vectorization potential as well as reducing the round-off error. This procedure is recommended when summing large (e..g, >2**10 elements) arrays, for repetitive summation of smaller arrays consider the classical `sum`.
	`19`	`+`
	`20`	`+#### Syntax`
	`21`	`+`
	`22`	+`res = ` [[stdlib_intrinsics(module):stdlib_sum(interface)]] ` (x [,mask] )`
	`23`	`+`
	`24`	+`res = ` [[stdlib_intrinsics(module):stdlib_sum(interface)]] ` (x, dim [,mask] )`
	`25`	`+`
	`26`	`+#### Status`
	`27`	`+`
	`28`	`+Experimental`
	`29`	`+`
	`30`	`+#### Class`
	`31`	`+`
	`32`	`+Pure function.`
	`33`	`+`
	`34`	`+#### Argument(s)`
	`35`	`+`
	`36`	+`x`: N-D array of either `real`, `complex` or `integer` type. This argument is `intent(in)`.
	`37`	`+`
	`38`	+`dim` (optional): scalar of type `integer` with a value in the range from 1 to n, where n equals the rank of `x`.
	`39`	`+`
	`40`	+`mask` (optional): N-D array of `logical` values, with the same shape as `x`. This argument is `intent(in)`.
	`41`	`+`
	`42`	`+#### Output value or Result value`
	`43`	`+`
	`44`	+If `dim` is absent, the output is a scalar of the same `type` and `kind` as to that of `x`. Otherwise, an array of rank n-1, where n equals the rank of `x`, and a shape similar to that of `x` with dimension `dim` dropped is returned.
	`45`	`+`
	`46`	`+<!-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -->`
	`47`	+### `stdlib_sum_kahan` function
	`48`	`+`
	`49`	`+#### Description`
	`50`	`+`
	`51`	+The `stdlib_sum_kahan` function can replace the intrinsic `sum` for `real` or `complex` arrays. It follows a chunked implementation which maximizes vectorization potential complemented by an `elemental` kernel based on the [kahan summation](https://doi.org/10.1145%2F363707.363723) strategy to reduce the round-off error:
	`52`	`+`
	`53`	+```fortran
	`54`	`+elemental subroutine kahan_kernel_<kind>(a,s,c)`
	`55`	`+ type(<kind>), intent(in) :: a`
	`56`	`+ type(<kind>), intent(inout) :: s`
	`57`	`+ type(<kind>), intent(inout) :: c`
	`58`	`+ type(<kind>) :: t, y`
	`59`	`+ y = a - c`
	`60`	`+ t = s + y`
	`61`	`+ c = (t - s) - y`
	`62`	`+ s = t`
	`63`	`+end subroutine`
	`64`	+```
	`65`	`+`
	`66`	`+#### Syntax`
	`67`	`+`
	`68`	+`res = ` [[stdlib_intrinsics(module):stdlib_sum_kahan(interface)]] ` (x [,mask] )`
	`69`	`+`
	`70`	+`res = ` [[stdlib_intrinsics(module):stdlib_sum_kahan(interface)]] ` (x, dim [,mask] )`
	`71`	`+`
	`72`	`+#### Status`
	`73`	`+`
	`74`	`+Experimental`
	`75`	`+`
	`76`	`+#### Class`
	`77`	`+`
	`78`	`+Pure function.`
	`79`	`+`
	`80`	`+#### Argument(s)`
	`81`	`+`
	`82`	+`x`: 1D array of either `real` or `complex` type. This argument is `intent(in)`.
	`83`	`+`
	`84`	+`dim` (optional): scalar of type `integer` with a value in the range from 1 to n, where n equals the rank of `x`.
	`85`	`+`
	`86`	+`mask` (optional): N-D array of `logical` values, with the same shape as `x`. This argument is `intent(in)`.
	`87`	`+`
	`88`	`+#### Output value or Result value`
	`89`	`+`
	`90`	+If `dim` is absent, the output is a scalar of the same type and kind as to that of `x`. Otherwise, an array of rank n-1, where n equals the rank of `x`, and a shape similar to that of `x` with dimension `dim` dropped is returned.
	`91`	`+`
	`92`	`+#### Example`
	`93`	`+`
	`94`	+```fortran
	`95`	`+{!example/intrinsics/example_sum.f90!}`
	`96`	+```
	`97`	`+`
	`98`	`+<!-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -->`
	`99`	+### `stdlib_dot_product` function
	`100`	`+`
	`101`	`+#### Description`
	`102`	`+`
	`103`	+The `stdlib_dot_product` function can replace the intrinsic `dot_product` for 1D `real`, `complex` or `integer` arrays. It follows a chunked implementation which maximizes vectorization potential as well as reducing the round-off error. This procedure is recommended when crunching large arrays, for repetitive products of smaller arrays consider the classical `dot_product`.
	`104`	`+`
	`105`	`+#### Syntax`
	`106`	`+`
	`107`	+`res = ` [[stdlib_intrinsics(module):stdlib_dot_product(interface)]] ` (x, y)`
	`108`	`+`
	`109`	`+#### Status`
	`110`	`+`
	`111`	`+Experimental`
	`112`	`+`
	`113`	`+#### Class`
	`114`	`+`
	`115`	`+Pure function.`
	`116`	`+`
	`117`	`+#### Argument(s)`
	`118`	`+`
	`119`	+`x`: 1D array of either `real`, `complex` or `integer` type. This argument is `intent(in)`.
	`120`	`+`
	`121`	+`y`: 1D array of the same type and kind as `x`. This argument is `intent(in)`.
	`122`	`+`
	`123`	`+#### Output value or Result value`
	`124`	`+`
	`125`	+The output is a scalar of `type` and `kind` same as to that of `x` and `y`.
	`126`	`+`
	`127`	`+<!-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -->`
	`128`	+### `stdlib_dot_product_kahan` function
	`129`	`+`
	`130`	`+#### Description`
	`131`	`+`
	`132`	+The `stdlib_dot_product_kahan` function can replace the intrinsic `dot_product` for 1D `real` or `complex` arrays. It follows a chunked implementation which maximizes vectorization potential, complemented by the same `elemental` kernel based on the [kahan summation](https://doi.org/10.1145%2F363707.363723) used for `stdlib_sum` to reduce the round-off error.
	`133`	`+`
	`134`	`+#### Syntax`
	`135`	`+`
	`136`	+`res = ` [[stdlib_intrinsics(module):stdlib_dot_product_kahan(interface)]] ` (x, y)`
	`137`	`+`
	`138`	`+#### Status`
	`139`	`+`
	`140`	`+Experimental`
	`141`	`+`
	`142`	`+#### Class`
	`143`	`+`
	`144`	`+Pure function.`
	`145`	`+`
	`146`	`+#### Argument(s)`
	`147`	`+`
	`148`	+`x`: 1D array of either `real` or `complex` type. This argument is `intent(in)`.
	`149`	`+`
	`150`	+`y`: 1D array of the same type and kind as `x`. This argument is `intent(in)`.
	`151`	`+`
	`152`	`+#### Output value or Result value`
	`153`	`+`
	`154`	+The output is a scalar of the same type and kind as to that of `x` and `y`.
	`155`	`+`
	`156`	+```fortran
	`157`	`+{!example/intrinsics/example_dot_product.f90!}`
	`158`	+```

`‎example/CMakeLists.txt‎`

Lines changed: 1 addition & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -13,6 +13,7 @@ add_subdirectory(constants)`
`13`	`13`	`add_subdirectory(error)`
`14`	`14`	`add_subdirectory(hashmaps)`
`15`	`15`	`add_subdirectory(hash_procedures)`
	`16`	`+add_subdirectory(intrinsics)`
`16`	`17`	`add_subdirectory(io)`
`17`	`18`	`add_subdirectory(linalg)`
`18`	`19`	`add_subdirectory(logger)`

`‎example/intrinsics/CMakeLists.txt‎`

Lines changed: 2 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,2 @@`
	`1`	`+ADD_EXAMPLE(sum)`
	`2`	`+ADD_EXAMPLE(dot_product)`

`‎example/intrinsics/example_dot_product.f90‎`

Lines changed: 18 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,18 @@`
	`1`	`+program example_dot_product`
	`2`	`+ use stdlib_kinds, only: sp`
	`3`	`+ use stdlib_intrinsics, only: stdlib_dot_product, stdlib_dot_product_kahan`
	`4`	`+ implicit none`
	`5`	`+`
	`6`	`+ real(sp), allocatable :: x(:), y(:)`
	`7`	`+ real(sp) :: total_prod(3)`
	`8`	`+`
	`9`	`+ allocate( x(1000), y(1000) )`
	`10`	`+ call random_number(x)`
	`11`	`+ call random_number(y)`
	`12`	`+`
	`13`	`+ total_prod(1) = dot_product(x,y) !> compiler intrinsic`
	`14`	`+ total_prod(2) = stdlib_dot_product(x,y) !> chunked summation over inner product`
	`15`	`+ total_prod(3) = stdlib_dot_product_kahan(x,y) !> chunked kahan summation over inner product`
	`16`	`+ print *, total_prod(1:3)`
	`17`	`+`
	`18`	`+end program example_dot_product`

`‎example/intrinsics/example_sum.f90‎`

Lines changed: 17 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,17 @@`
	`1`	`+program example_sum`
	`2`	`+ use stdlib_kinds, only: sp`
	`3`	`+ use stdlib_intrinsics, only: stdlib_sum, stdlib_sum_kahan`
	`4`	`+ implicit none`
	`5`	`+`
	`6`	`+ real(sp), allocatable :: x(:)`
	`7`	`+ real(sp) :: total_sum(3)`
	`8`	`+`
	`9`	`+ allocate( x(1000) )`
	`10`	`+ call random_number(x)`
	`11`	`+`
	`12`	`+ total_sum(1) = sum(x) !> compiler intrinsic`
	`13`	`+ total_sum(2) = stdlib_sum(x) !> chunked summation`
	`14`	`+ total_sum(3) = stdlib_sum_kahan(x)!> chunked kahan summation`
	`15`	`+ print *, total_sum(1:3)`
	`16`	`+`
	`17`	`+end program example_sum`

`‎src/CMakeLists.txt‎`

Lines changed: 3 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -17,6 +17,9 @@ set(fppFiles`
`17`	`17`	`stdlib_hash_64bit_fnv.fypp`
`18`	`18`	`stdlib_hash_64bit_pengy.fypp`
`19`	`19`	`stdlib_hash_64bit_spookyv2.fypp`
	`20`	`+ stdlib_intrinsics_dot_product.fypp`
	`21`	`+ stdlib_intrinsics_sum.fypp`
	`22`	`+ stdlib_intrinsics.fypp`
`20`	`23`	`stdlib_io.fypp`
`21`	`24`	`stdlib_io_npy.fypp`
`22`	`25`	`stdlib_io_npy_load.fypp`

`‎src/stdlib_constants.fypp‎`

Lines changed: 17 additions & 1 deletion

Original file line number	Diff line number	Diff line change
`@@ -1,9 +1,13 @@`
`1`	`1`	`#:include "common.fypp"`
`2`	`2`	`#:set KINDS = REAL_KINDS`
	`3`	`+#:set I_KINDS_TYPES = list(zip(INT_KINDS, INT_TYPES, INT_KINDS))`
	`4`	`+#:set R_KINDS_TYPES = list(zip(REAL_KINDS, REAL_TYPES, REAL_SUFFIX))`
	`5`	`+#:set C_KINDS_TYPES = list(zip(CMPLX_KINDS, CMPLX_TYPES, CMPLX_SUFFIX))`
	`6`	`+`
`3`	`7`	`module stdlib_constants`
`4`	`8`	`!! Constants`
`5`	`9`	`!! ([Specification](../page/specs/stdlib_constants.html))`
`6`		`- use stdlib_kinds, only: #{for k in KINDS[:-1]}#${k},ドル #{endfor}#${KINDS[-1]}$`
	`10`	`+ use stdlib_kinds`
`7`	`11`	`use stdlib_codata, only: SPEED_OF_LIGHT_IN_VACUUM, &`
`8`	`12`	`VACUUM_ELECTRIC_PERMITTIVITY, &`
`9`	`13`	`VACUUM_MAG_PERMEABILITY, &`
`@@ -60,5 +64,17 @@ module stdlib_constants`
`60`	`64`	`real(dp), parameter, public :: u = ATOMIC_MASS_CONSTANT%value !! Atomic mass constant`
`61`	`65`
`62`	`66`	`! Additional constants if needed`
	`67`	`+ #:for k, t, s in I_KINDS_TYPES`
	`68`	`+ ${t},ドル parameter, public :: zero_${s}$ = 0_${k}$`
	`69`	`+ ${t},ドル parameter, public :: one_${s}$ = 1_${k}$`
	`70`	`+ #:endfor`
	`71`	`+ #:for k, t, s in R_KINDS_TYPES`
	`72`	`+ ${t},ドル parameter, public :: zero_${s}$ = 0._${k}$`
	`73`	`+ ${t},ドル parameter, public :: one_${s}$ = 1._${k}$`
	`74`	`+ #:endfor`
	`75`	`+ #:for k, t, s in C_KINDS_TYPES`
	`76`	`+ ${t},ドル parameter, public :: zero_${s}$ = (0._${k},0ドル._${k}$)`
	`77`	`+ ${t},ドル parameter, public :: one_${s}$ = (1._${k},0ドル._${k}$)`
	`78`	`+ #:endfor`
`63`	`79`
`64`	`80`	`end module stdlib_constants`

0 commit comments

Comments

(0)

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit 7b99d43

File tree

13 files changed

13 files changed

`‎doc/specs/stdlib_intrinsics.md‎`

`‎example/CMakeLists.txt‎`

`‎example/intrinsics/CMakeLists.txt‎`

`‎example/intrinsics/example_dot_product.f90‎`

`‎example/intrinsics/example_sum.f90‎`

`‎src/CMakeLists.txt‎`

`‎src/stdlib_constants.fypp‎`

0 commit comments