Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit 7b99d43

Browse files
authored
Merge pull request #915 from jalvesz/intrinsics
intrinsics module with alternative implementations
2 parents 6c2565d + 12612bc commit 7b99d43

13 files changed

+1022
-1
lines changed

‎doc/specs/stdlib_intrinsics.md‎

Lines changed: 158 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,158 @@
1+
---
2+
title: intrinsics
3+
---
4+
5+
# The `stdlib_intrinsics` module
6+
7+
[TOC]
8+
9+
## Introduction
10+
11+
The `stdlib_intrinsics` module provides replacements for some of the well known intrinsic functions found in Fortran compilers for which either a faster and/or more accurate implementation is found which has also proven of interest to the Fortran community.
12+
13+
<!-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -->
14+
### `stdlib_sum` function
15+
16+
#### Description
17+
18+
The `stdlib_sum` function can replace the intrinsic `sum` for `real`, `complex` or `integer` arrays. It follows a chunked implementation which maximizes vectorization potential as well as reducing the round-off error. This procedure is recommended when summing large (e..g, >2**10 elements) arrays, for repetitive summation of smaller arrays consider the classical `sum`.
19+
20+
#### Syntax
21+
22+
`res = ` [[stdlib_intrinsics(module):stdlib_sum(interface)]] ` (x [,mask] )`
23+
24+
`res = ` [[stdlib_intrinsics(module):stdlib_sum(interface)]] ` (x, dim [,mask] )`
25+
26+
#### Status
27+
28+
Experimental
29+
30+
#### Class
31+
32+
Pure function.
33+
34+
#### Argument(s)
35+
36+
`x`: N-D array of either `real`, `complex` or `integer` type. This argument is `intent(in)`.
37+
38+
`dim` (optional): scalar of type `integer` with a value in the range from 1 to n, where n equals the rank of `x`.
39+
40+
`mask` (optional): N-D array of `logical` values, with the same shape as `x`. This argument is `intent(in)`.
41+
42+
#### Output value or Result value
43+
44+
If `dim` is absent, the output is a scalar of the same `type` and `kind` as to that of `x`. Otherwise, an array of rank n-1, where n equals the rank of `x`, and a shape similar to that of `x` with dimension `dim` dropped is returned.
45+
46+
<!-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -->
47+
### `stdlib_sum_kahan` function
48+
49+
#### Description
50+
51+
The `stdlib_sum_kahan` function can replace the intrinsic `sum` for `real` or `complex` arrays. It follows a chunked implementation which maximizes vectorization potential complemented by an `elemental` kernel based on the [kahan summation](https://doi.org/10.1145%2F363707.363723) strategy to reduce the round-off error:
52+
53+
```fortran
54+
elemental subroutine kahan_kernel_<kind>(a,s,c)
55+
type(<kind>), intent(in) :: a
56+
type(<kind>), intent(inout) :: s
57+
type(<kind>), intent(inout) :: c
58+
type(<kind>) :: t, y
59+
y = a - c
60+
t = s + y
61+
c = (t - s) - y
62+
s = t
63+
end subroutine
64+
```
65+
66+
#### Syntax
67+
68+
`res = ` [[stdlib_intrinsics(module):stdlib_sum_kahan(interface)]] ` (x [,mask] )`
69+
70+
`res = ` [[stdlib_intrinsics(module):stdlib_sum_kahan(interface)]] ` (x, dim [,mask] )`
71+
72+
#### Status
73+
74+
Experimental
75+
76+
#### Class
77+
78+
Pure function.
79+
80+
#### Argument(s)
81+
82+
`x`: 1D array of either `real` or `complex` type. This argument is `intent(in)`.
83+
84+
`dim` (optional): scalar of type `integer` with a value in the range from 1 to n, where n equals the rank of `x`.
85+
86+
`mask` (optional): N-D array of `logical` values, with the same shape as `x`. This argument is `intent(in)`.
87+
88+
#### Output value or Result value
89+
90+
If `dim` is absent, the output is a scalar of the same type and kind as to that of `x`. Otherwise, an array of rank n-1, where n equals the rank of `x`, and a shape similar to that of `x` with dimension `dim` dropped is returned.
91+
92+
#### Example
93+
94+
```fortran
95+
{!example/intrinsics/example_sum.f90!}
96+
```
97+
98+
<!-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -->
99+
### `stdlib_dot_product` function
100+
101+
#### Description
102+
103+
The `stdlib_dot_product` function can replace the intrinsic `dot_product` for 1D `real`, `complex` or `integer` arrays. It follows a chunked implementation which maximizes vectorization potential as well as reducing the round-off error. This procedure is recommended when crunching large arrays, for repetitive products of smaller arrays consider the classical `dot_product`.
104+
105+
#### Syntax
106+
107+
`res = ` [[stdlib_intrinsics(module):stdlib_dot_product(interface)]] ` (x, y)`
108+
109+
#### Status
110+
111+
Experimental
112+
113+
#### Class
114+
115+
Pure function.
116+
117+
#### Argument(s)
118+
119+
`x`: 1D array of either `real`, `complex` or `integer` type. This argument is `intent(in)`.
120+
121+
`y`: 1D array of the same type and kind as `x`. This argument is `intent(in)`.
122+
123+
#### Output value or Result value
124+
125+
The output is a scalar of `type` and `kind` same as to that of `x` and `y`.
126+
127+
<!-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -->
128+
### `stdlib_dot_product_kahan` function
129+
130+
#### Description
131+
132+
The `stdlib_dot_product_kahan` function can replace the intrinsic `dot_product` for 1D `real` or `complex` arrays. It follows a chunked implementation which maximizes vectorization potential, complemented by the same `elemental` kernel based on the [kahan summation](https://doi.org/10.1145%2F363707.363723) used for `stdlib_sum` to reduce the round-off error.
133+
134+
#### Syntax
135+
136+
`res = ` [[stdlib_intrinsics(module):stdlib_dot_product_kahan(interface)]] ` (x, y)`
137+
138+
#### Status
139+
140+
Experimental
141+
142+
#### Class
143+
144+
Pure function.
145+
146+
#### Argument(s)
147+
148+
`x`: 1D array of either `real` or `complex` type. This argument is `intent(in)`.
149+
150+
`y`: 1D array of the same type and kind as `x`. This argument is `intent(in)`.
151+
152+
#### Output value or Result value
153+
154+
The output is a scalar of the same type and kind as to that of `x` and `y`.
155+
156+
```fortran
157+
{!example/intrinsics/example_dot_product.f90!}
158+
```

‎example/CMakeLists.txt‎

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ add_subdirectory(constants)
1313
add_subdirectory(error)
1414
add_subdirectory(hashmaps)
1515
add_subdirectory(hash_procedures)
16+
add_subdirectory(intrinsics)
1617
add_subdirectory(io)
1718
add_subdirectory(linalg)
1819
add_subdirectory(logger)

‎example/intrinsics/CMakeLists.txt‎

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
ADD_EXAMPLE(sum)
2+
ADD_EXAMPLE(dot_product)
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
program example_dot_product
2+
use stdlib_kinds, only: sp
3+
use stdlib_intrinsics, only: stdlib_dot_product, stdlib_dot_product_kahan
4+
implicit none
5+
6+
real(sp), allocatable :: x(:), y(:)
7+
real(sp) :: total_prod(3)
8+
9+
allocate( x(1000), y(1000) )
10+
call random_number(x)
11+
call random_number(y)
12+
13+
total_prod(1) = dot_product(x,y) !> compiler intrinsic
14+
total_prod(2) = stdlib_dot_product(x,y) !> chunked summation over inner product
15+
total_prod(3) = stdlib_dot_product_kahan(x,y) !> chunked kahan summation over inner product
16+
print *, total_prod(1:3)
17+
18+
end program example_dot_product

‎example/intrinsics/example_sum.f90‎

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
program example_sum
2+
use stdlib_kinds, only: sp
3+
use stdlib_intrinsics, only: stdlib_sum, stdlib_sum_kahan
4+
implicit none
5+
6+
real(sp), allocatable :: x(:)
7+
real(sp) :: total_sum(3)
8+
9+
allocate( x(1000) )
10+
call random_number(x)
11+
12+
total_sum(1) = sum(x) !> compiler intrinsic
13+
total_sum(2) = stdlib_sum(x) !> chunked summation
14+
total_sum(3) = stdlib_sum_kahan(x)!> chunked kahan summation
15+
print *, total_sum(1:3)
16+
17+
end program example_sum

‎src/CMakeLists.txt‎

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,9 @@ set(fppFiles
1717
stdlib_hash_64bit_fnv.fypp
1818
stdlib_hash_64bit_pengy.fypp
1919
stdlib_hash_64bit_spookyv2.fypp
20+
stdlib_intrinsics_dot_product.fypp
21+
stdlib_intrinsics_sum.fypp
22+
stdlib_intrinsics.fypp
2023
stdlib_io.fypp
2124
stdlib_io_npy.fypp
2225
stdlib_io_npy_load.fypp

‎src/stdlib_constants.fypp‎

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,13 @@
11
#:include "common.fypp"
22
#:set KINDS = REAL_KINDS
3+
#:set I_KINDS_TYPES = list(zip(INT_KINDS, INT_TYPES, INT_KINDS))
4+
#:set R_KINDS_TYPES = list(zip(REAL_KINDS, REAL_TYPES, REAL_SUFFIX))
5+
#:set C_KINDS_TYPES = list(zip(CMPLX_KINDS, CMPLX_TYPES, CMPLX_SUFFIX))
6+
37
module stdlib_constants
48
!! Constants
59
!! ([Specification](../page/specs/stdlib_constants.html))
6-
use stdlib_kinds, only: #{for k in KINDS[:-1]}#${k},ドル #{endfor}#${KINDS[-1]}$
10+
use stdlib_kinds
711
use stdlib_codata, only: SPEED_OF_LIGHT_IN_VACUUM, &
812
VACUUM_ELECTRIC_PERMITTIVITY, &
913
VACUUM_MAG_PERMEABILITY, &
@@ -60,5 +64,17 @@ module stdlib_constants
6064
real(dp), parameter, public :: u = ATOMIC_MASS_CONSTANT%value !! Atomic mass constant
6165

6266
! Additional constants if needed
67+
#:for k, t, s in I_KINDS_TYPES
68+
${t},ドル parameter, public :: zero_${s}$ = 0_${k}$
69+
${t},ドル parameter, public :: one_${s}$ = 1_${k}$
70+
#:endfor
71+
#:for k, t, s in R_KINDS_TYPES
72+
${t},ドル parameter, public :: zero_${s}$ = 0._${k}$
73+
${t},ドル parameter, public :: one_${s}$ = 1._${k}$
74+
#:endfor
75+
#:for k, t, s in C_KINDS_TYPES
76+
${t},ドル parameter, public :: zero_${s}$ = (0._${k},0ドル._${k}$)
77+
${t},ドル parameter, public :: one_${s}$ = (1._${k},0ドル._${k}$)
78+
#:endfor
6379

6480
end module stdlib_constants

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /