Numerical integration with Numba

Question 1

I'm a bit new to working with Numba, but I got the gist of it. I wonder if there any more advanced tricks to make four nested for loops even faster that what I have now. In particular, I need to calculate the following integral:

$$ G_B(\mathbf X, T) = \Lambda \int_\Omega G(\mathbf X, \mathbf X', T) W(\mathbf X', T) \ d\mathbf X' \\ G(\mathbf X, \mathbf X', T) = \frac{1}{2\pi S_0^2} \exp\left[-\frac{\left| \mathbf X -\mathbf X'\right|^2}{2[S_0(1+EB(\mathbf X, T))]^2}\right] $$

Where \$B\$ is a 2D array, and \$S_0\$ and \$E\$ are certain parameters. My code is the following:

import numpy as np
from numba import njit, double
def calc_gb_gauss_2d(b,s0,e,dx):
 n,m=b.shape
 norm = 1.0/(2*np.pi*s0**2)
 gb = np.zeros((n,m))
 for i in range(n):
 for j in range(m):
 sigma = 2.0*(s0*(1.0+e*b[i,j]))**2
 for ii in range(n):
 for jj in range(m):
 gb[i,j]+=np.exp(-(((i-ii)*dx)**2+((j-jj)*dx)**2)/sigma)
 gb[i,j]*=norm
 return gb
calc_gb_gauss_2d_nb = njit(double[:, :](double[:, :],double,double,double))(calc_gb_gauss_2d)

For and input array of size ×ばつ256 the calculation speed is:

In [4]: a=random.random((256,256))
In [5]: %timeit calc_gb_gauss_2d_nb(a,0.1,1.0,0.5)
The slowest run took 8.46 times longer than the fastest. This could mean that an intermediate result is being cached.
1 loop, best of 3: 1min 1s per loop

Comparison between pure Python and Numba calculation speed give me this picture: Comparative performance plot

Is there any way to optimize my code for better performance?

Question 2

When using njit as a decorator you have access to multiple options that you might want to try, such as:

fastmath
parallel
nogil

I will only focus on fastmath and parallel. There must be someone more competent to deal with the nogil argument.

Fastmath

This options let you relax some operations (diminishing the precision) for shorter computation time. It can be set to True/False but you can be more picky with the options you want to make faster.

Parallel and here

It doesn't have the freedom of conventional parallel projects such as OMP/MPI, but it supports the prange directive. You have to explicitly choose the loop you want to parallelize. Also you might need to slightly modify the code - for example, creating a variable to compute reductions or "private" to each thread/process.

In addition to that, you should not only test times, but also test your integration scheme on test cases when you have the exact solution or at best an approximation.

mle mle 1715 bronze badges · Answer 1 · 2024-02-21 21:52:50Z

When using njit as a decorator you have access to multiple options that you might want to try, such as:

fastmath
parallel
nogil

I will only focus on fastmath and parallel. There must be someone more competent to deal with the nogil argument.

Fastmath

This options let you relax some operations (diminishing the precision) for shorter computation time. It can be set to True/False but you can be more picky with the options you want to make faster.

Parallel and here

It doesn't have the freedom of conventional parallel projects such as OMP/MPI, but it supports the prange directive. You have to explicitly choose the loop you want to parallelize. Also you might need to slightly modify the code - for example, creating a variable to compute reductions or "private" to each thread/process.

In addition to that, you should not only test times, but also test your integration scheme on test cases when you have the exact solution or at best an approximation.

Stack Exchange Network

Numerical integration with Numba

1 Answer 1

Fastmath

Parallel and here

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

1 Answer 1

Parallel and here

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related