- 145.6k
- 22
- 190
- 479
Making four nested for loops even faster Numerical integration with Numba
I'm a bit new to workworking with Numba, but I got the gist of it. I wonder if there any more advanced tricks to make four nested for
loops even faster that what I have now. In particular, I need to calculate the following integral:
enter image description here $$ G_B(\mathbf X, T) = \Lambda \int_\Omega G(\mathbf X, \mathbf X', T) W(\mathbf X', T) \ d\mathbf X' \\ G(\mathbf X, \mathbf X', T) = \frac{1}{2\pi S_0^2} \exp\left[-\frac{\left| \mathbf X -\mathbf X'\right|^2}{2[S_0(1+EB(\mathbf X, T))]^2}\right] $$
Where B\$B\$ is a 2D array, and S0\$S_0\$ and E\$E\$ are certain parameters. My code is the following:
import numpy as np
from numba import njit, double
def calc_gb_gauss_2d(b,s0,e,dx):
n,m=b.shape
norm = 1.0/(2*np.pi*s0**2)
gb = np.zeros((n,m))
for i in range(n):
for j in range(m):
sigma = 2.0*(s0*(1.0+e*b[i,j]))**2
for ii in range(n):
for jj in range(m):
gb[i,j]+=np.exp(-(((i-ii)*dx)**2+((j-jj)*dx)**2)/sigma)
gb[i,j]*=norm
return gb
calc_gb_gauss_2d_nb = njit(double[:, :](double[:, :],double,double,double))(calc_gb_gauss_2d)
For and input array of size 256x256
×ばつ256 the calculation speed is:
In [4]: a=random.random((256,256))
In [5]: %timeit calc_gb_gauss_2d_nb(a,0.1,1.0,0.5)
The slowest run took 8.46 times longer than the fastest. This could mean that an intermediate result is being cached.
1 loop, best of 3: 1min 1s per loop
Comparison between pure Python and Numba calculation speed give me this picture: enter image description hereComparative performance plot
Is there any way to optimize my code for better performance?
Making four nested for loops even faster with Numba
I'm a bit new to work with Numba, but I got the gist of it. I wonder if there any more advanced tricks to make four nested for
loops even faster that what I have now. In particular, I need to calculate the following integral:
Where B is a 2D array, and S0 and E are certain parameters. My code is the following:
import numpy as np
from numba import njit, double
def calc_gb_gauss_2d(b,s0,e,dx):
n,m=b.shape
norm = 1.0/(2*np.pi*s0**2)
gb = np.zeros((n,m))
for i in range(n):
for j in range(m):
sigma = 2.0*(s0*(1.0+e*b[i,j]))**2
for ii in range(n):
for jj in range(m):
gb[i,j]+=np.exp(-(((i-ii)*dx)**2+((j-jj)*dx)**2)/sigma)
gb[i,j]*=norm
return gb
calc_gb_gauss_2d_nb = njit(double[:, :](double[:, :],double,double,double))(calc_gb_gauss_2d)
For and input array of size 256x256
the calculation speed is:
In [4]: a=random.random((256,256))
In [5]: %timeit calc_gb_gauss_2d_nb(a,0.1,1.0,0.5)
The slowest run took 8.46 times longer than the fastest. This could mean that an intermediate result is being cached.
1 loop, best of 3: 1min 1s per loop
Comparison between pure Python and Numba calculation speed give me this picture: enter image description here
Is there any way to optimize my code for better performance?
Numerical integration with Numba
I'm a bit new to working with Numba, but I got the gist of it. I wonder if there any more advanced tricks to make four nested for
loops even faster that what I have now. In particular, I need to calculate the following integral:
$$ G_B(\mathbf X, T) = \Lambda \int_\Omega G(\mathbf X, \mathbf X', T) W(\mathbf X', T) \ d\mathbf X' \\ G(\mathbf X, \mathbf X', T) = \frac{1}{2\pi S_0^2} \exp\left[-\frac{\left| \mathbf X -\mathbf X'\right|^2}{2[S_0(1+EB(\mathbf X, T))]^2}\right] $$
Where \$B\$ is a 2D array, and \$S_0\$ and \$E\$ are certain parameters. My code is the following:
import numpy as np
from numba import njit, double
def calc_gb_gauss_2d(b,s0,e,dx):
n,m=b.shape
norm = 1.0/(2*np.pi*s0**2)
gb = np.zeros((n,m))
for i in range(n):
for j in range(m):
sigma = 2.0*(s0*(1.0+e*b[i,j]))**2
for ii in range(n):
for jj in range(m):
gb[i,j]+=np.exp(-(((i-ii)*dx)**2+((j-jj)*dx)**2)/sigma)
gb[i,j]*=norm
return gb
calc_gb_gauss_2d_nb = njit(double[:, :](double[:, :],double,double,double))(calc_gb_gauss_2d)
For and input array of size ×ばつ256 the calculation speed is:
In [4]: a=random.random((256,256))
In [5]: %timeit calc_gb_gauss_2d_nb(a,0.1,1.0,0.5)
The slowest run took 8.46 times longer than the fastest. This could mean that an intermediate result is being cached.
1 loop, best of 3: 1min 1s per loop
Comparison between pure Python and Numba calculation speed give me this picture: Comparative performance plot
Is there any way to optimize my code for better performance?
Making four nested for loops even faster with Numba
I'm a bit new to work with Numba, but I got the gist of it. I wonder if there any more advanced tricks to make four nested for
loops even faster that what I have now. In particular, I need to calculate the following integral:
Where B is a 2D array, and S0 and E are certain parameters. My code is the following:
import numpy as np
from numba import njit, double
def calc_gb_gauss_2d(b,s0,e,dx):
n,m=b.shape
norm = 1.0/(2*np.pi*s0**2)
gb = np.zeros((n,m))
for i in range(n):
for j in range(m):
sigma = 2.0*(s0*(1.0+e*b[i,j]))**2
for ii in range(n):
for jj in range(m):
gb[i,j]+=np.exp(-(((i-ii)*dx)**2+((j-jj)*dx)**2)/sigma)
gb[i,j]*=norm
return gb
calc_gb_gauss_2d_nb = njit(double[:, :](double[:, :],double,double,double))(calc_gb_gauss_2d)
For and input array of size 256x256
the calculation speed is:
In [4]: a=random.random((256,256))
In [5]: %timeit calc_gb_gauss_2d_nb(a,0.1,1.0,0.5)
The slowest run took 8.46 times longer than the fastest. This could mean that an intermediate result is being cached.
1 loop, best of 3: 1min 1s per loop
Comparison between pure Python and Numba calculation speed give me this picture: enter image description here
Is there any way to optimize my code for better performance?