I need to calculate a Bayesian likelihood in its negative logarithmic form:
$$-\ln L = N\ln M -\sum\limits_{i=1}^{N} \ln\left\{P_i \sum\limits_{j=1}^M \frac{\exp\left[ -\frac{1}{2}\left(\sum_{k=1}^D \frac{(f_{ik}-g_{jk})^2}{\sigma_{ik}^2 + \rho_{jk}^2}\right) \right]}{\prod_{k=1}^D \sqrt{\sigma_{ik}^2 + \rho_{jk}^2}} \right\}$$
where these $$P_i\;\;;\;\; f_{ik} \;\;;\;\; \sigma_{ik}^2\;\;;\;\; g_{jk} \;\;;\;\; \rho_{jk}^2$$ are all known floats. This is my code:
import numpy as np
# Define random data with correct shapes.
N, M = np.random.randint(10, 1000), np.random.randint(10, 1000)
D = np.random.randint(2, 5)
f_N, g_M = np.random.rand(N, D, 2), np.random.rand(M, D, 2)
P_i = np.random.uniform(0., 1., N)
# N summatory.
# Store here each 'i' element from the M summatory.
sum_Mi = []
for f_i in f_N:
# M summatory.
sum_M_j = 0.
for g_j in g_M:
# Obtain ijk element inside the M summatory
D_sum, sigma_ijk = 0., 1.
for k, f_ik in enumerate(f_i):
# Where: f_ik[0] = \f_{ik} ; g_j[k][0] = \g_{jk}
# and: f_ik[1] = \sigma_{ik}^2 ; g_j[k][1] = \rho_{jk}^2
D_sum += np.square(f_ik[0] - g_j[k][0]) / (f_ik[1] + g_j[k][1])
sigma_ijk = sigma_ijk * (f_ik[1] + g_j[k][1])
sum_M_j += np.exp(-0.5 * D_sum) / np.sqrt(sigma_ijk)
sum_Mi.append(sum_M_j)
sum_Mi_Pi = P_i * sum_Mi
# Remove 0. elements before applying the logarithm below.
sum_N = sum_Mi_Pi[sum_Mi_Pi != 0.]
# Final negative logarithmic likelihood
lik = len(f_N) * np.log(len(g_M)) - sum(np.log(sum_N))
print(lik)
I'm after an optimization done at the code level, ie: not using multi-threading.
I've tried numpy
broadcasting on the inner loop
# Obtain ijk element inside the M summatory
D_sum = sum(np.square((f_i[:, 0] - g_j[:, 0])) / (f_i[:, 1] + g_j[:, 1]))
sigma_ijk = np.prod(f_i[:, 1] + g_j[:, 1])
but that actually takes a longer time to complete than using the for
loop!
I also tried using the logsumexp function from scipy
combined with the broadcasted arrays:
# Store here each 'i' element from the M summatory.
sum_N = 0.
# N summatory.
for f_i in f_N:
# M summatory.
A, B = [], []
for g_j in g_M:
# Obtain ijk element inside the M summatory
A.append(sum(
np.square((f_i[:, 0] - g_j[:, 0])) / (f_i[:, 1] + g_j[:, 1])))
B.append(np.prod(f_i[:, 1] + g_j[:, 1]))
sum_N += logsumexp(-.5 * np.array(A), b=1. / np.sqrt(B))
# Final negative logarithmic likelihood
lik = len(f_N) * np.log(len(g_M)) - sum(np.log(P_i[P_i != 0.])) - sum_N
print(lik)
This resulted in a marginal improvement over just using broadcasting, but still slower than the original code.
1 Answer 1
This is much faster. I added the seed
to better compare results across runs.
import numpy as np
# Define random data with correct shapes.
np.random.seed(0)
N, M = np.random.randint(10, 1000), np.random.randint(10, 1000)
D = np.random.randint(2, 5)
f_N, g_M = np.random.rand(N, D, 2), np.random.rand(M, D, 2)
P_i = np.random.uniform(0., 1., N)
# replace loops with broadcasted array operation, followed by sums
fgsum = f_N[:,None,:,1] + g_M[None,:,:,1]
fgdif = f_N[:,None,:,0] - g_M[None,:,:,0]
print(fgsum.shape) # e.g. (694, 569, 3)
Dsum = (fgdif**2 / fgsum).sum(axis=-1)
sigma_ijk = np.prod(fgsum, axis=-1)
sum_M_j = np.exp(-0.5 * Dsum) / np.sqrt(sigma_ijk)
sum_Mi = np.sum(sum_M_j, axis=-1)
# as before
sum_Mi_Pi = P_i * sum_Mi
# Remove 0. elements before applying the logarithm below.
sum_N = sum_Mi_Pi[sum_Mi_Pi != 0.]
# Final negative logarithmic likelihood
lik = len(f_N) * np.log(len(g_M)) - sum(np.log(sum_N))
print(lik)
-
\$\begingroup\$ Excellent! This code is between 80-100 times faster in my machine, thank you very much hpaulj! \$\endgroup\$Gabriel– Gabriel2017年03月17日 21:54:01 +00:00Commented Mar 17, 2017 at 21:54
-
1\$\begingroup\$ You can get another few percent faster by replacing
sum(np.log(sum_N))
withnp.log(sum_N).sum()
ornp.sum(np.log(sum_N))
. \$\endgroup\$Graipher– Graipher2017年03月18日 09:59:22 +00:00Commented Mar 18, 2017 at 9:59 -
1\$\begingroup\$ okay ... nice that this is faster, but ... WHY??? \$\endgroup\$Vogel612– Vogel6122017年03月18日 16:41:21 +00:00Commented Mar 18, 2017 at 16:41
-
\$\begingroup\$ Because it replaces Python loops with compiled
numpy
code. Sorry. I normally answer on SO where this kind ofnumpy
'vectorization' is an everyday question. \$\endgroup\$hpaulj– hpaulj2017年03月18日 16:48:09 +00:00Commented Mar 18, 2017 at 16:48 -
1\$\begingroup\$ More about increasing performance with
numpy
: codereview.stackexchange.com/questions/17702/… \$\endgroup\$hpaulj– hpaulj2017年03月18日 16:50:18 +00:00Commented Mar 18, 2017 at 16:50
\
in front of your$
. \$\endgroup\$D
is so small compared toN
andM
, e.g. 3 v 500 and 500. It's just a few loops of a costly expression. \$\endgroup\$