I don't find an implementation in the library I work with of covariance function:
I give a try at implementing it in Python:
import numpy as np
def covariance_dirichlet(X,alpha):
cov_list = []
sigma = np.sum(alpha)
for i in range(0,len(X)):
for j in range(i+1,len(X)):
if i == j:
iter_val_num = np.multiply(X[i],sigma)-np.power(X[i],2)#**2
else:
iter_val_num = -np.multiply(X[i],X[j])#a*b
iter_res = iter_val_num/(sigma**2)*(sigma+1)
cov_list.append(iter_res)
return np.reshape(np.array(cov_list),(-1,len(X)))
print(covariance_dirichlet([[0.2, 0.2, 0.6,0.8],[0.2, 0.2, 0.6,0.8]],[0.4, 5, 15,11]))
Does this look right? Can it be simplified?
-
\$\begingroup\$ This doesn't run. You've passed the wrong number of arguments to your function. I think you've probably mis-placed a bracket for your first argument. \$\endgroup\$Reinderien– Reinderien2023年12月19日 17:03:39 +00:00Commented Dec 19, 2023 at 17:03
-
\$\begingroup\$ Small bug in the method call, now it is ok \$\endgroup\$SSSOF– SSSOF2023年12月19日 17:36:20 +00:00Commented Dec 19, 2023 at 17:36
-
\$\begingroup\$ Notice that, whereas you pass chi into your function, the equation doesn't actually refer to chi at all. It is only based on alpha. \$\endgroup\$Reinderien– Reinderien2023年12月19日 17:48:51 +00:00Commented Dec 19, 2023 at 17:48
-
\$\begingroup\$ which chi you talk about, in the params there are X and alpha \$\endgroup\$SSSOF– SSSOF2023年12月19日 18:03:20 +00:00Commented Dec 19, 2023 at 18:03
-
\$\begingroup\$ The description has a stylised X which looked like they meant chi. \$\endgroup\$Reinderien– Reinderien2023年12月19日 19:14:13 +00:00Commented Dec 19, 2023 at 19:14
1 Answer 1
Does this look right?
No, for a few reasons. Primarily, X doesn't actually appear in the definition for the equation, so it doesn't make any sense to pass it into your function. Also, *(sigma+1)
is on the wrong side of the quotient.
Can it be simplified?
Yes. Perform fundamental Numpy vectorisation.
import numpy as np
def covariance_dirichlet_slow(alpha: np.ndarray) -> np.ndarray:
sigma = alpha.sum()
n = alpha.size
cov = np.empty((n, n))
for j in range(n):
for k in range(n):
if j == k:
iter_val_num = alpha[j]*sigma - alpha[j]**2
else:
iter_val_num = -alpha[j]*alpha[k]
iter_res = iter_val_num / sigma**2 / (1 + sigma)
cov[j, k] = iter_res
return cov
def covariance_dirichlet_fast(alpha: np.ndarray) -> np.ndarray:
sigma = alpha.sum()
alpha_coef = alpha/(sigma**2 * (1 + sigma))
diag = np.diag(alpha_coef*sigma)
square = np.outer(alpha_coef, alpha)
return diag - square
def test() -> None:
alpha = np.array((0.4, 5, 15, 11))
expected = np.array([
[ 3.88165899e-04, -6.26074030e-05, -1.87822209e-04, -1.37736287e-04],
[-6.26074030e-05, 4.13208860e-03, -2.34777761e-03, -1.72170358e-03],
[-1.87822209e-04, -2.34777761e-03, 7.70071057e-03, -5.16511075e-03],
[-1.37736287e-04, -1.72170358e-03, -5.16511075e-03, 7.02455062e-03],
])
actual_slow = covariance_dirichlet_slow(alpha)
actual_fast = covariance_dirichlet_fast(alpha)
assert np.allclose(expected, actual_slow, atol=0, rtol=1e-8)
assert np.allclose(expected, actual_fast, atol=0, rtol=1e-8)
if __name__ == '__main__':
test()
-
\$\begingroup\$ could you explain the use of np.diag and np.outer in your fast version? also where are the two random variables on which you are applying the function? because covariance is about measuring the similarity between two vectors \$\endgroup\$SSSOF– SSSOF2023年12月19日 21:17:34 +00:00Commented Dec 19, 2023 at 21:17
-
\$\begingroup\$ also where should the input of the dirichlet function follow simplex ? I don't see alpha as a simplex distribution \$\endgroup\$SSSOF– SSSOF2023年12月19日 21:43:45 +00:00Commented Dec 19, 2023 at 21:43
-
\$\begingroup\$
diag
andouter
are equivalent vectorised operations based on the way that the indexing works in the definition for the equation. \$\endgroup\$Reinderien– Reinderien2023年12月19日 21:59:08 +00:00Commented Dec 19, 2023 at 21:59 -
\$\begingroup\$ It seems like your source is statlect.com/probability-distributions/Dirichlet-distribution (or at least excerpted from it). Your answers are there. It has full definitions for the alpha parameter to the distribution. If you have difficulty understanding this, consider posting on math.stackexchange.com . \$\endgroup\$Reinderien– Reinderien2023年12月19日 22:00:39 +00:00Commented Dec 19, 2023 at 22:00