This question came from a real use case. I had a data frame with different columns each one containing data from a data source and I wanted to design a hypothesis test to prove or not if the data had the same mean. So I had to compute the Kolmogorov-Smirnov test for each couple of columns.
Now the problem can be generalized to any combinatory task.
It follows that I had to implement a Binomial Coefficient like
$$ \binom{n}{k} $$
Where n is the number of columns
and k is = 2
My question is: if exists a more efficient way to apply a function on permutated samples taken from a list? And how to do this permutation eg. Given a func
and a list [a,b,c,d]
func(a,b)
func(a,c)
func(a,d)
func(b,c)
func(b,d)
func(c,d)
I created an algorithm to solve this issue, but I am wondering if there is a better way to do that in Python.
In my algorithm, I simply multiply each n
element in an explanatory array with another element i
of the same array, with n!=i
, instead of computing the statistical test.
to_do=[1,2,3,4,5]
#done will store information on the elements already combined
done=[]
#j will store information on the results of the combination
j=[]
#iterating over the to_do array
for n in to_do:
#checking if we already computed the n element
if n not in done:
print(n)
#taking another i element from the array
#where n!=i
for i in to_do:
print(i)
#if the condition is satisfied
if i!=n:
#combine the two elements
m=n*i
#append the result on the "j" array
j.append(m)
#updating the array with the "done" elements
done.append(n)
print(len(done))
print(len(j))
2 Answers 2
If you just want the length of the list, your code can be reduced to one line:
result = len(to_do) * (len(to_do) - 1)
Also, you're comments are, at the very least, excessive. You should really only use comments to explain a design choice or a complicated equation/algorithm. Comments like #if the condition is satisfied
just clutters your code.
As explained in a comment, actual computations need to be made. This can be done using itertools.permutations
.
import itertools
from typing import List
def func(nums: List[int]) -> List[int]:
return list(set(
x * y for x, y in itertools.permutations(nums, 2)
))
While this doesn't avoid the extra computations, it's a short and sweet solution. And below is how you would use it:
to_do = [1, 2, 3, 4, 5]
func(to_do)
-
\$\begingroup\$ Thank you the point about the comment is very useful, the output as I commented above is to multiply each element of the list with another one and this should be done for all the elements, except for the multiplication of the element ´i´ with itself. \$\endgroup\$Andrea Ciufo– Andrea Ciufo2020年12月28日 09:45:07 +00:00Commented Dec 28, 2020 at 9:45
-
\$\begingroup\$ @AndreaCiufo Please see my edited answer. \$\endgroup\$Ben A– Ben A2020年12月28日 12:51:53 +00:00Commented Dec 28, 2020 at 12:51
-
\$\begingroup\$ what does it mean
-> List[int]
it's the first time that I see it in the function signature? \$\endgroup\$Andrea Ciufo– Andrea Ciufo2021年01月02日 18:12:06 +00:00Commented Jan 2, 2021 at 18:12 -
\$\begingroup\$ @AndreaCiufo It means that the function returns a list of integers. Take a look at the python docs regarding type hints \$\endgroup\$Ben A– Ben A2021年01月02日 20:01:33 +00:00Commented Jan 2, 2021 at 20:01
My question is: if exists a more efficient way to apply a function on permutated samples taken from a list? And how to do this permutation eg. Given a func and a list [a,b,c,d]
func(a,b)
func(a,c)
func(a,d)
func(b,c)
func(b,d)
func(c,d)
It seems that you are looking for combinations rather than permutations. In that case, use itertools.combinations:
from itertools import combinations
for x, y in combinations(['a', 'b', 'c', 'd'], 2):
print(x, y)
It prints:
a b
a c
a d
b c
b d
c d
Using itertools.combinations
in your example:
from itertools import combinations
todo = [1, 2, 3, 4, 5]
j = [x * y for x, y in combinations(todo, 2)]
itertools.permutations(to_do, 2)
? \$\endgroup\$[ABCD]
and I need onlyAB
and notBA
(or viceversa). I tried to fix this issue with thedone
array in my code \$\endgroup\$