This takes an array of numbers then splits it into all possible combinations of the number array of size 4 then in another array puts the leftovers. As I want to take the difference in averages of the first column and the second.
import itertools
#defines the array of numbers and the two columns
number = [53, 64, 68, 71, 77, 82, 85]
col_one = []
col_two = []
#creates an array that holds the first four
results = itertools.combinations(number,4)
for x in results:
col_one.append(list(x))
#attempts to go through and remove those numbers in the first array
#and then add that array to col_two
for i in range(len(col_one)):
holder = list(number)
for j in range(4):
holder.remove(col_one[i][j])
col_two.append(holder)
col_one_average = []
col_two_average = []
for k in col_one:
col_one_average.append(sum(k)/len(k))
for l in col_two:
col_two_average.append(sum(l)/len(l))
dif = []
for i in range(len(col_one_average)):
dif.append(col_one_average[i] - col_two_average[i])
print dif
So for example, if I have
a = [1,2,3]
and I want to split it into an array of size 2 and 1, I get
col_one[0] = [1,2]
and
col_two[0] = [3]
then
col_one[1] = [1,3]
and
col_two[1] = [2]
After I get all those I find the average of col_one[0]
- average of col_two[0]
.
I hope that makes sense. I'm trying to do this for a statistics class, so if there is a 'numpy-y' solution, I'd love to hear it.
2 Answers 2
import itertools
import numpy
number = [53, 64, 68, 71, 77, 82, 85]
results = itertools.combinations(number,4)
# convert the combination iterator into a numpy array
col_one = numpy.array(list(results))
# calculate average of col_one
col_one_average = numpy.mean(col_one, axis = 1).astype(int)
# I don't actually create col_two, as I never figured out a good way to do it
# But since I only need the sum, I figure that out by subtraction
col_two_average = (numpy.sum(number) - numpy.sum(col_one, axis = 1)) / 3
dif = col_one_average - col_two_average
print dif
-
2\$\begingroup\$ Using
np.fromiter(combinations(
is far faster thannp.array(list(combinations(
, (0.1 seconds vs 2 seconds, for instance) but it's also more complicated: numpy-discussion.10968.n7.nabble.com/… \$\endgroup\$endolith– endolith2013年04月14日 17:05:13 +00:00Commented Apr 14, 2013 at 17:05
Not using numpy or scipy, but there are several things that can be improved about your code:
- This is minor, but in your comments you call your lists arrays, but it in python they're called lists
- Variable names like
col_one
andcol_two
aren't very meaningful. Maybe you should call themcombinations
andrests
or something like that. - You should definitely refactor your code into functions
- You often use index-based loops where it is not necessary. Where possible you should iterate by element, not by index.
- You're also often setting lists to the empty list and then appending to them in a loop. It is generally more pythonic and often faster to use list comprehensions for this.
If I were to write the code, I'd write something like this:
import itertools
def average(lst):
"""Returns the average of a list or other iterable"""
return sum(lst)/len(lst)
def list_difference(lst1, lst2):
"""Returns the difference between two iterables, i.e. a list containing all
elements of lst1 that are not in lst2"""
result = list(lst1)
for x in lst2:
result.remove(x)
return result
def differences(numbers, n):
"""Returns a list containing the difference between a combination and the remaining
elements of the list for all combinations of size n of the given list"""
# Build lists containing the combinations of size n and the rests
combinations = list(itertools.combinations(numbers, n))
rests = [list_difference(numbers, row) for row in col_one]
# Create a lists of averages of the combinations and the rests
combination_averages = [average(k) for k in combinations]
rest_averages = [average(k) for k in rests]
# Create a list containing the differences between the averages
# using zip to iterate both lists in parallel
diffs = [avg1 - avg2 for avg1, avg2 in zip(combination_averages, rest_averages)]
return diffs
print differences([53, 64, 68, 71, 77, 82, 85], 4)