Splitting an array of numbers into all possible combinations

Question 1

This takes an array of numbers then splits it into all possible combinations of the number array of size 4 then in another array puts the leftovers. As I want to take the difference in averages of the first column and the second.

import itertools
#defines the array of numbers and the two columns
number = [53, 64, 68, 71, 77, 82, 85]
col_one = []
col_two = []
#creates an array that holds the first four
results = itertools.combinations(number,4)
for x in results:
 col_one.append(list(x))
#attempts to go through and remove those numbers in the first array
#and then add that array to col_two
for i in range(len(col_one)):
 holder = list(number)
 for j in range(4):
 holder.remove(col_one[i][j])
 col_two.append(holder) 
col_one_average = []
col_two_average = []
for k in col_one:
 col_one_average.append(sum(k)/len(k))
for l in col_two:
 col_two_average.append(sum(l)/len(l))
dif = []
for i in range(len(col_one_average)):
 dif.append(col_one_average[i] - col_two_average[i])
print dif

So for example, if I have

a = [1,2,3]

and I want to split it into an array of size 2 and 1, I get

col_one[0] = [1,2]

and

col_two[0] = [3]

then

col_one[1] = [1,3]

and

col_two[1] = [2]

After I get all those I find the average of col_one[0] - average of col_two[0].

I hope that makes sense. I'm trying to do this for a statistics class, so if there is a 'numpy-y' solution, I'd love to hear it.

Question 2

import itertools
import numpy
number = [53, 64, 68, 71, 77, 82, 85]
results = itertools.combinations(number,4)
# convert the combination iterator into a numpy array
col_one = numpy.array(list(results))
# calculate average of col_one
col_one_average = numpy.mean(col_one, axis = 1).astype(int)
# I don't actually create col_two, as I never figured out a good way to do it
# But since I only need the sum, I figure that out by subtraction
col_two_average = (numpy.sum(number) - numpy.sum(col_one, axis = 1)) / 3
dif = col_one_average - col_two_average
print dif

Question 3

Using np.fromiter(combinations( is far faster than np.array(list(combinations(, (0.1 seconds vs 2 seconds, for instance) but it's also more complicated: numpy-discussion.10968.n7.nabble.com/…

Question 4

Not using numpy or scipy, but there are several things that can be improved about your code:

This is minor, but in your comments you call your lists arrays, but it in python they're called lists
Variable names like col_one and col_two aren't very meaningful. Maybe you should call them combinations and rests or something like that.
You should definitely refactor your code into functions
You often use index-based loops where it is not necessary. Where possible you should iterate by element, not by index.
You're also often setting lists to the empty list and then appending to them in a loop. It is generally more pythonic and often faster to use list comprehensions for this.

If I were to write the code, I'd write something like this:

import itertools
def average(lst):
 """Returns the average of a list or other iterable"""
 return sum(lst)/len(lst)
def list_difference(lst1, lst2):
 """Returns the difference between two iterables, i.e. a list containing all
 elements of lst1 that are not in lst2"""
 result = list(lst1)
 for x in lst2:
 result.remove(x)
 return result
def differences(numbers, n):
 """Returns a list containing the difference between a combination and the remaining
 elements of the list for all combinations of size n of the given list"""
 # Build lists containing the combinations of size n and the rests
 combinations = list(itertools.combinations(numbers, n))
 rests = [list_difference(numbers, row) for row in col_one]
 # Create a lists of averages of the combinations and the rests
 combination_averages = [average(k) for k in combinations]
 rest_averages = [average(k) for k in rests]
 # Create a list containing the differences between the averages
 # using zip to iterate both lists in parallel
 diffs = [avg1 - avg2 for avg1, avg2 in zip(combination_averages, rest_averages)]
 return diffs
print differences([53, 64, 68, 71, 77, 82, 85], 4)

score 9 · Accepted Answer · 2011-03-03 21:28:33Z

import itertools
import numpy
number = [53, 64, 68, 71, 77, 82, 85]
results = itertools.combinations(number,4)
# convert the combination iterator into a numpy array
col_one = numpy.array(list(results))
# calculate average of col_one
col_one_average = numpy.mean(col_one, axis = 1).astype(int)
# I don't actually create col_two, as I never figured out a good way to do it
# But since I only need the sum, I figure that out by subtraction
col_two_average = (numpy.sum(number) - numpy.sum(col_one, axis = 1)) / 3
dif = col_one_average - col_two_average
print dif

Using np.fromiter(combinations( is far faster than np.array(list(combinations(, (0.1 seconds vs 2 seconds, for instance) but it's also more complicated: numpy-discussion.10968.n7.nabble.com/…

Stack Exchange Network

Splitting an array of numbers into all possible combinations

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Splitting an array of numbers into all possible combinations

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions