I have some very long code, which I know can done more efficiently using a for loop. For context, the "matrix" is my dataset and I need to extract all the values where the second column is equal to 1,2,3...20, and sum the last four columns of those (resulting in a 20x4 matrice with the summed values)
But I need to write it using a loop, I would guess a for-loop.
I've tried the following:
M=np.zeros([20,10]) #creating empty matrix to fill in
for i in range(1,21):
M=matrix[matrix[:,1]==i]
sub=sum(M[:,6:10])
But the result is only the last run of the loop, that is the values where stackD[stackD[:,1]==20]. How can I do this with a for-loop? Thanks in advance.
2 Answers 2
Your problem is that you repeatedly overwrite M at each loop iteration (M=...).
Here's a correct solution that uses a loop:
M = np.stack([stackD[stackD[:,1] == i+1, 6:10].sum(axis=0)
for i in range(12)])
Comments
In your new code, M is a single variable and is repeatedly given a new value in the for loop. If you want to store the results of each iteration, you need to create a list. Note that when you name variables with numbers, this almost always means that you should use a list instead. For example, sub1, sub2, sub3, etc. can be replaced with a list:
sub = []
dat = []
for i in range(1,13):
sub.append(stackD[stackD[:,1]==i])
dat.append(sum(sub[i]))
Now you can access values with indexes like sub[5] and dat[12]. If sub is never used outside of this loop, then you only need the single list dat:
dat = []
for i in range(1,13):
sub = stackD[stackD[:,1]==i]
dat.append(sum(sub))
Disclaimer:
I am not familiar with numpy and there is likely a better way to do what you want with its tools. I am only explaining how to use a basic list with your for loop. I strongly encourage you to learn more about lists and loops because these are very important tools when writing Python programs.
1 ... to Nfor arbitrary positive nonzero integer N? Or is N potentially a value larger than 12?