Given the following artificially generated data:
t_steps = 30
data = np.array([
np.arange(t_steps) * .05,
np.arange(t_steps) * .1,
np.arange(t_steps) * .2,
np.arange(t_steps) * .3
])
I find the time-step the each line of data has passed a threshold. If it does not pass the given threshold, I assign a time-step of -1
:
react_tms = []
thresh = 3.5
for dat in data:
whr = np.where(dat > thresh)
if len(whr[0]) == 0:
react_tms.append(-1)
else:
react_tms.append(whr[0][0])
This gives:
[-1, -1, 18, 12]
Is there some way to do this without the for-loop? Even before the for-loop is removed, should I be using something other than np.where
to find the threshold crossing?
1 Answer 1
In principle you can use numpy.argmax
for this. The only problem is that if no value is above the threshold, the maximum is False
, so it returns 0 as the index of the maximum. We therefore need to subtract 1 for those cases:
above_threshold = data > thresh
react_tms = np.argmax(above_threshold, axis=1)
react_tms = react_tms - (~np.any(data > thresh, axis=1)).astype(float)
print(react_tms)
# array([ -1., -1., 18., 12.])
Whether or not that is really more readable, I am not sure. It is, however slightly faster than using numpy.where
(and probably also faster than a list comprehension): https://stackoverflow.com/q/16243955/4042267
In the end this does not really matter: