I have a 2D NumPy array. How do I replace all values in it greater than a threshold T = 255 with a value x = 255? A slow for-loop based method would be:
# arr = arr.copy() # Optionally, do not modify original arr.
for i in range(arr.shape[0]):
for j in range(arr.shape[1]):
if arr[i, j] > 255:
arr[i, j] = x
-
1For more information, take a look at this intro to indexing.askewchan– askewchan2013年10月29日 19:25:32 +00:00Commented Oct 29, 2013 at 19:25
8 Answers 8
I think both the fastest and most concise way to do this is to use NumPy's built-in Fancy indexing. If you have an ndarray named arr, you can replace all elements >255 with a value x as follows:
arr[arr > 255] = x
I ran this on my machine with a 500 x 500 random matrix, replacing all values>0.5 with 5, and it took an average of 7.59ms.
In [1]: import numpy as np
In [2]: A = np.random.rand(500, 500)
In [3]: timeit A[A > 0.5] = 5
100 loops, best of 3: 7.59 ms per loop
11 Comments
arr, instead of creating a result array as in the OP.A but creating a new array?np.array([1,2,3])array[ ? ] = x, setting every value to x. Secondly, is it possible to do multiple conditions like: array[ ? ] = 255 if array[i] > 127 else 0 I want to optimize my code and am currently using list comprehension which was dramatically slower than this fancy indexing.If you want a new array result containing a copy of arr whenever arr < 255, and 255 otherwise:
result = np.minimum(arr, 255)
More generally, for a lower and/or upper bound:
result = np.clip(arr, 0, 255)
If you just want to access the values over 255, or something more complicated, @mtitan8's answer is more general, but np.clip and np.minimum (or np.maximum) are nicer and much faster for your case:
In [292]: timeit np.minimum(a, 255)
100000 loops, best of 3: 19.6 μs per loop
In [293]: %%timeit
.....: c = np.copy(a)
.....: c[a>255] = 255
.....:
10000 loops, best of 3: 86.6 μs per loop
If you want to do it in-place (i.e., modify arr instead of creating result) you can use the out parameter of np.minimum:
np.minimum(arr, 255, out=arr)
or
np.clip(arr, 0, 255, arr)
(the out= name is optional since the arguments in the same order as the function's definition.)
For in-place modification, the boolean indexing speeds up a lot (without having to make and then modify the copy separately), but is still not as fast as minimum:
In [328]: %%timeit
.....: a = np.random.randint(0, 300, (100,100))
.....: np.minimum(a, 255, a)
.....:
100000 loops, best of 3: 303 μs per loop
In [329]: %%timeit
.....: a = np.random.randint(0, 300, (100,100))
.....: a[a>255] = 255
.....:
100000 loops, best of 3: 356 μs per loop
For comparison, if you wanted to restrict your values with a minimum as well as a maximum, without clip you would have to do this twice, with something like
np.minimum(a, 255, a)
np.maximum(a, 0, a)
or,
a[a>255] = 255
a[a<0] = 0
5 Comments
a[start:stop:step] gives you the elements of the array from start to stop, but instead of every element, it takes only every step (if neglected, it is 1 by default). So to set all the evens to zero, you could do a[::2] = 0a = np.maximum(a,0) is faster than np.maximum(a,0,out=a).I think you can achieve this the quickest by using the where function:
For example looking for items greater than 0.2 in a numpy array and replacing those with 0:
import numpy as np
nums = np.random.rand(4,3)
print np.where(nums > 0.2, 0, nums)
Comments
Another way is to use np.place which does in-place replacement and works with multidimentional arrays:
import numpy as np
# create 2x3 array with numbers 0..5
arr = np.arange(6).reshape(2, 3)
# replace 0 with -10
np.place(arr, arr == 0, -10)
3 Comments
np.place was also slower compared to the built-in method, although the opposite is claimed in this comment.You can consider using numpy.putmask :
np.putmask(arr, arr>=T, 255.0)
Here is a performance comparison with the Numpy's builtin indexing:
In [1]: import numpy as np
In [2]: A = np.random.rand(500, 500)
In [3]: timeit np.putmask(A, A>0.5, 5)
1000 loops, best of 3: 1.34 ms per loop
In [4]: timeit A[A > 0.5] = 5
1000 loops, best of 3: 1.82 ms per loop
1 Comment
0.5 used instead of 5, and indexing was better than np.putmask about two times.You can also use &, | (and/or) for more flexibility:
values between 5 and 10: A[(A>5)&(A<10)]
values greater than 10 or smaller than 5: A[(A<5)|(A>10)]
Comments
np.where() works great!
np.where(arr > 255, 255, arr)
example:
FF = np.array([[0, 0],
[1, 0],
[0, 1],
[1, 1]])
np.where(FF == 1, '+', '-')
Out[]:
array([['-', '-'],
['+', '-'],
['-', '+'],
['+', '+']], dtype='<U1')
1 Comment
Lets us assume you have a numpy array that has contains the value from 0 all the way up to 20 and you want to replace numbers greater than 10 with 0
import numpy as np
my_arr = np.arange(0,21) # creates an array
my_arr[my_arr > 10] = 0 # modifies the value
Note this will however modify the original array to avoid overwriting the original array try using
arr.copy()to create a new detached copy of the original array and modify that instead.
import numpy as np
my_arr = np.arange(0,21)
my_arr_copy = my_arr.copy() # creates copy of the orignal array
my_arr_copy[my_arr_copy > 10] = 0
Comments
Explore related questions
See similar questions with these tags.