Replace all elements of NumPy array that are greater than some value

Question 1

I have a 2D NumPy array. How do I replace all values in it greater than a threshold T = 255 with a value x = 255? A slow for-loop based method would be:

# arr = arr.copy() # Optionally, do not modify original arr.
for i in range(arr.shape[0]):
 for j in range(arr.shape[1]):
 if arr[i, j] > 255:
 arr[i, j] = x

Question 2

For more information, take a look at this intro to indexing.

Question 3

I think both the fastest and most concise way to do this is to use NumPy's built-in Fancy indexing. If you have an ndarray named arr, you can replace all elements >255 with a value x as follows:

arr[arr > 255] = x

I ran this on my machine with a 500 x 500 random matrix, replacing all values>0.5 with 5, and it took an average of 7.59ms.

In [1]: import numpy as np
In [2]: A = np.random.rand(500, 500)
In [3]: timeit A[A > 0.5] = 5
100 loops, best of 3: 7.59 ms per loop

Question 4

Note that this modifies the existing array arr, instead of creating a result array as in the OP.

Question 5

Is there a way to do this by not modifying A but creating a new array?

Question 6

What would we do, if we wanted to change values at indexes which are multiple of given n, like a[2],a[4],a[6],a[8]..... for n=2?

Question 7

NOTE: this doesn't work if the data is in a python list, it HAS to be in a numpy array (np.array([1,2,3])

Question 8

is it possible to use this indexing to update every value without condition? I want to do this: array[ ? ] = x, setting every value to x. Secondly, is it possible to do multiple conditions like: array[ ? ] = 255 if array[i] > 127 else 0 I want to optimize my code and am currently using list comprehension which was dramatically slower than this fancy indexing.

Question 9

If you want a new array result containing a copy of arr whenever arr < 255, and 255 otherwise:

result = np.minimum(arr, 255)

More generally, for a lower and/or upper bound:

result = np.clip(arr, 0, 255)

If you just want to access the values over 255, or something more complicated, @mtitan8's answer is more general, but np.clip and np.minimum (or np.maximum) are nicer and much faster for your case:

In [292]: timeit np.minimum(a, 255)
100000 loops, best of 3: 19.6 μs per loop
In [293]: %%timeit
 .....: c = np.copy(a)
 .....: c[a>255] = 255
 .....: 
10000 loops, best of 3: 86.6 μs per loop

If you want to do it in-place (i.e., modify arr instead of creating result) you can use the out parameter of np.minimum:

np.minimum(arr, 255, out=arr)

or

np.clip(arr, 0, 255, arr)

(the out= name is optional since the arguments in the same order as the function's definition.)

For in-place modification, the boolean indexing speeds up a lot (without having to make and then modify the copy separately), but is still not as fast as minimum:

In [328]: %%timeit
 .....: a = np.random.randint(0, 300, (100,100))
 .....: np.minimum(a, 255, a)
 .....: 
100000 loops, best of 3: 303 μs per loop
In [329]: %%timeit
 .....: a = np.random.randint(0, 300, (100,100))
 .....: a[a>255] = 255
 .....: 
100000 loops, best of 3: 356 μs per loop

For comparison, if you wanted to restrict your values with a minimum as well as a maximum, without clip you would have to do this twice, with something like

np.minimum(a, 255, a)
np.maximum(a, 0, a)

or,

a[a>255] = 255
a[a<0] = 0

Question 10

Thank you very much for your complete comment, however np.clip and np.minimum do not seem to be what I need in this case, in the OP you see that the threshold T and the replacement value (255) are not necessarily the same number. However I still gave you an up vote for thoroughness. Thanks again.

Question 11

What would we do, if we wanted to change values at indexes which are multiple of given n, like a[2],a[4],a[6],a[8]..... for n=2?

Question 12

@lavee_singh, to do that, you can use the third part of the slice, which is usually neglected: a[start:stop:step] gives you the elements of the array from start to stop, but instead of every element, it takes only every step (if neglected, it is 1 by default). So to set all the evens to zero, you could do a[::2] = 0

Question 13

Thanks I needed something, like this, even though I knew it for simple lists, but I didn't know whether or how it works for numpy.array.

Question 14

Surprisingly in my investigation, a = np.maximum(a,0) is faster than np.maximum(a,0,out=a).

Question 15

I think you can achieve this the quickest by using the where function:

For example looking for items greater than 0.2 in a numpy array and replacing those with 0:

import numpy as np
nums = np.random.rand(4,3)
print np.where(nums > 0.2, 0, nums)

Question 16

Another way is to use np.place which does in-place replacement and works with multidimentional arrays:

import numpy as np
# create 2x3 array with numbers 0..5
arr = np.arange(6).reshape(2, 3)
# replace 0 with -10
np.place(arr, arr == 0, -10)

Question 17

This is the solution I used because it was the first I came across. I wonder if there is a big difference between this and the selected answer above. What do you think?

Question 18

In my very limited tests, my above code with np.place is running 2X slower than accepted answer's method of direct indexing. It's surprising because I would have thought np.place would be more optimized but I guess they have probably put more work on direct indexing.

Question 19

In my case np.place was also slower compared to the built-in method, although the opposite is claimed in this comment.

Question 20

You can consider using numpy.putmask :

np.putmask(arr, arr>=T, 255.0)

Here is a performance comparison with the Numpy's builtin indexing:

In [1]: import numpy as np
In [2]: A = np.random.rand(500, 500)
In [3]: timeit np.putmask(A, A>0.5, 5)
1000 loops, best of 3: 1.34 ms per loop
In [4]: timeit A[A > 0.5] = 5
1000 loops, best of 3: 1.82 ms per loop

Question 21

I have tested the code for when upper limit 0.5 used instead of 5, and indexing was better than np.putmask about two times.

Question 22

You can also use &, | (and/or) for more flexibility:

values between 5 and 10: A[(A>5)&(A<10)]

values greater than 10 or smaller than 5: A[(A<5)|(A>10)]

Question 23

np.where() works great!

np.where(arr > 255, 255, arr)

example:

FF = np.array([[0, 0],
 [1, 0],
 [0, 1],
 [1, 1]])
np.where(FF == 1, '+', '-')
Out[]: 
array([['-', '-'],
 ['+', '-'],
 ['-', '+'],
 ['+', '+']], dtype='<U1')

Question 24

np.where is a great solution, it doesn't mutate the arrays involved, and it's also directly compatible with pandas series objects. Really helped me.

Question 25

Lets us assume you have a numpy array that has contains the value from 0 all the way up to 20 and you want to replace numbers greater than 10 with 0

import numpy as np
my_arr = np.arange(0,21) # creates an array
my_arr[my_arr > 10] = 0 # modifies the value

Note this will however modify the original array to avoid overwriting the original array try using arr.copy() to create a new detached copy of the original array and modify that instead.

import numpy as np
my_arr = np.arange(0,21)
my_arr_copy = my_arr.copy() # creates copy of the orignal array
my_arr_copy[my_arr_copy > 10] = 0

mdml 23k8 gold badges61 silver badges66 bronze badges · Accepted Answer · 2013-10-29 18:46:06Z

479

I think both the fastest and most concise way to do this is to use NumPy's built-in Fancy indexing. If you have an ndarray named arr, you can replace all elements >255 with a value x as follows:

arr[arr > 255] = x

I ran this on my machine with a 500 x 500 random matrix, replacing all values>0.5 with 5, and it took an average of 7.59ms.

In [1]: import numpy as np
In [2]: A = np.random.rand(500, 500)
In [3]: timeit A[A > 0.5] = 5
100 loops, best of 3: 7.59 ms per loop

Share

Improve this answer

edited Apr 27, 2019 at 22:52

kmario23's user avatar

kmario23

62.1k17 gold badges174 silver badges160 bronze badges

answered Oct 29, 2013 at 18:46

mdml's user avatar

mdml

23k8 gold badges61 silver badges66 bronze badges

Sign up to request clarification or add additional context in comments.

11 Comments

askewchan

askewchan Over a year ago

Note that this modifies the existing array arr, instead of creating a result array as in the OP.

2013年10月29日T20:01:40.92Z+00:00

sodiumnitrate

sodiumnitrate Over a year ago

Is there a way to do this by not modifying A but creating a new array?

2015年08月25日T23:12:58.863Z+00:00

lavee_singh

lavee_singh Over a year ago

What would we do, if we wanted to change values at indexes which are multiple of given n, like a[2],a[4],a[6],a[8]..... for n=2?

2015年10月07日T19:01:44.91Z+00:00

mjp

mjp Over a year ago

NOTE: this doesn't work if the data is in a python list, it HAS to be in a numpy array (np.array([1,2,3])

2017年05月08日T14:28:13.22Z+00:00

AgentM

AgentM Over a year ago

is it possible to use this indexing to update every value without condition? I want to do this: array[ ? ] = x, setting every value to x. Secondly, is it possible to do multiple conditions like: array[ ? ] = 255 if array[i] > 127 else 0 I want to optimize my code and am currently using list comprehension which was dramatically slower than this fancy indexing.

2019年10月17日T15:33:47.483Z+00:00

|

CollectivesTM on Stack Overflow

Replace all elements of NumPy array that are greater than some value

8 Answers 8

11 Comments

5 Comments

Comments

3 Comments

1 Comment

Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

CollectivesTM on Stack Overflow

8 Answers 8

11 Comments

5 Comments

Comments

3 Comments

1 Comment

Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related