From what I understand, numpy arrays can handle operations more quickly than python lists because they're handled in a parallel rather than iterative fashion. I tried to test that out for fun, but I didn't see much of a difference.
Was there something wrong with my test? Does the difference only matter with arrays much bigger than the ones I used? I made sure to create a python list and numpy array in each function to cancel out differences creating one vs. the other might make, but the time delta really seems negligible. Here's my code:
My final outputs were numpy function: 6.534756324786595s, list function: 6.559365831783256s
import timeit
import numpy as np
a_setup = 'import timeit; import numpy as np'
std_fx = '''
def operate_on_std_array():
std_arr = list(range(0,1000000))
np_arr = np.asarray(std_arr)
for index,elem in enumerate(std_arr):
std_arr[index] = (elem**20)*63134
return std_arr
'''
parallel_fx = '''
def operate_on_np_arr():
std_arr = list(range(0,1000000))
np_arr = np.asarray(std_arr)
np_arr = (np_arr**20)*63134
return np_arr
'''
def operate_on_std_array():
std_arr = list(range(0,1000000))
np_arr = np.asarray(std_arr)
for index,elem in enumerate(std_arr):
std_arr[index] = (elem**20)*63134
return std_arr
def operate_on_np_arr():
std_arr = list(range(0,1000000))
np_arr = np.asarray(std_arr)
np_arr = (np_arr**20)*63134
return np_arr
print('std',timeit.timeit(setup = a_setup, stmt = std_fx, number = 80000000))
print('par',timeit.timeit(setup = a_setup, stmt = parallel_fx, number = 80000000))
#operate_on_np_arr()
#operate_on_std_array()
2 Answers 2
The timeit docs here show that the statement you pass in is supposed to execute something, but the statements you pass in just define functions. I was thinking 80000000 trials on a 1-million-length array should take much longer.
Other issues you have in your test:
np_arr = (np_arr**20)*63134
may create a copy of np_arr, but your Python list equivalent only mutates an existing array.- Numpy math is different than Python math.
100**20
in Python returns a huge number because Python has unbounded-length integers, but Numpy uses C-style fixed-length integers that overflow. (In general, you have to imagine doing the operation in C when you use Numpy because other unintuitive things may apply, like garbage in uninitialized arrays.)
Here's a test where I modify both in place, multiplying then dividing by 31 each time so the values don't change over time or overflow:
import numpy as np
import timeit
std_arr = list(range(0,100000))
np_arr = np.array(std_arr)
np_arr_vec = np.vectorize(lambda n: (n * 31) / 31)
def operate_on_std_array():
for index,elem in enumerate(std_arr):
std_arr[index] = elem * 31
std_arr[index] = elem / 31
return std_arr
def operate_on_np_arr():
np_arr_vec(np_arr)
return np_arr
import time
def test_time(f):
count = 100
start = time.time()
for i in range(count):
f()
dur = time.time() - start
return dur
print(test_time(operate_on_std_array))
print(test_time(operate_on_np_arr))
Results:
3.0798873901367188 # standard array time
2.221336841583252 # np array time
Edit: As @user2357112 pointed out, the proper Numpy way to do it is this:
def operate_on_np_arr():
global np_arr
np_arr *= 31
np_arr //= 31 # integer division, not double
return np_arr
Makes it much faster. I see 0.1248
seconds.
-
2
np.vectorize
is basically just a wrapper around afor
loop. It only exists for convenience, not efficiency.user2357112– user23571122018年04月10日 18:21:00 +00:00Commented Apr 10, 2018 at 18:21 -
2If you want to avoid the copy, use
np_arr *= 31
andnp_arr //= 31
instead of working element by element.user2357112– user23571122018年04月10日 18:23:22 +00:00Commented Apr 10, 2018 at 18:23 -
I was getting weird errors trying to do that.
TypeError: ufunc 'true_divide' output (typecode 'd') could not be coerced...
Could be my version of Numpy.sudo– sudo2018年04月10日 18:24:06 +00:00Commented Apr 10, 2018 at 18:24 -
Ohh,
//=
does integer division. I thought that was a typo. My mistake, retrying...sudo– sudo2018年04月10日 18:24:41 +00:00Commented Apr 10, 2018 at 18:24 -
This is fantastic. Glad to see I was doing something wrong.Leopold Boom– Leopold Boom2018年04月10日 20:09:13 +00:00Commented Apr 10, 2018 at 20:09
Here are some timings using the ipython
magic to initialize lists and or arrays. The results should focus on the calculations:
In [103]: %%timeit alist = list(range(10000))
...: for i,e in enumerate(alist):
...: alist[i] = (e*3)*20
...:
4.13 ms ± 146 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [104]: %%timeit arr = np.arange(10000)
...: z = (arr*3)*20
...:
20.6 μs ± 439 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [105]: %%timeit alist = list(range(10000))
...: z = [(e*3)*20 for e in alist]
...:
...:
1.71 ms ± 2.69 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Looking at the effect of array creation times:
In [106]: %%timeit alist = list(range(10000))
...: arr = np.array(alist)
...: z = (arr*3)*20
...:
...:
1.01 ms ± 43.6 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Ok, the calculation isn't the same. If I use **3
instead, all times are about 2x larger. Same relative relations.
Explore related questions
See similar questions with these tags.
numpy
faster by a larger degree on my machine.std_fx
createnp_arr
and not use it? Why doesparallel_fx
use range? WHy notnp_arr =np.arange(0,100...)
? Your calculated numbers are too large for the arraydtype
(int32
orint64
).numpy
is usually faster because it performs the iteration(s) in compiled code. Operations the requires iteration at the Python level are slower. Creating arrays from lists takes time, potentially cancelling out time savings. Sometimes for very large cases, memory management can also chew into the time savings.