Skip to main content
Stack Overflow
  1. About
  2. For Teams

Timeline for answer to Python numpy performance - selection on very large array by MSeifert

Current License: CC BY-SA 3.0

Post Revisions

12 events
when toggle format what by license comment
Sep 6, 2022 at 17:38 comment added kostrykin Regarding your note on taking the minimum run time instead of the mean and standard deviation. As I was reading this, I remembered hearing this back when I studied computer science, but later I forgot about it. I think this "rule of thumb" has a name, which I cannot find, do you maybe know it?
Jun 20, 2020 at 9:12 history edited Community Bot
Commonmark migration
Aug 10, 2017 at 13:18 history edited MSeifert CC BY-SA 3.0
added 61 characters in body
Aug 10, 2017 at 13:11 vote accept Diogo Santos
Aug 10, 2017 at 13:10 comment added Diogo Santos Tested and approved! :) I will run more tests, but as far as I can see there's some gain (in my code it run at 75% of the original code time). Maybe this is explained by the usage of python_intel, which speed up my numpy already
Aug 10, 2017 at 13:10 comment added MSeifert @DiogoSantos It's hard to give general guidelines on the performance of numba. One rule of thump is: Write all the loops yourself and try to avoid calling "complicated" numpy functions in the numba function (e.g. advanced indexing or operations that create a temporary array). Then there are also different ways to iterate over numpy arrays, sometimes it's faster to use for element in array or for idx in range(len(array)) or even for element in np.nditer(array). One has to experiment a bit to get the fastest numba function and sometimes the fastest way depends on the numba version.
Aug 10, 2017 at 13:09 comment added Alexander McFarlane @MSeifert thanks for looking into my approach! I've learned a fair bit about the weaknesses of %%timeit magic command with your help - I'll delete my solution as it will thoroughly confuse someone looking at it - feel free to incorporate it in your answer as an approach to avoid
Aug 10, 2017 at 13:05 comment added MSeifert Yes, it will work on views (you can verify this by running selection(a[0:2], b, c) instead of selection(a, b, c) in my example.
Aug 10, 2017 at 12:59 comment added Diogo Santos I'm going to test it now, but does your solution works if 'a', 'b' and 'c' are views on other arrays? Will they be modified as I need ? As an aside, I've been testing Numba on other bottlenecks of the code, specifically operations with numpy arrays (same shape, summing, dividing, get the array sum) and found it slower than original numpy code... Any general advice?
Aug 10, 2017 at 12:49 history edited MSeifert CC BY-SA 3.0
added 1477 characters in body
Aug 10, 2017 at 12:23 comment added MSeifert @AlexanderMcFarlane I'm not so sure that your approach is working correctly. There's basically no way I could explain a 1000 times speedup over a vectorized numpy operation. I guess 2-5 times is the limit you could expect to be faster if one avoids temporary arrays or one uses a more efficient operation.
Aug 10, 2017 at 12:11 history answered MSeifert CC BY-SA 3.0

AltStyle によって変換されたページ (->オリジナル) /