Return to Answer

Post Timeline

Commonmark migration

edited Jun 20, 2020 at 9:12

##Note

Note

It’s tempting to calculate mean and standard deviation from the result vector and report these. However, this is not very useful. In a typical case, the lowest value gives a lower bound for how fast your machine can run the given code snippet; higher values in the result vector are typically not caused by variability in Python’s speed, but by other processes interfering with your timing accuracy. So the min() of the result is probably the only number you should be interested in. After that, you should look at the entire vector and apply common sense rather than statistics.

##Note

It’s tempting to calculate mean and standard deviation from the result vector and report these. However, this is not very useful. In a typical case, the lowest value gives a lower bound for how fast your machine can run the given code snippet; higher values in the result vector are typically not caused by variability in Python’s speed, but by other processes interfering with your timing accuracy. So the min() of the result is probably the only number you should be interested in. After that, you should look at the entire vector and apply common sense rather than statistics.

Note

It’s tempting to calculate mean and standard deviation from the result vector and report these. However, this is not very useful. In a typical case, the lowest value gives a lower bound for how fast your machine can run the given code snippet; higher values in the result vector are typically not caused by variability in Python’s speed, but by other processes interfering with your timing accuracy. So the min() of the result is probably the only number you should be interested in. After that, you should look at the entire vector and apply common sense rather than statistics.

added 61 characters in body

Source Link

edited Aug 10, 2017 at 13:18

MSeifert

edited Aug 10, 2017 at 13:18

MSeifert

154.3k
41
356
378

In my test runs this was almostroughly a factor 2 faster than your NumPy code. However numba might be a heavy dependency if you're not using conda.

It's hard to time this accuratly because all approaches work in-place, so I actually use timeit.repeat to measure the timings with a number=1 (that avoids broken timings due to the in-place-ness of the solutions) and I used the min of the resulting list of timings because that's advertised as athe most useful quantitative measure in the documentation:

Alexander McFarlane's solution (now deleted)

In my test runs this was almost a factor 2 faster than your NumPy code. However numba might be a heavy dependency if you're not using conda.

It's hard to time this accuratly because all approaches work in-place, so I actually use timeit.repeat to measure the timings with a number=1 (that avoids broken timings due to the in-place-ness of the solutions) and I used the min because that's advertised as a quantitative measure in the documentation:

Alexander McFarlane's solution

In my test runs this was roughly a factor 2 faster than your NumPy code. However numba might be a heavy dependency if you're not using conda.

It's hard to time this accuratly because all approaches work in-place, so I actually use timeit.repeat to measure the timings with a number=1 (that avoids broken timings due to the in-place-ness of the solutions) and I used the min of the resulting list of timings because that's advertised as the most useful quantitative measure in the documentation:

Alexander McFarlane's solution (now deleted)

added 1477 characters in body

Source Link

edited Aug 10, 2017 at 12:49

MSeifert

edited Aug 10, 2017 at 12:49

MSeifert

154.3k
41
356
378

Example:

Timing:

TimingIt's hard to time this accuratly because all approaches work in-place, so I actually use timeit.repeat to measure the timings with a number=1 (that avoids broken timings due to the in-place-ness of the solutions) and I used the min because that's advertised as a quantitative measure in the documentation:

##Note

It’s tempting to calculate mean and standard deviation from the result vector and report these. However, this is not very useful. In a typical case, the lowest value gives a lower bound for how fast your machine can run the given code snippet; higher values in the result vector are typically not caused by variability in Python’s speed, but by other processes interfering with your timing accuracy. So the min() of the result is probably the only number you should be interested in. After that, you should look at the entire vector and apply common sense rather than statistics.

Numba solution

import numpy as nptimeit
a0 = np.random
min(timeit.randomrepeat(100000"""selection(a, b, c)""",
a0[a0 < 0.5] = 0
b0 = np.random.random(100000)
c0 = """import numpy as np.random.random(100000)
from __main__ import selection
%%timeit
a = a0np.copyarange(1000000) % 3
b = b0a.copy()
c = c0a.copy()
survivors""", repeat=100, number=1))

0.007700118746939211

Original solution

import timeit
min(timeit.repeat("""survivors = np.where(a > 0)[0]
pos = len(survivors)
a[:pos] = a[survivors]
b[:pos] = b[survivors]
c[:pos] = c[survivors]c[survivors]""",

4.07 ms ± 67.1 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%%timeit """import numpy as np
a = a0np.copyarange(1000000) % 3
b = b0a.copy()
c = c0a.copy()
selection(a""", brepeat=100, cnumber=1))

2.47 ms ± 39.1 μs per loop (mean ± std. dev0. of 7 runs, 100 loops each)028622144571883723

Note that the cost to copy the arrays is significant here:

Alexander McFarlane's solution

%%timeitimport timeit
min(timeit.repeat("""survivors = comb_array[:, 0].nonzero()[0]
comb_array[:len(survivors)] = comb_array[survivors]""",
 """import numpy as np
a = a0np.copyarange(1000000) % 3
b = b0a.copy()
c = c0a.copy()
comb_array = np.vstack([a,b,c]).T""", repeat=100, number=1))

1.35 ms ± 42.4 μs per loop (mean ± std. dev0. of 7 runs, 1000 loops each)058305527038669425

So the Numba solution can actually speed this up by a factor 3-4 while the solution of Alexander McFarlane is actually slower (2x) than the original approach. However the small number of repeats may bias the timings somewhat.

Example:

Timing:

import numpy as np
a0 = np.random.random(100000)
a0[a0 < 0.5] = 0
b0 = np.random.random(100000)
c0 = np.random.random(100000)
%%timeit
a = a0.copy()
b = b0.copy()
c = c0.copy()
survivors = np.where(a > 0)[0]
pos = len(survivors)
a[:pos] = a[survivors]
b[:pos] = b[survivors]
c[:pos] = c[survivors]

4.07 ms ± 67.1 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%%timeit
a = a0.copy()
b = b0.copy()
c = c0.copy()
selection(a, b, c)

2.47 ms ± 39.1 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Note that the cost to copy the arrays is significant here:

%%timeit
a = a0.copy()
b = b0.copy()
c = c0.copy()

1.35 ms ± 42.4 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Example:

Timing:

It's hard to time this accuratly because all approaches work in-place, so I actually use timeit.repeat to measure the timings with a number=1 (that avoids broken timings due to the in-place-ness of the solutions) and I used the min because that's advertised as a quantitative measure in the documentation:

##Note

It’s tempting to calculate mean and standard deviation from the result vector and report these. However, this is not very useful. In a typical case, the lowest value gives a lower bound for how fast your machine can run the given code snippet; higher values in the result vector are typically not caused by variability in Python’s speed, but by other processes interfering with your timing accuracy. So the min() of the result is probably the only number you should be interested in. After that, you should look at the entire vector and apply common sense rather than statistics.

Numba solution

import timeit

min(timeit.repeat("""selection(a, b, c)""",
 """import numpy as np
from __main__ import selection
a = np.arange(1000000) % 3
b = a.copy()
c = a.copy()
""", repeat=100, number=1))

0.007700118746939211

Original solution

import timeit
min(timeit.repeat("""survivors = np.where(a > 0)[0]
pos = len(survivors)
a[:pos] = a[survivors]
b[:pos] = b[survivors]
c[:pos] = c[survivors]""",
 """import numpy as np
a = np.arange(1000000) % 3
b = a.copy()
c = a.copy()
""", repeat=100, number=1))

0.028622144571883723

Alexander McFarlane's solution

import timeit
min(timeit.repeat("""survivors = comb_array[:, 0].nonzero()[0]
comb_array[:len(survivors)] = comb_array[survivors]""",
 """import numpy as np
a = np.arange(1000000) % 3
b = a.copy()
c = a.copy()
comb_array = np.vstack([a,b,c]).T""", repeat=100, number=1))

0.058305527038669425

Source Link

answered Aug 10, 2017 at 12:11

MSeifert

answered Aug 10, 2017 at 12:11

MSeifert

154.3k
41
356
378

lang-py

CollectivesTM on Stack Overflow

Note

Note

Alexander McFarlane's solution (now deleted)

Alexander McFarlane's solution

Alexander McFarlane's solution (now deleted)

Example:

Timing:

Numba solution

Original solution

Alexander McFarlane's solution

Example:

Timing:

Numba solution

Original solution

Alexander McFarlane's solution