Strange scipy.interpolate.RegularGridInterpolator timings

Asked 1 month ago

Viewed 70 times

I am facing some strange timings with the RegularGridInterpolator from scipy.

Splitting the interpolation into 100 chunks and concatenating the final result seems to be faster for large arrays:
When interpolating 1,000,000 points in 5D, for instance, the chunked interpolating is almost twice as fast as interpolating all in one go.

from scipy.interpolate import RegularGridInterpolator
import numpy as np
import time
N = [10, 11, 12, 13, 14]
grid = [np.linspace(i, i + 1, n) for i, n in enumerate(N, 0)]
values = np.arange(np.prod(N)).reshape(N)
rgi = RegularGridInterpolator(grid, values)
for N in [10000, 100000, 1000000]:
 xp = np.array([np.random.random(N) + i for i in range(5)]).T
 t = time.time()
 r1 = rgi(xp)
 t1 = time.time() - t
 t = time.time()
 r2 = np.concatenate([rgi(xp_) for xp_ in np.array_split(xp, 100)])
 t2 = time.time() - t
 print(f'{N}: {t1:.3f}, 100x{N/100}: {t2:.3f}')

Output

10000: 0.009, 100x100.0: 0.051
100000: 0.087, 100x1000.0: 0.067
1000000: 1.094, 100x10000.0: 0.594

Versions

Python 3.13.9
numpy 2.3.4
pip 25.3
scipy 1.16.3

Improve this question

edited Nov 16, 2025 at 19:39

Nick ODell's user avatar

Nick ODell

28.2k7 gold badges53 silver badges93 bronze badges

asked Nov 13, 2025 at 9:51

Mads M Pedersen's user avatar

Mads M Pedersen

6651 gold badge8 silver badges18 bronze badges

Side note: if you want to benchmark code in python, always use time.perf_counter() for better accuracy. time.time() is more suited for general timekeeping.

mqqz
– mqqz

2025年11月13日 09:57:14 +00:00
Commented Nov 13, 2025 at 9:57

Add a comment |

1 Answer 1

Sorted by: Reset to default

This happens because RegularGridInterpolator’s vectorized evaluation becomes memory-bound for large inputs — the overhead of allocating huge intermediate arrays dominates the runtime.

When you split the input into smaller chunks, NumPy’s temporary arrays stay smaller and fit better in CPU cache, so you get less memory pressure and better throughput.

In short:
✅ Small chunks → better cache locality, less memory overhead
❌ One big array → vectorization overhead + cache misses

If you want both speed and simplicity, use chunking proportional to available RAM (e.g. np.array_split(xp, max(1, len(xp)//5000))).

Improve this answer

answered Nov 13, 2025 at 10:34

louis thomas's user avatar

louis thomas

3801 silver badge8 bronze badges

Comments

Your Answer

Draft saved

Draft discarded

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

lang-py

CollectivesTM on Stack Overflow

Strange scipy.interpolate.RegularGridInterpolator timings

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CollectivesTM on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related