-
-
Notifications
You must be signed in to change notification settings - Fork 8k
Use old stride_windows implementation on 32-bit builds #29115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
It's not really clear to me why sliding_window_view (in the way we use it) would lead to an OOM while the manual approach wouldn't?
Perhaps there is a NumPy calculation bug? It ends up as:
__________ test_psd_csd[png] __________
@image_comparison(
["psd_freqs.png", "csd_freqs.png", "psd_noise.png", "csd_noise.png"],
remove_text=True, tol=0.002)
def test_psd_csd():
n = 10000
Fs = 100.
fstims = [[Fs/4, Fs/5, Fs/11], [Fs/4.7, Fs/5.6, Fs/11.9]]
NFFT_freqs = int(1000 * Fs / np.min(fstims))
x = np.arange(0, n, 1/Fs)
ys_freqs = np.sin(2 * np.pi * np.multiply.outer(fstims, x)).sum(axis=1)
NFFT_noise = int(1000 * Fs / 11)
np.random.seed(0)
ys_noise = [np.random.standard_normal(n), np.random.rand(n)]
all_kwargs = [{"sides": "default"},
{"sides": "onesided", "return_line": False},
{"sides": "twosided", "return_line": True}]
for ys, NFFT in [(ys_freqs, NFFT_freqs), (ys_noise, NFFT_noise)]:
noverlap = NFFT // 2
pad_to = int(2 ** np.ceil(np.log2(NFFT)))
for ax, kwargs in zip(plt.figure().subplots(3), all_kwargs):
> ret = ax.psd(np.concatenate(ys), NFFT=NFFT, Fs=Fs,
noverlap=noverlap, pad_to=pad_to, **kwargs)
../venv-test/lib/python3.12/site-packages/matplotlib/tests/test_axes.py:5529:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../venv-test/lib/python3.12/site-packages/matplotlib/_api/deprecation.py:453: in wrapper
return func(*args, **kwargs)
../venv-test/lib/python3.12/site-packages/matplotlib/__init__.py:1521: in inner
return func(
../venv-test/lib/python3.12/site-packages/matplotlib/axes/_axes.py:7616: in psd
pxx, freqs = mlab.psd(x=x, NFFT=NFFT, Fs=Fs, detrend=detrend,
../venv-test/lib/python3.12/site-packages/matplotlib/mlab.py:511: in psd
Pxx, freqs = csd(x=x, y=None, NFFT=NFFT, Fs=Fs, detrend=detrend,
../venv-test/lib/python3.12/site-packages/matplotlib/mlab.py:567: in csd
Pxy, freqs, _ = _spectral_helper(x=x, y=y, NFFT=NFFT, Fs=Fs,
../venv-test/lib/python3.12/site-packages/matplotlib/mlab.py:307: in _spectral_helper
result = np.lib.stride_tricks.sliding_window_view(
../venv-test/lib/python3.12/site-packages/numpy/lib/stride_tricks.py:336: in sliding_window_view
return as_strided(x, strides=out_strides, shape=out_shape,
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
x = array([ 0. , 2.49169733, 1.49741725, ..., 1.04870198,
0.57119919, -0.18319108]), shape = (1988101, 11900), strides = (8, 8), subok = False, writeable = False
def as_strided(x, shape=None, strides=None, subok=False, writeable=True):
"""
Create a view into the array with the given shape and strides.
.. warning:: This function has to be used with extreme care, see notes.
Parameters
----------
x : ndarray
Array to create a new.
shape : sequence of int, optional
The shape of the new array. Defaults to ``x.shape``.
strides : sequence of int, optional
The strides of the new array. Defaults to ``x.strides``.
subok : bool, optional
.. versionadded:: 1.10
If True, subclasses are preserved.
writeable : bool, optional
.. versionadded:: 1.12
If set to False, the returned array will always be readonly.
Otherwise it will be writable if the original array was. It
is advisable to set this to False if possible (see Notes).
Returns
-------
view : ndarray
See also
--------
broadcast_to : broadcast an array to a given shape.
reshape : reshape an array.
lib.stride_tricks.sliding_window_view :
userfriendly and safe function for the creation of sliding window views.
Notes
-----
``as_strided`` creates a view into the array given the exact strides
and shape. This means it manipulates the internal data structure of
ndarray and, if done incorrectly, the array elements can point to
invalid memory and can corrupt results or crash your program.
It is advisable to always use the original ``x.strides`` when
calculating new strides to avoid reliance on a contiguous memory
layout.
Furthermore, arrays created with this function often contain self
overlapping memory, so that two elements are identical.
Vectorized write operations on such arrays will typically be
unpredictable. They may even give different results for small, large,
or transposed arrays.
Since writing to these arrays has to be tested and done with great
care, you may want to use ``writeable=False`` to avoid accidental write
operations.
For these reasons it is advisable to avoid ``as_strided`` when
possible.
"""
# first convert input to array, possibly keeping subclass
x = np.array(x, copy=False, subok=subok)
interface = dict(x.__array_interface__)
if shape is not None:
interface['shape'] = tuple(shape)
if strides is not None:
interface['strides'] = tuple(strides)
> array = np.asarray(DummyArray(interface, base=x))
E ValueError: array is too big; `arr.size * arr.dtype.itemsize` is larger than the maximum possible size.
../venv-test/lib/python3.12/site-packages/numpy/lib/stride_tricks.py:105: ValueError
Oh, I see, this is because the intermediate array is too large even though we slice it immediately (to compute the overlapping FTs); also it seems like numpy wants array.size * array.itemsize to be representable even though that may be much bigger than the physical array size. That seems overall related to the request for step_size at numpy/numpy#18244.
I guess the easy way out is indeed to go back to as_strided.
I thought mlab was being deprecated at some point. How useful is this to add this code piece, versus adding a pytest.skipif(sys.maxsize < 2**32)
on the failing tests and suggesting users to do this themselves if they want to do large array calculations on 32-bit systems?
Fair enough; I don't know what the status of the deprecations are at this point. I will say that this is reverting to the pre-#21190 code, so it's not new, and I've been using the patch on Fedora without issue since that PR, so it's been stable AFAICT.
769eeb1
to
8435793
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me, it seems more sense to put the comment about OOM in the 32-bit clause, but apart from that it makes sense to add this to simplify life for those running 32-bit systems.
bc3a3ba
to
af49409
Compare
af49409
to
8ef71ee
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the special-case here can go away, it should be handled by the general case below.
lib/matplotlib/mlab.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nowadays I think it is more usual to error out if noverlap or n are not ints (or rather number.Integer).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, both error out in the 64-bit case for floats.
lib/matplotlib/mlab.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this is just a backcompat function I would suggest grouping together all the checks for conciseness, something like
if not (isinstance(n, Integer) and isinstance(noverlap, Integer) and 1 <= n <= x.size and n < noverlap: raise ValueError(f"n ({n}) and noverlap ({noverlap}) must be positive integers with n < noverlap and n <= x.size ({x.size})")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I merged and didn't notice the typo: #30273
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nits, but nothing critical. Feel free to merge with or without them.
This was originally for i686 on Fedora, but is now applicable to WASM, which is 32-bit. The older implementation doesn't OOM.
8ef71ee
to
f31ba35
Compare
Unfortunately, I applied the change from matplotlib#29115 (comment) directly without noticing the typo, or running full tests. So fix the swapped condition, and add a test (for `csd` only, which should be enough since everything goes though `_spectral_helper`.)
Unfortunately, I applied the change from matplotlib#29115 (comment) directly without noticing the typo, or running full tests. So fix the swapped condition, and add a test (for `csd` only, which should be enough since everything goes though `_spectral_helper`.)
Unfortunately, I applied the change from matplotlib#29115 (comment) directly without noticing the typo, or running full tests. So fix the swapped condition, and add a test (for `csd` only, which should be enough since everything goes though `_spectral_helper`.)
Unfortunately, I applied the change from matplotlib#29115 (comment) directly without noticing the typo, or running full tests. So fix the swapped condition, and add a test (for `csd` only, which should be enough since everything goes though `_spectral_helper`.)
Unfortunately, I applied the change from matplotlib#29115 (comment) directly without noticing the typo, or running full tests. So fix the swapped condition, and add a test (for `csd` only, which should be enough since everything goes though `_spectral_helper`.)
Uh oh!
There was an error while loading. Please reload this page.
PR summary
I've long had the patch on Fedora (since #21190 (comment)), but it's now applicable to WASM as well (#29093), which is 32-bit. The older implementation doesn't OOM.
cc @anntzer as original author of that PR in case you have an alternate implementation idea.
PR checklist