ENH: Only create ticks if required #27027

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Draft

larsoner wants to merge 2 commits into matplotlib:main

from larsoner:ticks

Draft

ENH: Only create ticks if required #27027

larsoner wants to merge 2 commits into matplotlib:main from larsoner:ticks

Conversation

larsoner

Copy link

Contributor

@larsoner larsoner commented Oct 7, 2023

PR summary

Consider this a proposal-by-draft-PR since I couldn't see how tractable a solution would be until I actually fixed/improved the behavior, and this very well might not be a good solution or work fully correctly yet! #6664 being closed got me to look again at whether in MNE-Python we should plot traces ourselves with fake mini-axes or try creating hundreds of axes (below I used 720 axes):

Benchmarking code

import numpy as np
import matplotlib.pyplot as plt
import time
rng = np.random.default_rng(0)
n_ch = 720
n_samp = 1000
n_col = int(np.ceil(np.sqrt(n_ch)))
n_row = int(np.ceil(n_ch / n_col))
t = np.linspace(0, 1. / n_col * 0.8, n_samp)
data = rng.uniform(0, 1. / n_row * 0.8, (n_ch, n_samp))
positions = np.array(
 np.meshgrid(
 np.linspace(0, 1, n_row + 2)[1:-1],
 np.linspace(0, 1, n_col + 2)[1:-1],
 )
)
positions.shape = (2, -1)
positions = positions.T[:n_ch]
assert positions.shape == (n_ch, 2)
rcParams = plt.rcParams
rcParams['xtick.top'] = False
rcParams['xtick.bottom'] = False
rcParams['xtick.labeltop'] = False
rcParams['xtick.labelbottom'] = False
rcParams['ytick.left'] = False
rcParams['ytick.right'] = False
rcParams['ytick.labelleft'] = False
rcParams['ytick.labelright'] = False
rcParams["axes.grid"] = False
# Unified
t0 = time.time()
fig = plt.figure(layout=None)
ax = fig.add_axes([0, 0, 1, 1])
for p, d in zip(positions, data):
 ax.plot(p[0] + t, p[1] + d, color='k', lw=0.5)
fig.canvas.draw_idle()
print(time.time() - t0)
t0 = time.time()
fig, axes = plt.subplots(n_row, n_col, layout=None)
for ax, d in zip(axes.ravel(), data):
 ax.plot(t, d, color="k", lw=0.5)
 ax.axis("off")
for ax in axes.ravel()[n_ch:]:
 fig.delaxes(ax)
fig.canvas.draw_idle()
print(time.time() - t0)

Running this script on main, the first time is the time it takes to plot all 720 of the 1000-sample traces within a single Axes (positioning them with offsets) versus using 720 axes, each with a single trace:

0.28928518295288086
2.8085598945617676

On this PR the timings are:

0.3005490303039551
0.9130969047546387

So we cut the time down from ~2.8s to ~0.9s. The code here was developed using kernprof/line-profiler to see that the bulk of time was spent formatting ticks that would never be used. To prevent tick creation (and thus reformatting) both of the changes in this PR were necessary, namely:

autodetecting when NullLocator can be used in LinearScale, and
explicitly avoiding creating any ticks when NullLocator is in use
avoid accessing majorTicks[0] without checking len first in one place (could be others!)

It really doesn't seem like (2) should be required in principle, but I couldn't figure out how to avoid the tick creation with the _LazyTickList and how it gets accessed/used -- I couldn't wrap my head around it, and no matter what I did, 2 ticks were always created per axis. And that means 4 per plot, i.e., 2880 ticks with text and lines and such that need to be processed (hence the time savings by avoiding it in this PR).

If this seems like a reasonable or workable approach, it looked like there was some very similar code elsewhere in scale.py that could use a similar treatment.

Profiling with py-spy record -f speedscope --subprocesses --nonblocking --rate 1000 python ~/Desktop/topo_bench.py I'm not sure there are any more big gains to be made here, maybe another 50 ms from avoiding spines or 50 ms from avoiding text resetting but those seem like much more challenging targets:

cc @drammock and @ruuskas who I discussed this with a bit recently

PR checklist

[N/A] "closes #0000" is in the body of the PR description to link the related issue
[N/A] new and changed code is tested
- Existing tests should pass and I'm not sure we could query the problematic, intermediate/transient (I think) internal state where .majorTicks has two entries even when NullLocator is in use.
[N/A] Plotting related features are demonstrated in an example
[N/A] New Features and API Changes are noted with a directive and release note
[N/A] Documentation complies with general and docstring guidelines

@larsoner


 ENH: Only create ticks if required

6428a1c

jklymak

jklymak reviewed

Oct 7, 2023

View reviewed changes

Copy link

Member

@jklymak jklymak left a comment •

edited

Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Speeding this up would be great! OTOH this seems to do it by carving out specially named axises?

lib/matplotlib/scale.py

axis.axis_name == 'y' and (mpl.rcParams['ytick.left'] or mpl.rcParams['ytick.right'])):

axis.set_major_locator(AutoLocator())

else:

axis.set_major_locator(NullLocator())

Copy link

Member

@jklymak jklymak Oct 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is breaking on polar where I think these axes are named something else (r and theta?)

But I'm curious how this is helping you? Are your axes not named x and y?

Copy link

Contributor Author

@larsoner larsoner Oct 7, 2023 •

edited

Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I'm curious how this is helping you? Are your axes not named x and y?

This logic is copied directly from below (109-113 on main), only adapting major in place of minor. Below for the minor ticks if the axis is x (or y), and the user has x.minor.visible set to False, it uses NullLocator for minorTicks. So this was the equivalent logic (I think) for major where if the axis is x (or y), and the user has xtick.visible set to False, it uses NullLocator for majorTicks.

Copy link

Contributor Author

@larsoner larsoner Oct 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is breaking on polar where I think these axes are named something else (r and theta?)

I think I see something else. Assuming Polar uses a Linear scaling (I assume it does), Polar could be making an assumption that the locator is AutoLocator or whatever the default used to be before my change. Then you NullLocator.set_params(...) which is not allowed (safely at least). At least that's what I'm thinking from this traceback:

/home/runner/work/matplotlib/matplotlib/lib/matplotlib/projections/polar.py:421: in clear
 super().clear()
/home/runner/work/matplotlib/matplotlib/lib/matplotlib/axis.py:897: in clear
 self._set_scale('linear')
/home/runner/work/matplotlib/matplotlib/lib/matplotlib/projections/polar.py:433: in _set_scale
 self.get_major_locator().set_params(steps=[1, 1.5, 3, 4.5, 9, 10])
/home/runner/work/matplotlib/matplotlib/lib/matplotlib/ticker.py:1605: in set_params
 _api.warn_external(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
message = "'set_params()' not defined for locator of type <class 'matplotlib.ticker.NullLocator'>"

One easy fix/workaround is to just make clear or maybe better Polar._set_scale set the formatter to AutoFormatter since it almost immediately thereafter assumes it can set_params on it.

Copy link

Member

@jklymak jklymak Oct 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I see - this is just not setting the locators/formatters if the ticks are turned off. That seems a reasonable short path, if a rare use case.

@larsoner


 FIX: Conditional setting

@timhoffm

Copy link

Member

timhoffm commented Oct 8, 2023

Handling each tick individually is costly, and almost always all ticks have the same style. I have a rough idea for migrating from individual ticks to tick collections. These insert an abstraction layer, replacing the current tick lists and can be more efficient internally. Though, we need to ensure backward compatibility and still allow styling individual ticks. The strategy here is to allow access to individual ticks, and if the user does that the internal representation of the tick collection falls back to a list of individual ticks. Within our own library code, we don’t use that interface (in standard plots, there is no need to access individual ticks) and thus have the more efficient internal representation.
@larsoner if you are interested in looking into this, here is a very first step in that direction: main...timhoffm:matplotlib:tick-refactor

@jklymak

Copy link

Member

jklymak commented Oct 8, 2023

@timhoffm I agree that a TickCollection would be great, and it would be great if @larsoner chose to work on it. However, would that more general plan block the fast-path proposed here?

@timhoffm

Copy link

Member

timhoffm commented Oct 9, 2023

It wouldn’t necessarily block this PR, but it might be that the PR logic can be implemented more consistently behind such an abstraction layer, or that it’s not needed at all afterwards.

@larsoner

Copy link

Contributor Author

larsoner commented Oct 9, 2023

it might be that the PR logic can be implemented more consistently behind such an abstraction layer, or that it’s not needed at all afterwards.

It's possible. I'm not totally sure, but maybe I could add some tests here. For example if you request "no ticks" (like in my example above) currently in main two ticks per axis do get created which seems counter-intuitive. So I could continue pursuing this PR and add tests that "if I request no ticks, none should be created". Then these should still pass after the refactor after some adjustment like "the collection should be empty / length zero".

I'm also unsure if using a collection with no properties will be as fast (not knowledgable enough to know) -- you'd still have to iterate over 720*2 of the collections and set properties in my example above, whereas after this PR you won't iterate over anything because .majorTicks will be an empty list. That would still be better than in main where you iterate over 720*2*2 of them, but not as good at iterating over an empty list (unless the collection that's used is very smart about doing no-ops for property setting when it's "empty").

@github-actions github-actions bot added the status: needs rebase label

Jan 7, 2025

Labels

status: needs rebase

3 participants

@larsoner @timhoffm @jklymak

Uh oh!

ENH: Only create ticks if required #27027

Are you sure you want to change the base?

ENH: Only create ticks if required #27027

Uh oh!

Conversation

@larsoner larsoner commented Oct 7, 2023

PR summary

PR checklist

Uh oh!

@jklymak jklymak left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

@jklymak jklymak Oct 7, 2023

Choose a reason for hiding this comment

Uh oh!

@larsoner larsoner Oct 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

@larsoner larsoner Oct 7, 2023

Choose a reason for hiding this comment

Uh oh!

@jklymak jklymak Oct 7, 2023

Choose a reason for hiding this comment

Uh oh!

timhoffm commented Oct 8, 2023

Uh oh!

jklymak commented Oct 8, 2023

Uh oh!

timhoffm commented Oct 9, 2023

Uh oh!

larsoner commented Oct 9, 2023

Uh oh!

Uh oh!

@jklymak jklymak left a comment •

edited

Loading

@larsoner larsoner Oct 7, 2023 •

edited

Loading