-
-
Notifications
You must be signed in to change notification settings - Fork 8k
FIX: get_datalim for collection #13642
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This is WIP from a still-need-to-fix-tests point of view, but I wanted to have a concrete proposal on the table for fixing part of #7413.
I think the general approach is good, haven't reviewed carefully yet.
Does this close #6915/#11198 too?
In #7413 (comment) @tacaswell argued in favor of accepting clipped markers too. Despite my comments on that thread I now also think ignoring marker extents in scatter makes complete sense (the "larger-than-figure marker" case is quite convincing).
No it doesn’t fix #6915. But I don’t think it’s incompatible with your PR leaving the autolim to draw time which makes sense to me.
OK, this breaks streamplots
, mostly in manageable ways, but a couple of the failures I can't figure out yet.
lib/matplotlib/collections.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
transOffset != transData
So, this PR damages autolim-ing on constructs like in test_transforms
:
times10 = mtransforms.Affine2D().scale(10) ax.scatter(np.linspace(0, 10), np.linspace(10, 0), transform=times10 + ax.transData)
I'm OK with that, but wanted to let folks know... The new test file looks fine, but the limits have changed substantially: https://github.com/matplotlib/matplotlib/blob/d3f8447cc69b21be885c8683c942043e2903ac7f/lib/matplotlib/tests/baseline_images/test_transforms/pre_transform_data.png
BTW, is there an easy way to tell if a transform has ax.dataTrans
in it? i.e. if someone has done a simple offsetTrans+ax.dataTrans
?
@jklymak Have a look at contains_branch or contains_branch_separately (not exactly sure about their use, but I think they may be what you want).
Please see updated description above.
I can make an API change doc etc, but wanted feedback on the approach first. Ping @ImportanceOfBeingErnest as you have been known to make extensive use of the transform machinery. This will break some autolimits for some transforms.
Instead of changing the baseline images, can you keep the old ones, add an assert that the x/ylims are what they should be per the new approach, then force the x/ylims to match the old values?
I'll wait until you look into contains_branch and possibly don't change the baseline images?
But again, generally speaking I like the approach.
Instead of changing the baseline images, can you keep the old ones, add an assert that the x/ylims are what they should be per the new approach, then force the x/ylims to match the old values?
Moved this discussion to gitter 😉
I'll wait until you look into contains_branch...
Seems to work! The transform=times10 + ax.transData
now works, though the image still changes because the scatter in that test changes its limits....
copypasting my reply on gitter:
Instead of regenerating the images, it may also be worth just checking whether we could do away with some of the tests, in case they're redundant with others
For example the test that generates scatter.png could become a check_figures_equal comparing scatter() with manually calling plot() to add one marker at a time
scatter_post_alpha too
I guess colorbar_single_scatter could just not do an image check but check that the colorbar bounds are correctly set
Also, fewer baseline-image tests also help mplcairo :) (as they are renderer-dependent)
Fair enough - I'll look at rejiggering the tests after we get some consensus that this is a good way to go. Particularly as A/B-ing the tests shows the end-effects of this change for users....
...alos ping @efiring
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just delete this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, meant to re-Instate this actually. Thanks....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess(?) you could get a few extra points by checking contains_branch_separately
and then deciding to use the x
or the y
data limits if one of them does contain transData; this could be a separate PR too as the use case is a bit obscure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's just forget this case for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed on gitter, I think thats best for simplicity sake. For instance, I don't know what the datalim handling does with infs (for instance) and adding this case is a good chunk of code. As also discussed, very happy to revisit if it turns out to be needed.
btw, I don't mind doing the conversion to check_figures_equal if you want.
I will need to look at this more thoroughly. Based on a quick glance I wonder if we can't keep using mpath.get_path_collection_extents
even in the cases you exclude here. It does not seem to give generally bad output, it just needs a good frozen transform to start with (maybe that is the hard part, but maybe not?) and it will then not be 100% accurate - but probably still better then cropping half the marker.
1162001
to
28f63d0
Compare
I will need to look at this more thoroughly. Based on a quick glance I wonder if we can't keep using
mpath.get_path_collection_extents
even in the cases you exclude here. It does not seem to give generally bad output, it just needs a good frozen transform to start with (maybe that is the hard part, but maybe not?) and it will then not be 100% accurate - but probably still better then cropping half the marker.
We could heuristically do that when the marker size is not too big (probably a quarter of the axes size), but you still need to solve the inverse problem, which isn't necessarily trivial particularly if you need to take into account the scale of the axes, or iterate to a solution as discussed before. Or accept the present state, which I think we can all agree is broken..
One further point, this doesn't typically cut off markers because of the default data margins. Just really big markers... Which are also the ones that cause auto-lim instability...
41e75ec
to
a005562
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the same spirit as #14845 this could be
offset_to_data = transOffset - transData
offsets = offset_to_data.transform(offsets)
(you also save a transform by doing so).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't paying attention when #14845 went in. This requires remembering that transA - transB is the same as what I wrote here. I don't think it helps the readability/comprehensibility of all the transform magic to have arithmetic operators that do stuff versus just using explicit methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main operator is trA+trB, which just means "perform trA, then perform trB"; once you know it it's quite natural that trA-trB means "perform trA, then perform the inverse of trB". You could write mtransforms.composite_transform_factory(trA, trB.inverse())
if you prefer explicit method names, but the point is that it is more efficient to first multiply two small (3x3) matrices and then use the result to transform a big (N,2) array, rather than to do two transformations in a series.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Split the difference: (trA + trB.inverted())
seems straightforward enough. Like, I understand the __sub__
convention, I just find it a bit obscure. The __add__
convention is fine.
a005562
to
f19ae31
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modulo ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, Axes.transData
does not link (I think because it's an undocumented instance variable), but there's one more that can be fixed:
Co-Authored-By: Elliott Sales de Andrade <quantum.analyst@gmail.com>
Uh oh!
There was an error while loading. Please reload this page.
PR Summary
EDIT 12 Mar:
Closes #13639
Closes the non-unique auto-limit problem in #7413 for collections. Collections have "offsets" and "paths" and each can be in a different co-ordinate frames; This PR tackles the autolim problem by consulting the
self.get_transform()
andself.get_offset_transform()
for the collection and checking if these areax.dataTrans
or not.For
collection.get_datalim
: There are two cases that I can see in common usage:A.
offsets=None
forLineCollection
. In this case, all the vertices in paths should be included in the autolim iftransform=ax.dataTrans
. (this is the case forstreamplot
)B
offset
is not None:if the offset coords (
offsetTrans
) is not the axesdataTrans
, then don't auto-lim on this collection at all. I don't know off hand of any built-in examples like this, but if the user specifies a bunch of markers in AxesTrans co-ords, then they shouldn't expect the axes to change its x/ylims.If the path coords (
self.transform
) isdataTrans
then we use the existing algorithm. This is useful for paths like inquiver
, where the offset defines the tail of an arrow, and the path defines it's length. In that case, the whole arrow is meant to be in data space, and the lims can change size accordinglyif the path co-ords are not
dataTrans
, then we don't consult thepath
at all, because these paths are in some other unit that doesn't help set the data limits. i.e. scatter markers are in points, and these can be arbitrarily large (even larger than the figure if you want), so setting the axes data limits based on this size is ill-defined. That lead to many of the issues in Autoscaling has fundamental problems #7413 . Of course theoffset
limits should still be included in the limits (i,e. we want the centre of markers to be in the new x/ylims.)This necessitated 14 image tests needing to be changed, mostly due to small axes limit differences in
scatter
tests.The major exception is
test_transforms.py::test_pre_transform_plotting
which highlights a limitation of the consult-the-transforms approach used here. If the user passes a simple transform liketransform = mtransforms.Affine2D().scale(10) + ax.dataTrans
the autolimiting will not take place becausetransform != ax.dataTrans
. Thats a cost of the approach, but I personally feel users should just transform the data before asking matplotlib to plot it, or be willing to manually adjust the axes limits.Before
Figure_2
After
Figure_1
Big Markers
Note that big markers will be clipped. This is "worse", but its consistent...
Before:
boo2
After:
boo
Cures inconsistency
To see the inconsistency in the old way not how the xlim changes with each re-lim:
Of course the resulting markers are still clipped anyways:
Before:
boo3
After
boo4
Failures:
test_axes
Failures are minor axes limits changing slightly....
test_collections.py
test_barb_limits
fails because it is constructed as:and that fails the rubric above because the offset transform is not the data transform. Not sure about how people feel about this - personally I'm not a fan of defining data at one location and then using the transform stack to move the object around, instead of just manipulating the original data, and if folks want to do this, I'm OK with auto_lims failing for them. If you are going to manually move things around with a low-level stack, then you can also be expected to set your limits accordingly: flexibility comes at the cost of automation.
test_EllipseCollection
,test_regularpolycollection_rotate
fail somewhat badly because they are big paths that used to get some space from the existing algorithm, and no longer do.scatter_post_alpha
is a very minor change....PR Checklist