Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

MEP32: matplotlibrc/mplstyle with Python syntax. #9528

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
anntzer wants to merge 3 commits into matplotlib:main from anntzer:mplrc-mep

Conversation

Copy link
Contributor

@anntzer anntzer commented Oct 22, 2017
edited
Loading

See proposed MEP text in the PR. The rendered version is available by clicking on view at the top right of the "files changed" tab.

I had this written for a while but was hoping to push this a bit later (I don't really have the time to work on it now). #6157 (comment) made me consider at least publishing my current thoughts.

attn @matplotlib/developers

rotsee reacted with thumbs up emoji
Copy link
Member

jklymak commented Oct 22, 2017
edited
Loading

Cool - I think I prefer the straight python syntax. How does that look to the user in their code where they call the style sheet? Same as before?

Copy link
Contributor Author

anntzer commented Oct 22, 2017

Indeed, I did not mention that point (nothing changes for the end user). Edited accordingly.

jklymak reacted with thumbs up emoji

@anntzer anntzer force-pushed the mplrc-mep branch 3 times, most recently from a1dff3f to 43cab58 Compare October 23, 2017 08:49
@jklymak jklymak added this to the v3.0 milestone Dec 21, 2017
Copy link
Member

jklymak commented Dec 21, 2017

Milestoning as 3.0, a) because I think this would be great, and popping to the top of everyone's queue, and b) because I would be surprised if it was accepted for 2.2.

tacaswell reacted with thumbs up emoji

Copy link
Member

efiring commented Dec 21, 2017

I also like plain python for configuration. I ran into a speed bump, but got over it, when doing something like this. See the Bunch.from_pyfile() method in https://currents.soest.hawaii.edu/hgstage/pycurrents/file/tip/system/misc.py.

Copy link
Member

Thanks for putting this together! I am coming around to this being the right thing to do (despite my grumbling).

A third option for the full-python implementation is to look for modules with a function

def apply_style(rcparams) -> None:
 ...

and have style.use pass the rcParams file through them in sequence. This would also let us do the update atomically by collecting the changes all into a temporary namespace and then update the global one once. It would also avoid having to delete the module to re-import it.

This would also allow for some interesting import hook implementations...

An other option would be to make rcparams a Traitlets object and leverage all of their configuration management.

anntzer and jenshnielsen reacted with thumbs up emoji

Copy link
Member

jklymak commented Mar 7, 2018

Ping @anntzer - I personally think this would be a great feature for 3.0.... (once you are done expunging any py2-isms that have been driving you nuts).

Copy link
Member

WeatherGod commented Mar 7, 2018 via email

This is basically dead on arrival as far as I am concerned. I have spoken to a few other people in the private sector who have to put their systems through security review, and they all agreed that matplotlib loading arbitrary, unsanitized, code upon import, with no ability to exclude that operation or force a plain config approach, would fail security review. This is particularly problematic because matplotlib is a dependency to so many other projects, people may not even be aware that their system is vulnerable. The argument that "ipython does it, why can't we?" is misleading. ipython is intended to be an end-user component, with its design and implications all up front. You don't accidentally have ipython be a dependency because it was a dependency of something else a few layers deep. matplotlib is used as both an end-user component *and* as an API library. Its use is almost as prevalent as numpy. Imagine if numpy executed a config file every time it was imported! If this were to go forward, serious work would need to go into its design, particularly for sanitizing the input and constraining its effects. I doubt that the value proposition is there versus using a proper parser, several of which exists and could be adapted to our use.
...
On Wed, Mar 7, 2018 at 4:38 PM, Jody Klymak ***@***.***> wrote: Ping @anntzer <https://github.com/anntzer> - I personally think this would be a great feature for 3.0.... (once you are done expunging any py2-isms that have been driving you nuts). — You are receiving this because you are on a team that was mentioned. Reply to this email directly, view it on GitHub <#9528 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AARy-LTdfWOy_TZ3Vr7kmhfkQOwIo2h-ks5tcFNlgaJpZM4QCEt2> .

Copy link
Member

jklymak commented Mar 7, 2018

Good point. I do wonder about MPL arbitrarily loading a config file, without the opportunity to turn that convenience off, or even better to have it turned off by default. I think it makes sense that downstream packages should never load any matplotlibrc file.

Copy link
Contributor Author

anntzer commented Mar 8, 2018
edited
Loading

I am not particularly interested in relitigating this question (although I will state once again that 1) did you know that np.load can also execute arbitrary code? 2) the "proper parser" proposed has been in a state of vaporware for quite a while), but given that both @tacaswell and @efiring consider that this is a good approach, I'll let them present their thoughts (if any) on the security implications.

As for the implementation strategy I think I quite like @tacaswell's suggestion of requiring an apply_style hook.

Copy link
Member

WeatherGod commented Mar 8, 2018 via email

which is why I don't use np.load() in my production code! We already covered this. The point is that I can choose to not use np.load(). I can choose to not use ipython. But if matplotlib goes down this route, it would happen at import time, and can't be avoided.
...
On Wed, Mar 7, 2018 at 9:16 PM, Antony Lee ***@***.***> wrote: I am not particularly interested in relitigating this question (although I will still say that 1) did you know that np.load can also execute arbitrary code? 2) the "proper parser" proposed has been in a state of vaporware for quite a while), but given that both @tacaswell <https://github.com/tacaswell> and @efiring <https://github.com/efiring> consider that this is a good approach, I'll let them present their thoughts (if any) on the security implications. As for the implementation strategy I think I quite like @tacaswell <https://github.com/tacaswell>'s suggestion of requiring an apply_style hook. — You are receiving this because you are on a team that was mentioned. Reply to this email directly, view it on GitHub <#9528 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AARy-D4YXLPGMF05OQgji7688Mg8zM6-ks5tcJR6gaJpZM4QCEt2> .

Copy link
Contributor

pwuertz commented Mar 8, 2018

Just to throw in my 2 cents: From what I see none of the arguments against YAML mentioned in the linked discussion apply here (any more). The ruamel.yaml implementation is actively maintained (forget about PyYAML) and does safe parsing by default, i.e. there is no security risk of importing/constructing arbitrary objects by default. You just explicitly add the constructors for the custom types (like cycler) to be supported.

In my personal opinion YAML currently is the most human-readable, human-editable configuration language out there. And having a parser/dumper with round-trip support (like ruamel.yaml) makes it machine-readable/writable too.

Copy link
Contributor Author

anntzer commented Mar 8, 2018

I guess I'm +0 on using yaml (if it is indeed possible to define custom type loaders), I don't like the language but it's not too bad and most importantly at least it has a spec...

Copy link
Member

WeatherGod commented Mar 8, 2018 via email

Thanks for pointing out ruamel! I hadn't noticed it before!
...
On Thu, Mar 8, 2018 at 3:08 AM, Antony Lee ***@***.***> wrote: I guess I'm +0 on using yaml (if it is indeed possible to define custom type loaders), I don't like the language but it's not too bad and *most importantly* at least it has a spec... — You are receiving this because you are on a team that was mentioned. Reply to this email directly, view it on GitHub <#9528 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AARy-CihlfOEKXAX5zZzB0dSjqWjwCU4ks5tcOcRgaJpZM4QCEt2> .

Copy link
Contributor

Given how it shows up in my conda updates, I think ruamel is used by conda for parsing its config.

I agree some with @WeatherGod here--as a library, we need to tread differently than an application. In theory, the same level of permissions required to modify the config would be used to run the script (baring misconfiguration), so no security risk. On the other hand, we're literally talking about adding a hook to trigger arbitrary code execution, which screams out CVE to me (as irrational as that might be). (I think I've changed my feeling from in the past.)

Copy link
Contributor Author

anntzer commented Mar 8, 2018

In theory, the same level of permissions required to modify the config would be used to run the script (baring misconfiguration), so no security risk.

Thanks for pointing that out. At that point I'm genuinely curious about what is the threat model that we're talking about avoiding. As in, can you present any way in which such a patch (say e.g. matplotlib executes ~/.config/matplotlibrc.py on import, just to pick a concrete case) actually makes things more dangerous?

Note also that you can opt out: just set the MATPLOTLIBRC environment variable to os.devnull (/dev/null) before importing matplotlib. (Well, if that doesn't work, then there should be a patch that makes it work.)

Copy link
Contributor

I'm not a security expert, nor do I want to pretend to be, so I feel really uncomfortable trying to decide what's "safe".

There's also a difference between what's actually safe and what our user community would perceive as safe.

Copy link
Contributor Author

anntzer commented Mar 8, 2018
edited
Loading

I guess it's good that the discussion restarted there, because I got to think about the issue again and am going to renege on my "uninterest in relitigating it" mentioned above.

Let's, again, use a concrete proposal as basis of discussion: at import time, matplotlib tries to import a module named matplotlibrc(.py) from the normal config path (possibly modified by MATPLOTLIBRC), and tries to call the apply_style(rc) function defined there. It is possible to disable this feature (for example) by setting MATPLOTLIBRC to os.devnull. Note that neither eval nor exec appear in this proposal :-)

The fear appears to be that that module can be modified by an attacker or an oblivious user to execute arbitrary code, e.g. os.system("rm -rf /"), and we don't want to be responsible for that. But wait! An attacker or an oblivious user had a much simpler way to achieve the same thing: they could just write this into six.py and put that in the user's cwd (or ~/.local/lib/python3.6/site-packages), and we do import six (and other dependencies) without any kind of validation (just as nearly all Python packages in the world do).

From this I conclude that we (and nearly all other Python packages that have external dependencies) are already vulnerable to arbitrary code execution (in that model), and can essentially do nothing about it.

Edit: for improved impact, replace six.py by some stdlib module name, of course.

Copy link
Contributor

Just my 2 cents, I think this conversation should have input from somebody with experience in security issues who can vet the proposal. Security vulnerabilities are hard to spot and, as suggested in this thread, can be a deal-breaker for some groups. I'd feel more comfortable moving forward on this if a trusted voice said that it was OK from a security standpoint.

jkseppan reacted with thumbs up emoji

@jklymak jklymak modified the milestones: v3.0, v3.1 Jul 9, 2018
@anntzer anntzer mentioned this pull request Oct 12, 2018
6 tasks
@tacaswell tacaswell modified the milestones: v3.1.0, v3.2.0 Feb 26, 2019
@anntzer anntzer mentioned this pull request Jul 31, 2019
6 tasks
@tacaswell tacaswell modified the milestones: v3.2.0, v3.3.0 Aug 19, 2019
@QuLogic QuLogic modified the milestones: v3.3.0, v3.4.0 Apr 30, 2020
@jklymak jklymak marked this pull request as draft September 12, 2020 19:54
@QuLogic QuLogic modified the milestones: v3.4.0, v3.5.0 Jan 21, 2021
Copy link

In my personal opinion YAML currently is the most human-readable, human-editable configuration language out there. And having a parser/dumper with round-trip support (like ruamel.yaml) makes it machine-readable/writable too.

Sorry for resurrecting an old issue, but I just saw it milestoned for 3.5.0 and had a look. I was wondering if toml had been considered (it's not part of the list of formats in the alternatives). It basically combines (IMO) the best of both worlds between YAML & JSON, and results in a very readable format (and is getting ever more widespread use through PEP518, cargo, etc.).

Since this allows easily defining arrays etc., and has a fully specified & verifiable grammar, that might also help with having to eval strings that currently contain more complicated expressions.

NeilGirdhar reacted with thumbs up emoji

Copy link
Contributor Author

anntzer commented May 1, 2021

I don't think PEP518 is a good argument here; PEP518 explicitly chose to introduce a new format rather than using Python literals (https://www.python.org/dev/peps/pep-0518/#python-literals) because they expect build systems to be written in other languages than Python (note that this is the only point against them), whereas we certainly don't expect matplotlibrc being read by anyone other than Matplotlib.

Copy link

I'm not saying it's a complete answer (e.g. I haven't understood the cycler requirements beyond the fact that some arrays are incompatible with the current comma-dependent parsing), my core point was mainly: TOML > YAML

anntzer and NeilGirdhar reacted with thumbs up emoji

@tacaswell tacaswell modified the milestones: v3.5.0, v3.6.0 Aug 5, 2021
@timhoffm timhoffm modified the milestones: v3.6.0, unassigned Apr 30, 2022
@story645 story645 modified the milestones: unassigned, needs sorting Oct 6, 2022
Copy link

Since this Pull Request has not been updated in 60 days, it has been marked "inactive." This does not mean that it will be closed, though it may be moved to a "Draft" state. This helps maintainers prioritize their reviewing efforts. You can pick the PR back up anytime - please ping us if you need a review or guidance to move the PR forward! If you do not plan on continuing the work, please let us know so that we can either find someone to take the PR over, or close it.

@github-actions github-actions bot added the status: inactive Marked by the "Stale" Github Action label Apr 21, 2023
Copy link
Contributor Author

anntzer commented Apr 21, 2023

I think the idea is well-advertised now and whether the issue is open or closed won't change much to the discussion.

Copy link
Member

As a remark, strictYAML may be an improvement over the very general and complex YAML standard. https://github.com/crdoconnor/strictyaml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Reviewers
No reviews
Assignees
No one assigned
Labels
status: inactive Marked by the "Stale" Github Action topic: rcparams
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AltStyle によって変換されたページ (->オリジナル) /