-
-
Notifications
You must be signed in to change notification settings - Fork 8k
MEP32: matplotlibrc/mplstyle with Python syntax. #9528
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Cool - I think I prefer the straight python syntax. How does that look to the user in their code where they call the style sheet? Same as before?
cad68af
to
94cbbcb
Compare
Indeed, I did not mention that point (nothing changes for the end user). Edited accordingly.
a1dff3f
to
43cab58
Compare
43cab58
to
ac17c1c
Compare
Milestoning as 3.0, a) because I think this would be great, and popping to the top of everyone's queue, and b) because I would be surprised if it was accepted for 2.2.
I also like plain python for configuration. I ran into a speed bump, but got over it, when doing something like this. See the Bunch.from_pyfile()
method in https://currents.soest.hawaii.edu/hgstage/pycurrents/file/tip/system/misc.py.
Thanks for putting this together! I am coming around to this being the right thing to do (despite my grumbling).
A third option for the full-python implementation is to look for modules with a function
def apply_style(rcparams) -> None: ...
and have style.use
pass the rcParams file through them in sequence. This would also let us do the update atomically by collecting the changes all into a temporary namespace and then update the global one once. It would also avoid having to delete the module to re-import it.
This would also allow for some interesting import hook implementations...
An other option would be to make rcparams a Traitlets
object and leverage all of their configuration management.
Ping @anntzer - I personally think this would be a great feature for 3.0.... (once you are done expunging any py2-isms that have been driving you nuts).
Good point. I do wonder about MPL arbitrarily loading a config file, without the opportunity to turn that convenience off, or even better to have it turned off by default. I think it makes sense that downstream packages should never load any matplotlibrc file.
I am not particularly interested in relitigating this question (although I will state once again that 1) did you know that np.load can also execute arbitrary code? 2) the "proper parser" proposed has been in a state of vaporware for quite a while), but given that both @tacaswell and @efiring consider that this is a good approach, I'll let them present their thoughts (if any) on the security implications.
As for the implementation strategy I think I quite like @tacaswell's suggestion of requiring an apply_style
hook.
Just to throw in my 2 cents: From what I see none of the arguments against YAML
mentioned in the linked discussion apply here (any more). The ruamel.yaml
implementation is actively maintained (forget about PyYAML
) and does safe parsing by default, i.e. there is no security risk of importing/constructing arbitrary objects by default. You just explicitly add the constructors for the custom types (like cycler) to be supported.
In my personal opinion YAML
currently is the most human-readable, human-editable configuration language out there. And having a parser/dumper with round-trip support (like ruamel.yaml
) makes it machine-readable/writable too.
I guess I'm +0 on using yaml (if it is indeed possible to define custom type loaders), I don't like the language but it's not too bad and most importantly at least it has a spec...
Given how it shows up in my conda updates, I think ruamel is used by conda for parsing its config.
I agree some with @WeatherGod here--as a library, we need to tread differently than an application. In theory, the same level of permissions required to modify the config would be used to run the script (baring misconfiguration), so no security risk. On the other hand, we're literally talking about adding a hook to trigger arbitrary code execution, which screams out CVE to me (as irrational as that might be). (I think I've changed my feeling from in the past.)
In theory, the same level of permissions required to modify the config would be used to run the script (baring misconfiguration), so no security risk.
Thanks for pointing that out. At that point I'm genuinely curious about what is the threat model that we're talking about avoiding. As in, can you present any way in which such a patch (say e.g. matplotlib executes ~/.config/matplotlibrc.py on import, just to pick a concrete case) actually makes things more dangerous?
Note also that you can opt out: just set the MATPLOTLIBRC environment variable to os.devnull (/dev/null) before importing matplotlib. (Well, if that doesn't work, then there should be a patch that makes it work.)
I'm not a security expert, nor do I want to pretend to be, so I feel really uncomfortable trying to decide what's "safe".
There's also a difference between what's actually safe and what our user community would perceive as safe.
I guess it's good that the discussion restarted there, because I got to think about the issue again and am going to renege on my "uninterest in relitigating it" mentioned above.
Let's, again, use a concrete proposal as basis of discussion: at import time, matplotlib tries to import a module named matplotlibrc(.py) from the normal config path (possibly modified by MATPLOTLIBRC), and tries to call the apply_style(rc) function defined there. It is possible to disable this feature (for example) by setting MATPLOTLIBRC to os.devnull. Note that neither eval
nor exec
appear in this proposal :-)
The fear appears to be that that module can be modified by an attacker or an oblivious user to execute arbitrary code, e.g. os.system("rm -rf /")
, and we don't want to be responsible for that. But wait! An attacker or an oblivious user had a much simpler way to achieve the same thing: they could just write this into six.py and put that in the user's cwd (or ~/.local/lib/python3.6/site-packages), and we do import six (and other dependencies) without any kind of validation (just as nearly all Python packages in the world do).
From this I conclude that we (and nearly all other Python packages that have external dependencies) are already vulnerable to arbitrary code execution (in that model), and can essentially do nothing about it.
Edit: for improved impact, replace six.py by some stdlib module name, of course.
Just my 2 cents, I think this conversation should have input from somebody with experience in security issues who can vet the proposal. Security vulnerabilities are hard to spot and, as suggested in this thread, can be a deal-breaker for some groups. I'd feel more comfortable moving forward on this if a trusted voice said that it was OK from a security standpoint.
h-vetinari
commented
Apr 30, 2021
In my personal opinion
YAML
currently is the most human-readable, human-editable configuration language out there. And having a parser/dumper with round-trip support (likeruamel.yaml
) makes it machine-readable/writable too.
Sorry for resurrecting an old issue, but I just saw it milestoned for 3.5.0 and had a look. I was wondering if toml had been considered (it's not part of the list of formats in the alternatives). It basically combines (IMO) the best of both worlds between YAML & JSON, and results in a very readable format (and is getting ever more widespread use through PEP518, cargo, etc.).
Since this allows easily defining arrays etc., and has a fully specified & verifiable grammar, that might also help with having to eval
strings that currently contain more complicated expressions.
I don't think PEP518 is a good argument here; PEP518 explicitly chose to introduce a new format rather than using Python literals (https://www.python.org/dev/peps/pep-0518/#python-literals) because they expect build systems to be written in other languages than Python (note that this is the only point against them), whereas we certainly don't expect matplotlibrc being read by anyone other than Matplotlib.
h-vetinari
commented
May 1, 2021
I'm not saying it's a complete answer (e.g. I haven't understood the cycler
requirements beyond the fact that some arrays are incompatible with the current comma-dependent parsing), my core point was mainly: TOML > YAML
Since this Pull Request has not been updated in 60 days, it has been marked "inactive." This does not mean that it will be closed, though it may be moved to a "Draft" state. This helps maintainers prioritize their reviewing efforts. You can pick the PR back up anytime - please ping us if you need a review or guidance to move the PR forward! If you do not plan on continuing the work, please let us know so that we can either find someone to take the PR over, or close it.
I think the idea is well-advertised now and whether the issue is open or closed won't change much to the discussion.
As a remark, strictYAML may be an improvement over the very general and complex YAML standard. https://github.com/crdoconnor/strictyaml
Uh oh!
There was an error while loading. Please reload this page.
See proposed MEP text in the PR. The rendered version is available by clicking on
view
at the top right of the "files changed" tab.I had this written for a while but was hoping to push this a bit later (I don't really have the time to work on it now). #6157 (comment) made me consider at least publishing my current thoughts.
attn @matplotlib/developers