-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
refactor validators to remove duplicate files #5214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| File | Before (bytes) | After (bytes) | | --------- | -------------: | ------------: | | `.whl` | 16265568 | 9447645 | | `.tar.gz` | 7666222 | 6615305 |
f0fddc2
to
0184bd4
Compare
@alexcjohnson if you have time to look this over we'd be grateful for your feedback
@bmaranville The file plotly/validators/_validators.json
needs to be flagged in the pyproject.toml
so that it's included in the built package.
Looks like the correct way is to add validators/_validators.json
to the list under [tool.setuptools.package-data]
.
Thanks for the PR @bmaranville ! Other than my comments above, looks good to me. The tests are passing and validation seems to be working as usual in the examples I've tried. (@alexcjohnson I'm curious whether you can see any possible pitfalls or edge cases here -- happy to do more testing of specific cases if you have concerns.)
@bmaranville nice work, this is a clever solution! Two small thoughts about it:
- We should be able to reduce the whitespace significantly, to make the file smaller. Could even collapse it to a single line, I'm not sure there's much value keeping it human-readable, but maybe @emilykl has opinions about this. If we really want to optimize file size we could also change its structure a bit - like if every entry is a dict
{params, superclass}
this could be converted to a length-2 list. But the whitespace is the biggest piece of this. json.load
is pretty fast, ~41ms on my computer. But it's even faster if we just make this a Python file. In my quick test I just added:
true=True false=False null=None v =
to the beginning of the file to convert the JSON to Python, and then from validators._validators import v
took only ~25ms. (If we do this for real we should be able to tweak the json.dump
to output Python in the first place so we don't need to alias true
, false
, and null
)
@emilykl the only thing I'm not 100% confident of is whether this has any type checking implications. I don't think so, I think the only type checking that matters is on graph_objs
, not validators
, but we do currently have if TYPE_CHECKING
blocks in all the __init__.py
files in validators, ensuring that type checkers will load them all. If you haven't done so already it's probably worth playing around on this branch a bit to convince ourselves that we don't lose anything important with this.
@alexcjohnson 👍 on reducing whitespace, but I'd like to merge this one first, then combine it with #5218, and then tidy up the whitespace in the result - #5218 switches code formatting and checking from black
to ruff
, so if we're going to figure out how to configure a tool to stay silent about things like long lines, I'd like to do that once. I've created #5225 and will assign it to myself.
48daf32
to
a39e0e6
Compare
- fix black exclude parameter in pyproject.toml to properly exclude ONLY directories - reintroduce py38 in codegen/__init__.py - remove commented-out lines - add plotly/validators/_validators.json to pyproject.toml - regenerate code
a39e0e6
to
513337c
Compare
- We should be able to reduce the whitespace significantly, to make the file smaller. Could even collapse it to a single line, I'm not sure there's much value keeping it human-readable, but maybe @emilykl has opinions about this.
@alexcjohnson I'm in favor of this. I don't think there's any particular reason to keep it human-readable, although it's also not a HUGE difference with compression. Looks like about 194 KB compressed with whitespace vs. 163 KB compressed without whitespace, from a quick test I did locally.
json.load
is pretty fast, ~41ms on my computer. But it's even faster if we just make this a Python file. ... convert the JSON to Python, and thenfrom validators._validators import v
took only ~25ms. (If we do this for real we should be able to tweak thejson.dump
to output Python in the first place so we don't need to aliastrue
,false
, andnull
)
@alexcjohnson @gvwilson I'm in favor of this -- that's a big performance improvement. Something for a future PR? (as a bonus, then we wouldn't have to worry about including this file in the pyproject.toml
, since Python files are included automatically.)
@emilykl the only thing I'm not 100% confident of is whether this has any type checking implications. I don't think so, I think the only type checking that matters is on
graph_objs
, notvalidators
, but we do currently haveif TYPE_CHECKING
blocks in all the__init__.py
files in validators, ensuring that type checkers will load them all. If you haven't done so already it's probably worth playing around on this branch a bit to convince ourselves that we don't lose anything important with this.
I'm not a heavy user of type checking so it's possible I'll miss something here. Syntax highlighting is working as expected with these changes. I turned on stricter Pylance type checking and haven't noticed any issues yet but I'll keep an eye out. Hopefully we can put these changes out in an RC to catch any issues before full release.
This PR replaces #5173 by rebasing @bmaranville's changes on top of
main
. Please see #5173 for detailed notes about the motivation and changes. The important changes are in:codegen/init.py
codegen/datatypes.py
codegen/validators.py
doc/python/marker-style.md
plotly/_subplots.py
plotly/basedatatypes.py
plotly/figure_factory/_annotated_heatmap.py
plotly/io/_templates.py
plotly/validator_cache.py
I have read through the contributing notes and understand the structure of the package. In particular, if my PR modifies code of
plotly.graph_objects
, my modifications concern thecodegen
files and not generated files.I have added tests (if submitting a new feature or correcting a bug) or
modified existing tests.
For a new feature, I have added documentation examples in an existing or
new tutorial notebook (please see the doc checklist as well).
I have added a CHANGELOG entry if fixing/changing/adding anything substantial.
For a new feature or a change in behaviour, I have updated the relevant docstrings in the code to describe the feature or behaviour (please see the doc checklist as well).