-
Notifications
You must be signed in to change notification settings - Fork 13.7k
rustdoc-json: Postcard output #142642
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rustdoc-json: Postcard output #142642
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A friend points out https://hackers.town/@zwol/114155807716413069, with advice on how to design a magic number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having 'Json' in there seems perverse :)
Something @jamesmunns pointed out is that this means that reordering fields or enum variants in rustdoc-json-types
will now require a FORMAT_VERSION
bump. This could probably be detected in CI using postcard-schema
.
More broadly, we should think about where (if at all) postcard-schema fits into this.
As a note, I'm working on iterating on postcard
and postcard-schema
right now, postcard-schema-ng
was just released, and is a form that might be releasable as a 1.0 soon, but I'd need to finish up the items at jamesmunns/postcard#241 to see if any additional iteration is required.
postcard-schema
gets you two interesting pieces of data:
postcard-schema::Key
can be used to generate an 8-byte hash of the schema and the string of your choice, as a const. This can be snapshotted to detect wire changes in CI.postcard-schema::NamedType
/postcard-schema-ng::DataModelType
is the full reflection-style schema of the data type, which is also serializable as postcard data. This can be useful if you want the data to be archival: storing the schema inside the file itself, so you could still decode it even if the schema changes (using thepostcard-dyn
crate, giving you aserde_json::Value
-like view of the data).
postcard
is also getting a 2.0 soon, but it's important to note that the wire format is NOT changing. You will be able to use the library version v1.0 and v2.0 interchangably, wrt to serialization/deserialization (it's a breaking change because I'm removing some external crates that are now out of dates from my public API, it's likely your code won't need to change at all).
A possibly useful form for the file format could be:
struct PostcardFile<T> { key: Key, schema: Option<Schema>, data: T, }
I've considered "standardizing" this format a bit, maybe with a trailing CRC32.
postcard
is also getting a 2.0 soon, but it's important to note that the wire format is NOT changing. [...]
Awesome! It'd be great to not have cobs
and embedded-io
in Cargo.lock
(and that for all the users that care about performance).
A possibly useful form for the file format could be:
I think we definatly want to keep the magic number, so that consumers can tell if this file is rustdoc output at all, and a linear format version so they can tell if rustdoc is too old or too new for them if the schema's changed (vs a schema hash that only tells you that it's changed). Embedded the schema into the output itself is an interesting idea, I'll need to look more at it. But as long as both of these come after the magic number and linear format version, we should be fine to change them after the fact.
☔ The latest upstream changes (presumably #143173) made this pull request unmergeable. Please resolve the merge conflicts.
r? @ghost
What
rustdoc --output-format=postcard
is like rustdoc-json, but using https://postcard.rs/ / https://docs.rs/postcard/1.1.1/ instead of JSON.Why
JSON Size and speed isn't great. People want more speed, and smaller docs. There are proposals to make the JSON smaller (and therefor faster) by making field-names shorter, and omitting them when the value is the default. But
How good is it?
In a very unscientific benchmark for aws-sdk-ec2, it's ~3.6x smaller (255MiB vs 69 MiB) and ~1.8x faster to deserialize (1.6273 s vs 914.05 ms)
What's the metaformat
Crate
as usualThis way, users can look at the magic number to check it's a rustdoc-json-postcard file, then read the version number to know if they can decode it. Only then can they deserialize the
Crate
itself. I plan to write a library that does this, so it's easy to do well.Why is this a draft
HtmlRenderer
andJsonRenderer
are configures from the same options, we should change this.is_json()
instead of the current hacks