This week I started reading "The Rust Programming Language". As I reached the chapters on enumerations and pattern matching I felt I had enough material to put together a simple representation of JSON in Rust, just to play around and get a better feeling of the language.
This is what I came up with:
use std::fmt;
pub enum Json {
Obj(Vec<(String, JsonVal)>),
Arr(Vec<JsonVal>),
}
pub enum JsonVal {
Str(String),
Num(f64),
Composite(Json),
Bool(bool),
Null,
}
#[allow(unused_must_use)]
impl fmt::Display for Json {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match *self {
Json::Obj(ref obj) => {
"{".fmt(f);
for (n, prop) in obj.iter().enumerate() {
if n != 0 {
",".fmt(f);
}
"\"".fmt(f);
prop.0.fmt(f);
"\":".fmt(f);
prop.1.fmt(f);
}
"}".fmt(f);
Result::Ok(())
}
Json::Arr(ref arr) => {
"[".fmt(f);
for (n, item) in arr.iter().enumerate() {
if n != 0 {
",".fmt(f);
}
item.fmt(f);
}
"]".fmt(f);
Result::Ok(())
}
}
}
}
impl fmt::Display for JsonVal {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f,
"{}",
match *self {
JsonVal::Str(ref string) => format!("\"{}\"", string),
JsonVal::Num(number) => number.to_string(),
JsonVal::Composite(ref json) => json.to_string(),
JsonVal::Bool(boolean) => boolean.to_string(),
JsonVal::Null => String::from("null"),
})
}
}
The things that "smell" a little bit, in my humble opinion, are three:
The warning suppression before the implementation of the
Display
trait forJson
: I tried aggregating the operations inVector
s and then folding them together but the resulting code looked unnecessarily garbled, so I just ignored theResult
s. Is there a better way to do this?In the
Display
implementation ofJsonVal
I use theformat!
macro to basically make a copy of the string (a "technique" which smells by itself); however, as I'm not touching the string in any way, it would be great to just hand out the string itself to thewrite!
macro; returning*string
wouldn't make sense as I would be trying to perform a move on a borrowed reference, but just returning the reference would mean that in other branches I'd have to return a reference as well, which gets rejected because the result of the calls to theto_string
method go out of scope after the pattern matching. Is there a way out of this, perhaps using lifetimes?In the
Display
implementation ofJsonVal
I use theString::from
associated function to return the value of JSON'snull
. However, if I'm not mistaken, this would mean that I would be creating a new instance of theString
every time I have to format anull
. Astr
, being'static
(again, if I'm not mistaken), would solve the problem. Is the something like a staticString
or a way to be more efficient, perhaps using (again) lifetimes?
Also, I put all the ref
s because the compiler hinted me to do so and I understand more or less that it has to do with the fact that I'm just borrowing what's inside the struct to use it, but it's not completely clear to me. However, I still have to encounter the ref
keyword on the book, so don't bother explaining if you feel like I'm better off RTFM.
You can find an updated version of this code on my Github repo.
1 Answer 1
- There are no tests. That means that all of the changes I show compile, but may not do what I say. Rust makes is very easy to write basic tests, so I'd encourage you to write them. Especially as you are learning a new language and especially when you ask for feedback or refactor the code.
- You should never ignore
Result
s. If you don't believe it's possible for an error to occur, useunwrap
to cause the program to abort on error. Even better would be to usetry!
to return an error in case of error. Result::Ok(())
is duplicated at the end of each method; move it after thematch
.Result::Ok
is imported via the prelude, you can just sayOk
.- Multiple consecutive
fmt
calls can be combined with thewrite!
macro and a format string. - Instead of using
to_string
, embed thewrite!
call inside the match arms. Then there are no additional heap allocations. - Consider dealing with the middle commas in a different way. Pop off the first element, then unconditionally add the comma for all subsequent ones. Theres a possibility of performance difference here, so you should test.
- Consider using an
Option
instead of creating aNull
type.Option
is a pervasive type from the standard library and people innately know how to use it. - I think you will have a bug if a string contains embedded quotes, as there's no escaping happening.
use std::fmt;
pub enum Json {
Obj(Vec<(String, JsonVal)>),
Arr(Vec<JsonVal>),
}
pub enum JsonVal {
Str(String),
Num(f64),
Composite(Json),
Bool(bool),
Null,
}
impl fmt::Display for Json {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match *self {
Json::Obj(ref obj) => {
try!("{".fmt(f));
let mut props = obj.iter();
if let Some(prop) = props.next() {
try!(write!(f, r#""{}":{}"#, prop.0, prop.1));
}
for prop in props {
try!(write!(f, r#","{}":{}"#, prop.0, prop.1));
}
try!("}".fmt(f));
}
Json::Arr(ref arr) => {
try!("[".fmt(f));
let mut items = arr.iter();
if let Some(item) = items.next() {
try!(item.fmt(f));
}
for item in items {
try!(write!(f, ",{}", item));
}
try!("]".fmt(f));
}
}
Ok(())
}
}
impl fmt::Display for JsonVal {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match *self {
JsonVal::Str(ref string) => write!(f, r#""{}""#, string),
JsonVal::Num(number) => number.fmt(f),
JsonVal::Composite(ref json) => json.fmt(f),
JsonVal::Bool(boolean) => boolean.fmt(f),
JsonVal::Null => "null".fmt(f),
}
}
}
fn main() {
}
the warning suppression before the implementation of the Display trait
We mentioned this one already; good call on thinking that something was wonky here.
In the
Display
implementation ofJsonVal
I use theformat!
macro to basically make a copy of the string In theDisplay
implementation ofJsonVal
I use theString::from
associated function to return the value of JSON's null.
Again, good call on both of these points. Creating an allocated object is sometimes the right thing, but in this case we can just delegate to the existing Display
implementations of various types. In case you weren't aware, the {}
corresponds to items that implement Display
. That's what makes it possible to just call out to those.
I put all the refs because the compiler hinted me to do so and I understand more or less that it has to do with the fact that I'm just borrowing what's inside the struct to use it
Yes, this is the right intuition. When formatting something, you don't have ownership of the item, only a reference. That means you can't take ownership of a struct member, only further references. The ref
keyword in a pattern match is conceptually equivalent to taking another reference on the right side:
struct Foo(String);
let f = Foo("hello".to_string());
let String(ref s) = f;
let s = &f.0; // equivalent
The problem is that you can't always write code like the second case, like in a multi-way pattern match or in one arm of a match, so the ref
keyword is needed.
-
\$\begingroup\$ Thank you so much for the detailed review and for the tips, @Shepmaster! I've addressed points 1 through 6 and will now explore the final 3. Regarding the tests, there was actually one but I didn't paste it, good to know it's a good practice to paste them along the code. Also, I didn't know about raw string literals, very handy! \$\endgroup\$stefanobaghino– stefanobaghino2016年02月21日 17:43:09 +00:00Commented Feb 21, 2016 at 17:43
-
\$\begingroup\$ I have used
test::Bencher
in the nightly channel and you're "peek-and-roll" strategy performs better. Added it as well. Now I'll think about the unescaped strings bug; regarding theOption
, thanks for the tip, I'll keep it in mind as I play around. Thanks again for the tips! \$\endgroup\$stefanobaghino– stefanobaghino2016年02月21日 19:29:58 +00:00Commented Feb 21, 2016 at 19:29 -
1\$\begingroup\$ I've come up with another solution for the duplicated
Ok
: I've removed it entirely and left the last call tofmt
in each branch as an expression, what do you think? \$\endgroup\$stefanobaghino– stefanobaghino2016年02月21日 20:28:03 +00:00Commented Feb 21, 2016 at 20:28 -
\$\begingroup\$ strategy performs better — good to know! I stole the idea from somewhere that I can't remember at the moment. It's a bit longer, but there's less overall conditional branches. left the last call to fmt in each branch as an expression — yes, that's a good idea too. I often try to do something similar, but the
for
loops prevent it. in this case, you have the closing delimiter, so it works nicely! \$\endgroup\$Shepmaster– Shepmaster2016年02月21日 20:37:16 +00:00Commented Feb 21, 2016 at 20:37
Explore related questions
See similar questions with these tags.