I am playing with Rust, and am trying to write a simple parser.
I need to parse the string:
"0123,456789"
into a stucture:
Passport { label : i8 = 0, body : i32 = 123456789 }
I am using parser_combinators
and my code is working, but very ugly.
How can I rewrite this code?
extern crate parser_combinators as pc;
use pc::*;
use pc::primitives::{State, Stream};
fn main() {
match parser(parse_passport).parse("a123,456") {
Ok((r,l)) => println!("{} {} {}", r.label, r.body, l),
Err(e) => println!("{}", e)
}
}
struct Passport {
label : i8,
body : i32,
}
fn parse_passport<I>(input: State<I>) -> ParseResult<Passport, I> where I: Stream<Item=char> {
let mut label = digit().map(|i : char| (i as i8) - ('0' as i8));
let mut fst = many1(digit()).map(|string : String| string.parse::<i32>().unwrap());
let (i,input) = match label.parse_state(input) {
Ok((x,n)) => (x,n.into_inner()),
Err(e) => return Err(e)
};
let (f,input) = match fst.parse_state(input) {
Ok((x,n)) => (x,n.into_inner()),
Err(e) => return Err(e)
};
let (_,input) = match satisfy(|c| c == ',').parse_state(input) {
Ok((x,n)) => (x,n.into_inner()),
Err(e) => return Err(e)
};
let (s,input) = match fst.parse_state(input) {
Ok((x,n)) => (x,n),
Err(e) => return Err(e)
};
let p = Passport { label : i, body : f * 1000000 + s };
Ok((p, input))
}
1 Answer 1
Some small style nits to fix before we dive into the meat. When specifying a variable and a type, there should be space after the :
, but not before:
// label : i8,
label: i8,
I break where
clauses onto the next line, with each type restriction on its own line as well:
// fn parse_passport<I>(input: State<I>) -> ParseResult<Passport, I> where I: Stream<Item=char> {
fn parse_passport<I>(input: State<I>) -> ParseResult<Passport, I>
where I: Stream<Item=char>
{
Let the compiler use type inference as much as possible. You don't always need to specify the type of a closure variable, for instance:
// let mut label = digit().map(|i : char| (i as i8) - ('0' as i8));
let mut label = digit().map(|i| (i as i8) - ('0' as i8));
Using try!
is going to be your biggest win for clarity:
// let (i,input) = match label.parse_state(input) {
// Ok((x,n)) => (x,n.into_inner()),
// Err(e) => return Err(e)
// };
let (i, input) = try!(label.parse_state(input));
let input = input.into_inner();
That's as far as I got with generic Rust knowledge. After reading the docs a bit, I came up with this though (note: this is my first time using this library, so I make no claims to use it well!):
fn parse_passport<I>(input: State<I>) -> ParseResult<Passport, I>
where I: Stream<Item=char>
{
let label = digit().map(|i| (i as i8) - ('0' as i8));
fn to_number(string: String) -> i32 { string.parse().unwrap() }
let fst = many1(digit());
let comma = satisfy(|c| c == ',');
let mut parser = label.and(fst.clone().map(to_number)).and(comma).and(fst.map(to_number));
let ((((i, f), _), s), input) = try!(parser.parse_state(input));
let p = Passport { label: i, body: f * 1000000 + s };
Ok((p, input))
}
I'm certain there's a better way though. The nested tuples are a bit off-putting. Also, I feel like there should be a nice way to map
over the result of the parsing so we didn't have to unpack and repack the Result, including the input
. There's the straight forward but awkward:
parser.parse_state(input).map(|((((i, f), _), s), input)| {
(Passport { label: i, body: f * 1000000 + s }, input)
})
This is the final thing I ended up at:
extern crate parser_combinators as pc;
use pc::*;
use pc::primitives::{State, Stream};
fn main() {
match parser(parse_passport).parse("0123,456789") {
Ok((r, l)) => println!("{} {} {}", r.label, r.body, l),
Err(e) => println!("{}", e)
}
}
struct Passport {
label: i8,
body: i32,
}
fn parse_passport<I>(input: State<I>) -> ParseResult<Passport, I>
where I: Stream<Item=char>
{
let label = digit().map(|i| (i as i8) - ('0' as i8));
fn to_number(string: String) -> i32 { string.parse().unwrap() }
let fst = many1(digit());
let comma = satisfy(|c| c == ',');
let mut parser = label.and(fst.clone().map(to_number)).and(comma).and(fst.map(to_number));
parser.parse_state(input).map(|((((i, f), _), s), input)| {
(Passport { label: i, body: f * 1000000 + s }, input)
})
}
a123,456
, not0123,456
. \$\endgroup\$