1
\$\begingroup\$

Usually, parser combinators in Haskell based on the parser type like

newtype Parser t r = Parser ([t] -> [(r,[t])])

This allows back tracking and multiple results. However I think it would be good to have it in "push" style, means the parser will have a set of result when no more input given, and have a transition function that for a given token, results in a new state of the parser. This results in the defintion

newtype PushParser t r = PushParser ([r], Maybe (t -> PushParser t r))

(We could have a data constructor that have two fields, however I prefer to use the more efficient newtype keyword so have to have only one field as a tuple)

To parse a list of tokens, we define the parse function as following:

parse :: PushParser t r -> [t] -> [r]
--No input, rs is the result
parse ~(PushParser (rs, _)) [] = rs
--No parsing function, this is an error state. No result
parse (PushParser (_,Nothing)) _ = []
--Otherwise, drop the results, apply the pasing function with the input and continue
parse (PushParser (_,Just f)) (t:ts) = parse (f t) ts

We can then define a fmap to make it a Functor like the following:

instance Functor (PushParser t) where
 fmap f ~(PushParser (r,mg)) = PushParser ((map f r), do {g<-mg; return (fmap f.g)})

Instance of Applicative is more tricky, but the easiest way is to make it depend on the not yet implemented instance of Monad.

instance Applicative (PushParser t) where
 pure x = PushParser ([x],Nothing)
 pf <*> pv = do 
 f <- pf
 v <- pv
 return (f v)

But before giving the definition of Monad I want to introduce the definition of Alternative:

instance Alternative (PushParser t) where
 empty = PushParser ([],Nothing)
 (PushParser (r1,Nothing)) <|> (PushParser (r2,f)) = PushParser ((r1++r2), f)
 (PushParser (r1,f)) <|> (PushParser (r2,Nothing)) = PushParser ((r1++r2), f)
 (PushParser (r1,Just f1)) <|> (PushParser (r2,Just f2)) =
 PushParser ((r1 ++ r2), Just (\t -> f1 t <|> f2 t))

I have tried out different ways for the instance of Monad, but finally I believe the best way to give definition of (>>=) is to define unwrap (or join in the monad library) first.

instance Monad (PushParser t) where
 p >>= f = unwrap (fmap f p) where 
 unwrap ~(PushParser (prs, mf)) = 
 foldr (<|>) (PushParser ([], do { f<-mf; return (unwrap.f)})) prs

The above works perfectally fine, and although it does not make sense to make it an instance of Parser (in Text.Parser.Combinators) as we don't allow trace back, the other combinators that only requires Alternative worked perfectally fine.

However, the performance seems to be too bad. It is possibly as it keeps all possible parsing path and so consumes too much memory. For example, a simple parse (many (some (char '.'))) "...." will return 8 results, with all combinations of dots.

But I think there should be some way to improve the performance. Please advise me how to improve on this.

asked Dec 16, 2016 at 12:31
\$\endgroup\$
3
  • \$\begingroup\$ Did you mean to write newtype PushParser t r = PushParser ([r], Maybe (t -> PushParser t r))? \$\endgroup\$ Commented Dec 16, 2016 at 13:29
  • 1
    \$\begingroup\$ But when the second argument of parse is [], how can you return the [t] from the first argument as an [r]? Also tuples can't possibly be faster than data-constructed types: Tuples are defined using data \$\endgroup\$ Commented Dec 17, 2016 at 0:05
  • \$\begingroup\$ That's correct. Fixed. \$\endgroup\$ Commented Dec 17, 2016 at 6:31

1 Answer 1

2
\$\begingroup\$

It'll lazily take no more memory than it takes time to find the first result if you do not try to enumerate all possible results.

For many/some to only try one option, you want Maybe's Alternative/MonadPlus behavior, not []'s. You could parametrize your Parser type in the Alternative used so the user can change it on the fly.

yoctoparsec uses this "push" style.

parseString (hoistFreeT maybeToList $ many $ some $ mfilter (=='.') token) "...."
answered Dec 16, 2016 at 15:23
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.