4

I'm trying to write a simple parser via Haskell, but stuck at an infinite loop. the code is:

import Control.Applicative (Alternative, empty, many, (<|>))
data Parser a = Parser {runParser :: String -> [(a, String)]}
instance Functor Parser where
 fmap f (Parser p) = Parser $ \s -> [(f x', s') | (x', s') <- p s]
instance Applicative Parser where
 pure x = Parser $ \s -> [(x, s)]
 (Parser pf) <*> (Parser p) = Parser $ \s -> [(f' x, ss') | (f', ss) <- pf s, (x, ss') <- p ss]
instance Alternative Parser where
 empty = Parser $ \s -> []
 (Parser p1) <|> (Parser p2) = Parser $ \s ->
 case p1 s of
 [] -> p2 s
 xs -> xs
singleSpaceParser :: Parser Char
singleSpaceParser = Parser $ \s ->
 ( case s of
 x : xs -> if x == ' ' then [(' ', xs)] else []
 [] -> []
 )
multiSpaceParser :: Parser [Char]
multiSpaceParser = many singleSpaceParser

I just load this file in ghci, and run:

runParser multiSpaceParser " 123"

I expect it to get [(" ", "123")], but actually it got an infinite loop

I used trace to debug, and it seems that many is wrong

How can I fix this bug?

asked Jun 27, 2022 at 6:36
3
  • many deals with Alternative, so it means it will first try a singleSpaceParser, if that fails, it will attempt another singleSpaceParser etc. Commented Jun 27, 2022 at 6:39
  • so the problem is multiSpaceParser will call infinite singleSpaceParser? why it will attempt another singleSpaceParser if the first singleSpaceParser fails? Commented Jun 27, 2022 at 6:43
  • 1
    It's a rather subtle laziness-induced problem. If you replace data by newtype, it should work. Commented Jun 27, 2022 at 7:30

1 Answer 1

3

Let's assume

many p = (:) <$> p <*> many p <|> pure []

and consider the call

many singleSpaceParser " 123"

(The string does not actually matter here, the many singleSpaceParser call will always loop.)

One reduction step yields

 ((:) <$> singleSpaceParser <*> many singleSpaceParser <|> pure []) " 123"

Now observe that, in order to reduce the call to (<|>), we have to evaluate both arguments of (<|>) to be of the shape Parser ....

Let's consider doing that for (:) <$> singleSpaceParser <*> many singleSpaceParser. As both (<$>) and (<*>) are infixl 4, this is an application of <*> at the outermost level.

But now observe that in order to reduce (<*>), we again have to evaluate both arguments of (<*>) to be of the shape Parser ..., so in particular the recursive call many singleSpaceParser.

This is where we get the infinite loop.

By switching data to newtype (or alternatively, at least avoiding aggressively pattern-matching on the Parser constructor in all the second arguments), these problems can be avoided.

answered Jun 27, 2022 at 7:40
Sign up to request clarification or add additional context in comments.

4 Comments

after changing data to newtype I solve the problem, thank you so much :)
Then do deriving (Functor, Applicative, Alternative, Monad) via StateT String []
You can also make <*> lazier with - pf <*> p = Parser $ \s -> [(f' x, ss') | (f', ss) <- runParser pf s, (x, ss') <- runParser p ss]
Alternatively, make <*> lazier with ~: ~(Parser pf) <*> ~(Parser p) = ...

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.