I have been working through the exercises in a Write Yourself a Scheme in 48 Hours/Parsing and hacked together something to get parseString to comprehend escaped characters. Also had some inspiration from Real World Haskell chapter 16 minus the applicative parts.
parseString :: Parser LispVal
parseString = do char '"'
x <- many $ chars
char '"'
return $ String x
where chars = escaped <|> noneOf "\""
escaped = choice $ map tryEscaped escapedChars
tryEscaped c = try $ char '\\' >> char (fst c) >> return (snd c)
escapedChars = zip "bnfrt\\\"/" "\b\n\f\r\t\\\"/"
This works but I am not fond of my 'tryEscaped' definition. What are the alternatives?
2 Answers 2
Some suggestions:
- Replace
fst
andsnd
with a pattern match or explicit function arguments. - Extract the common
char '\\'
parser. You can then avoid thetry
. - Strings with lots of escaped characters are hard to read and it's hard to visually verify that the escape codes are correctly matched with their replacements. Consider spelling them out and aligning them to make this easy to see.
Here's what I came up with:
escaped = char '\\' >> choice (zipWith escapedChar codes replacements)
escapedChar code replacement = char code >> return replacement
codes = ['b', 'n', 'f', 'r', 't', '\\', '\"', '/']
replacements = ['\b', '\n', '\f', '\r', '\t', '\\', '\"', '/']
-
1\$\begingroup\$ Hey, please note in 7.10 you will get an error message suggesting you enable FlexibleContext. This is due to not having an explicit type signature on the function escapedChar. The proper fix, IMO, is to give it an explicit type signature: escapedChar :: Char -> Char -> Parser Char or you may be able to utilize an anonymous function to the same effect. \$\endgroup\$Alex Hart– Alex Hart2015年09月08日 14:21:15 +00:00Commented Sep 8, 2015 at 14:21
The OP is doing this as exercise, but for anyone who found this question on google like I did and wants a fast solution: import Parsec's Token
and Language
.
Then you can do this in GHCi:
>>>let lexer = makeTokenParser haskellDef
>>>let p = stringLiteral lexer
>>>runParser p () "" "\"Hi\\n\""
Right "Hi\n"
According to the Parsec docs:
the literal string is parsed according to the grammar rules defined in the Haskell report (which matches most programming languages quite closely)
-
\$\begingroup\$ I was looking for an out of the box solution. This works great! \$\endgroup\$Damian Nadales– Damian Nadales2017年11月09日 13:34:52 +00:00Commented Nov 9, 2017 at 13:34
\/
really a valid escape sequence in scheme? \$\endgroup\$