1
\$\begingroup\$

My intent is to perform "mail merge" where I can write strings like "hi <<name>>" and format them according to a HashMap.

Specifically, the string contains keys formatted as <<key>>, and the map contains the corresponding values.

I have a main concern that I'd like help with. I think it would be best to perform the parsing in multiple stages:

  • first find the Keys
  • second find the remaining Chunks
  • and for more complicated parsing tasks, perhaps more stages

I couldn't figure that out and instead use the more expensive lookahead function notFollowedBy and one pass. That obviously wouldn't work well if I had a slightly more complicated need.

import Data.Functor.Identity (Identity)
import Data.HashMap.Lazy as HM
import Text.Parsec
import Text.Parsec.String
-- Parsing ----------
data Merge a = Chunk a | Key a deriving (Show)
key :: Parser (Merge String)
key = Key <$> between (string "<<") (string ">>") (many1 letter)
chunk :: Parser (Merge String)
chunk = Chunk <$> many1 (notFollowedBy key >> anyChar)
prose :: ParsecT String () Identity [Merge String]
prose = many1 $ key <|> chunk
-- Formatting ----------
format :: HM.HashMap String String -> [Merge String] -> String
format _ [] = ""
format hmap (Chunk x : xs) = x ++ format hmap xs
format hmap (Key k : xs) = 
 case HM.lookup k hmap of
 -- I could obviate the `error` by working within a failure monad
 Nothing -> error $ "missing key: " ++ k
 Just v -> v ++ format hmap xs
-- Testing ----------
testString = "Hi <<name>>! Do you like <<thing>>?"
testMap = HM.fromList [("name", "Adam"), ("thing", "Apples")]
main = print $ format testMap <$> parse prose "" testString
Jamal
35.2k13 gold badges134 silver badges238 bronze badges
asked Sep 20, 2017 at 3:40
\$\endgroup\$
8
  • 1
    \$\begingroup\$ Do you allow nested <<? If they are not allowed, your language is regular. \$\endgroup\$ Commented Sep 20, 2017 at 8:59
  • \$\begingroup\$ @Zeta Like <<<<>>? No, there should just be alpha characters inside a Key. Or if you meant Keys within Keys, still no, just <<alpha>>. \$\endgroup\$ Commented Sep 20, 2017 at 15:39
  • \$\begingroup\$ I'm really looking for some guidance on parsing in multiple passes, I couldn't quite figure out a convenient way to do that. Eg, how do I do a parse into tokens, and then parse my tokens (not strings). \$\endgroup\$ Commented Sep 20, 2017 at 15:45
  • \$\begingroup\$ You want to split your current implementation into lexing+parsing, instead of using a parser combinator as you do at the moment? \$\endgroup\$ Commented Sep 21, 2017 at 7:52
  • \$\begingroup\$ Isn't a parser for this way overkill? Sounds like breakOn would do the job perfectly well. \$\endgroup\$ Commented Sep 22, 2017 at 13:31

1 Answer 1

1
\$\begingroup\$

Instead of notFollowedBy, you can use noneOf in chunk:

chunk :: Parser (Merge String)
chunk = Chunk <$> many1 (noneOf "<") <|> try (sequence [char '<', noneOf "<"])

That prevents a lookahead but it is no longer that nice to look at.


format can be rewritten without explicit recursion if we use concatMap:

format :: HM.HashMap String String -> [Merge String] -> String
format hmap xs = concatMap go xs
 where
 go (Chunk c) = c
 go (Key k) = maybe (error $ "missing key: " ++ k) id (HM.lookup k hmap)

However, this would make our function partial, which we try to prevent. So let's use Either instead:

format :: HM.HashMap String String -> [Merge String] -> Either String String
format hmap xs = concat <$> mapM go xs
 where
 go (Chunk c) = Right c
 go (Key k) = maybe (Left $ "missing key: " ++ k) Right (HM.lookup k hmap)
answered Nov 5, 2017 at 17:39
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.