My intent is to perform "mail merge" where I can write strings like "hi <<name>>"
and format them according to a HashMap
.
Specifically, the string contains keys formatted as <<key>>
, and the map contains the corresponding values.
I have a main concern that I'd like help with. I think it would be best to perform the parsing in multiple stages:
- first find the
Key
s - second find the remaining
Chunk
s - and for more complicated parsing tasks, perhaps more stages
I couldn't figure that out and instead use the more expensive lookahead function notFollowedBy
and one pass. That obviously wouldn't work well if I had a slightly more complicated need.
import Data.Functor.Identity (Identity)
import Data.HashMap.Lazy as HM
import Text.Parsec
import Text.Parsec.String
-- Parsing ----------
data Merge a = Chunk a | Key a deriving (Show)
key :: Parser (Merge String)
key = Key <$> between (string "<<") (string ">>") (many1 letter)
chunk :: Parser (Merge String)
chunk = Chunk <$> many1 (notFollowedBy key >> anyChar)
prose :: ParsecT String () Identity [Merge String]
prose = many1 $ key <|> chunk
-- Formatting ----------
format :: HM.HashMap String String -> [Merge String] -> String
format _ [] = ""
format hmap (Chunk x : xs) = x ++ format hmap xs
format hmap (Key k : xs) =
case HM.lookup k hmap of
-- I could obviate the `error` by working within a failure monad
Nothing -> error $ "missing key: " ++ k
Just v -> v ++ format hmap xs
-- Testing ----------
testString = "Hi <<name>>! Do you like <<thing>>?"
testMap = HM.fromList [("name", "Adam"), ("thing", "Apples")]
main = print $ format testMap <$> parse prose "" testString
1 Answer 1
Instead of notFollowedBy
, you can use noneOf
in chunk
:
chunk :: Parser (Merge String)
chunk = Chunk <$> many1 (noneOf "<") <|> try (sequence [char '<', noneOf "<"])
That prevents a lookahead but it is no longer that nice to look at.
format
can be rewritten without explicit recursion if we use concatMap
:
format :: HM.HashMap String String -> [Merge String] -> String
format hmap xs = concatMap go xs
where
go (Chunk c) = c
go (Key k) = maybe (error $ "missing key: " ++ k) id (HM.lookup k hmap)
However, this would make our function partial, which we try to prevent. So let's use Either
instead:
format :: HM.HashMap String String -> [Merge String] -> Either String String
format hmap xs = concat <$> mapM go xs
where
go (Chunk c) = Right c
go (Key k) = maybe (Left $ "missing key: " ++ k) Right (HM.lookup k hmap)
<<
? If they are not allowed, your language is regular. \$\endgroup\$<<<<>>
? No, there should just be alpha characters inside aKey
. Or if you meantKey
s withinKey
s, still no, just<<alpha>>
. \$\endgroup\$breakOn
would do the job perfectly well. \$\endgroup\$