5
\$\begingroup\$

I've written a rudimentary parser for INI files:

{-# LANGUAGE OverloadedStrings #-}
import qualified Data.Map as M
import Data.Maybe (fromMaybe)
import qualified Data.Text as T
type Ini = M.Map T.Text Section
data Section =
 Section
 { name :: T.Text
 , properties :: M.Map T.Text T.Text
 }
 deriving (Show)
main :: IO ()
main = parseIni iniFilePath >>= \ini -> putStrLn $ "Parsed INI: " ++ show ini
 where
 iniFilePath = "/home/me/test.ini"
parseIni :: FilePath -> IO Ini
parseIni iniFilePath = parseToIni . T.pack <$> readFile iniFilePath
parseToIni :: T.Text -> Ini
parseToIni stringToParse =
 -- We return the parsed Ini, not the helper values
 firstOfTriple $
 foldr
 (\line (ini, currentSectionMaybe, lineIndex) ->
 -- We're at a new section start or the end of the file → add the previous section
 -- to the parsed Ini value and create a new section
 if isSectionHeader line || lineIndex >= length lines - 1
 then let updatedIni = addSection ini currentSectionMaybe
 in (updatedIni, Just $ Section (getSectionName line) M.empty, 1 + lineIndex)
 else (ini, updateSection currentSectionMaybe line, 1 + lineIndex))
 (M.empty, Nothing, 0) $
 -- Since foldr is right associative we would process the lines starting with the last one, that's
 -- why we reverse the list of lines
 reverse lines
 where
 lines :: [T.Text]
 lines = T.splitOn "\n" stringToParse
firstOfTriple :: (a, b, c) -> a
firstOfTriple (x, _, _) = x
parseProperty :: T.Text -> Maybe (T.Text, T.Text)
parseProperty line =
 case T.splitOn "=" line of
 [name, value] -> Just (T.strip name, T.strip value)
 _ -> Nothing
updateSection :: Maybe Section -> T.Text -> Maybe Section
updateSection sectionMaybe line = fmap updateSection' sectionMaybe
 where
 updateSection' :: Section -> Section
 updateSection' section =
 -- Add the property to the section if the property can be parsed.
 -- Otherwise, leave the section as it were
 maybe
 section
 (\(propName, value) -> Section (name section) (M.insert propName value (properties section)))
 (parseProperty line)
getSectionName :: T.Text -> T.Text
getSectionName line = fromMaybe line headerWithoutBracketsMaybe
 where
 headerWithoutBracketsMaybe = T.stripPrefix "[" line >>= T.stripSuffix "]"
isSectionHeader :: T.Text -> Bool
isSectionHeader line = T.isPrefixOf "[" strippedLine && T.isSuffixOf "]" strippedLine
 where
 strippedLine = T.strip line
addSection :: Ini -> Maybe Section -> Ini
addSection ini sectionMaybe = maybe ini (\section -> M.insert (name section) section ini) sectionMaybe

I'd love to get feedback on how to simplify the code and/or make it more readable.

Things I'm aware of and ok with at the moment:

  • The parser doesn't support comments
  • addSection could be eta-reduced
  • I don't use a parsing library like Parsec
  • I don't use lenses
200_success
145k22 gold badges190 silver badges478 bronze badges
asked May 18, 2019 at 11:14
\$\endgroup\$

2 Answers 2

3
\$\begingroup\$

Just a few comments – not a full review:

  1. parseToIni :: T.Text -> Ini indicates that from any random string, parseToIni can produce an Ini. This makes me wonder how it would handle an invalid .ini file, or e.g. the string foo.

  2. In updateSection :: Maybe Section -> T.Text -> Maybe Section the Maybes obfuscate what the function is supposed to do. Can the function produce a Nothing if the first argument is a Just? Better remove the Maybes and fmap the whole thing if needed. addSection is similar.

  3. main would IMHO be more readable if it simply used do-notation.

  4. In parseToIni, the worker function for foldr is complex enough that it should have a type annotation. It's ok to to simply call inner worker functions f IMHO.

  5. I think it's a bit confusing that the section names appear both as the keys of Ini and in Section's name field. I'd probably remove the name field.

  6. A few type synonyms for keys, values, section names etc might help with readability.

  7. The order of parameters in addSection and updateSection is a bit unconventional. The usual a -> b -> b ordering is a bit nicer for partial applications.

  8. IMHO, maybe (and similar functions like either) don't aid readability. If you don't want to come up with a variable name, try the LambdaCase extension.

  9. Try Data.Text.IO.readFile.

  10. Instead of getSectionName and isSectionName, have a single function of type Text -> Maybe SectionName.

All in all I think your code is pretty readable. It's mostly the types that could be a bit better.

answered May 21, 2019 at 21:07
\$\endgroup\$
4
\$\begingroup\$

I know you said you were fine with not using a parser combinator library like parsec, but I thought you might like to see the how the same thing might look using one, so I wrote an Attoparsec based parser for your data types:

{-# LANGUAGE LambdaCase #-}
{-# LANGUAGE OverloadedStrings #-}
module IniParser where
import Control.Monad ( void )
import Data.Attoparsec.Text ( Parser
 , char
 , endOfInput
 , endOfLine
 , many'
 , many1
 , notInClass
 , parseOnly
 , satisfy
 , space
 )
import Data.Map.Strict ( Map )
import qualified Data.Map.Strict as Map
import Data.Text ( Text
 , pack
 )
import System.Environment ( getArgs )
type Ini = Map Text Section
data Section = Section
 { name :: Text
 , properties :: Map Text Text
 } deriving (Show)
main :: IO ()
main = do
 [path] <- getArgs
 parseIniFile path >>= \case
 Right ini -> putStrLn $ "Parsed INI: " ++ show ini
 Left err -> putStrLn $ "ERROR parsing ini: " ++ err
parseIniFile :: FilePath -> IO (Either String Ini)
parseIniFile iniFilePath = parseIni . pack <$> readFile iniFilePath
parseIni :: Text -> Either String Ini
parseIni = parseOnly ini
ini :: Parser Ini
ini = do
 defaultSection <- lexeme (Section "" <$> (Map.fromList <$> many' property))
 namedSections <- lexeme (many' section)
 void $ endOfInput
 let allSections | null (properties defaultSection) = namedSections
 | otherwise = defaultSection:namedSections
 pure . Map.fromList . map (\section -> (name section, section))
 $ allSections
section :: Parser Section
section = Section <$> sectionName <*> (Map.fromList <$> many' (lexeme property))
sectionName :: Parser Text
sectionName = char '[' *> sectionNameChars <* char ']' <* endOfLine
sectionNameChars :: Parser Text
sectionNameChars = pack <$> many' (satisfy $ notInClass "]\r\n")
property :: Parser (Text, Text)
property = (,) <$> propertyName <*> (lexeme (char '=') *> propertyValue)
propertyName :: Parser Text
propertyName = pack <$> many' (satisfy $ notInClass "=\r\n\t ")
propertyValue :: Parser Text
propertyValue = pack <$> many' (satisfy $ notInClass "\r\n")
lexeme :: Parser a -> Parser a
lexeme p = whitespace *> p <* whitespace
whitespace :: Parser String
whitespace = many' space

I think the main strength of the approach is self pretty self-evident. It eliminates all the multi-line lambdas, the entire foldr, etc. which (at least IMHO) really obscures the essence of what the code is expressing.

Additionally I've restricted the use of qualified imports to a single one usage where I think it makes the code more readable, though your taste may vary.

You can see the whole stack based project here if you're interested.

answered May 22, 2019 at 4:57
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.