This module allows to parse indented text file into Map
from node to child nodes identified by Coordinate
. This module seems way too complex - mainly because of managing stack that allows to "jump back" with decreasing indentation. How can I redesign it?
sample input:
a
b
c
d
e
f
produces:
*Parse> parse tree
fromList [("a",["f","c","b"]),("c",["d"]),("d",["e"])]
implementation:
module Parse where
import System.Environment (getArgs)
import Data.List (find, isInfixOf, intercalate)
import Data.Char (isAlpha)
import Control.Lens ((&))
import Data.List.Split (splitWhen)
import Control.Arrow
import qualified Data.Map.Strict as Map
import Data.Maybe
import Debug.Trace
import Control.Applicative
type Coordinate = String
parse :: String -> Map.Map Coordinate [Coordinate]
parse = snd . go
where
go = foldl addDependency ([], Map.empty) . filter ((>0) . length) . lines
addDependency (stack, acc) line = if null stack then
(,) [(0, line)] $ Map.singleton line []
else let children = fromMaybe [] . flip Map.lookup acc
lineDepth = countIndent line
indent = signum $ lineDepth - (fst . head $ stack)
coord = toCoordinate line
addChild c = Map.insert c (coord : children c) acc
parent = tail stack
in
case indent of
0 -> (,) ((lineDepth,coord):parent) $ addChild (snd . head $ parent)
1 -> (,) ((lineDepth,coord):stack) (addChild $ snd . head $ stack)
_ -> let newStack = back stack lineDepth
newHead = snd . head $ newStack
in (,) ((lineDepth,coord):newStack) (addChild newHead)
back stack@((i,c):ss) indent = if indent > i then stack else back ss indent
toCoordinate = dropWhile (not . special)
special = (||) <$> isAlpha <*> (=='(')
countIndent = length . takeWhile (not . special)
1 Answer 1
Make things easier on yourself by separating concerns. E.g., each line encodes a position in its amount of leading whitespace followed by an identifier. Parse this representation into a more manipulable form in a single function invocation, don't thread it through your algorithm logic which shouldn't care about serialized representation.
type SerializedGraph = String -- Aliases for pedagogical clarity
type Indentation = Int
type Symbol = String
parse :: SerializedGraph -> [(Indentation, Symbol)]
parse = map (first length . span (== ' ')) . lines
where first :: (a -> b) -> (a, c) -> (b, c)
first f (a, c) = (f a, c)
Now consider intermediate representations that can get you closer to your goal. Remember that it's easy to construct a map from an association list with fromList
, so let's make that our end goal. Write out the types for a roadmap from where we are to where we want to be.
[(Indentation, Symbol)] -- [(Hint, Key)]
==> ???
==> [(Symbol, [Symbol])] -- [(Key, [Value])]
If that final representation has only unique keys, then a reasonable intermediary would be one that hasn't had values which share a key accumulated yet. I.e., [(Symbol, Symbol)] -- [(Key, Value)]
.
associate :: [(Int, String)] -> [(String, String)]
associate [] = []
associate ((i, s):ss) = [(s, t) | (j, t) <- descendants, j == i + 1]
++ associate descendants
++ associate siblings
where (descendants, siblings) = span ((> i) . fst) ss
Now all that's left to do is create a Map
. We could accumulate values on our own, but really that's what fromListWith
is for.
toMap :: [(String, String)] -> Map String [String]
toMap = fromListWith (++) . map (second (:[]))
where second f (a, b) = (a, f b)
All that's left is composing these functions together.
readGraph :: String -> Map String [String]
readGraph = toMap . associate . parse