3
\$\begingroup\$

I'm not sure if a Markov Chain generator is the proper term for this, really all it does is create the chain, it doesn't generate any text from it. The code is below, and I'd appreciate any feedback: bad practices, code smells, poor naming, anything at all really.

module Main where
import Data.List(find, delete)
type Chain = [Node]
data Node = Node {nPrefix :: String, nSuffixes :: [String]} deriving (Show, Eq, Read)
generateChain :: Int -> [String] -> Chain
generateChain n = foldl (learnSample n) []
learnSample :: Int -> Chain -> String -> Chain
learnSample n chain sample = foldl learnGroup chain groups
 where groups = (groupings (n+1) . words) sample
learnGroup :: Chain -> [String] -> Chain
learnGroup chain group = 
 case find (\node -> nPrefix node == prefix) chain of
 Just node -> delete node chain ++ [Node prefix (nSuffixes node ++ [suffix])]
 Nothing -> chain ++ [Node prefix [suffix]]
 where 
 prefix = unwords $ init group
 suffix = last group
groupings :: Int -> [a] -> [[a]]
groupings n list 
 | length list >= n = take n list : groupings n (drop 1 list)
 | otherwise = []
main :: IO ()
main = print $ generateChain 1 ["a b a b a c a d"]
200_success
145k22 gold badges190 silver badges478 bronze badges
asked Dec 3, 2014 at 16:14
\$\endgroup\$
0

2 Answers 2

1
\$\begingroup\$

Looks good, the only thing I'd suggest is that since you don't really rely on having strings as nodes, you can also generalise your implementation to arbitrary chains, i.e. Chain Char for this particular main function. You'll need to change to a GADT (I think) but otherwise the code would just need a few minor additional changes.

answered Dec 3, 2014 at 23:29
\$\endgroup\$
0
\$\begingroup\$

This is very nice haskell code. I was slightly irritated by the use of foldl, which should usuablly be replaced by foldl'. Using foldl seems perfectly valid here, but it might warrant a comment as to why it was used.

Something I'm unsure about is your choice of data structures to represent the chain. My understanding of Markov chains is that what's called suffixes here is a set of Nodes reachable from this node. A Data.Set or Data.Map.Strict.Map might be a better choice to capture this. The latter choice would also give us access to functions such as insertWith, which could be used to simplify learnGroup. But still, nice code.

answered Aug 1, 2015 at 9:18
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.