Type Validation using State Monad

Question 1

I am attempting to build a compiler for a stack-based (or concatenative) programming language in Haskell. I've done fine getting the lexer and parser up and running, but now that I'm into the meat of the semantic analyzer, implementing the algorithms idiomatically is getting trickier. I finally had to force myself to learn monads for this latest problem: validating a function type. I don't want to bore anyone with the details of the type system I'm working on, but I did include some cursory comments for clarity. The algorithm fragment below does produce my expected output in all of my test cases:

module Types where
import qualified Data.Map as Map
import Control.Monad.State
data PrimitiveType = Bool
 | Byte | Short | Int | Long
 | UByte | UShort | UInt | ULong
 | Single | Double
 | Address
 deriving (Eq, Show, Read)
-- 'Verbs' are the equivalent of functions, and the type of a verb
-- is represented by two sequences of types, called a stack effect.
-- Hence a verb can be thought of as a function from a tuple to a tuple.
-- Stack types are like generic types, except they can replace an entire
-- sequence or subsequence of types rather than just a single type.
-- Stack expressions generally only contain stack types and expressions, especially
-- in the parsing phase. Concatenation is simply represented by having writing
-- two stack types consecutively in a type sequence.
data Type = Primitive PrimitiveType
 | GenericType Int
 | StackType Int
 | VerbType [Type] [Type]
 | StackExpression Type Type
 | TupleType [Type]
 | VectorType Type
 | PointerType Type
 deriving (Eq, Show)
-- Top level verb validation has a special kink. The output sequence
-- must not introduce any new stack types. This is ONLY THE CASE for
-- the outermost verb type.
validVerbType :: Type -> Bool
validVerbType (VerbType l r) = fst $ runState (do
 vl <- validateSequence (reverse l) False
 vr <- validateSequence (reverse r) True
 return (vl && vr)) []
validVerbType _ = False
-- Type validation: used for making sure type declarations constructed by
-- the programmer are actually valid. Due to the rather finicky nature of
-- stack types in this type system, I'm not actually sure this is a proper
-- definition of what a 'valid' type is.
-- Right now, what it does is make sure that only one stack type is introduced
-- in each stack effect or tuple (recursively). Due to how the language is
-- parsed, only verb stack effects and tuples can contain stack types (and this
-- is really what makes the most sense to me).
-- Verbs have a special mechanism to accomplish this: the shortest of the left
-- or right effects is validated first. All sequences of types are validated
-- in reverse order.
-- The State monad is used to thread a map through validator. If the map doesn't
-- contain an encountered stack type, it is added to the map if it is the first
-- stack type encountered in this sequence, otherwise it is invalid. If the map
-- does contain the stack type, 
validate :: Type -> State [Int] Bool
validate (StackType i) = do
 frame <- get
 return (elem i frame)
validate (StackExpression l r) = do
 vl <- validate l
 vr <- validate r
 return (vl && vr)
validate (TupleType t) = do validateSequence t False
validate (VerbType l r) = do
 if (length l) > (length r) then do
 vr <- validateSequence (reverse r) False
 vl <- validateSequence (reverse l) False
 return (vr && vl)
 else do
 vl <- validateSequence (reverse l) False
 vr <- validateSequence (reverse r) False
 return (vl && vr)
validate _ = return True
validateSequence :: [Type] -> Bool -> State [Int] Bool
validateSequence [] _ = return True
validateSequence ((StackType t):ts) False = do
 frame <- get
 if notElem t frame then do
 put (t:frame)
 v <- (validateSequence ts True)
 return v
 else do
 v <- (validateSequence ts False)
 return v
validateSequence ((StackType t):ts) True = do
 frame <- get
 if notElem t frame then return False
 else do
 v <- (validateSequence ts True)
 return v
validateSequence (t:ts) i = do
 vt <- validate t
 vts <- validateSequence ts i
 return (vt && vts)

Review Requests

Are there segments that could be more idiomatically written?
Could the code be structured to be more readable?
Could validateSequence be replaced by a map of some kind?
The function validateSequence passes a Bool as a stateful argument: it starts as False until the first 'unseen' stack type is encountered and is then passed as True for the rest of the recursion. It is used to make sure there is only one 'unseen' stack type per sequence.
Could that be programmed in a better way? It seems hacky to me.

Question 2

Welcome to Code Review! I hope you get some great answers!

Question 3

I concentrate on

Are there segments that could be more idiomatically written?

and probably overdo it.

fst $ runState can be written as evalState.

Look at this:

vl <- validateSequence (reverse l) False
vr <- validateSequence (reverse r) True
return (vl && vr)

You extract two monadic values, apply && and put them back into the monad. This can be written as the surely less readable

liftM2 (&&) (validateSequence (reverse l) False) (validateSequence (reverse r) True)

but given how often you use it, define

(<&&>) :: (Applicative f) => f Bool -> f Bool -> f Bool 
(<&&>) = liftA2 (&&)

and it becomes

validateSequence (reverse l) False <&&> validateSequence (reverse r) True

Don't worry too much about the difference between Applicative and Monad too much, liftM2 will do, too.

Instead of

x <- something
return x

you can always write just something. So

frame <- get
if notElem t frame then do
 put (t:frame)
 v <- (validateSequence ts True)
 return v
else do
 v <- (validateSequence ts False)
 return v

can be written as

notIn <- notElem t <$> get
when notIn $ modify (t:)
validateSequence ts False

which uses the fact that monads are functors, <$> is fmap: Here I apply notElem t to the monadic value of get. when is in Control.Monad and is equivalent to

when :: (Monad m) => Bool -> m () -> m ()
when True x = x
when False _ = return

Franky Franky 6853 silver badges13 bronze badges · Accepted Answer · 2015-07-07 16:09:41Z

I concentrate on

Are there segments that could be more idiomatically written?

and probably overdo it.

fst $ runState can be written as evalState.

Look at this:

vl <- validateSequence (reverse l) False
vr <- validateSequence (reverse r) True
return (vl && vr)

You extract two monadic values, apply && and put them back into the monad. This can be written as the surely less readable

liftM2 (&&) (validateSequence (reverse l) False) (validateSequence (reverse r) True)

but given how often you use it, define

(<&&>) :: (Applicative f) => f Bool -> f Bool -> f Bool 
(<&&>) = liftA2 (&&)

and it becomes

validateSequence (reverse l) False <&&> validateSequence (reverse r) True

Don't worry too much about the difference between Applicative and Monad too much, liftM2 will do, too.

Instead of

x <- something
return x

you can always write just something. So

frame <- get
if notElem t frame then do
 put (t:frame)
 v <- (validateSequence ts True)
 return v
else do
 v <- (validateSequence ts False)
 return v

can be written as

notIn <- notElem t <$> get
when notIn $ modify (t:)
validateSequence ts False

which uses the fact that monads are functors, <$> is fmap: Here I apply notElem t to the monadic value of get. when is in Control.Monad and is equivalent to

when :: (Monad m) => Bool -> m () -> m ()
when True x = x
when False _ = return

Stack Exchange Network

Type Validation using State Monad

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Type Validation using State Monad

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions