5
\$\begingroup\$

I am new to Haskell and currently trying to port my solutions for the 2019 installment of the coding challenge AdventOfCode to Haskell. So, I would very much appreciate any suggestions how to make the code more readable and more idiomatic. In particular, I am interested in whether my use of the State monad is sensible here.

This post is about the combined result from day 2, day 5, and day 9, an interpreter for a simple assembly language called IntCode. If you have not solved these problems and still intend to do so, stop reading immediately.

I have kept the entire solution for each part of each day in a single module with a single exported function that prints the solution. Without this selfimposed restriction, I would have split this code into different modules. For day 9 part 1 it starts as follows.

module AdventOfCode20191209_1
 (
 systemCheck
 ) where
import System.IO
import Data.List.Split
import Data.List
import Data.Maybe
import qualified Data.HashMap.Strict as Map
import Control.Monad.State
import Data.Digits
systemCheck :: IO ()
systemCheck = do
 inputText <- readFile "Advent20191209_1_input.txt"
 let code = toIntCode inputText
 let result = executeCode code [1]
 print result

Now, let me explain the setting, i.e. the IntCode language, and how I have implemented the different features.

As the name might suggest, the input is a stream/list of integer values. In the problems, it is provided as a comma separated string, which I read as follows.

type IntCode = [Int]
toIntCode :: String -> IntCode
toIntCode = map read . splitOn ","

Process State

This code also functions as the bases for the process memory, which is essentially infinite and initialized by the code itself inside its index range and by zero outside.

newtype Memory = Memory (Map.HashMap Int Int) deriving (Eq, Show)
readMemory :: Int -> State Memory Int
readMemory pointer = gets $ \(Memory m) -> Map.lookupDefault 0 pointer m
writeToMemory :: Int -> Int -> State Memory ()
writeToMemory pointer 0 = modify $ \(Memory m) -> Memory $ Map.delete pointer m
writeToMemory pointer value = modify $ \(Memory m) -> Memory $ Map.alter (\_ -> Just value) pointer m
fromIntCode :: IntCode -> Memory
fromIntCode = Memory . Map.fromList . indexedIntCode
indexedIntCode :: IntCode -> [(Int, Int)]
indexedIntCode = zip [0..]

In addition to memory, there is an input stream used by the Get operation,

newtype InputStream = InputStream [Int] deriving (Eq, Show)
addInput :: Int -> State InputStream ()
addInput input = modify $ \(InputStream xs) -> InputStream $ xs ++ [input]
addInputs :: [Int] -> State InputStream ()
addInputs inputs = modify $ \(InputStream xs) -> InputStream $ xs ++ inputs
popInput :: State InputStream (Maybe Int)
popInput = state $ \input@(InputStream xs) -> case xs of
 [] -> (Nothing, input)
 y:ys -> (Just y, InputStream ys)

There are two more stateful objects involved in the computation, an instruction pointer and a relative base pointer, both starting at zero. The instruction pointer is what the name suggests and the relative base pointer is used for one of the methods to read arguments described later.

I have combined these into a ProcessState type.

data ExecutionStatus = Running | Blocked | Terminated | Error deriving (Eq, Show, Enum)
data ProcessState = ProcessState {
 memory :: Memory,
 inputs :: InputStream,
 instructionPointer :: Int,
 relativeBasePointer :: Int,
 status :: ExecutionStatus
} deriving (Eq, Show)
processStatus :: State ProcessState ExecutionStatus
processStatus = gets $ status
hasShutDown :: State ProcessState Bool
hasShutDown = do
 currentStatus <- processStatus
 case currentStatus of
 Terminated -> return True
 Error -> return True
 _ -> return False
isRunning :: State ProcessState Bool
isRunning = do
 currentStatus <- processStatus
 case currentStatus of
 Running -> return True
 _ -> return False
setProcessStatus :: ExecutionStatus -> State ProcessState ()
setProcessStatus processStatus = do
 stopped <- hasShutDown
 if stopped
 then return ()
 else modify $ \s -> s{status = processStatus}
terminateProcess :: State ProcessState ()
terminateProcess = setProcessStatus Terminated
abortProcess :: State ProcessState ()
abortProcess = setProcessStatus Error
setInstructionPointer :: Int -> State ProcessState ()
setInstructionPointer pointer = modify $ \s -> s {instructionPointer = pointer}
processInstructionPointer :: State ProcessState Int
processInstructionPointer = gets $ instructionPointer
incrementInstructionPointer :: Int -> State ProcessState ()
incrementInstructionPointer offset = do
 instructionPointer <- processInstructionPointer
 setInstructionPointer (instructionPointer + offset)
setRelativeBasePointer :: Int -> State ProcessState ()
setRelativeBasePointer pointer = modify $ \s -> s {relativeBasePointer = pointer}
processRelativeBasePointer :: State ProcessState Int
processRelativeBasePointer = gets $ relativeBasePointer
incrementRelativeBasePointer :: Int -> State ProcessState ()
incrementRelativeBasePointer offset = do
 relativeBasePointer <- processRelativeBasePointer
 setRelativeBasePointer (relativeBasePointer + offset)
readProcessMemory :: Int -> State ProcessState Int
readProcessMemory pointer = gets $ \ProcessState{memory = m} -> evalState (readMemory pointer) m
writeToProcessMemory :: Int -> Int -> State ProcessState ()
writeToProcessMemory pointer value = modify $ \s@ProcessState{memory = m} -> s {memory = execState (writeToMemory pointer value) m}
addProcessInput :: Int -> State ProcessState ()
addProcessInput additionalInput = modify $ \s@ProcessState{inputs = inputStream} -> s {inputs = execState (addInput additionalInput) inputStream}
addProcessInputs :: [Int] -> State ProcessState ()
addProcessInputs additionalInputs = modify $ \s@ProcessState{inputs = inputStream} -> s {inputs = execState (addInputs additionalInputs) inputStream}
popProcessInput :: State ProcessState (Maybe Int)
popProcessInput = state $ \s@ProcessState{inputs = inputStream} ->
 let (input, newInputs) = runState popInput inputStream
 in (input, s {inputs = newInputs})
initializeProcess :: IntCode -> [Int] -> ProcessState
initializeProcess code initialInputs = ProcessState {
 memory = fromIntCode code,
 inputs = InputStream initialInputs,
 instructionPointer = 0,
 relativeBasePointer = 0,
 status = Running}

Most of the associated methods are basically plumbing between the stateful operations on the process state and the underlying stateful operations. I have added an execution status that is used to indicate that the process has terminated, via hasShutDown. Moreover, it will be used to halt execution in case the process blocks due to missing input. This is necessary for applications as in part 2 of day 7 or on day 23.

Instructions

Each step of a computation starts by reading the value in memory the instruction pointer points to. This encodes two kinds of information: the opcode for the operation and the modes used to read the arguments from memory.

data IntCodeInstruction = IntCodeInstruction {
 opcode :: OpCode,
 argumentSpecifications :: [ArgumentSpecification]
} deriving (Eq, Show)
data ArgumentSpecification = ArgumentSpecification {
 argumentMode :: ArgumentMode,
 operationMode :: OperationMode
} deriving (Eq, Show)

Reading Arguments

There are three basic modes how to read data.

toArgumentMode :: Int -> Maybe ArgumentMode
toArgumentMode 0 = Just Pointer
toArgumentMode 1 = Just Value
toArgumentMode 2 = Just Relative
toArgumentMode _ = Nothing

In general, the information relating to the nth argument is at an offset of n from the instruction pointer. In value mode, this is the argument, in pointer mode, it is the value of the pointer to the data and in relative mode, it is the offset from the relative base pointer where to look for the data. Unfortunately, this is not all. If the argument specifies a location to write to in memory, it the argument is the pointer instead of the data at the location the pointer points to. To encode this, I introduced an operation mode, which depends on the opcode.

data OperationMode = Read | Write deriving (Eq, Show)

Putting this together, I read arguments as follows.

instructionArguments :: IntCodeInstruction -> State ProcessState Arguments
instructionArguments instruction = do
 basePointer <- processInstructionPointer
 let enumeratedArgumentSpecifications = zip [1..] (argumentSpecifications instruction)
 in mapM (instructionArgument basePointer) enumeratedArgumentSpecifications
instructionArgument :: Int -> (Int, ArgumentSpecification) -> State ProcessState Int
instructionArgument basePointer (offset, argumentSpec) =
 let evaluationPointer = basePointer + offset
 in case argumentMode argumentSpec of
 Value -> readProcessMemory evaluationPointer
 Pointer -> case operationMode argumentSpec of
 Write -> readProcessMemory evaluationPointer
 Read -> do
 transitiveEvaluationPointer <- readProcessMemory evaluationPointer
 readProcessMemory transitiveEvaluationPointer
 Relative -> do
 relativeBase <- processRelativeBasePointer
 baseIncrement <- readProcessMemory evaluationPointer
 let targetPointer = relativeBase + baseIncrement
 in case operationMode argumentSpec of
 Write -> return targetPointer
 Read -> readProcessMemory targetPointer

Decoding Instructions

The instruction itself is encoded as follows. The last two digits of the value in memory the instruction pointer points to represent the opcode. the higher digits represent the argument modes to be used, in inverted order when read, i.e. the lowest digit belongs to the first argument. Missing argument modes default to pointer mode.

intCodeInstruction :: State ProcessState (Maybe IntCodeInstruction)
intCodeInstruction = do
 instructionPointer <- processInstructionPointer
 instructionValue <- readProcessMemory instructionPointer
 return (do -- Maybe
 opcode <- toOpCode (instructionValue `mod` 100)
 argumentSpecs <- toArgumentSpecifications opcode (instructionValue `div` 100)
 return (IntCodeInstruction opcode argumentSpecs))
toArgumentSpecifications :: OpCode -> Int -> Maybe [ArgumentSpecification]
toArgumentSpecifications opcode argumentSpecifier =
 do -- Maybe
 maybeSpecifiedArgumentModes <- argumentModesFromSpecifier argumentSpecifier
 specifiedArgumentModes <- sequence maybeSpecifiedArgumentModes
 let
 operationModes = associatedOperationModes opcode
 numberOfMissingElements = length operationModes - length specifiedArgumentModes
 in if numberOfMissingElements < 0
 then Nothing
 else let paddedArgumentsModes = specifiedArgumentModes ++ replicate numberOfMissingElements Pointer
 in return (zipWith ArgumentSpecification paddedArgumentsModes operationModes)
argumentModesFromSpecifier :: Int -> Maybe [Maybe ArgumentMode]
argumentModesFromSpecifier 0 = Just []
argumentModesFromSpecifier x
 | x < 0 = Nothing
 | otherwise = Just (map toArgumentMode (reverse (digits 10 x)))

OpCodes

Now, let me come to the opcodes; there are 10 of them.

data OpCode = Add | Multiply | Get | Put | JumpIfTrue | JumpIfFalse | LessThan | Equals | IncrementRelativeBase | Stop deriving (Eq, Show, Enum)
toOpCode :: Int -> Maybe OpCode
toOpCode 1 = Just Add
toOpCode 2 = Just Multiply
toOpCode 3 = Just Get
toOpCode 4 = Just Put
toOpCode 5 = Just JumpIfTrue
toOpCode 6 = Just JumpIfFalse
toOpCode 7 = Just LessThan
toOpCode 8 = Just Equals
toOpCode 9 = Just IncrementRelativeBase
toOpCode 99 = Just Stop
toOpCode _ = Nothing

Each opcode is accociated with one operation and hence also with one set of requires arguments.

--operation modes in order of arguments
associatedOperationModes :: OpCode -> [OperationMode]
associatedOperationModes Add = [Read, Read, Write]
associatedOperationModes Multiply = [Read, Read, Write]
associatedOperationModes Get = [Write]
associatedOperationModes Put = [Read]
associatedOperationModes JumpIfTrue = [Read, Read]
associatedOperationModes JumpIfFalse = [Read, Read]
associatedOperationModes LessThan = [Read, Read, Write]
associatedOperationModes Equals = [Read, Read, Write]
associatedOperationModes IncrementRelativeBase = [Read]
associatedOperationModes Stop = []
type Arguments = [Int]
associatedOperation :: OpCode -> (Arguments -> State ProcessState (Maybe Int))
associatedOperation Add = handleTerminationAndRun . add
associatedOperation Multiply = handleTerminationAndRun . multiply
associatedOperation Get = handleTerminationAndRun . getOperation
associatedOperation Put = handleTerminationAndRun . putOperation
associatedOperation JumpIfTrue = handleTerminationAndRun . jumpIfTrue
associatedOperation JumpIfFalse = handleTerminationAndRun . jumpIfFalse
associatedOperation LessThan = handleTerminationAndRun . lessThan
associatedOperation Equals = handleTerminationAndRun . equals
associatedOperation IncrementRelativeBase = handleTerminationAndRun . incrementRelativeBase
associatedOperation Stop = handleTerminationAndRun . stop

Code Execution

Code execution generally works by reading input instructions, executing the associated operations and then advancing the instruction pointer to after the last argument, unless the operation alters the instruction pointer itself.

Operations

All operations share the same signature, Arguments -> State ProcessState (Maybe Int), in order to easily associate them with opcodes. The general operations assume that the process has not terminated. To guard against that, there is a special method used in the accociation with the opcodes.

handleTerminationAndRun :: State ProcessState (Maybe Int) -> State ProcessState (Maybe Int)
handleTerminationAndRun state = do
 stopped <- hasShutDown
 if stopped
 then return Nothing
 else state

We always return Nothing and do not alter the process state if the process has already shut down.

The individual operations work as follows.

Arithmetic operations apply the associated operator to their first two arguments and then write to the third.

add :: Arguments -> State ProcessState (Maybe Int)
add = applyBinaryOperationAndWrite (+)
multiply :: Arguments -> State ProcessState (Maybe Int)
multiply = applyBinaryOperationAndWrite (*)
applyBinaryOperationAndWrite :: (Int -> Int -> Int) -> (Arguments -> State ProcessState (Maybe Int))
applyBinaryOperationAndWrite binaryOp arguments = do
 let
 targetPointer = arguments!!2
 value = binaryOp (head arguments) (arguments!!1)
 in writeToProcessMemory targetPointer value
 incrementInstructionPointer 4
 setProcessStatus Running
 return Nothing

Binary comparisons are similar. However, they encode the return value by 1 for True and 0 for False.

lessThan :: Arguments -> State ProcessState (Maybe Int)
lessThan = applyBinaryComparisonAndWrite (<)
equals :: Arguments -> State ProcessState (Maybe Int)
equals = applyBinaryComparisonAndWrite (==)
applyBinaryComparisonAndWrite :: (Int -> Int -> Bool) -> (Arguments -> State ProcessState (Maybe Int))
applyBinaryComparisonAndWrite binaryComp arguments = do
 let
 targetPointer = arguments!!2
 value = if (head arguments) `binaryComp` (arguments!!1)
 then 1
 else 0
 in writeToProcessMemory targetPointer value
 incrementInstructionPointer 4
 setProcessStatus Running
 return Nothing

The two jump instructions set the instruction pointer to their second argument if the first arguments represents the corresponding truthyness. (again with 0 == False, /= 0 == True)

jumpIfTrue :: Arguments -> State ProcessState (Maybe Int)
jumpIfTrue = jumpIf (/= 0)
jumpIfFalse :: Arguments -> State ProcessState (Maybe Int)
jumpIfFalse = jumpIf (== 0)
jumpIf :: (Int -> Bool) -> (Arguments -> State ProcessState (Maybe Int))
jumpIf test arguments = do
 if test (head arguments)
 then setInstructionPointer (arguments!!1)
 else incrementInstructionPointer 3
 setProcessStatus Running
 return Nothing

In addition to these operations to modify the instruction pointer, there is an operation that increments the relative base pointer by its first argument.

incrementRelativeBase :: Arguments -> State ProcessState (Maybe Int)
incrementRelativeBase arguments = do
 incrementRelativeBasePointer $ head arguments
 incrementInstructionPointer 2
 setProcessStatus Running
 return Nothing

To read from the input stream, there is the Get operation, which i called getOperation because of the name clash with get from State. It reads the first value from the input stream and writes it to its only argument. If the nput stream is empty, it blocks.

getOperation :: Arguments -> State ProcessState (Maybe Int)
getOperation arguments = do
 maybeInput <- popProcessInput
 case maybeInput of
 Nothing -> do
 setProcessStatus Blocked
 return Nothing
 Just input -> do
 let
 targetPointer = head arguments
 in writeToProcessMemory targetPointer input
 incrementInstructionPointer 2
 setProcessStatus Running
 return Nothing

The only operation providing output from the process is the Put operation, which I again named putOperation because of the clash with put from State. It simply outputs its first argument.

putOperation arguments = do
 incrementInstructionPointer 2
 setProcessStatus Running
 let newOutputValue = head arguments
 in return $ Just newOutputValue

Finally, there is the stop operation to terminte the process.

stop :: Arguments -> State ProcessState (Maybe Int)
stop _ = do
 terminateProcess
 return Nothing

Code Execution Coordination

Now that all the pieces of the computation are specified, it only needs to be wired up. I do this using the following stateful computations.

continueExecution :: State ProcessState [Int]
continueExecution = do
 maybeResult <- executeNextInstruction
 running <- isRunning
 if running
 then do
 remainingResult <- continueExecution
 case maybeResult of
 Nothing -> return remainingResult
 Just result -> return (result:remainingResult)
 else return []
executeNextInstruction :: State ProcessState (Maybe Int)
executeNextInstruction = do
 maybeInstruction <- intCodeInstruction
 case maybeInstruction of
 Nothing -> do
 abortProcess
 return Nothing
 Just instruction -> executeInstruction instruction
executeInstruction :: IntCodeInstruction -> State ProcessState (Maybe Int)
executeInstruction instruction = do
 arguments <- instructionArguments instruction
 let operation = associatedOperation (opcode instruction)
 in operation arguments

As a convenience function, I add a function that can be used to initialize a process state and run the computation, throwing away the final state.

executeCode :: IntCode -> [Int] -> [Int]
executeCode code initialInputs =
 let initialState = initializeProcess code initialInputs
 in evalState continueExecution initialState

For day 9, this function is sufficient. However, in problems like part 2 of day 7, continueExecution needs to be used direcly in order to wire up a stateful computation with resuming after blocking temporarily.

asked Feb 25, 2020 at 12:19
\$\endgroup\$

1 Answer 1

4
\$\begingroup\$

Disclaimer: It has been a while since I've written Haskell in production. Also, while I usually like to review all the code, I have to admit that there is too much for me in this case. Instead, I'll try to focus on what I've seen from a short glance, raise concerns and show alternatives where applicable.

But first, let's give praise.

Types are everywhere

This is great. You've put a type signature on every top-level value, and never one on an intermediate binding. This enables me to reason about your code even without a compiler that would usually help me with the review.

You also introduced proper newtypes instead of HashMap Int Int or other non-semantic types. Overall, well-done.

Stateless vs stateful functions

Next, we come into a territory that's probably subject to personal opinion: should one write functions in terms of the State monad, or without?

Let's have a look at our first Stateful function:

writeToMemory :: Int -> Int -> State Memory ()
writeToMemory pointer 0 = modify $ \(Memory m) -> Memory $ Map.delete pointer m
writeToMemory pointer value = modify $ \(Memory m) -> Memory $ Map.alter (\_ -> Just value) pointer m

While seemingly innocent, those functions provide some problems later on. For example in writeToProcessMemory we have to conjure a new State:

writeToProcessMemory :: Int -> Int -> State ProcessState ()
writeToProcessMemory pointer value = 
 modify $ \s@ProcessState{memory = m} -> 
 s {memory = execState (writeToMemory pointer value) m} -- <- execState

Our stateful functions force us to create and exec a new State just to apply a function. That's cumbersome.

What happened if we used another writeToMemory?

writeToMemory :: Int -> Int -> Memory -> Memory
writeToMemory pointer 0 (Memory m) = Memory $ Map.delete pointer m
writeToMemory pointer value (Memory m) = Memory $ Map.insert pointer value m
writeToProcessMemory :: Int -> Int -> State ProcessState ()
writeToProcessMemory pointer value = 
 modify $ \s@ProcessState{memory = m} -> 
 s {memory = writeToMemory pointer value m}

No more execState, and we don't have to wrap around State to understand this function.

Some more examples for simpler building blocks

Another example where the Stateful function is much more verbose than a non-Stateful one is hasShutDown:

hasShutDown :: State ProcessState Bool
hasShutDown = do
 currentStatus <- processStatus
 case currentStatus of
 Terminated -> return True -- alignment added by me
 Error -> return True
 _ -> return False

Those seven lines need to get carefully processed by a reviewer. However, we can simply split it into two functions:

hasShutDown' :: ProcessState -> Bool
hasShutDown' Terminated = True
hasShutDown' Error = True
hasShutDown' _ = False
hasShutDown :: State ProcessState Bool
hasShutDown = hasShutDown' <$> processStatus

Even with the additional empty line and types, the overall length stayed at 7 lines. However, it is now a lot easier to examine hasShutDown' in my point of view.

My general rule of thumb

If you have a function with type a -> b for some a and b, keep it out of the State monad as much as possible to make it easier for reuse. If you have a function with type a -> (b, a), then it's a lot easier to handle with State.

However, that is my general rule of thumb. Your style might vary.

Consider lenses

Let's get back to our new writeToProcessMemory:

writeToProcessMemory :: Int -> Int -> State ProcessState ()
writeToProcessMemory pointer value = 
 modify $ \s@ProcessState{memory = m} -> 
 s {memory = writeToMemory pointer value m}

While I usually dislike lenses when overused, as they introduce a new dependency (use microlens-platform instead of lens if you don't need prisms, isomorphisms or similar), they can make this function even shorter:

writeToProcessMemory :: Int -> Int -> State ProcessState ()
writeToProcessMemory pointer value = memory %= writeToMemory pointer value

However, this needs memory to be a lens; it's a design decision whether to use them, but they can simplify code a lot.

answered Jun 27, 2020 at 11:05
\$\endgroup\$
2
  • \$\begingroup\$ Thank you very much for your review, especially on the guideline regarding the use, or rather avoidance, of the State monad for building blocks. \$\endgroup\$ Commented Jun 30, 2020 at 20:03
  • \$\begingroup\$ @M.Doerner You're welcome. Sorry that it took so long, given that you've asked this question in February. \$\endgroup\$ Commented Jul 1, 2020 at 10:35

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.