I am new to Haskell and currently trying to port my solutions for the 2019 installment of the coding challenge AdventOfCode to Haskell. So, I would very much appreciate any suggestions how to make the code more readable and more idiomatic. In particular, I am interested in whether my use of the State
monad is sensible here.
This post is about the combined result from day 2, day 5, and day 9, an interpreter for a simple assembly language called IntCode. If you have not solved these problems and still intend to do so, stop reading immediately.
I have kept the entire solution for each part of each day in a single module with a single exported function that prints the solution. Without this selfimposed restriction, I would have split this code into different modules. For day 9 part 1 it starts as follows.
module AdventOfCode20191209_1
(
systemCheck
) where
import System.IO
import Data.List.Split
import Data.List
import Data.Maybe
import qualified Data.HashMap.Strict as Map
import Control.Monad.State
import Data.Digits
systemCheck :: IO ()
systemCheck = do
inputText <- readFile "Advent20191209_1_input.txt"
let code = toIntCode inputText
let result = executeCode code [1]
print result
Now, let me explain the setting, i.e. the IntCode language, and how I have implemented the different features.
As the name might suggest, the input is a stream/list of integer values. In the problems, it is provided as a comma separated string, which I read as follows.
type IntCode = [Int]
toIntCode :: String -> IntCode
toIntCode = map read . splitOn ","
Process State
This code also functions as the bases for the process memory, which is essentially infinite and initialized by the code itself inside its index range and by zero outside.
newtype Memory = Memory (Map.HashMap Int Int) deriving (Eq, Show)
readMemory :: Int -> State Memory Int
readMemory pointer = gets $ \(Memory m) -> Map.lookupDefault 0 pointer m
writeToMemory :: Int -> Int -> State Memory ()
writeToMemory pointer 0 = modify $ \(Memory m) -> Memory $ Map.delete pointer m
writeToMemory pointer value = modify $ \(Memory m) -> Memory $ Map.alter (\_ -> Just value) pointer m
fromIntCode :: IntCode -> Memory
fromIntCode = Memory . Map.fromList . indexedIntCode
indexedIntCode :: IntCode -> [(Int, Int)]
indexedIntCode = zip [0..]
In addition to memory, there is an input stream used by the Get
operation,
newtype InputStream = InputStream [Int] deriving (Eq, Show)
addInput :: Int -> State InputStream ()
addInput input = modify $ \(InputStream xs) -> InputStream $ xs ++ [input]
addInputs :: [Int] -> State InputStream ()
addInputs inputs = modify $ \(InputStream xs) -> InputStream $ xs ++ inputs
popInput :: State InputStream (Maybe Int)
popInput = state $ \input@(InputStream xs) -> case xs of
[] -> (Nothing, input)
y:ys -> (Just y, InputStream ys)
There are two more stateful objects involved in the computation, an instruction pointer and a relative base pointer, both starting at zero. The instruction pointer is what the name suggests and the relative base pointer is used for one of the methods to read arguments described later.
I have combined these into a ProcessState
type.
data ExecutionStatus = Running | Blocked | Terminated | Error deriving (Eq, Show, Enum)
data ProcessState = ProcessState {
memory :: Memory,
inputs :: InputStream,
instructionPointer :: Int,
relativeBasePointer :: Int,
status :: ExecutionStatus
} deriving (Eq, Show)
processStatus :: State ProcessState ExecutionStatus
processStatus = gets $ status
hasShutDown :: State ProcessState Bool
hasShutDown = do
currentStatus <- processStatus
case currentStatus of
Terminated -> return True
Error -> return True
_ -> return False
isRunning :: State ProcessState Bool
isRunning = do
currentStatus <- processStatus
case currentStatus of
Running -> return True
_ -> return False
setProcessStatus :: ExecutionStatus -> State ProcessState ()
setProcessStatus processStatus = do
stopped <- hasShutDown
if stopped
then return ()
else modify $ \s -> s{status = processStatus}
terminateProcess :: State ProcessState ()
terminateProcess = setProcessStatus Terminated
abortProcess :: State ProcessState ()
abortProcess = setProcessStatus Error
setInstructionPointer :: Int -> State ProcessState ()
setInstructionPointer pointer = modify $ \s -> s {instructionPointer = pointer}
processInstructionPointer :: State ProcessState Int
processInstructionPointer = gets $ instructionPointer
incrementInstructionPointer :: Int -> State ProcessState ()
incrementInstructionPointer offset = do
instructionPointer <- processInstructionPointer
setInstructionPointer (instructionPointer + offset)
setRelativeBasePointer :: Int -> State ProcessState ()
setRelativeBasePointer pointer = modify $ \s -> s {relativeBasePointer = pointer}
processRelativeBasePointer :: State ProcessState Int
processRelativeBasePointer = gets $ relativeBasePointer
incrementRelativeBasePointer :: Int -> State ProcessState ()
incrementRelativeBasePointer offset = do
relativeBasePointer <- processRelativeBasePointer
setRelativeBasePointer (relativeBasePointer + offset)
readProcessMemory :: Int -> State ProcessState Int
readProcessMemory pointer = gets $ \ProcessState{memory = m} -> evalState (readMemory pointer) m
writeToProcessMemory :: Int -> Int -> State ProcessState ()
writeToProcessMemory pointer value = modify $ \s@ProcessState{memory = m} -> s {memory = execState (writeToMemory pointer value) m}
addProcessInput :: Int -> State ProcessState ()
addProcessInput additionalInput = modify $ \s@ProcessState{inputs = inputStream} -> s {inputs = execState (addInput additionalInput) inputStream}
addProcessInputs :: [Int] -> State ProcessState ()
addProcessInputs additionalInputs = modify $ \s@ProcessState{inputs = inputStream} -> s {inputs = execState (addInputs additionalInputs) inputStream}
popProcessInput :: State ProcessState (Maybe Int)
popProcessInput = state $ \s@ProcessState{inputs = inputStream} ->
let (input, newInputs) = runState popInput inputStream
in (input, s {inputs = newInputs})
initializeProcess :: IntCode -> [Int] -> ProcessState
initializeProcess code initialInputs = ProcessState {
memory = fromIntCode code,
inputs = InputStream initialInputs,
instructionPointer = 0,
relativeBasePointer = 0,
status = Running}
Most of the associated methods are basically plumbing between the stateful operations on the process state and the underlying stateful operations.
I have added an execution status that is used to indicate that the process has terminated, via hasShutDown
. Moreover, it will be used to halt execution in case the process blocks due to missing input. This is necessary for applications as in part 2 of day 7 or on day 23.
Instructions
Each step of a computation starts by reading the value in memory the instruction pointer points to. This encodes two kinds of information: the opcode for the operation and the modes used to read the arguments from memory.
data IntCodeInstruction = IntCodeInstruction {
opcode :: OpCode,
argumentSpecifications :: [ArgumentSpecification]
} deriving (Eq, Show)
data ArgumentSpecification = ArgumentSpecification {
argumentMode :: ArgumentMode,
operationMode :: OperationMode
} deriving (Eq, Show)
Reading Arguments
There are three basic modes how to read data.
toArgumentMode :: Int -> Maybe ArgumentMode
toArgumentMode 0 = Just Pointer
toArgumentMode 1 = Just Value
toArgumentMode 2 = Just Relative
toArgumentMode _ = Nothing
In general, the information relating to the nth argument is at an offset of n from the instruction pointer. In value mode, this is the argument, in pointer mode, it is the value of the pointer to the data and in relative mode, it is the offset from the relative base pointer where to look for the data. Unfortunately, this is not all. If the argument specifies a location to write to in memory, it the argument is the pointer instead of the data at the location the pointer points to. To encode this, I introduced an operation mode, which depends on the opcode.
data OperationMode = Read | Write deriving (Eq, Show)
Putting this together, I read arguments as follows.
instructionArguments :: IntCodeInstruction -> State ProcessState Arguments
instructionArguments instruction = do
basePointer <- processInstructionPointer
let enumeratedArgumentSpecifications = zip [1..] (argumentSpecifications instruction)
in mapM (instructionArgument basePointer) enumeratedArgumentSpecifications
instructionArgument :: Int -> (Int, ArgumentSpecification) -> State ProcessState Int
instructionArgument basePointer (offset, argumentSpec) =
let evaluationPointer = basePointer + offset
in case argumentMode argumentSpec of
Value -> readProcessMemory evaluationPointer
Pointer -> case operationMode argumentSpec of
Write -> readProcessMemory evaluationPointer
Read -> do
transitiveEvaluationPointer <- readProcessMemory evaluationPointer
readProcessMemory transitiveEvaluationPointer
Relative -> do
relativeBase <- processRelativeBasePointer
baseIncrement <- readProcessMemory evaluationPointer
let targetPointer = relativeBase + baseIncrement
in case operationMode argumentSpec of
Write -> return targetPointer
Read -> readProcessMemory targetPointer
Decoding Instructions
The instruction itself is encoded as follows. The last two digits of the value in memory the instruction pointer points to represent the opcode. the higher digits represent the argument modes to be used, in inverted order when read, i.e. the lowest digit belongs to the first argument. Missing argument modes default to pointer mode.
intCodeInstruction :: State ProcessState (Maybe IntCodeInstruction)
intCodeInstruction = do
instructionPointer <- processInstructionPointer
instructionValue <- readProcessMemory instructionPointer
return (do -- Maybe
opcode <- toOpCode (instructionValue `mod` 100)
argumentSpecs <- toArgumentSpecifications opcode (instructionValue `div` 100)
return (IntCodeInstruction opcode argumentSpecs))
toArgumentSpecifications :: OpCode -> Int -> Maybe [ArgumentSpecification]
toArgumentSpecifications opcode argumentSpecifier =
do -- Maybe
maybeSpecifiedArgumentModes <- argumentModesFromSpecifier argumentSpecifier
specifiedArgumentModes <- sequence maybeSpecifiedArgumentModes
let
operationModes = associatedOperationModes opcode
numberOfMissingElements = length operationModes - length specifiedArgumentModes
in if numberOfMissingElements < 0
then Nothing
else let paddedArgumentsModes = specifiedArgumentModes ++ replicate numberOfMissingElements Pointer
in return (zipWith ArgumentSpecification paddedArgumentsModes operationModes)
argumentModesFromSpecifier :: Int -> Maybe [Maybe ArgumentMode]
argumentModesFromSpecifier 0 = Just []
argumentModesFromSpecifier x
| x < 0 = Nothing
| otherwise = Just (map toArgumentMode (reverse (digits 10 x)))
OpCodes
Now, let me come to the opcodes; there are 10 of them.
data OpCode = Add | Multiply | Get | Put | JumpIfTrue | JumpIfFalse | LessThan | Equals | IncrementRelativeBase | Stop deriving (Eq, Show, Enum)
toOpCode :: Int -> Maybe OpCode
toOpCode 1 = Just Add
toOpCode 2 = Just Multiply
toOpCode 3 = Just Get
toOpCode 4 = Just Put
toOpCode 5 = Just JumpIfTrue
toOpCode 6 = Just JumpIfFalse
toOpCode 7 = Just LessThan
toOpCode 8 = Just Equals
toOpCode 9 = Just IncrementRelativeBase
toOpCode 99 = Just Stop
toOpCode _ = Nothing
Each opcode is accociated with one operation and hence also with one set of requires arguments.
--operation modes in order of arguments
associatedOperationModes :: OpCode -> [OperationMode]
associatedOperationModes Add = [Read, Read, Write]
associatedOperationModes Multiply = [Read, Read, Write]
associatedOperationModes Get = [Write]
associatedOperationModes Put = [Read]
associatedOperationModes JumpIfTrue = [Read, Read]
associatedOperationModes JumpIfFalse = [Read, Read]
associatedOperationModes LessThan = [Read, Read, Write]
associatedOperationModes Equals = [Read, Read, Write]
associatedOperationModes IncrementRelativeBase = [Read]
associatedOperationModes Stop = []
type Arguments = [Int]
associatedOperation :: OpCode -> (Arguments -> State ProcessState (Maybe Int))
associatedOperation Add = handleTerminationAndRun . add
associatedOperation Multiply = handleTerminationAndRun . multiply
associatedOperation Get = handleTerminationAndRun . getOperation
associatedOperation Put = handleTerminationAndRun . putOperation
associatedOperation JumpIfTrue = handleTerminationAndRun . jumpIfTrue
associatedOperation JumpIfFalse = handleTerminationAndRun . jumpIfFalse
associatedOperation LessThan = handleTerminationAndRun . lessThan
associatedOperation Equals = handleTerminationAndRun . equals
associatedOperation IncrementRelativeBase = handleTerminationAndRun . incrementRelativeBase
associatedOperation Stop = handleTerminationAndRun . stop
Code Execution
Code execution generally works by reading input instructions, executing the associated operations and then advancing the instruction pointer to after the last argument, unless the operation alters the instruction pointer itself.
Operations
All operations share the same signature, Arguments -> State ProcessState (Maybe Int)
, in order to easily associate them with opcodes. The general operations assume that the process has not terminated. To guard against that, there is a special method used in the accociation with the opcodes.
handleTerminationAndRun :: State ProcessState (Maybe Int) -> State ProcessState (Maybe Int)
handleTerminationAndRun state = do
stopped <- hasShutDown
if stopped
then return Nothing
else state
We always return Nothing
and do not alter the process state if the process has already shut down.
The individual operations work as follows.
Arithmetic operations apply the associated operator to their first two arguments and then write to the third.
add :: Arguments -> State ProcessState (Maybe Int)
add = applyBinaryOperationAndWrite (+)
multiply :: Arguments -> State ProcessState (Maybe Int)
multiply = applyBinaryOperationAndWrite (*)
applyBinaryOperationAndWrite :: (Int -> Int -> Int) -> (Arguments -> State ProcessState (Maybe Int))
applyBinaryOperationAndWrite binaryOp arguments = do
let
targetPointer = arguments!!2
value = binaryOp (head arguments) (arguments!!1)
in writeToProcessMemory targetPointer value
incrementInstructionPointer 4
setProcessStatus Running
return Nothing
Binary comparisons are similar. However, they encode the return value by 1 for True and 0 for False.
lessThan :: Arguments -> State ProcessState (Maybe Int)
lessThan = applyBinaryComparisonAndWrite (<)
equals :: Arguments -> State ProcessState (Maybe Int)
equals = applyBinaryComparisonAndWrite (==)
applyBinaryComparisonAndWrite :: (Int -> Int -> Bool) -> (Arguments -> State ProcessState (Maybe Int))
applyBinaryComparisonAndWrite binaryComp arguments = do
let
targetPointer = arguments!!2
value = if (head arguments) `binaryComp` (arguments!!1)
then 1
else 0
in writeToProcessMemory targetPointer value
incrementInstructionPointer 4
setProcessStatus Running
return Nothing
The two jump instructions set the instruction pointer to their second argument if the first arguments represents the corresponding truthyness. (again with 0 == False, /= 0 == True)
jumpIfTrue :: Arguments -> State ProcessState (Maybe Int)
jumpIfTrue = jumpIf (/= 0)
jumpIfFalse :: Arguments -> State ProcessState (Maybe Int)
jumpIfFalse = jumpIf (== 0)
jumpIf :: (Int -> Bool) -> (Arguments -> State ProcessState (Maybe Int))
jumpIf test arguments = do
if test (head arguments)
then setInstructionPointer (arguments!!1)
else incrementInstructionPointer 3
setProcessStatus Running
return Nothing
In addition to these operations to modify the instruction pointer, there is an operation that increments the relative base pointer by its first argument.
incrementRelativeBase :: Arguments -> State ProcessState (Maybe Int)
incrementRelativeBase arguments = do
incrementRelativeBasePointer $ head arguments
incrementInstructionPointer 2
setProcessStatus Running
return Nothing
To read from the input stream, there is the Get
operation, which i called getOperation
because of the name clash with get
from State
. It reads the first value from the input stream and writes it to its only argument. If the nput stream is empty, it blocks.
getOperation :: Arguments -> State ProcessState (Maybe Int)
getOperation arguments = do
maybeInput <- popProcessInput
case maybeInput of
Nothing -> do
setProcessStatus Blocked
return Nothing
Just input -> do
let
targetPointer = head arguments
in writeToProcessMemory targetPointer input
incrementInstructionPointer 2
setProcessStatus Running
return Nothing
The only operation providing output from the process is the Put
operation, which I again named putOperation
because of the clash with put
from State
. It simply outputs its first argument.
putOperation arguments = do
incrementInstructionPointer 2
setProcessStatus Running
let newOutputValue = head arguments
in return $ Just newOutputValue
Finally, there is the stop
operation to terminte the process.
stop :: Arguments -> State ProcessState (Maybe Int)
stop _ = do
terminateProcess
return Nothing
Code Execution Coordination
Now that all the pieces of the computation are specified, it only needs to be wired up. I do this using the following stateful computations.
continueExecution :: State ProcessState [Int]
continueExecution = do
maybeResult <- executeNextInstruction
running <- isRunning
if running
then do
remainingResult <- continueExecution
case maybeResult of
Nothing -> return remainingResult
Just result -> return (result:remainingResult)
else return []
executeNextInstruction :: State ProcessState (Maybe Int)
executeNextInstruction = do
maybeInstruction <- intCodeInstruction
case maybeInstruction of
Nothing -> do
abortProcess
return Nothing
Just instruction -> executeInstruction instruction
executeInstruction :: IntCodeInstruction -> State ProcessState (Maybe Int)
executeInstruction instruction = do
arguments <- instructionArguments instruction
let operation = associatedOperation (opcode instruction)
in operation arguments
As a convenience function, I add a function that can be used to initialize a process state and run the computation, throwing away the final state.
executeCode :: IntCode -> [Int] -> [Int]
executeCode code initialInputs =
let initialState = initializeProcess code initialInputs
in evalState continueExecution initialState
For day 9, this function is sufficient. However, in problems like part 2 of day 7, continueExecution
needs to be used direcly in order to wire up a stateful computation with resuming after blocking temporarily.
1 Answer 1
Disclaimer: It has been a while since I've written Haskell in production. Also, while I usually like to review all the code, I have to admit that there is too much for me in this case. Instead, I'll try to focus on what I've seen from a short glance, raise concerns and show alternatives where applicable.
But first, let's give praise.
Types are everywhere
This is great. You've put a type signature on every top-level value, and never one on an intermediate binding. This enables me to reason about your code even without a compiler that would usually help me with the review.
You also introduced proper newtype
s instead of HashMap Int Int
or other non-semantic types. Overall, well-done.
Stateless vs stateful functions
Next, we come into a territory that's probably subject to personal opinion: should one write functions in terms of the State
monad, or without?
Let's have a look at our first State
ful function:
writeToMemory :: Int -> Int -> State Memory ()
writeToMemory pointer 0 = modify $ \(Memory m) -> Memory $ Map.delete pointer m
writeToMemory pointer value = modify $ \(Memory m) -> Memory $ Map.alter (\_ -> Just value) pointer m
While seemingly innocent, those functions provide some problems later on. For example in writeToProcessMemory
we have to conjure a new State
:
writeToProcessMemory :: Int -> Int -> State ProcessState ()
writeToProcessMemory pointer value =
modify $ \s@ProcessState{memory = m} ->
s {memory = execState (writeToMemory pointer value) m} -- <- execState
Our stateful functions force us to create and exec
a new State
just to apply a function. That's cumbersome.
What happened if we used another writeToMemory
?
writeToMemory :: Int -> Int -> Memory -> Memory
writeToMemory pointer 0 (Memory m) = Memory $ Map.delete pointer m
writeToMemory pointer value (Memory m) = Memory $ Map.insert pointer value m
writeToProcessMemory :: Int -> Int -> State ProcessState ()
writeToProcessMemory pointer value =
modify $ \s@ProcessState{memory = m} ->
s {memory = writeToMemory pointer value m}
No more execState
, and we don't have to wrap around State
to understand this function.
Some more examples for simpler building blocks
Another example where the State
ful function is much more verbose than a non-State
ful one is hasShutDown
:
hasShutDown :: State ProcessState Bool
hasShutDown = do
currentStatus <- processStatus
case currentStatus of
Terminated -> return True -- alignment added by me
Error -> return True
_ -> return False
Those seven lines need to get carefully processed by a reviewer. However, we can simply split it into two functions:
hasShutDown' :: ProcessState -> Bool
hasShutDown' Terminated = True
hasShutDown' Error = True
hasShutDown' _ = False
hasShutDown :: State ProcessState Bool
hasShutDown = hasShutDown' <$> processStatus
Even with the additional empty line and types, the overall length stayed at 7 lines. However, it is now a lot easier to examine hasShutDown'
in my point of view.
My general rule of thumb
If you have a function with type a -> b
for some a
and b
, keep it out of the State
monad as much as possible to make it easier for reuse. If you have a function with type a -> (b, a)
, then it's a lot easier to handle with State
.
However, that is my general rule of thumb. Your style might vary.
Consider lenses
Let's get back to our new writeToProcessMemory
:
writeToProcessMemory :: Int -> Int -> State ProcessState ()
writeToProcessMemory pointer value =
modify $ \s@ProcessState{memory = m} ->
s {memory = writeToMemory pointer value m}
While I usually dislike lenses when overused, as they introduce a new dependency (use microlens-platform
instead of lens
if you don't need prisms, isomorphisms or similar), they can make this function even shorter:
writeToProcessMemory :: Int -> Int -> State ProcessState ()
writeToProcessMemory pointer value = memory %= writeToMemory pointer value
However, this needs memory
to be a lens; it's a design decision whether to use them, but they can simplify code a lot.
-
\$\begingroup\$ Thank you very much for your review, especially on the guideline regarding the use, or rather avoidance, of the
State
monad for building blocks. \$\endgroup\$M.Doerner– M.Doerner2020年06月30日 20:03:04 +00:00Commented Jun 30, 2020 at 20:03 -
\$\begingroup\$ @M.Doerner You're welcome. Sorry that it took so long, given that you've asked this question in February. \$\endgroup\$Zeta– Zeta2020年07月01日 10:35:39 +00:00Commented Jul 1, 2020 at 10:35