I am very new to Haskell and come from languages like C++, although I do have some experience with Scheme. Here, I wrote a simple brainf*** interpreter, which so far is my largest project. I followed a tutorial for some of the code (such as the Tape
data type), but did most of this on my own. I have one major concern with my approach, which is the way I handle the loops (which in brainf*** are written with []
and repeat until the current cell in the tape is zero. In my implementation, I used the loop as a parameter to the main recursive function, which is called if the current instruction is ]
and the current cell is not 0
. Since these loops modify the tape, I had to make them return an instance of Tape
that would replace the current one. However, since the function also handles IO (and therefore must return an instance of IO
), I made it return an IO (Tape Int)
which is then unpacked in a do
block. This all felt very messy and hacky, so I would really appreciate any help from more experienced Haskell developers.
import Data.Maybe
--The Tape data type and functions
data Tape a = Tape [a] a [a]
newTape :: a -> Tape a
newTape x = Tape r x r
where r = repeat x
moveLeft :: Tape a -> Tape a
moveLeft (Tape (l:ls) x rs) = Tape ls l (x:rs)
moveRight :: Tape a -> Tape a
moveRight (Tape ls x (r:rs)) = Tape (x:ls) r rs
--The Brainf*** instruction data types
data BfInstruction
= MovLeft
| MovRight
| Increment
| Decrement
| Output
| Input
| BeginLoop
| EndLoop
deriving (Show, Eq)
type BfProgram = [BfInstruction]
--Convert string to BfProgram
parseBf :: String -> BfProgram
parseBf = mapMaybe parse
where
parse :: Char -> Maybe BfInstruction
parse x = case x of
'<' -> Just MovLeft
'>' -> Just MovRight
'+' -> Just Increment
'-' -> Just Decrement
',' -> Just Input
'.' -> Just Output
'[' -> Just BeginLoop
']' -> Just EndLoop
x -> Nothing --anything but the above chars is a comment
--Main running function
runBf :: String -> IO ()
runBf p = runBf' (parseBf p) (newTape 0) [] >> return ()
where
runBf' :: BfProgram -> Tape Int -> BfProgram -> IO (Tape Int)
runBf' [] tape _ = return tape
runBf' prog@(p:ps) tape@(Tape ls x rs) loop = case p of
MovLeft -> advance prog (moveLeft tape)
MovRight -> advance prog (moveRight tape)
Increment -> advance prog (Tape ls (x+1) rs)
Decrement -> advance prog (Tape ls (x-1) rs)
Input -> do
char <- getChar
advance prog (Tape ls (fromEnum char) rs)
Output -> putChar (toEnum x) >> advance prog tape
BeginLoop ->
let lp = getLoop 1 ps
in runBf' (length lp `drop` ps) tape lp --Drop so that we are at the ] now
EndLoop ->
if x /= 0
then do
lt <- runBf' loop tape []
runBf' prog lt loop --Copy the tape from the result of the loop into next iteration
else advance prog tape
advance :: BfProgram -> Tape Int -> IO (Tape Int)
advance (p:ps) tape = runBf' ps tape []
getLoop :: Int -> BfProgram -> BfProgram
getLoop _ [] = error "Mismatched brackets in BF program"
getLoop 1 (EndLoop:ps) = []
getLoop n (p:ps) = p:case p of
BeginLoop -> getLoop (n + 1) ps
EndLoop -> getLoop (n - 1) ps
_ -> getLoop n ps
--Simple IO
main = do
program <- readFile "program.bf"
runBf program
-
\$\begingroup\$ Welcome to Code Review. That almost looks like literate Haskell, did you mean to split your code into several snippets? \$\endgroup\$Zeta– Zeta2018年08月15日 19:29:39 +00:00Commented Aug 15, 2018 at 19:29
-
\$\begingroup\$ I did? I separated my data type definitions and functions. What should I have done? \$\endgroup\$mrFoobles– mrFoobles2018年08月15日 19:40:24 +00:00Commented Aug 15, 2018 at 19:40
-
\$\begingroup\$ You can keep it like this, but it's usually better if code is kept copyable (e.g. one file = one code block) so that interested reviewers can test your program (it's not a required, though). Literate Haskell looks very similar to your presentation style, so you might be interested in it. \$\endgroup\$Zeta– Zeta2018年08月15日 19:42:53 +00:00Commented Aug 15, 2018 at 19:42
-
\$\begingroup\$ Ok thank you. I looked around and saw some other people doing it, so I split mine up too. Just adjusted it. \$\endgroup\$mrFoobles– mrFoobles2018年08月15日 19:44:41 +00:00Commented Aug 15, 2018 at 19:44
-
\$\begingroup\$ It's not required per-se to keep the code together, you can keep the code split for presentation purposes, sorry if my comment made it sound like a requirement. Either way, I hope you get some nice reviews. \$\endgroup\$Zeta– Zeta2018年08月15日 19:46:27 +00:00Commented Aug 15, 2018 at 19:46
1 Answer 1
Well done. Overall, there are no big flaws, just some minor issues.
Type annotations
While it's great that all top-level functions have proper type signatures, the local bindings inside those functions usually don't. After all, their types should get inferred, e.g.
example :: [Int] -> [Int]
example = map inc
where
inc x = 1 + x
doesn't need a type signature since x
's type is already fixed to Int
. It makes refactoring also a lot easier if we change the type later. If we started with
example :: [Int] -> [Int]
example = map inc
where
inc :: Int -> Int
inc x = 1 + x
and later want to generalize, we might forget the second type signature and end up with an error message:
example :: Num a => [a] -> [a]
example = map inc
where
inc :: Int -> Int -- whoops, GHC will yell about that
inc x = 1 + x
Therefore, type signatures for local functions are usually not written out. There are some instances where they're necessary, but that's usually with RankNTypes
or other extensions.
The tape
The tape works well, and is pretty much how you would expect it.
Infinite tapes and debugging
That being said, an infinite tape has the slight inconvenience that you can never inspect it for debugging purposes.
Also, if you ever create a module from your code, you must not export the Tape
data constructor, as it would enable Tape [] 0 []
and therefore break assertions.
A finite tape circumvents those issues, but needs slightly more effort in the move
ments.
Working on the current value
In runBf
we can find several spots where we advance
the program after we worked on the current value, e.g.:
Increment -> advance prog (Tape ls (x+1) rs)
Decrement -> advance prog (Tape ls (x-1) rs)
That's now a possible source of errors, since we could have used
Increment -> advance prog (Tape ls (x+1) ls)
Decrement -> advance prog (Tape ls (x-1) ls)
by accident. A small helper can prevent that issue:
onCurrent :: (a -> a) -> Tape a -> Tape a
onCurrent f (Tape ls x rs) = Tape ls (f x) rs
current :: Tape a -> a
current (Tape _ x _ ) = x
Then we end up with
MovLeft -> advance prog (moveLeft tape)
MovRight -> advance prog (moveRight tape)
Increment -> advance prog (onCurrent (+1) tape)
Decrement -> advance prog (onCurrent (subtract 1) tape)
Input -> do
char <- getChar
advance prog (onCurrent (const (fromEnum char)) tape)
Naming and scope
As neither advance
nor getLoop
use any of the bindings in their scope, they're candidates for top-level functions.
runBf'
can be called go
or another short name. Calling the inner worker just go
is really common and won't alienate other readers.
Make interfaces hard to use wrong
getLoop
uses an Int
as first argument that's not properly documented. Types only go so far as documentation, and we could accidentally use getLoop 0
in BeginLoop
.
Instead, we should make it impossible to misuse getLoop
:
getLoop :: BfProgram -> BfProgram
getLoop = go 1
where
go _ [] = error "Mismatched brackets in BF program"
go 1 (EndLoop:ps) = []
go n (p:ps) = p:case p of
BeginLoop -> go (n + 1) ps
EndLoop -> go (n - 1) ps
_ -> go n ps
Similarly, runBf
should probably take a BfProgram
, not an arbitrary String
, as this doesn't decrease the strength of your program, we can recreate the previous behaviour with
runBf . parseBf
However, speaking of parsing...
Loop validation
A drawback with our current BfProgram
is that we might end up with mismatched brackets, e.g.
parseBf "]["
parses fine and leads to a runtime error. However, we could easily detect that during parsing. Our parseBf
needs a way to report errors:
type ParserError = String
parseBf :: String -> Either ParserError BfProgram
parseBf = go
where
go [] = Right []
go (x:xs) = case x of
'<' -> MovLeft <$:> go xs
'>' -> MovRight <$:> go xs
'+' -> Increment <$:> go xs
'-' -> Decrement <$:> go xs
',' -> Input <$:> go xs
'.' -> Output <$:> go xs
'[' -> -- exercise ; use `getLoop`-like function
']' -> -- exercise ; easier if previous one done correctly.
x -> go xs
x <$:> xs = fmap (x:) xs
but afterwards, we can be sure that parseBf
only returns BfProgram
s with valid brackets.
Unfortunately, we still need to use getLoop
, as BeginLoop
and EndLoop
are still in our instruction set. If we change the instruction set, we can get rid of that too:
data BfInstruction
= MovLeft
| MovRight
| Increment
| Decrement
| Output
| Input
| Loop BfProgram
deriving (Show, Eq)
I go into more details in some of my previous Bf reviews, feel free to read them if you get stuck on Loop
.
Final remarks
Other than the usual re-evaluation of loops (which is a common scenario in Haskell Bf interpreters), your code was fine, so all the issues are really minor. Again: well done.
Explore related questions
See similar questions with these tags.