A simple brainf*** interpreter in Haskell

Question 1

I am very new to Haskell and come from languages like C++, although I do have some experience with Scheme. Here, I wrote a simple brainf*** interpreter, which so far is my largest project. I followed a tutorial for some of the code (such as the Tape data type), but did most of this on my own. I have one major concern with my approach, which is the way I handle the loops (which in brainf*** are written with [] and repeat until the current cell in the tape is zero. In my implementation, I used the loop as a parameter to the main recursive function, which is called if the current instruction is ] and the current cell is not 0. Since these loops modify the tape, I had to make them return an instance of Tape that would replace the current one. However, since the function also handles IO (and therefore must return an instance of IO), I made it return an IO (Tape Int) which is then unpacked in a do block. This all felt very messy and hacky, so I would really appreciate any help from more experienced Haskell developers.

import Data.Maybe
--The Tape data type and functions
data Tape a = Tape [a] a [a] 
newTape :: a -> Tape a 
newTape x = Tape r x r 
 where r = repeat x
moveLeft :: Tape a -> Tape a 
moveLeft (Tape (l:ls) x rs) = Tape ls l (x:rs) 
moveRight :: Tape a -> Tape a
moveRight (Tape ls x (r:rs)) = Tape (x:ls) r rs 
--The Brainf*** instruction data types
data BfInstruction 
 = MovLeft
 | MovRight
 | Increment
 | Decrement
 | Output
 | Input 
 | BeginLoop
 | EndLoop 
 deriving (Show, Eq) 
type BfProgram = [BfInstruction]
--Convert string to BfProgram
parseBf :: String -> BfProgram
parseBf = mapMaybe parse
 where 
 parse :: Char -> Maybe BfInstruction
 parse x = case x of
 '<' -> Just MovLeft
 '>' -> Just MovRight 
 '+' -> Just Increment
 '-' -> Just Decrement
 ',' -> Just Input
 '.' -> Just Output
 '[' -> Just BeginLoop
 ']' -> Just EndLoop
 x -> Nothing --anything but the above chars is a comment
--Main running function
runBf :: String -> IO () 
runBf p = runBf' (parseBf p) (newTape 0) [] >> return ()
 where 
 runBf' :: BfProgram -> Tape Int -> BfProgram -> IO (Tape Int) 
 runBf' [] tape _ = return tape 
 runBf' prog@(p:ps) tape@(Tape ls x rs) loop = case p of
 MovLeft -> advance prog (moveLeft tape)
 MovRight -> advance prog (moveRight tape)
 Increment -> advance prog (Tape ls (x+1) rs)
 Decrement -> advance prog (Tape ls (x-1) rs)
 Input -> do 
 char <- getChar 
 advance prog (Tape ls (fromEnum char) rs) 
 Output -> putChar (toEnum x) >> advance prog tape 
 BeginLoop -> 
 let lp = getLoop 1 ps
 in runBf' (length lp `drop` ps) tape lp --Drop so that we are at the ] now
 EndLoop -> 
 if x /= 0 
 then do 
 lt <- runBf' loop tape [] 
 runBf' prog lt loop --Copy the tape from the result of the loop into next iteration
 else advance prog tape 
 advance :: BfProgram -> Tape Int -> IO (Tape Int) 
 advance (p:ps) tape = runBf' ps tape [] 
 getLoop :: Int -> BfProgram -> BfProgram
 getLoop _ [] = error "Mismatched brackets in BF program"
 getLoop 1 (EndLoop:ps) = [] 
 getLoop n (p:ps) = p:case p of 
 BeginLoop -> getLoop (n + 1) ps 
 EndLoop -> getLoop (n - 1) ps 
 _ -> getLoop n ps
--Simple IO
main = do 
 program <- readFile "program.bf"
 runBf program

Question 2

Welcome to Code Review. That almost looks like literate Haskell, did you mean to split your code into several snippets?

Question 3

I did? I separated my data type definitions and functions. What should I have done?

Question 4

You can keep it like this, but it's usually better if code is kept copyable (e.g. one file = one code block) so that interested reviewers can test your program (it's not a required, though). Literate Haskell looks very similar to your presentation style, so you might be interested in it.

Question 5

Ok thank you. I looked around and saw some other people doing it, so I split mine up too. Just adjusted it.

Question 6

It's not required per-se to keep the code together, you can keep the code split for presentation purposes, sorry if my comment made it sound like a requirement. Either way, I hope you get some nice reviews.

Question 7

Well done. Overall, there are no big flaws, just some minor issues.

Type annotations

While it's great that all top-level functions have proper type signatures, the local bindings inside those functions usually don't. After all, their types should get inferred, e.g.

example :: [Int] -> [Int]
example = map inc
 where
 inc x = 1 + x

doesn't need a type signature since x's type is already fixed to Int. It makes refactoring also a lot easier if we change the type later. If we started with

example :: [Int] -> [Int]
example = map inc
 where
 inc :: Int -> Int
 inc x = 1 + x

and later want to generalize, we might forget the second type signature and end up with an error message:

example :: Num a => [a] -> [a]
example = map inc
 where
 inc :: Int -> Int -- whoops, GHC will yell about that
 inc x = 1 + x

Therefore, type signatures for local functions are usually not written out. There are some instances where they're necessary, but that's usually with RankNTypes or other extensions.

The tape

The tape works well, and is pretty much how you would expect it.

Infinite tapes and debugging

That being said, an infinite tape has the slight inconvenience that you can never inspect it for debugging purposes.

Also, if you ever create a module from your code, you must not export the Tape data constructor, as it would enable Tape [] 0 [] and therefore break assertions.

A finite tape circumvents those issues, but needs slightly more effort in the movements.

Working on the current value

In runBf we can find several spots where we advance the program after we worked on the current value, e.g.:

 Increment -> advance prog (Tape ls (x+1) rs)
 Decrement -> advance prog (Tape ls (x-1) rs)

That's now a possible source of errors, since we could have used

 Increment -> advance prog (Tape ls (x+1) ls)
 Decrement -> advance prog (Tape ls (x-1) ls)

by accident. A small helper can prevent that issue:

onCurrent :: (a -> a) -> Tape a -> Tape a
onCurrent f (Tape ls x rs) = Tape ls (f x) rs
current :: Tape a -> a
current (Tape _ x _ ) = x

Then we end up with

 MovLeft -> advance prog (moveLeft tape)
 MovRight -> advance prog (moveRight tape)
 Increment -> advance prog (onCurrent (+1) tape)
 Decrement -> advance prog (onCurrent (subtract 1) tape)
 Input -> do 
 char <- getChar 
 advance prog (onCurrent (const (fromEnum char)) tape)

Naming and scope

As neither advance nor getLoop use any of the bindings in their scope, they're candidates for top-level functions.

runBf' can be called go or another short name. Calling the inner worker just go is really common and won't alienate other readers.

Make interfaces hard to use wrong

getLoop uses an Int as first argument that's not properly documented. Types only go so far as documentation, and we could accidentally use getLoop 0 in BeginLoop.

Instead, we should make it impossible to misuse getLoop:

getLoop :: BfProgram -> BfProgram
getLoop = go 1
 where
 go _ [] = error "Mismatched brackets in BF program"
 go 1 (EndLoop:ps) = [] 
 go n (p:ps) = p:case p of 
 BeginLoop -> go (n + 1) ps 
 EndLoop -> go (n - 1) ps 
 _ -> go n ps

Similarly, runBf should probably take a BfProgram, not an arbitrary String, as this doesn't decrease the strength of your program, we can recreate the previous behaviour with

runBf . parseBf

However, speaking of parsing...

Loop validation

A drawback with our current BfProgram is that we might end up with mismatched brackets, e.g.

parseBf "]["

parses fine and leads to a runtime error. However, we could easily detect that during parsing. Our parseBf needs a way to report errors:

type ParserError = String
parseBf :: String -> Either ParserError BfProgram
parseBf = go
 where 
 go [] = Right []
 go (x:xs) = case x of
 '<' -> MovLeft <$:> go xs
 '>' -> MovRight <$:> go xs
 '+' -> Increment <$:> go xs
 '-' -> Decrement <$:> go xs
 ',' -> Input <$:> go xs
 '.' -> Output <$:> go xs
 '[' -> -- exercise ; use `getLoop`-like function
 ']' -> -- exercise ; easier if previous one done correctly.
 x -> go xs
 x <$:> xs = fmap (x:) xs

but afterwards, we can be sure that parseBf only returns BfPrograms with valid brackets.

Unfortunately, we still need to use getLoop, as BeginLoop and EndLoop are still in our instruction set. If we change the instruction set, we can get rid of that too:

data BfInstruction 
 = MovLeft
 | MovRight
 | Increment
 | Decrement
 | Output
 | Input 
 | Loop BfProgram
 deriving (Show, Eq)

I go into more details in some of my previous Bf reviews, feel free to read them if you get stuck on Loop.

Final remarks

Other than the usual re-evaluation of loops (which is a common scenario in Haskell Bf interpreters), your code was fine, so all the issues are really minor. Again: well done.

Zeta Zeta 19.6k2 gold badges57 silver badges90 bronze badges · Accepted Answer · 2018-12-29 08:19:31Z

Well done. Overall, there are no big flaws, just some minor issues.

Type annotations

While it's great that all top-level functions have proper type signatures, the local bindings inside those functions usually don't. After all, their types should get inferred, e.g.

example :: [Int] -> [Int]
example = map inc
 where
 inc x = 1 + x

doesn't need a type signature since x's type is already fixed to Int. It makes refactoring also a lot easier if we change the type later. If we started with

example :: [Int] -> [Int]
example = map inc
 where
 inc :: Int -> Int
 inc x = 1 + x

and later want to generalize, we might forget the second type signature and end up with an error message:

example :: Num a => [a] -> [a]
example = map inc
 where
 inc :: Int -> Int -- whoops, GHC will yell about that
 inc x = 1 + x

Therefore, type signatures for local functions are usually not written out. There are some instances where they're necessary, but that's usually with RankNTypes or other extensions.

The tape

The tape works well, and is pretty much how you would expect it.

Infinite tapes and debugging

That being said, an infinite tape has the slight inconvenience that you can never inspect it for debugging purposes.

Also, if you ever create a module from your code, you must not export the Tape data constructor, as it would enable Tape [] 0 [] and therefore break assertions.

A finite tape circumvents those issues, but needs slightly more effort in the movements.

Working on the current value

In runBf we can find several spots where we advance the program after we worked on the current value, e.g.:

 Increment -> advance prog (Tape ls (x+1) rs)
 Decrement -> advance prog (Tape ls (x-1) rs)

That's now a possible source of errors, since we could have used

 Increment -> advance prog (Tape ls (x+1) ls)
 Decrement -> advance prog (Tape ls (x-1) ls)

by accident. A small helper can prevent that issue:

onCurrent :: (a -> a) -> Tape a -> Tape a
onCurrent f (Tape ls x rs) = Tape ls (f x) rs
current :: Tape a -> a
current (Tape _ x _ ) = x

Then we end up with

 MovLeft -> advance prog (moveLeft tape)
 MovRight -> advance prog (moveRight tape)
 Increment -> advance prog (onCurrent (+1) tape)
 Decrement -> advance prog (onCurrent (subtract 1) tape)
 Input -> do 
 char <- getChar 
 advance prog (onCurrent (const (fromEnum char)) tape)

Naming and scope

As neither advance nor getLoop use any of the bindings in their scope, they're candidates for top-level functions.

runBf' can be called go or another short name. Calling the inner worker just go is really common and won't alienate other readers.

Make interfaces hard to use wrong

getLoop uses an Int as first argument that's not properly documented. Types only go so far as documentation, and we could accidentally use getLoop 0 in BeginLoop.

Instead, we should make it impossible to misuse getLoop:

getLoop :: BfProgram -> BfProgram
getLoop = go 1
 where
 go _ [] = error "Mismatched brackets in BF program"
 go 1 (EndLoop:ps) = [] 
 go n (p:ps) = p:case p of 
 BeginLoop -> go (n + 1) ps 
 EndLoop -> go (n - 1) ps 
 _ -> go n ps

Similarly, runBf should probably take a BfProgram, not an arbitrary String, as this doesn't decrease the strength of your program, we can recreate the previous behaviour with

runBf . parseBf

However, speaking of parsing...

Loop validation

A drawback with our current BfProgram is that we might end up with mismatched brackets, e.g.

parseBf "]["

parses fine and leads to a runtime error. However, we could easily detect that during parsing. Our parseBf needs a way to report errors:

type ParserError = String
parseBf :: String -> Either ParserError BfProgram
parseBf = go
 where 
 go [] = Right []
 go (x:xs) = case x of
 '<' -> MovLeft <$:> go xs
 '>' -> MovRight <$:> go xs
 '+' -> Increment <$:> go xs
 '-' -> Decrement <$:> go xs
 ',' -> Input <$:> go xs
 '.' -> Output <$:> go xs
 '[' -> -- exercise ; use `getLoop`-like function
 ']' -> -- exercise ; easier if previous one done correctly.
 x -> go xs
 x <$:> xs = fmap (x:) xs

but afterwards, we can be sure that parseBf only returns BfPrograms with valid brackets.

Unfortunately, we still need to use getLoop, as BeginLoop and EndLoop are still in our instruction set. If we change the instruction set, we can get rid of that too:

data BfInstruction 
 = MovLeft
 | MovRight
 | Increment
 | Decrement
 | Output
 | Input 
 | Loop BfProgram
 deriving (Show, Eq)

I go into more details in some of my previous Bf reviews, feel free to read them if you get stuck on Loop.

Final remarks

Other than the usual re-evaluation of loops (which is a common scenario in Haskell Bf interpreters), your code was fine, so all the issues are really minor. Again: well done.

Stack Exchange Network

A simple brainf*** interpreter in Haskell

1 Answer 1

Type annotations

The tape

Infinite tapes and debugging

Working on the current value

Naming and scope

Make interfaces hard to use wrong

Loop validation

Final remarks

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

A simple brainf*** interpreter in Haskell

1 Answer 1

Type annotations

The tape

Infinite tapes and debugging

Working on the current value

Naming and scope

Make interfaces hard to use wrong

Loop validation

Final remarks

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions