Brainf*** interpreter in Rust

Question 1

I'm learning Rust coming from an intermediate background in Python. I've completed the first 8 chapters of the book and I wanted a project that would solidify the concepts I learned, so I made a Brainf*** interpreter designed around the things I wanted to make sure I knew. This meant that I wanted it to use structs, enums, functions, collections, and the module system. Writing and debugging this program was a great learning experience and I'm hoping that with the help of the CRSE I could learn even more from it.

The program accepts some BF code as input as well as a string containing all the inputs the code will use. Each time the code takes input it uses a character from the provided input and then moves to the next one. It also supports some very light (and I mean minimal) error handling. This program was tested using this test and to test the input I used a simple program* that prints the user's input.

A primary thing I would like to know is the quality of my comments. I did not comment too much but I hope a lot of the code is explained well enough with variable and method names. However, I have always had trouble knowing when to comment so advice on that in the context of this code would be appreciated. I would also like to know how I could make the design fit better with the idiosyncrasies of Rust. Thanks in advance, here's the code:

main.rs

use std::{collections::HashMap,
io::stdin};
mod interpreter;
use interpreter::Interpreter;
fn main() {
 let mut code = String::new();
 println!("Brainf code:");
 stdin().read_line(&mut code).unwrap();
 let mut input = String::new();
 println!("Inputs for the program:");
 stdin().read_line(&mut input).unwrap();
 let mut program = Interpreter {
 instructions: Vec::new(),
 pointer: 0,
 code,
 loops: HashMap::new(),
 input: input.trim().to_string(),
 tape: [0; 3000]
 };
 program.build_instructions();
 program.run();
}

interpreter.rs

use std::collections::HashMap;
fn throw (error: String) {
 println!("{}", error);
 std::process::exit(0);
}
pub enum Instruction {
 Increment,
 Decrement,
 MoveLeft,
 MoveRight,
 StartLoop(u32), // u32 is the index of the end of the loop
 EndLoop(u32), // u32 is the index of the start of the loop
 Output,
 Input,
 Halt,
}
pub struct Interpreter {
 pub instructions: Vec<Instruction>,
 pub pointer: usize,
 pub code: String,
 pub loops: HashMap<u32, u32>,
 pub input: String,
 pub tape: [u8; 3000]
}
impl Interpreter {
 fn build_loop_map (&mut self) {
 /* Creates mapping of loops and their endpoints for easy jumping around the code at loops
 * endpoints and startpoints */
 let mut open_loops: Vec<u32> = Vec::new();
 for (i, c) in self.code.chars().enumerate() {
 let i: u32 = i as u32;
 if c == '[' {
 open_loops.push(i)
 } else if c == ']' {
 let start = match open_loops.pop() {
 Some(n) => n,
 None => {
 throw(format!("Loop Error: Closure of nonexistent loop\nChar: ']' Pos: {}", i));
 0
 }
 };
 self.loops.insert(start, i);
 self.loops.insert(i, start);
 }
 }
 if open_loops.len() > 0 {
 throw(format!("Loop Error: At least one unclosed loop\nChar: '[', Pos: {}", open_loops[0]));
 }
 }
 pub fn build_instructions (&mut self) { 
 /* Converts the code string into a vector of instructions */
 self.build_loop_map();
 for (i, c) in self.code.chars().enumerate() {
 let i: u32 = i as u32;
 if c == '+' {
 self.instructions.push(Instruction::Increment);
 } else if c == '-' {
 self.instructions.push(Instruction::Decrement);
 } else if c == '<' {
 self.instructions.push(Instruction::MoveLeft);
 } else if c == '>' {
 self.instructions.push(Instruction::MoveRight);
 } else if c == '[' {
 let end = self.loops.get(&i).unwrap();
 self.instructions.push(Instruction::StartLoop(*end));
 } else if c == ']' {
 let start = self.loops.get(&i).unwrap();
 self.instructions.push(Instruction::EndLoop(*start));
 } else if c == '.' {
 self.instructions.push(Instruction::Output);
 } else if c == ',' {
 self.instructions.push(Instruction::Input);
 }
 }
 self.instructions.push(Instruction::Halt);
 }
 pub fn run(&mut self) {
 /* Loops over the vec of instructions and executes the corresponding code */
 let tape_size: usize = self.tape.len() - 1; // Represents the last index of the tape rather than the len
 let mut input_pointer: usize = 0;
 let inputs: Vec<u8> = self.input.clone().into_bytes();
 let mut instruction_pointer: usize = 0;
 loop {
 let instruction = &self.instructions[instruction_pointer]; 
 let cell = &mut self.tape[self.pointer]; 
 match *instruction {
 Instruction::Increment => {
 *cell = cell.overflowing_add(1).0; 
 },
 Instruction::Decrement => {
 *cell = cell.overflowing_sub(1).0;
 },
 Instruction::MoveRight => {
 self.pointer = self.pointer.overflowing_sub(1).0;
 if self.pointer > tape_size {
 self.pointer = tape_size;
 }
 }
 Instruction::MoveLeft => {
 self.pointer += 1;
 if self.pointer > tape_size {
 self.pointer = 0;
 }
 },
 Instruction::StartLoop(n) => {
 if *cell == 0 {
 instruction_pointer = n as usize;
 } 
 },
 Instruction::EndLoop(n) => {
 if *cell != 0 {
 instruction_pointer = n as usize;
 }
 },
 Instruction::Output => {
 print!("{}", *cell as char);
 },
 Instruction::Input => {
 let c = match inputs.get(input_pointer) {
 Some(a) => *a,
 None => 0
 };
 *cell = c;
 input_pointer += 1;
 },
 Instruction::Halt => {
 break;
 }
 }
 instruction_pointer += 1;
 }
 }
}

*The code is simply >+[>,]<[<]>>[.>]

Question 2

Formatting

Your code deviates from the official Rust coding style in a few small aspects:

use std::{collections::HashMap,
io::stdin};

The line break should not be there:

use std::{collections::HashMap, io::stdin};

fn throw (error: String) {
 // --snip--
}

Functions should not be followed by a space.

pub struct Interpreter {
 // --snip
 pub tape: [u8; 3000]
}

Put a trailing comma for consistency.

`use`

use std::{collections::HashMap,
io::stdin};

In Rust, functions are conventionally accessed by bringing their parent modules into scope using use (see the relevant section of the book):

use std::collections::HashMap;
use std::io;

Error handling

fn throw (error: String) {
 println!("{}", error);
 std::process::exit(0);
}

Rust already has panic! for this.

let mut code = String::new();
println!("Brainf code:");
stdin().read_line(&mut code).unwrap();

Provide a friendly error message:

io::stdin()
 .read_line(&mut code)
 .expect("Failed to read code");

Interaction

let mut code = String::new();
println!("Brainf code:");
stdin().read_line(&mut code).unwrap();
let mut input = String::new();
println!("Inputs for the program:");
stdin().read_line(&mut input).unwrap();

Prefer to print interaction code on stderr by using eprintln! instead of println!. This allows users to conveniently redirect output to files.

This pattern can be extracted into a function:

use std::io;
fn ask_input(prompt: &str) -> io::Result<String> {
 eprintln!("{}", prompt);
 let mut input = String::new();
 io::stdin().read_line(&mut input)?;
 Ok(input)
}

This function returns a io::Result, the result type of I/O functions, for customizable error handling. Then the main function can be simplified and mutable variables are no longer necessary:

fn main() {
 let code = ask_input("Brainf code:").expect("Failed to read code");
 let input = ask_input("Input:").expect("Failed to read input");
 // --snip--
}

Types and constants

pub enum Instruction {
 // --snip--
 StartLoop(u32), // u32 is the index of the end of the loop
 EndLoop(u32), // u32 is the index of the start of the loop
 // --snip--
}

In my opinion, a dedicated type alias for positions makes the code clearer:

type Pos = usize;
pub enum Instruction {
 // --snip--
 StartLoop(Pos),
 EndLoop(Pos),
}

Similarly for cells: (I avoided Cell to reduce confusion)

type TapeCell = u8;
const TapeLength: usize = 3000;
pub struct Interpreter {
 // --snip--
 pub tape: [TapeCell; TapeLength],
}

`Interpreter`

The fields should be private, not pub.

I think the code having balanced brackets is an invariant of the struct, and the build functions belong to the construction process. I would define a specialized type for syntax errors:

use fmt::{self, Display};
// better names welcome
#[derive(Copy, Debug)]
pub enum SyntaxErrorKind {
 BadClosure,
 OpenLoop,
}
#[derive(Debug)]
pub struct SyntaxError {
 pub kind: SyntaxErrorKind,
 pub pos: Pos,
}
impl Display for SyntaxError {
 fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
 match self.kind {
 BadClosure => write!(f, "Closure of nonexistent loop at position {}", self.pos),
 OpenLoop => write!(f, "Unclosed loop at position {}", self.pos),
 }
 }
}

refactor build_loop_map and build_instructions to associated functions:

fn build_instructions(code: &str) -> Vec<Instructions> {
 // --snip--
}
fn build_loop_map(code: &str) -> Result<HashMap<Pos, Pos>, SyntaxError> {
 // --snip--
}

and define an associated function new for Interpreter:

pub fn new(code: String, input: String) -> Result<Interpreter, SyntaxError> {
 let instructions = build_instructions(&code);
 let loops = build_loop_map(&code)?;
 Ok(Interpreter {
 instructions,
 pointer: 0,
 code,
 loops,
 input,
 tape: [0; 3000],
 })
}

Storing loop information in both the map and instructions is redundant; keep one of them. You also don't need to store the code.

L. F. L. F. 9,6952 gold badges27 silver badges69 bronze badges · Answer 1 · 2020-05-19 09:11:53Z

Formatting

Your code deviates from the official Rust coding style in a few small aspects:

use std::{collections::HashMap,
io::stdin};

The line break should not be there:

use std::{collections::HashMap, io::stdin};

fn throw (error: String) {
 // --snip--
}

Functions should not be followed by a space.

pub struct Interpreter {
 // --snip
 pub tape: [u8; 3000]
}

Put a trailing comma for consistency.

`use`

use std::{collections::HashMap,
io::stdin};

In Rust, functions are conventionally accessed by bringing their parent modules into scope using use (see the relevant section of the book):

use std::collections::HashMap;
use std::io;

Error handling

fn throw (error: String) {
 println!("{}", error);
 std::process::exit(0);
}

Rust already has panic! for this.

let mut code = String::new();
println!("Brainf code:");
stdin().read_line(&mut code).unwrap();

Provide a friendly error message:

io::stdin()
 .read_line(&mut code)
 .expect("Failed to read code");

Interaction

let mut code = String::new();
println!("Brainf code:");
stdin().read_line(&mut code).unwrap();
let mut input = String::new();
println!("Inputs for the program:");
stdin().read_line(&mut input).unwrap();

Prefer to print interaction code on stderr by using eprintln! instead of println!. This allows users to conveniently redirect output to files.

This pattern can be extracted into a function:

use std::io;
fn ask_input(prompt: &str) -> io::Result<String> {
 eprintln!("{}", prompt);
 let mut input = String::new();
 io::stdin().read_line(&mut input)?;
 Ok(input)
}

This function returns a io::Result, the result type of I/O functions, for customizable error handling. Then the main function can be simplified and mutable variables are no longer necessary:

fn main() {
 let code = ask_input("Brainf code:").expect("Failed to read code");
 let input = ask_input("Input:").expect("Failed to read input");
 // --snip--
}

Types and constants

pub enum Instruction {
 // --snip--
 StartLoop(u32), // u32 is the index of the end of the loop
 EndLoop(u32), // u32 is the index of the start of the loop
 // --snip--
}

In my opinion, a dedicated type alias for positions makes the code clearer:

type Pos = usize;
pub enum Instruction {
 // --snip--
 StartLoop(Pos),
 EndLoop(Pos),
}

Similarly for cells: (I avoided Cell to reduce confusion)

type TapeCell = u8;
const TapeLength: usize = 3000;
pub struct Interpreter {
 // --snip--
 pub tape: [TapeCell; TapeLength],
}

`Interpreter`

The fields should be private, not pub.

I think the code having balanced brackets is an invariant of the struct, and the build functions belong to the construction process. I would define a specialized type for syntax errors:

use fmt::{self, Display};
// better names welcome
#[derive(Copy, Debug)]
pub enum SyntaxErrorKind {
 BadClosure,
 OpenLoop,
}
#[derive(Debug)]
pub struct SyntaxError {
 pub kind: SyntaxErrorKind,
 pub pos: Pos,
}
impl Display for SyntaxError {
 fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
 match self.kind {
 BadClosure => write!(f, "Closure of nonexistent loop at position {}", self.pos),
 OpenLoop => write!(f, "Unclosed loop at position {}", self.pos),
 }
 }
}

refactor build_loop_map and build_instructions to associated functions:

fn build_instructions(code: &str) -> Vec<Instructions> {
 // --snip--
}
fn build_loop_map(code: &str) -> Result<HashMap<Pos, Pos>, SyntaxError> {
 // --snip--
}

and define an associated function new for Interpreter:

pub fn new(code: String, input: String) -> Result<Interpreter, SyntaxError> {
 let instructions = build_instructions(&code);
 let loops = build_loop_map(&code)?;
 Ok(Interpreter {
 instructions,
 pointer: 0,
 code,
 loops,
 input,
 tape: [0; 3000],
 })
}

Storing loop information in both the map and instructions is redundant; keep one of them. You also don't need to store the code.

Stack Exchange Network

Brainf*** interpreter in Rust

1 Answer 1

Formatting

`use`

Error handling

Interaction

Types and constants

`Interpreter`

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Brainf*** interpreter in Rust

1 Answer 1

Formatting

use

Error handling

Interaction

Types and constants

Interpreter

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions

`use`

`Interpreter`