I'm learning Rust coming from an intermediate background in Python. I've completed the first 8 chapters of the book and I wanted a project that would solidify the concepts I learned, so I made a Brainf*** interpreter designed around the things I wanted to make sure I knew. This meant that I wanted it to use structs, enums, functions, collections, and the module system. Writing and debugging this program was a great learning experience and I'm hoping that with the help of the CRSE I could learn even more from it.
The program accepts some BF code as input as well as a string containing all the inputs the code will use. Each time the code takes input it uses a character from the provided input and then moves to the next one. It also supports some very light (and I mean minimal) error handling. This program was tested using this test and to test the input I used a simple program* that prints the user's input.
A primary thing I would like to know is the quality of my comments. I did not comment too much but I hope a lot of the code is explained well enough with variable and method names. However, I have always had trouble knowing when to comment so advice on that in the context of this code would be appreciated. I would also like to know how I could make the design fit better with the idiosyncrasies of Rust. Thanks in advance, here's the code:
main.rs
use std::{collections::HashMap,
io::stdin};
mod interpreter;
use interpreter::Interpreter;
fn main() {
let mut code = String::new();
println!("Brainf code:");
stdin().read_line(&mut code).unwrap();
let mut input = String::new();
println!("Inputs for the program:");
stdin().read_line(&mut input).unwrap();
let mut program = Interpreter {
instructions: Vec::new(),
pointer: 0,
code,
loops: HashMap::new(),
input: input.trim().to_string(),
tape: [0; 3000]
};
program.build_instructions();
program.run();
}
interpreter.rs
use std::collections::HashMap;
fn throw (error: String) {
println!("{}", error);
std::process::exit(0);
}
pub enum Instruction {
Increment,
Decrement,
MoveLeft,
MoveRight,
StartLoop(u32), // u32 is the index of the end of the loop
EndLoop(u32), // u32 is the index of the start of the loop
Output,
Input,
Halt,
}
pub struct Interpreter {
pub instructions: Vec<Instruction>,
pub pointer: usize,
pub code: String,
pub loops: HashMap<u32, u32>,
pub input: String,
pub tape: [u8; 3000]
}
impl Interpreter {
fn build_loop_map (&mut self) {
/* Creates mapping of loops and their endpoints for easy jumping around the code at loops
* endpoints and startpoints */
let mut open_loops: Vec<u32> = Vec::new();
for (i, c) in self.code.chars().enumerate() {
let i: u32 = i as u32;
if c == '[' {
open_loops.push(i)
} else if c == ']' {
let start = match open_loops.pop() {
Some(n) => n,
None => {
throw(format!("Loop Error: Closure of nonexistent loop\nChar: ']' Pos: {}", i));
0
}
};
self.loops.insert(start, i);
self.loops.insert(i, start);
}
}
if open_loops.len() > 0 {
throw(format!("Loop Error: At least one unclosed loop\nChar: '[', Pos: {}", open_loops[0]));
}
}
pub fn build_instructions (&mut self) {
/* Converts the code string into a vector of instructions */
self.build_loop_map();
for (i, c) in self.code.chars().enumerate() {
let i: u32 = i as u32;
if c == '+' {
self.instructions.push(Instruction::Increment);
} else if c == '-' {
self.instructions.push(Instruction::Decrement);
} else if c == '<' {
self.instructions.push(Instruction::MoveLeft);
} else if c == '>' {
self.instructions.push(Instruction::MoveRight);
} else if c == '[' {
let end = self.loops.get(&i).unwrap();
self.instructions.push(Instruction::StartLoop(*end));
} else if c == ']' {
let start = self.loops.get(&i).unwrap();
self.instructions.push(Instruction::EndLoop(*start));
} else if c == '.' {
self.instructions.push(Instruction::Output);
} else if c == ',' {
self.instructions.push(Instruction::Input);
}
}
self.instructions.push(Instruction::Halt);
}
pub fn run(&mut self) {
/* Loops over the vec of instructions and executes the corresponding code */
let tape_size: usize = self.tape.len() - 1; // Represents the last index of the tape rather than the len
let mut input_pointer: usize = 0;
let inputs: Vec<u8> = self.input.clone().into_bytes();
let mut instruction_pointer: usize = 0;
loop {
let instruction = &self.instructions[instruction_pointer];
let cell = &mut self.tape[self.pointer];
match *instruction {
Instruction::Increment => {
*cell = cell.overflowing_add(1).0;
},
Instruction::Decrement => {
*cell = cell.overflowing_sub(1).0;
},
Instruction::MoveRight => {
self.pointer = self.pointer.overflowing_sub(1).0;
if self.pointer > tape_size {
self.pointer = tape_size;
}
}
Instruction::MoveLeft => {
self.pointer += 1;
if self.pointer > tape_size {
self.pointer = 0;
}
},
Instruction::StartLoop(n) => {
if *cell == 0 {
instruction_pointer = n as usize;
}
},
Instruction::EndLoop(n) => {
if *cell != 0 {
instruction_pointer = n as usize;
}
},
Instruction::Output => {
print!("{}", *cell as char);
},
Instruction::Input => {
let c = match inputs.get(input_pointer) {
Some(a) => *a,
None => 0
};
*cell = c;
input_pointer += 1;
},
Instruction::Halt => {
break;
}
}
instruction_pointer += 1;
}
}
}
*The code is simply >+[>,]<[<]>>[.>]
1 Answer 1
Formatting
Your code deviates from the official Rust coding style in a few small aspects:
use std::{collections::HashMap, io::stdin};
The line break should not be there:
use std::{collections::HashMap, io::stdin};
fn throw (error: String) { // --snip-- }
Functions should not be followed by a space.
pub struct Interpreter { // --snip pub tape: [u8; 3000] }
Put a trailing comma for consistency.
use
use std::{collections::HashMap, io::stdin};
In Rust, functions are conventionally accessed by bringing their parent modules into scope using use
(see the relevant section of the book):
use std::collections::HashMap;
use std::io;
Error handling
fn throw (error: String) { println!("{}", error); std::process::exit(0); }
Rust already has panic!
for this.
let mut code = String::new(); println!("Brainf code:"); stdin().read_line(&mut code).unwrap();
Provide a friendly error message:
io::stdin()
.read_line(&mut code)
.expect("Failed to read code");
Interaction
let mut code = String::new(); println!("Brainf code:"); stdin().read_line(&mut code).unwrap(); let mut input = String::new(); println!("Inputs for the program:"); stdin().read_line(&mut input).unwrap();
Prefer to print interaction code on stderr by using eprintln!
instead of println!
. This allows users to conveniently redirect output to files.
This pattern can be extracted into a function:
use std::io;
fn ask_input(prompt: &str) -> io::Result<String> {
eprintln!("{}", prompt);
let mut input = String::new();
io::stdin().read_line(&mut input)?;
Ok(input)
}
This function returns a io::Result
, the result type of I/O functions, for customizable error handling. Then the main
function can be simplified and mutable variables are no longer necessary:
fn main() {
let code = ask_input("Brainf code:").expect("Failed to read code");
let input = ask_input("Input:").expect("Failed to read input");
// --snip--
}
Types and constants
pub enum Instruction { // --snip-- StartLoop(u32), // u32 is the index of the end of the loop EndLoop(u32), // u32 is the index of the start of the loop // --snip-- }
In my opinion, a dedicated type alias for positions makes the code clearer:
type Pos = usize;
pub enum Instruction {
// --snip--
StartLoop(Pos),
EndLoop(Pos),
}
Similarly for cells: (I avoided Cell
to reduce confusion)
type TapeCell = u8;
const TapeLength: usize = 3000;
pub struct Interpreter {
// --snip--
pub tape: [TapeCell; TapeLength],
}
Interpreter
The fields should be private, not pub
.
I think the code having balanced brackets is an invariant of the struct, and the build
functions belong to the construction process. I would define a specialized type for syntax errors:
use fmt::{self, Display};
// better names welcome
#[derive(Copy, Debug)]
pub enum SyntaxErrorKind {
BadClosure,
OpenLoop,
}
#[derive(Debug)]
pub struct SyntaxError {
pub kind: SyntaxErrorKind,
pub pos: Pos,
}
impl Display for SyntaxError {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match self.kind {
BadClosure => write!(f, "Closure of nonexistent loop at position {}", self.pos),
OpenLoop => write!(f, "Unclosed loop at position {}", self.pos),
}
}
}
refactor build_loop_map
and build_instructions
to associated functions:
fn build_instructions(code: &str) -> Vec<Instructions> {
// --snip--
}
fn build_loop_map(code: &str) -> Result<HashMap<Pos, Pos>, SyntaxError> {
// --snip--
}
and define an associated function new
for Interpreter
:
pub fn new(code: String, input: String) -> Result<Interpreter, SyntaxError> {
let instructions = build_instructions(&code);
let loops = build_loop_map(&code)?;
Ok(Interpreter {
instructions,
pointer: 0,
code,
loops,
input,
tape: [0; 3000],
})
}
Storing loop information in both the map and instructions is redundant; keep one of them. You also don't need to store the code.