This is my first attempt at writing a "large program", and I think that I've got the foundations for the emulator down. However, as I'm still a beginner, I might have made some questionable design choices (too many functions/too verbose/inefficient and so on), so I'd like to know if I can make the code more readable and/or efficient.
For now, I implemented only Immediate, Absolute, and Zero Page addressing modes, and only for LDA. I just waited to post this before going on and having to modify more code for a new design.
main.rs
use cpu::CPU;
mod cpu;
mod instruction;
mod flags;
fn main() {
let mut cpu = CPU::new();
// values to load
cpu.ram[0xA00B] = 0x0A;
cpu.ram[0x00DD] = 0x80;
// 0xA9 -> Load 0x80 into a
// 0xAD -> Load content of 0xA00B into a
// 0xA5 -> Load content of 0x00DD into a
cpu.run(vec![0xA9, 0x80, 0xAD, 0x0B, 0xA0, 0xA5, 0xDD]);
}
cpu.rs
use crate::{instruction::{Instruction, AddrMode}, flags::{Flags, FLAGS}};
pub struct CPU {
a: u8,
p: Flags,
pc: u16,
pub ram: [u8; 0xffff],
// unneeded, but it makes it harder to make mistakes + easier to format for a vector of states
curr_instr: Instruction,
// for printing the operative address in the log function
op_addr: u16
}
impl CPU {
pub fn new() -> Self {
CPU {
a: 0,
p: Flags::new(),
pc: 0,
ram: [0; 0xffff],
curr_instr: Instruction::new(),
op_addr: 0
}
}
// just for formatting in the print_state() function
fn get_flag_str(&self, flag: FLAGS) -> &str {
match flag {
FLAGS::Z => {
if self.p.z == true { return "T"; } else { return "F"; }
}
FLAGS::N => {
if self.p.n == true { return "T"; } else { return "F"; }
}
_ => { return "U"; }
}
}
pub fn print_state(&self) {
println!("┏━━━━━━━━━━━━━━━━━━━━━┓");
println!("┃ CPU STATE ┃");
println!("┃ ┃");
println!("┃ REGISTERS: ┃");
println!("┃ A: {:#04X?} ┃", self.a);
println!("┃ PC: {:#06X?} ┃", self.pc);
println!("┃ ┃");
println!("┃ FLAGS: ┃");
println!("┃ C: {} ┃", self.get_flag_str(FLAGS::C));
println!("┃ Z: {} ┃", self.get_flag_str(FLAGS::Z));
println!("┃ I: {} ┃", self.get_flag_str(FLAGS::I));
println!("┃ D: {} ┃", self.get_flag_str(FLAGS::D));
println!("┃ V: {} ┃", self.get_flag_str(FLAGS::V));
println!("┃ N: {} ┃", self.get_flag_str(FLAGS::N));
println!("┃ ┃");
println!("┃ INSTRUCTION: ┃");
println!("┃ MNEMONIC: {} ┃", self.curr_instr.mnemonic);
println!("┃ ADDR MODE: {} ┃", self.curr_instr.addr_mode.get_addr_mode_string(self.curr_instr.opcode));
println!("┃ OP ADDR: {:#06X?} ┃", self.op_addr);
println!("┃ OPCODE: {:#04X?} ┃", self.curr_instr.opcode);
println!("┗━━━━━━━━━━━━━━━━━━━━━┛");
}
fn read_byte(&self, addr: u16) -> u8 {
self.ram[addr as usize]
}
fn write_byte(&mut self, addr: u16, data: u8) {
self.ram[addr as usize] = data
}
fn read_word(&self, addr: u16) -> u16 {
let hi = self.read_byte(addr + 1);
let lo = self.read_byte(addr);
(hi as u16).wrapping_shl(8) | lo as u16
}
fn fetch_address(&mut self, mode: AddrMode) {
match mode {
AddrMode::IMM => {
self.op_addr = self.pc + 1;
self.pc += 2;
}
AddrMode::ABS => {
self.op_addr = self.read_word(self.pc + 1);
self.pc += 3;
}
AddrMode::ZPG => {
self.op_addr = 0x0000 | (self.read_byte(self.pc + 1)) as u16;
self.pc += 2;
}
}
}
fn execute(&mut self) {
let opcode = self.read_byte(self.pc);
self.curr_instr.fetch_instruction(opcode);
self.fetch_address(self.curr_instr.addr_mode);
match self.curr_instr.mnemonic.as_str() {
"LDA" => {
self.a = self.read_byte(self.op_addr);
self.p.toggle_flags(self.a, vec![FLAGS::Z, FLAGS::N]);
}
"BRK" => {
std::process::exit(0);
}
_ => { println!("Mnemonic unrecognized {}", self.curr_instr.mnemonic.as_str());}
}
}
fn load(&mut self, program: Vec<u8>) {
let mut addr = 0x0000;
for data in program {
self.write_byte(addr, data);
addr += 1;
}
}
pub fn run(&mut self, program: Vec<u8>) {
self.load(program);
loop {
self.execute();
self.print_state();
}
}
}
flags.rs
pub enum FLAGS {
C, Z, I, D, V, N
}
pub struct Flags {
pub c: bool,
pub z: bool,
pub i: bool,
pub d: bool,
// b flag
pub v: bool,
pub n: bool
}
impl Flags {
pub fn new() -> Self {
Flags { c: false, z: false, i: false, d: false, v: false, n: false }
}
fn toggle_zero_flag(&mut self, value: u8) {
if value == 0 {
self.z = true;
} else {
self.z = false;
}
}
fn toggle_negative_flag(&mut self, value: u8) {
if value & 0b1000_0000 != 0 {
self.n = true;
} else {
self.n = false;
}
}
pub fn toggle_flags(&mut self, value: u8, flags: Vec<FLAGS>) {
for flag in flags {
match flag {
FLAGS::Z => { self.toggle_zero_flag(value) }
FLAGS::N => { self.toggle_negative_flag(value) }
_ => { println!("Flag still not implemented") }
}
}
}
}
instruction.rs
#[derive(Copy, Clone)]
pub enum AddrMode {
IMM,
ABS,
ZPG
}
impl AddrMode {
pub fn get_addr_mode_string(&self, opcode: u8) -> String {
match opcode {
0xA9 | 0x00 => { String::from("IMM") }
0xAD => { String::from("ABS") }
0xA5 => { String::from("ZPG") }
_ => { String::from("IMM") }
}
}
}
pub struct Instruction {
pub opcode: u8,
pub mnemonic: String,
pub addr_mode: AddrMode
}
impl Instruction {
pub fn new() -> Self {
Instruction { opcode: 0x00, mnemonic: String::from("NOP"), addr_mode: AddrMode::IMM }
}
fn set_instruction(&mut self, new_opcode: u8, new_mnemonic: String, new_addr_mode: AddrMode) {
self.opcode = new_opcode;
self.mnemonic = new_mnemonic;
self.addr_mode = new_addr_mode;
}
fn fetch_mnemonic(&self, opcode: u8) -> String {
match opcode {
0xA9 | 0xAD | 0xA5 => { String::from("LDA") }
0x00 => { String::from("BRK") }
_ => { todo!() }
}
}
fn fetch_addr_mode(&self, opcode: u8) -> AddrMode {
match opcode {
0xA9 => { AddrMode::IMM }
0xAD => { AddrMode::ABS }
0xA5 => { AddrMode::ZPG }
// TODO: should be implied
0x00 => { AddrMode::IMM }
_ => { todo!() }
}
}
pub fn fetch_instruction(&mut self, opcode: u8) {
let new_opcode = opcode;
let new_mnemonic = self.fetch_mnemonic(opcode);
let new_addr_mode = self.fetch_addr_mode(opcode);
// set the current instruction with new data
self.set_instruction(new_opcode, new_mnemonic, new_addr_mode);
}
}
The instructions that I have implemented all work as intended, but I'm afraid the code isn't all that good :( Thanks in advance for the feedback
1 Answer 1
This is an attempt at my answer. Since I am not familiar with the 6502, I can only comment on more obvious code smells.
I already left a comment under your post regarding clippy; I will therefore not reiterate its output here. I'm confident you can work through this yourself. :)
I must say that I am impressed -- for a beginner, this is already quite good code!
You have short and concise functions with descriptive names -- that's nice!
Naming
You have named quite some of structs with all-caps. The convention in rust is that structs and enums use SnakeCase. Seeing your main
start with CPU::new()
will confuse people who first see your code, since ALLCAPS is reserved for constants and statics.
This affects CPU
, FLAGS
, the fields of AddrMode
, and likely more.
Incomplete abstraction, publicness
The field CPU.ram
is public. Structs with public fields can be an indicator for incomplete abstraction.
You make use of the public field in main
, to initialize the RAM. But you aleady have a nice write_byte
method that you could use. Make that public, and then use it in main.
let mut cpu = Cpu::new();
// values to load
cpu.write_byte(0xA00B, 0x0A);
cpu.write_byte(0x00DD, 0x80);
Missing abstraction, primitive obsession, magic numbers
From what I can understand, your cpu.run
gets a vector of instructions and address words. Apparently 0x80
, 0xDD
, and (0x0B
, 0xA0
) are words, and the other are instructions / opcodes. Your words also seem to be either 16 bit or 32 bit wide, but as a user I don't see when one or the other applies unless I already understand the architecture.
Providing the opcodes as u8
values is a missing abstraction, as well as primitive obession (using primities instead of value-objects). You actually want to allow only certain opcodes. I would suggest defining a pub enum OpCodes
somewhere.
Now you would have a vector of opcodes and addresses, which is not good. But hey, we are missing another abstraction that can help!
Every opcode and address belong together and form ... something (I'm missing the right word -- would you call that an instruction?). So make a pub struct Instruction { opcode: OpCodes, data: u16 }
. Feed those into your CPU.
This is also the point where you should deal with 16- and 32-bit words.
The method's signature should be this.
impl Cpu {
pub fn run(&mut self, program: Vec<Instruction>) { todo!() }
}
Since the opcodes are also not used by any identifier, this makes them magic numbers.
fn get_flag_str
can be written a lot shorter. You don't need to compare a bool
to true
, just use the bool
. Also, booleans can be used in match
expressions. And you don't need return
statements in the last instruction of blocks and in match arms, so we can get rid of those.
impl Cpu {
fn get_flag_str(&self, flag: Flags) -> &str {
match flag {
Flags::Z => match self.p.z {
true => "T",
false => "F",
},
Flags::N => match self.p.n {
true => "T",
false => "F",
},
_ => "U",
}
}
}
You now also see that you use the exact same logic twice to convert a bool to a str of length 1. That's a function.
impl Cpu {
fn get_flag_str(&self, flag: FLAGS) -> &str {
match flag {
FLAGS::Z => bool_to_str(self.p.z),
FLAGS::N => bool_to_str(self.p.n),
_ => "U",
}
}
}
fn bool_to_str(value: bool) -> &'static str {
match value {
true => "T",
false => "F",
}
}
Furthermore, it seems you don't need really need a &str
, since the result is always of length 1. Just use a char
.
derive Debug
For libraries in Rust, it is convention that every struct
and enum
gets a Debug
implementation. You have an application, but it still makes sense, because you can then easily throw a dbg!
in your code and show the state of whatever object you are interested in. And luckily in Rust: you don't need to implement this yourself. Just change your declaration like this, and you are done.
#[derive(Debug)]
pub struct Cpu { ... }
formatting
Use the integrated formatter, rustfmt
, to format your code. The moment you start collaborating with other programmers, you don't need to fight over indentation, whitespace, etc. Just use it, and everyone can read your code more easily since the formatting is closer than to what they would expect.
Missing tests
Your code does not include any tests. That is a problem, since after a refactor I cannot be sure that I did not break anything. This like read_word
would likely be a good candidate to introduce some tests.
I think I will leave it here. There's probably more to be said when you dive into the actual implementation. I will leave that to others. :)
-
\$\begingroup\$ I don't think your recommendation to take what I'd call "decoded instructions" as an input makes sense: it does not match reality. The real CPU executes raw bytes, and that can matter. Code may reuse the same byte as different things, for example one (rarely used) trick is to use a non-taken branch to skip a one-byte instruction (that instruction would be interpreted as the branch offset that way, but as an instruction when jumped to directly). Also, branch offsets are in terms of bytes, not instructions. And self-modifying code would be difficult to support. \$\endgroup\$user555045– user5550452022年09月09日 17:50:55 +00:00Commented Sep 9, 2022 at 17:50
-
\$\begingroup\$ Interesting point, and your insight in assembly is beyond my understanding. I think the question is what OP ultimately intends to do with this. Feed it some byte and check what the CPU does with that? Then I would keep it that way. But the intent is that it executes some operations on data. IMO code should signify intent. Of course, after passing the hypothetical
Instruction
vec to the CPU it would just convert it to bytes and place it into RAM. So we are just talking about a nicer interface. OP should maybe choose what makes most sense for his project. \$\endgroup\$sarema– sarema2022年09月09日 18:03:43 +00:00Commented Sep 9, 2022 at 18:03 -
\$\begingroup\$ Thank you for the review :) I think that I will try to implement the Instruction struct actually, it sounds like it would be much easier to implement instructions in that way. Also I really only used u8 and u16 because (as far as I understand) that's all that the 6502 could work with, using u16 for addresses/program counter and u8 for everything else \$\endgroup\$Leon9343– Leon93432022年09月11日 11:08:43 +00:00Commented Sep 11, 2022 at 11:08
cargo clippy
in your project and work through the lints. This is something you should do habitually. Clippy wants to be your friend! \$\endgroup\$