1
\$\begingroup\$

I'm fairly new to Verilog, hardware design and computer architecture. Nevertheless, I've had a go at designing a simplified MIPS processor. It seems to mostly work fine but whenever I simulate it, it hangs on a BEQ instruction.

I'm trying to get it to run the following program (R1 is preloaded with 32'd10 and R4 is preloaded with 32'd1):

0 ADD R0 R1 R5; // R1 = 10
4 ADD R0 R1 R5; // R5 = 10 (redundant)
8 ADD R5 R6 R6; // R6 = R5 + R6
C SUB R5 R4 R5; // R5 = R5 - 1
10 SLT R4 R5 R7; // R4 < R5
14 BEQ R7 R4 -3; // Loop back to 8 if R5 > 1 

Here is the GTKWave output:

enter image description here

As I understand it, the low 16 bits of the instruction should go to sign_extend, where it is concatenated with 16 copies of the sign bit. This then goes to shift_left2, where the address is multiplied by 4. This is added to the previous address + 4.

The two operands sent to the ALU are equal so the z0 signal is asserted. The Branch signal is asserted by the control module, based on bits [31:26] of the instruction. Branch and zero are ANDed in branch_and, causing branchnd to be asserted. This is fed into a mux to select the branch address instead of the incremented previous address and fed back in to the program counter, assigned to next_address on the next clock cycle, and fed into the instruction memory to fetch the next address.

It looks like all of this happens in my simulation. The branched address can be seen on addin at 12 seconds in the above picture, so it appears to function correctly up to that point. I can't understand why it hangs though. Surely this should just be fed into the program counter, assigned to next_address on the next posedge of the clock, and used to fetch the instruction. If I change the program so that the branch isn't taken, then it continues to execute until it runs out of instructions, so it does seem to be related to taking the branch.

I've tried changing sensitivity lists, adding/removing the clock from modules and just about anything else I can think of. Maybe I am missing something obvious. I'd greatly appreciate it if anyone could help. I'm new here so if this isn't the correct way to go about asking a question or if there is anything I can do to make this question easier to answer, please let me know.

Thanks

module MIPS_tb();
wire [4:0] rs1, rt1, rd1, writereg1;
wire [31:0] ins, register0, register1, register4, register5, register6, register7, addin, aluout, nextadd,
 op1, op2, rfoutput2, dmreaddata, se_address, wdata, braddr, shaddr, incPCaddr;
wire reg_write, branch, branchnd, z0, regds, memtoreg, memwrite, alusrc;
reg clock;
reg reset;
MIPS mips1(clock, reset, rs1, rt1, rd1, writereg1, reg_write, ins,
 register1, register4, register5, register6, register7, branch, branchnd, z0, addin, aluout, nextadd,
 op1, op2, rfoutput2, dmreaddata, se_address, wdata, braddr, shaddr, incPCaddr, regds, memtoreg, memwrite, alusrc);
initial
begin
 $dumpfile("MIPS_MIPS_tb.vcd");
 $dumpvars(0, clock, reset, rs1, rt1, rd1, writereg1, reg_write, ins,
 register1, register4, register5, register6, register7, branch, branchnd, z0, addin, aluout, nextadd,
 op1, op2, rfoutput2, dmreaddata, se_address, wdata, braddr, shaddr, incPCaddr, regds, memtoreg, memwrite, alusrc);
 reset = 1'b1;
 clock = 1'b0;
#1
 reset = 1'b0;
 repeat(250)
#1 
 clock = !clock;
#1
 $finish;
end
endmodule 

MIPS

module MIPS(clock, reset, rs1, rt1, rd1, writereg1, reg_write, ins,
 register1, register4, register5, register6, register7, branch, branchnd, z0, addin, aluout, nextadd,
 op1, op2, rfoutput2, dmreaddata, se_address, wdata, braddr, shaddr, incPCaddr, regds, memtoreg, memwrite, alusrc);
input clock, reset;
output [4:0] rs1, rt1, rd1, writereg1;
output [31:0] ins, addin, aluout;
output reg_write;
output [31:0] register1, register4, register5, register6, register7, nextadd, op1, op2,
 rfoutput2, dmreaddata, se_address, wdata, braddr, shaddr, incPCaddr;
output branch, branchnd, z0, regds, memtoreg, memwrite, alusrc;
wire [31:0] instruction, next_address, address_in, operand1, operand2, reg_file_output2, 
 ALU_result, data_mem_read_data, sign_extended_address, write_data,
 branch_address, shifted_address, incremented_PC_address;
wire RegDst, Branch, MemtoReg, MemWrite, ALUSrc, RegWrite, zero, BEQ;
wire [1:0] ALUOp;
wire [4:0] rs, rt, rd, write_register;
wire [5:0] funct;
wire [15:0] addressop;
wire [3:0] ALU_control_vector;
assign rs1 = rs;
assign rt1 = rt;
assign rd1 = rd;
assign writereg1 = write_register;
assign reg_write = RegWrite;
assign ins = instruction;
assign z0 = zero;
assign branch = Branch;
assign branchnd = BEQ;
assign addin = address_in;
assign aluout = ALU_result;
assign nextadd = next_address;
assign op1 = operand1;
assign op2 = operand2;
assign rfoutput2 = reg_file_output2;
assign dmreaddata = data_mem_read_data;
assign se_address = sign_extended_address;
assign wdata = write_data;
assign braddr = branch_address;
assign shaddr = shifted_address;
assign incPCaddr = incremented_PC_address;
assign regds = RegDst;
assign memtoreg = MemtoReg;
assign memwrite = MemWrite;
assign alusrc = ALUSrc; 
program_counter pc1(next_address, address_in, clock, reset);
register_file rf1(operand1, reg_file_output2, rs, rt, write_register, write_data, RegWrite,
 register1, register4, register5, register6, register7);
ALU alu1(ALU_result, zero, operand1, operand2, ALU_control_vector);
ALU_control aluctrl1(ALU_control_vector, ALUOp, funct);
data_memory dm1(data_mem_read_data, reg_file_output2, ALU_result, MemWrite);
sign_extend se1(sign_extended_address, addressop);
instruction_memory im1(instruction, next_address);
control ctrl1(RegDst, Branch, MemtoReg, ALUOp, MemWrite, ALUSrc, RegWrite, instruction,
 rs, rt, rd, funct, addressop); 
branch_adder ba1(branch_address, shifted_address, incremented_PC_address);
PC_increment pci(incremented_PC_address, next_address);
branch_and brand(BEQ, Branch, zero);
shift_left2 shl2(shifted_address, sign_extended_address);
mux_2to1 branch_mux(address_in, BEQ, incremented_PC_address, branch_address);
mux_2to1_5bit write_reg_mux(write_register, RegDst, rt, rd);
mux_2to1 write_data_mux(write_data, MemtoReg, ALU_result, data_mem_read_data);
mux_2to1 ALU_source_mux(operand2, ALUSrc, reg_file_output2, sign_extended_address);
endmodule

Control Unit

module control(RegDst, Branch, MemtoReg, ALUOp, MemWrite, ALUSrc, RegWrite, instruction,
 rs, rt, rd, funct, addressop); 
output RegDst, Branch, MemtoReg, MemWrite, ALUSrc, RegWrite;
output [4:0] rs, rt, rd;
output [5:0] funct;
output [15:0] addressop;
output [1:0] ALUOp;
input [31:0] instruction;
wire [5:0] Op = instruction[31:26];
assign rs = instruction[25:21];
assign rt = instruction[20:16];
assign rd = instruction[15:11];
assign funct = instruction[5:0];
assign addressop = instruction[15:0];
reg [7:0] Control;
assign RegDst = Control[7];
assign RegWrite = Control[6];
assign ALUSrc = Control[5];
assign MemWrite = Control[4];
assign MemtoReg = Control[3];
assign Branch = Control[2];
assign ALUOp = Control[1:0];
initial
 Control = 7'd0;
always @(*)
 casex(Op)
 32'd0 : Control = 8'b11000010; // R-TYPE
 32'd35 : Control = 8'b01101000; // LW
 32'd43 : Control = 8'bx011x000; // SW
 32'd4 : Control = 8'bx000x101; // BEQ
 default : Control = 8'b00000000; // NOP 
 endcase
endmodule 

Instruction Memory

module instruction_memory(instruction, address);
output reg [31:0] instruction;
input [31:0] address;
reg [31:0] prog [40:0];
initial
begin
 prog[0] <= 32'b000000_00000_00001_00101_00000000000;
 prog[4] <= 32'b000000_00000_00001_00101_00000000000;
 prog[8] <= 32'b000000_00000_00000_00110_00000000000;
 prog[12] <= 32'b000000_00101_00110_00110_00000000000;
 prog[16] <= 32'b000000_00101_00100_00101_00000000010;
 prog[20] <= 32'b000000_00100_00101_00111_00000001010;
 prog[24] <= 32'b000100_00111_00100_11111_11111111100; 
 prog[28] <= 32'b000000_00101_00100_00101_00000000010;
 prog[32] <= 32'b000000_00101_00100_00111_00000001010; 
 prog[36] <= 32'b000000_00100_00110_00110_00000000000;
 prog[40] <= 32'b000000_00100_00110_00110_00000000000;
end
always @(address)
 instruction = prog[address];
endmodule

Sign Extend

module sign_extend(sign_extended_address, instruction_addr);
output reg [31:0] sign_extended_address;
input [15:0] instruction_addr;
always @(*)
begin
 sign_extended_address[15:0] = instruction_addr;
 if(instruction_addr[15]==1'b1)
 sign_extended_address[31:16] = 16'b1111_1111_1111_1111;
 else
 sign_extended_address[31:16] = 16'b0000_0000_0000_0000;
end
endmodule

ALU Control Unit

module ALU_control(ALU_control_vector, ALUOp, funct);
output reg [3:0] ALU_control_vector;
input [1:0] ALUOp;
input [5:0] funct;
initial
 ALU_control_vector = 4'b0000;
always @(*)
begin
 case(ALUOp[1])
 1'b0 : case(ALUOp[0])
 1'b0 : ALU_control_vector = 4'b0010; // ADD for LW, SW
 1'b1 : ALU_control_vector = 4'b0110; // SUB for BEQ
 endcase
 1'b1 : case(funct[3:0])
 4'b0000 : ALU_control_vector = 4'b0010; // ADD
 4'b0100 : ALU_control_vector = 4'b0000; // AND
 4'b0101 : ALU_control_vector = 4'b0001; // OR
 4'b0010 : ALU_control_vector = 4'b0110; // SUB
 4'b1010 : ALU_control_vector = 4'b0111; // SLT
 default : ALU_control_vector = 4'b0010; // ADD (doesn't matter)
 endcase
 endcase
end
endmodule

ALU

module ALU(ALU_result, zero, operand1, operand2, ALU_control_vector);
output reg [31:0] ALU_result;
output reg zero;
input [31:0] operand1, operand2;
input [3:0] ALU_control_vector;
initial
begin
 ALU_result = 32'd0;
 zero = 0;
end
always @(*)
begin
 case(ALU_control_vector)
 4'b0000 : ALU_result = operand1 & operand2;
 4'b0001 : ALU_result = operand1 | operand2;
 4'b0010 : ALU_result = operand1 + operand2;
 4'b0110 : ALU_result = operand1 - operand2;
 4'b0111 : ALU_result = (operand1 < operand2) ? 32'd1 : 32'd0;
 endcase
 if (ALU_result == 0)
 zero = 1'b1;
 else
 zero = 1'b0;
end
endmodule

Register File

module register_file(operand1, reg_file_output2, rs, rt, write_register, write_data, RegWrite,
 register1, register4, register5, register6, register7);
output reg [31:0] operand1, reg_file_output2;
output [31:0] register0, register1, register4, register5, register6, register7;
input [31:0] write_data;
input [4:0] rs, rt, write_register;
input RegWrite;
reg [31:0] registers [31:1]; // Register 0 reserved for "0"
assign register1 = registers[1];
assign register4 = registers[4];
assign register5 = registers[5];
assign register6 = registers[6];
assign register7 = registers[7];
initial
begin
 registers[1] = 10;
 registers[4] = 1;
 registers[5] = 0;
 registers[6] = 0;
 registers[7] = 0;
 operand1 = 0;
 reg_file_output2 = 0;
end
always @(*)
begin
 operand1 = (rs == 0) ? 32'd0 : registers[rs];
 reg_file_output2 = (rt == 0) ? 32'd0 : registers[rt];
 if(RegWrite)
 registers[write_register] = write_data;
end
endmodule

Program Counter

module program_counter(next_address, address, clock, reset);
output [31:0] next_address;
input [31:0] address;
input clock, reset;
reg [31:0] pc_next;
assign next_address = pc_next;
always @(posedge clock, posedge reset)
begin
if(reset)
 pc_next = 32'd0;
else
 pc_next = address;
end
endmodule

Branch Adder

module branch_adder(branch_address, shifted_addr_instruction, incremented_PC_addr);
output [31:0] branch_address;
input [31:0] shifted_addr_instruction, incremented_PC_addr;
assign branch_address = shifted_addr_instruction + incremented_PC_addr;
endmodule

Mux 1

module mux_2to1(out, select, a, b);
output reg [31:0] out;
input select;
input [31:0] a, b;
initial
 out = 0;
always @(*)
begin
 if(select)
 out = b;
 else
 out = a;
end
endmodule

Mux 2

module mux_2to1_5bit(out, select, a, b);
output reg [4:0] out;
input select;
input [4:0] a, b;
initial
 out = 0;
always @(*)
begin
 if(select)
 out = b;
 else
 out = a;
end
endmodule

PC Incrementer

module PC_increment(incremented_PC_address, current_address);
output [31:0] incremented_PC_address;
input [31:0] current_address;
assign incremented_PC_address = current_address + 4; 
endmodule

Branch/Zero AND

module branch_and(BEQ, branch, zero);
output BEQ;
input branch, zero;
and(BEQ, branch, zero);
endmodule

Left Shift Address

module shift_left2(shifted_address, sign_extended_address);
output [31:0] shifted_address;
input [31:0] sign_extended_address;
assign shifted_address = sign_extended_address << 2;
endmodule

Data Memory

module data_memory(data_mem_read_data, data_mem_write_data, ALU_result, MemWrite);
output reg [31:0] data_mem_read_data;
input [31:0] ALU_result, data_mem_write_data;
input MemWrite;
wire MemRead = ~MemWrite;
reg [31:0] data_registers [255:0];
initial
 data_mem_read_data = 0;
always @(ALU_result, data_mem_write_data)
begin
 if(MemWrite)
 data_registers[ALU_result] = data_mem_write_data;
 if(MemRead)
 data_mem_read_data = data_registers[ALU_result];
end
endmodule
asked Sep 2, 2018 at 15:39
\$\endgroup\$
6
  • \$\begingroup\$ Please edit your question to explain, in words, how you believe branching is supposed to work in your design. Particularly include how you have implemented the MIPS branch delay slot. \$\endgroup\$ Commented Sep 2, 2018 at 16:43
  • \$\begingroup\$ @ChrisStratton - I have edited to include an explanation of how I expect branching to work. I should have mentioned that this is supposed to be a single-cycle processor. It's my understanding that I don't need to worry about branch delay in this case, but maybe I'm wrong? As I say, I'm quite new to this so I don't actually know what a branch delay slot is, just that it seems to be mentioned as being necessary for a pipelined processor. \$\endgroup\$ Commented Sep 2, 2018 at 17:19
  • 1
    \$\begingroup\$ I don't know where your bug is, but I have some general verilog tips for you. 1. Never use position based on port lists. Always use name based. You'll end up adding more ports and messing everything up. 2. You have way more modules than you need. A bunch of your modules just so one line of combinational logic, so do it in line. 3. You can simplify a lot of logic. You have a case statement checking if a signal is all 0 or not, and assigning one output to that. You could just as easily do "assign foo = (bar!=0);". \$\endgroup\$ Commented Sep 2, 2018 at 17:34
  • \$\begingroup\$ And if you simulation is stalling, there's a combinational loop somewhere. Equivalent of a = ~a;. \$\endgroup\$ Commented Sep 2, 2018 at 17:36
  • \$\begingroup\$ "MIPS" in combination with "single cycle branch" is an oxymoron. At the very least, by failing to implement a branch delay slot, you've broken binary compatibility. Even if your design ultimately works, you'll be confined to hand assembly, as you won't be able to use compilers which assume this. \$\endgroup\$ Commented Sep 2, 2018 at 22:56

2 Answers 2

2
\$\begingroup\$

In any sequential always block, you must use non blocking assignment (the less than/equal sign, <=). If it's not the cause of this problem, it will cause one layer for sure.

answered Sep 2, 2018 at 17:41
\$\endgroup\$
1
  • \$\begingroup\$ It works! Looks like changing to non blocking assignments and also making the register file outputs wires and continuously assigning from the selected registers fixed the problem. Thanks very much \$\endgroup\$ Commented Sep 2, 2018 at 18:42
1
\$\begingroup\$

The rule is simply: If one process writes to a variable synchronized to an event, and another process reads the same variable synchronized to the same event, you need to write using an NBA ensuring that the reading process uses the old value of the variable. If instead you use a blocking assignment in this situation, there's a race condition between getting the new or old value of the variable because the execution ordering between the reading and writing processes is indeterminate.

By "event" I mean an edge of a clock or enable. And by "same" I mean the identical or some combinational expression of the same signal.

answered Sep 2, 2018 at 21:06
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.