0
\$\begingroup\$

I have a pipelined CPU design in Verilog that uses a memory block whose reads are asynchronous. I have usually had separated memory for instructions and data and everything worked fine, but recently I have merged them into a single memory block. Thus, this block has an extra input for the PC and an extra output to read the instruction. All this memory's read are done asynchronously (implementation provided below)

This new memory design produces several hold-time violations. I have been investigating, and I have found some lectures, this post among them, that explain that memories should not be asynchronously-read.

Then, I have tried to change to sync. reads. The hold time violations get fixed but, of course, this won't work, as my design relies entirely in accessing data from memory in the same cycle the read address is known.

What is the problem with async. read memory in FPGAs?

Memory design

/**
 * Byte-enableable word-addressable memory. Allows to perform read/write
 * operations in only the specified bytes within the word whose
 * address is @param{addr}.
 *
 * In other words, instead of just allowing to read/write full words,
 * allows to read/write one or more bytes within the specified word.
 *
 * @param addr Address of the word whose bytes will be read/written.
 *
 * @param addr2_word Address to read (only read) an extra word. This is meant
 * to be used by program counters or similar. This address
 * must be 4-byte aligned.
 *
 * @param wd Data to be written (if @param{we} is enabled).
 *
 * @param be Byte selector. Allows to act only over the specified bytes.
 * Allowed values are:
 * 0001, 0010, 0100, 1000: to access individual bytes.
 * 0011, 1100: to access the lower/upper word respectively.
 * 1111: to access the full word.
 *
 * @param we Write enable. Enable to write, disable to read. In combination
 * with @param{be} allows to read/write subsets of the word.
 *
 * @param se Sign extend. On read, if enabled and not all bytes are selected, the
 * upper bits will be filled with 0 or 1 depending on the selected bytes'
 * MSB.
 *
 * @param rd Read value.
 * @param rd2 Second word read @see{addr2_word}
 * @param clk Clock signal.
 *
 * @tparam N Optional parameter. Sets the memory maximum size in words.
 * @tparam INIT_VALS See notes.
 *
 * @warning Write is sync. with the provided clock signal. Read is
 * asynchronous.
 *
 * @warning Address must be word-aligned when accessing the full word
 * (@see{be}) and half-word-aligned when accessing half words.
 *
 * @note This module allows to pre-load information at synthesis time by
 * enabling `CONFIG_ENABLE_MEM_DEFAULT_VALS`, setting the optional parameter
 * `INIT_VALS` to a number different than 0 and by defining the macro
 * `INIT_MEM_F` in a file named `mem_default_vals.svh`. The macro must be
 * like:
 *
 * `define INIT_MEM_F(mem_reg) \
 * mem_reg[0] = 32'haa; \
 * mem_reg[1] = 32'hbb;
 *
 * Beware that the above macro must not be out of bounds with regardst to
 * the parameter @param{N}.
 */
module mem_be #(parameter N = 64, INIT_VALS = 0)(
 input wire [31:0] addr,
 input wire [31:0] addr2_word,
 input wire [31:0] wd,
 input wire [3:0] be,
 input wire we,
 input wire se,
 output logic [31:0] rd,
 output wire [31:0] rd2_word,
 input wire clk
);
 reg [31:0] _mem [N-1:0];
 //
 // Write logic
 //
 always_ff @(posedge clk) begin
 if (we) begin
 case (be)
 4'b0001: _mem[addr[31:2]][7:0] <= wd[7:0];
 4'b0010: _mem[addr[31:2]][15:8] <= wd[7:0];
 4'b0100: _mem[addr[31:2]][23:16] <= wd[7:0];
 4'b1000: _mem[addr[31:2]][31:24] <= wd[7:0];
 4'b0011: _mem[addr[31:2]][15:0] <= wd[15:0];
 4'b1100: _mem[addr[31:2]][31:16] <= wd[15:0];
 4'b1111: _mem[addr[31:2]] <= wd;
 default: _mem[addr[31:2]] <= 32'hffffffff;
 endcase
 end
 end
 //
 // Read logic
 //
 wire [31:0] word;
 wire [7:0] b0, b1, b2, b3;
 wire s0, s1, s2, s3;
 assign word = _mem[addr[31:2]];
 assign b0 = word[7:0];
 assign b1 = word[15:8];
 assign b2 = word[23:16];
 assign b3 = word[31:24];
 assign s0 = se ? b0[7] : 1'b0;
 assign s1 = se ? b1[7] : 1'b0;
 assign s2 = se ? b2[7] : 1'b0;
 assign s3 = se ? b3[7] : 1'b0;
 always_comb begin
 case (be)
 4'b0001: rd = {{24{s0}}, b0};
 4'b0010: rd = {{24{s1}}, b1};
 4'b0100: rd = {{24{s2}}, b2};
 4'b1000: rd = {{24{s3}}, b3};
 4'b0011: rd = {{16{s1}}, b1, b0};
 4'b1100: rd = {{16{s3}}, b3, b2};
 4'b1111: rd = word;
 default: rd = 32'hffffffff;
 endcase
 end
 assign rd2_word = _mem[addr2_word[31:2]];
endmodule
asked Nov 21, 2022 at 11:38
\$\endgroup\$

1 Answer 1

2
\$\begingroup\$

There are no problems in FPGAs, provided it has hardware blocks that maps well to the logic you've designed.

As it is now, no hardware equivalent could be found by the compiler for the behavior you did implement, and as such the ram will be built with logic elements, will have long combinational paths, and will fail timings at frequencies we consider "child's play" today.

If you really want asynchronous (no inputs or outputs are registered) memory blocks, see your FPGA vendor's IP generators. Xilinx can use LUT as small memory blocks, behaving as true asynchronous ram.

But you'd still better rethink the approach. See if you can have address lines registered. Maybe it's already being registered somewhere in hierarchy above the ram block, and then the ram isn't asynchronous really, and hard ram blocks could have been used. But additional functionality you designed into your ram prevents compiler from inferring the hard ram blocks. Look into your FPGA properties, see which hardware primitive is closest to what you need, and adjust your overall system design to rely on available hardware features.

answered Nov 21, 2022 at 19:08
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.