FPGA BRAM initialization

Question 1

I need to create a lot of BRAM blocks in my (Altera) design. Each one has unique memory contents, determined a priori using an algorithm.

Before, I was setting a parameter for each BRAM cell to read from a .MIF, but this caused my compilation time to take forever.

Another approach I concocted was to allow "dynamic" population of the memory; the host controller would be able to send symbols to the FPGA to populate its BRAM blocks with. This is a little more complicated than I would like.

I was hoping there was a way to initialize BRAMs with a literal. Each block is only 1 bit x 256, so the resulting HDL code wouldn't even look that ghastly.

Does anyone know how to do this with Altera's BRAM IP, or perhaps even Xilinx's?

UPDATE 8/31/2016:

Hey guys, I actually found a very easy almost "turn key" solution to BRAM initialization on Altera. In Quartus, there are built-in VHDL and Verilog templates which can automatically infer BRAM. These templates have memory initialization utilities built-in which the user can modify to populate with whatever data they want (such as a bit vector from a generic). See this Quartus help page.

Question 2

It's definitely possible to infer block RAMs directly from HDL. Depending on the size and the configuration, the toolchain could place them in block RAM or in distributed RAM.

Here is a very simple verilog example:

module rom
(
 input clk,
 input [3:0] addr,
 output [7:0] data
);
reg [7:0] mem[2**4-1:0];
reg [7:0] output_reg = 8'd0;
initial begin
 mem[4'h0] = 8'h00;
 // ...
 mem[4'hF] = 8'h00;
end
assign data = output_reg;
always @(posedge clk) begin
 output_reg <= mem[addr];
end
endmodule

I have used similar code in both the Xilinx and Altera toolchains, and in general both tools will infer the proper RAM components. This also works for RAMs. And it can be used to store values in a look up table that are computed at synthesis time, though Quartus has historically had some serious issues with this, namely with doing floating point/trig at synthesis time.

Example of precomputing ROM contents in verilog: https://github.com/alexforencich/verilog-dsp/blob/master/rtl/sine_dds_lut.v

Question 3

Funny you should ask. I have recently been fighting with Xilinx's Vivado toolchain, trying to migrate an old ISE design that includes some BRAM-based ROMs whose contents are defined by .coe (coefficient) files.

I finally had to give up on it — the Vivado IP generator tools kept losing track of the .coe files, causing synthesis to fail altogether. I never found a workaround that worked twice in a row...

I ended up writing a relatively simple Perl script that converts a .coe file into a pure-RTL case statement, and that has been working just fine. Vivado correctly infers a BRAM-based ROM from it and everything is good. I don't have the script on this machine at the moment, but if you're interested in it, I'll attach it later. It would be easy to adapt it to take .mif files as input as well.

Here's a snippet from the script documentation that explains what it does:

# The standard .coe file contains two statements:
# =========
# memory_initialization_radix=2;
# memory_initialization_vector=
# 0000000000000000,
# 0000010100000010,
# ....
# 0000000000000000;
# =========
# The argument to memory_initialization_radix gives the radix (in decimal)
# used for the individual values of the memory_initialization_vector list.
#
# For now, we're just going to make the following assumptions:
# * The .coe file is laid out exactly as shown above, with one word per line.
# * The radix is either 2 or 16.
# * The number of words indicates the address size of the ROM.
# * The width of the words indicates the data size of the ROM.
#
# We will convert the .coe data into a Verilog source file with the following
# structure:
# 
# module <filename> (
# input [5:0] addra,
# output reg [15:0] douta,
# input clka
# );
# 
# always @(posedge clka) begin
# case (addra)
# 6'h00: douta <= 16'b0000000000000000;
# 6'h01: douta <= 16'b0000010100000010;
# ....
# 6'h3F: douta <= 16'b0000000000000000;
# endcase
# end
# endmodule

Note that the port names are the same as those used by the IP generator, so this behavioral code module is a direct replacement for the generated module.

Question 4

Could you elaborate on this case statement? I didn't know Vivado was that smart! I've always thought that if you wanted to use BRAM, then you had to instantiate an IP. Could you send me a code snippet?

Question 5

Since the script's own documentation shows such an example, I've added it above. I want to tweak a few minor things before showing the code, however.

Question 6

You can always instantiate a BRAM using the attribute 'block' for a memory declaration (This infers BRAM in the FPGA architecture). You don't need any IP core to do this and you can initialize it with any data you want, in a similar way you'd assign it to a signal or vector.

I've been using this for Xilinx FPGA for some time now and I think this will also work for the Altera domain as this is a piece of code and not an IP core parameter.

Question 7

I have used Xilinx BRAMs initialised to constants contained within the programming stream. That was back in the days when V4 had just come out, but I hope the facility hasn't been removed in the current tools.

There is a flag in Xilinx CoreGen on the memory which you set to 'ROM' instead of 'RAM', and then put the initialisation tables in line in the VHDL to be compiled.

I know nothing of Altera, but it's such an obvious facility to provide it must be there, just need to dig around for it.

Question 8

I've been digging around for a week now and it seems like for Altera you can only initialize BRAM either "online" or with MIF/HEX files. :\ I"ll keep looking, though.

Question 9

As well as the other answers, you can also do the initialisation from a single parameter. Something like this Verilog example should suffice:

module paramInitialisedROM #(
 parameter WIDTH = 1,
 parameter DEPTH = 256,
 parameter MEM_INIT = 256'd120310230123 //Or whatever, key thing is to make it the correct size (WIDTH*DEPTH).
)(
 input clock,
 input [DEPTH-1:0] address,
 output reg [WIDTH-1:0] data
);
localparam WORDS = 1<<DEPTH; //Number of words in ROM.
// Create an inferred ROM of the correct size
reg [WIDTH-1:0] rom [WORDS-1:0]; //It is possible to add an altera derective 
 //to this to specify BRAM or MLAB, but I forget
 //the syntax of that.
// ROM Initialisation
integer idx;
integer offset;
initial begin
 for (idx = 0; idx < WORDS; idx=idx+1) begin //Count through each word in the rom
 offset = idx * WIDTH; //The offset into the parameter is the current index times the with
 rom[idx] = MEM_INIT[offset+:WIDTH]; //Set the current rom word to the correct chunk in the parameter
 end
end
//Clocked Read from ROM
always @ (posedge clock) begin
 data <= rom[address];
end
endmodule

Basically when you instantiate the module, make sure to provide a parameter of the correct size. For example if you make it 16x4b ROM, you would need the parameter to be specified as 64bit or larger, e.g.:

paramInitialisedROM aRomInstance #(
 .WIDTH(4),
 .DEPTH(4),
 .MEM_INIT(64'd120310230123)
)(
 .clock(clock),
 .address(address),
 .data(data)
);

The data in the parameter is organised so that the WIDTH LSBs are used for the first word. The next WIDTH chunk is the next word, and so on until all words are filled.

The for loop during initialisation will be fully optimised out by the compiler, so it doesn't cost any logic. In fact the whole initial block is converted to be the initial value of the memory.

As it is inferred memory, it should work perfectly well in both Altera and Xilinx tools.

^{Note: I haven't test compiled this, but the principle should work fine. If you find any syntax errors, feel free to edit in corrections.}

score 3 · Accepted Answer · 2016-08-30 17:32:05Z

It's definitely possible to infer block RAMs directly from HDL. Depending on the size and the configuration, the toolchain could place them in block RAM or in distributed RAM.

Here is a very simple verilog example:

module rom
(
 input clk,
 input [3:0] addr,
 output [7:0] data
);
reg [7:0] mem[2**4-1:0];
reg [7:0] output_reg = 8'd0;
initial begin
 mem[4'h0] = 8'h00;
 // ...
 mem[4'hF] = 8'h00;
end
assign data = output_reg;
always @(posedge clk) begin
 output_reg <= mem[addr];
end
endmodule

I have used similar code in both the Xilinx and Altera toolchains, and in general both tools will infer the proper RAM components. This also works for RAMs. And it can be used to store values in a look up table that are computed at synthesis time, though Quartus has historically had some serious issues with this, namely with doing floating point/trig at synthesis time.

Example of precomputing ROM contents in verilog: https://github.com/alexforencich/verilog-dsp/blob/master/rtl/sine_dds_lut.v

Stack Exchange Network

FPGA BRAM initialization

5 Answers 5

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

FPGA BRAM initialization

5 Answers 5

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions