SystemVerilog FIFO problem with 6 bits in and 4 bits out

Question 1

I was trying to induce the following functionality into SystemVerilog but i cant think of any efficient ways:

So above is a picture of two 6-bit input packets that come one after the other (triggered by a clock edge). Out of that i need to deliver three 4-bit output packets.

These output packets need to be sent next after the other. So from the first input packet i need to send out 4-bit output packet, store the last 2 bits of the packet and concatenate it with the first 2 bits from the next package. Finally, the last 4-bits are then sent as one 4-bit packet.

enter image description here

N.B: This is an example. The Input packets number can go upto like 1000 and its width can change which would mean the 'residue' or left out bits would change.

So please if anyone can guide me how to approach this problem, i'll be really grateful.

Edit: I think i've confused everyone. The problem is quite complex so i didn't include everything for the sake of simplicity. Here it goes:

The module i'm trying to make has a higher input parallelism as compared to the output which means that the data going inside the module is greater than data coming out. This would mean an accumulation of more data inside the module with every transaction.

The input packets would be arriving continuously. Each input packets lasts one clock cycle. The requirement is to push output packets (smaller than input packets) continuously aswell. The output packet should also last one clock cycle. The N.B sentence i've added is related to parameterization in which i can choose an arbitrary number of input and output parallelism. This is just to inform you the 6-bit input and 4-bit output packets are just an illustration. It could clearly be 16 bits input and 10-bit output.

This would clearly be done through a state machine but i don't know how i can concatenate the bits from the previous messages with the incoming messages and maintain a continuous flow of output packets with each clock cycle. Hopefully this clears everything.

Question 2

How fast are the data clocks? What is the relationship between the clock that loads the input packets vs the clock that updates the output packets? Are input packets coming in continuously? Does the packet size change during operation or is it fixed at design time? What is receiving the packets that are output by this block?

Question 3

The system is synchronous (same clock for everything). Yeah Input packets are coming continuously. Packet size is fixed during operation.

Question 4

Gonna need some more concrete details. This is on the bit level? The packets are an arbitrary number of bits in length? What are the input and output interfaces? If the output is narrower than the input, is the output clock fast enough to transfer one per cycle, or do you need to transition to a faster clo k?

Question 5

How can you take two packets in and write three packets out on the same clock? Your description does not make sense.

Question 6

Please elaborate on the "its width can change" section. Apart from that it is a trivial problem.

Question 7

What you need is a FIFO with different input and output bit widths. This can be achieved with an array using two index pointers and a register keeping track of the numbers stored bits. The pointers will wrap around. The bit with of the stored bits need to be a common multiple of the input and output widths.

Here is some SystemVerilog code to get you started. Not fully tested or optimized, and I omitted the logic for empty / full / error (overflow) for you to figure out.

always_ff @(posedge clk) begin
 if (!rst_n) begin
 in_idx <= '0; // input pointer
 out_idx <= '0; // output pointer
 bitcnt <= '0; // number of stored bits
 out_vld <= 1'b0;
 end
 else begin
 bitcnt <= next_bitcnt;
 if (in_vld) begin
 store[ in_idx +: 6] <= in;
 in_idx <= (in_idx + 6) % STORE_BITS;
 end
 out_vld <= (bitcnt >= 4 || in_vld);
 if (out_vld) begin
 out_idx <= (out_idx + 4) % STORE_BITS;
 end
 end
end
always_comb begin
 next_bitcnt = bitcnt;
 if (in_vld) begin
 next_bitcnt += 6;
 end
 if (bitcnt >= 4 || in_vld) begin
 next_bitcnt -= 4;
 end
end
assign out = store[ out_idx +: 4 ];

Question 8

+1, also if the buffer is very large you might get better area/performance with a shift register shifting by 4b each time to avoid a big combinational mux? The 6b input could probably still use bit indexing, though the index increment would be different.

Question 9

As I said, it is not optimized. Considering I suspect this is a class assignment I'm intentionally leaving certain things out. I tried to synthesize my code with Yosys (v0.3.0) on EDAplayground which couldn't optimize store[ in_idx +: 6] <= in; (generated logic for in_idx equal to 0,1,2,3,4,... even though only 0,6,12,... are reachable). Had to add a for-loop and if-condition get better synthesis results.

Greg Greg 4,4881 gold badge23 silver badges32 bronze badges · Answer 1 · 2019-08-20 20:04:48Z

What you need is a FIFO with different input and output bit widths. This can be achieved with an array using two index pointers and a register keeping track of the numbers stored bits. The pointers will wrap around. The bit with of the stored bits need to be a common multiple of the input and output widths.

Here is some SystemVerilog code to get you started. Not fully tested or optimized, and I omitted the logic for empty / full / error (overflow) for you to figure out.

always_ff @(posedge clk) begin
 if (!rst_n) begin
 in_idx <= '0; // input pointer
 out_idx <= '0; // output pointer
 bitcnt <= '0; // number of stored bits
 out_vld <= 1'b0;
 end
 else begin
 bitcnt <= next_bitcnt;
 if (in_vld) begin
 store[ in_idx +: 6] <= in;
 in_idx <= (in_idx + 6) % STORE_BITS;
 end
 out_vld <= (bitcnt >= 4 || in_vld);
 if (out_vld) begin
 out_idx <= (out_idx + 4) % STORE_BITS;
 end
 end
end
always_comb begin
 next_bitcnt = bitcnt;
 if (in_vld) begin
 next_bitcnt += 6;
 end
 if (bitcnt >= 4 || in_vld) begin
 next_bitcnt -= 4;
 end
end
assign out = store[ out_idx +: 4 ];

+1, also if the buffer is very large you might get better area/performance with a shift register shifting by 4b each time to avoid a big combinational mux? The 6b input could probably still use bit indexing, though the index increment would be different.
As I said, it is not optimized. Considering I suspect this is a class assignment I'm intentionally leaving certain things out. I tried to synthesize my code with Yosys (v0.3.0) on EDAplayground which couldn't optimize store[ in_idx +: 6] <= in; (generated logic for in_idx equal to 0,1,2,3,4,... even though only 0,6,12,... are reachable). Had to add a for-loop and if-condition get better synthesis results.

Stack Exchange Network

SystemVerilog FIFO problem with 6 bits in and 4 bits out

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

SystemVerilog FIFO problem with 6 bits in and 4 bits out

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions