Implementing a 4-tap FIR unity coefficients, is this code efficient in power and area ?
always@(posedge Clk)
begin
//unit delays using flip flops
xn0<=Xin; //x[n]
xn1<=xn0; //x[n-1]
xn2<=xn1; //x[n-2]
xn3<=xn2; //x[n-3]
add0<=xn0+xn1;
add1<=xn2+xn3;
add_all<=add01+add23
Yout<=add_all;
end//aways
-
1\$\begingroup\$ Are you okay with only getting the output 4 cycles after the inputs? What I mean is this filter is giving \$y[n] = x[n-4] + x[n-5] + x[n-6] + x[n-7]\$ (if I calculated correctly). Is that the filter you want? Because often we want the output to depend on the most recent inputs available. But other times a delay may be acceptable to save power or area. \$\endgroup\$The Photon– The Photon2017年09月18日 15:02:34 +00:00Commented Sep 18, 2017 at 15:02
-
\$\begingroup\$ No I want x[n]+x[n-1]+x[n-2]+x[n-3] \$\endgroup\$Adigh– Adigh2017年09月18日 21:17:47 +00:00Commented Sep 18, 2017 at 21:17
2 Answers 2
You are creating additional pipeline steps for the intermediates. This introduces an additional delay, as The Photon suggested in the comments, and the output is \$x[n-7] + x[n-6] + x[n-5] + x[n-4]\$.
The additional pipeline steps can give you a higher \$f_{max}\,ドル but that is probably not what you want.
The minimum delay variant \$x[n-3] + x[n-2] + x[n-1] + x[n]\$ is more complex, because it would end in a combinatorial stage, and adding more combinatorial outputs would reduce \$f_{max}\$. That stage would have
assign Yout = Xin + xn1 + xn2 + xn3;
to calculate the output, and
always @ (posedge Clk)
begin
xn1 <= Xin;
xn2 <= xn1;
xn3 <= xn2;
end
You see that there is no register stage for xn0
here, because that is part of the component feeding Xin
, which is expected to be synchronous to Clk
as well.
A one-clock delay compromise would be
assign Yout = xn0 + xn1 + xn2 + xn3;
always @ (posedge Clk)
begin
xn0 <= Xin;
xn1 <= xn0;
xn2 <= xn1;
xn3 <= xn2;
end
This reduces routing complexity at the cost of one cycle delay. Whether that is a good trade-off is an engineering decision.
Do it in following
assign add01 = xn0+xn1;
assign add23 = xn2+xn3;
assign add_all = add01+add23;
assign Yout = add_all;
always @ (posedge Clk or negedge reset_n)
begin
if(~reset_n)
begin
xn0 <= 'd0
xn1 <= 'd0
xn2 <= 'd0
xn3 <= 'd0
Yout <= 'd0
end
else
begin
//unit delays using flip flops
xn0 <=Xin; //x[n]
xn1 <=xn0; //x[n-1]
xn2 <=xn1; //x[n-2]
xn3 <=xn2; //x[n-3]
end
end //always
Explore related questions
See similar questions with these tags.