Finding the largest std_logic_vector in an array (VHDL)

Question 1

I am trying to create an output layer classifier for a neural network that is implemented on FPGA (in VHDL). The classifier should simply return the array index that contains the largest std_logic_vector from output_counters array. Currently the design works simply like in the code snippet below but would like the it to be configurable by generics like the rest of the network. Instead of having two output classes (0 and 1 in this case below), there could be up to 10 output classes. This would be handled by the synthesis tools without the designer having to modify the RTL code. Ideally a flip flop is used for each output class and the classifier will write a '1' to the winner class flip flop.

--- output layer basic classifier -----------------------------------------------------------------------------
class_0_winner <= '1' when output_counters(0) > output_counters(1) else '0';
class_1_winner <= '1' when output_counters(1) > output_counters(0) else '0';
Process (clk_i)
begin 
if (rising_edge(clk_i)) then
 if (rstn_i = '0' or classifier_regs_rst = '1') then
 class_0 <= '0';
 class_1 <= '0';
 elsif (classifier_regs_en = '1') then
 class_0 <= class_0_winner;
 class_1 <= class_1_winner;
 end if;
end if;
end Process;
class_0_o <= class_0;
class_1_o <= class_1;
---------------------------------------------------------------------------------------------------------------

I created the function shown below to return the array index of the largest std_logic_vector. The array index can then be used to write to the corresponding output class flip flop. However if two std_logic_vectors are equal, the classifier will show the first vector to be the highest and will lead to an incorrect classification.

--- finding largest vector ------------------------------------------------------------------------------------
Process(clk_100MHz, output_counters)
variable max_slv_temp : std_logic_vector(3 downto 0);
variable max_index_temp : std_logic_vector(3 downto 0);
begin
max_slv_temp := "0000";
if (rising_edge(clk_100MHz)) then
 for i in 0 to output_counters'length-1 loop
 if (rstn_i = '0') then
 max_slv_temp := "0000";
 max_index_temp := "0000";
 elsif (output_counters(i) > max_slv_temp) then
 max_slv_temp := output_counters(i);
 max_index_temp := std_logic_vector(to_unsigned(i, max_index_temp 'length));
 end if;
 end loop;
end if;
max_index <= max_index_temp; 
end Process;
---------------------------------------------------------------------------------------------------------------

I could also implement a FSM and comparator which can loop through the array to solve this problem as time is not so much of an issue here (100 cycles would be fine for example), and this would save resources, but would like a solution that does not involve this. How would you folks implement this classifier as a generic-able approach? (By the way, I have the generics already working and there is one for num_outputs).

Question 2

I would implement it as a clocked process. You said, you do not want to use a FSM, so my implementation is running forever. This is my code:

library ieee;
use ieee.std_logic_1164.all;
package find_largest_package is
 type t_std_logic_vector_array is array (natural range <>) of std_logic_vector;
end package;
library ieee;
use ieee.std_logic_1164.all;
library work;
use work.find_largest_package.all;
entity find_largest is
 generic (
 constant g_num_outputs : natural := 10;
 constant g_data_width : natural := 16 
 );
 port (
 clk_i : in std_logic;
 output_counters_i : in t_std_logic_vector_array(g_num_outputs-1 downto 0)(g_data_width-1 downto 0);
 res_i : in std_logic;
 class_vector_o : out std_logic_vector(g_num_outputs-1 downto 0)
 );
end entity find_largest;
architecture struct of find_largest is
 signal index : natural;
 signal index_maximum : natural;
 signal maximum : std_logic_vector(g_data_width-1 downto 0);
begin
 process(res_i,clk_i)
 begin
 if res_i='1' then
 class_vector_o <= (others => '0');
 maximum <= (others => '0');
 index <= 0;
 index_maximum <= 0;
 elsif rising_edge(clk_i) then
 if index<g_num_outputs then
 index <= index + 1;
 if output_counters_i(index)>maximum then
 index_maximum <= index;
 maximum <= output_counters_i(index);
 end if;
 else
 class_vector_o <= (others => '0');
 class_vector_o(index_maximum) <= '1';
 index <= 0;
 maximum <= (others => '0');
 end if;
 end if;
 end process;
end architecture;

Question 3

Thanks for the answer. Would you implement this with an FSM if you had the choice?

Question 4

If there is already a signal, which can be used to start the FSM, then yes: With a FSM it is easier to verify, because then there would be a ready signal of the FSM which could be used to trigger the check of the result. And of course only running the comparison when it is really needed, is a better solution than running it at all time.

Question 5

Thanks, an FSM can be easily integrated so considering implementing it instead.

Matthias Schweikart Matthias Schweikart 2941 silver badge4 bronze badges · Accepted Answer · 2024-06-25 15:34:36Z

I would implement it as a clocked process. You said, you do not want to use a FSM, so my implementation is running forever. This is my code:

library ieee;
use ieee.std_logic_1164.all;
package find_largest_package is
 type t_std_logic_vector_array is array (natural range <>) of std_logic_vector;
end package;
library ieee;
use ieee.std_logic_1164.all;
library work;
use work.find_largest_package.all;
entity find_largest is
 generic (
 constant g_num_outputs : natural := 10;
 constant g_data_width : natural := 16 
 );
 port (
 clk_i : in std_logic;
 output_counters_i : in t_std_logic_vector_array(g_num_outputs-1 downto 0)(g_data_width-1 downto 0);
 res_i : in std_logic;
 class_vector_o : out std_logic_vector(g_num_outputs-1 downto 0)
 );
end entity find_largest;
architecture struct of find_largest is
 signal index : natural;
 signal index_maximum : natural;
 signal maximum : std_logic_vector(g_data_width-1 downto 0);
begin
 process(res_i,clk_i)
 begin
 if res_i='1' then
 class_vector_o <= (others => '0');
 maximum <= (others => '0');
 index <= 0;
 index_maximum <= 0;
 elsif rising_edge(clk_i) then
 if index<g_num_outputs then
 index <= index + 1;
 if output_counters_i(index)>maximum then
 index_maximum <= index;
 maximum <= output_counters_i(index);
 end if;
 else
 class_vector_o <= (others => '0');
 class_vector_o(index_maximum) <= '1';
 index <= 0;
 maximum <= (others => '0');
 end if;
 end if;
 end process;
end architecture;

Thanks for the answer. Would you implement this with an FSM if you had the choice?
If there is already a signal, which can be used to start the FSM, then yes: With a FSM it is easier to verify, because then there would be a ready signal of the FSM which could be used to trigger the check of the result. And of course only running the comparison when it is really needed, is a better solution than running it at all time.
Thanks, an FSM can be easily integrated so considering implementing it instead.

Stack Exchange Network

Finding the largest std_logic_vector in an array (VHDL)

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Finding the largest std_logic_vector in an array (VHDL)

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions