I want to convert the IEEE Double value computed in my code to Integer.
E.g. I have computed: X = 64'hxxxxxxxxxxxxxxxx; Now i want to use it as index of an array as: some_array[X];
How can I do it? Is there any IP-Core or any other third-party Core for this conversion? Or some synthesis-able method/algorithm?
-
\$\begingroup\$ Use (int)floor(X) or (int)ceil(X) \$\endgroup\$Alexxx– Alexxx2015年02月26日 16:51:46 +00:00Commented Feb 26, 2015 at 16:51
-
\$\begingroup\$ Is it for Verilog? @Alex \$\endgroup\$user263210– user2632102015年02月26日 16:56:28 +00:00Commented Feb 26, 2015 at 16:56
-
\$\begingroup\$ Sry. I thought you talked about C \$\endgroup\$Alexxx– Alexxx2015年02月26日 17:47:16 +00:00Commented Feb 26, 2015 at 17:47
-
\$\begingroup\$ Is X intentionally a real (Double)? \$\endgroup\$pre_randomize– pre_randomize2015年02月26日 20:35:57 +00:00Commented Feb 26, 2015 at 20:35
-
\$\begingroup\$ You have cross-posted the same question on StackOverflow. Please don't cross-post, StackExchange policy is against cross-posting.. You can make a flag to the moderators to migrate a question to another stack, if you so choose. \$\endgroup\$Nick Alexeev– Nick Alexeev2015年02月26日 20:39:19 +00:00Commented Feb 26, 2015 at 20:39
4 Answers 4
Assuming your number is normalized, it's positive and it's not a NaN or infinity or some other non convertible pattern, you have to take the mantissa, add a "1" bit at the left of it, and take as many bits of it as the value in the exponent says. The reulting number is the integer version of your number. To round it, check the first bit discarded (the one at the right of the last bit taken from the mantissa) and add it to the integer (using integer addition)
Something like this:
module double2int(
input clk,
input rst,
input [63:0] vin,
output reg [52:0] vout,
output reg done,
output reg error
);
wire sign = vin[63];
wire [10:0] exponent = vin[62:52];
wire [51:0] binaryfraction = vin[51:0];
wire [52:0] mantissa = {1'b1,binaryfraction};
reg [5:0] cnt;
reg start = 1'b0;
reg round;
always @(posedge clk) begin
if (rst) begin
if (sign==1'b0 && exponent >= 11'd1023 && exponent <= 11'd1075) begin
// only convert positive numbers between 0 and 2^52
cnt <= 52 - (exponent - 11'd1023); // how many bits to discard from mantissa
{vout,round} <= {mantissa,1'b0};
start <= 1'b1;
done <= 1'b0;
error <= 1'b0;
end
else begin
start <= 1'b0;
error <= 1'b1;
end
end
else if (start) begin
if (cnt != 0) begin // not finished yet?
cnt <= cnt - 1; // count one bit to discard
{vout,round} <= {1'b0, vout[52:0]}; // and discard it (bit just discarded goes into "round")
end
else begin // finished discarding bits then?
if (round) // if last bit discarded was high, increment vout
vout <= vout + 1;
start <= 1'b0;
done <= 1'b1; // signal we're done
end
end
end
endmodule
I've used this to test bench the module. Just use this webpage to find the hexadecimal representation of a given number and place it into the test bench source code. Simulate the circuit and you will get the plain binary value of the closest integer to your double number:
module tb_double2int;
// Inputs
reg clk;
reg rst;
reg [63:0] vin;
// Outputs
wire [52:0] vout;
wire error;
wire done;
// Instantiate the Unit Under Test (UUT)
double2int uut (
.clk(clk),
.rst(rst),
.vin(vin),
.vout(vout),
.done(done),
.error(error)
);
initial begin
// Initialize Inputs
clk = 0;
rst = 0;
vin = 0;
// Add stimulus here
vin = 64'h4058F22D0E560419; // Example: 99.784 . Must return 100d in vout (binary 0000....00000001100100)
rst = 1;
#20;
rst = 0;
if (!error)
@(posedge done);
@(posedge clk);
$finish;
end
always begin
clk = #5 !clk;
end
endmodule
If you wish to truncate the value to the next lower integer, first observe whether the exponent makes the number be less than 1.0 or greater than the size of your array, and handle those values suitably.
If the exponent is between those values, feed the leftmost part of the mantissa, with a "1" concatenated to its left, into a shifter such that the maximum exponent would result in no shifting, an exponent that's one smaller would shift right one place, etc. The output of the shifter will be the array index.
If you wish to approximate rounding, you should scale your desired index value up by a factor of two; after computing the scaled up value, add one and divide by two. This will round 0.5 to 1, 1.5 to 2, and 2.5 to 3.
If you wish to support IEEE-accurate rounding, then in addition to the above you'll need to "OR" together all the bits that were too small to be worth including in the shift, as well as any bits that "fell off the end" of the shifter. Instead of unconditionally adding one to the scaled value, only add one if the aforementioned "OR" yields true. This will make 0.5 round to 0, and both 1.5 and 2.5 round to 2.
I have used vectors as pointer to arrays all the time... just never had to deal with a pointer that's 64b wide. What's the depth of your array?
For example:
wire [7:0] buffer [255:0]; // Here we have an array of say 256 bytes
wire [7:0] ptr; // here's out pointer into that array (needs to be log2(256))
wire [7:0] byte_read;
assign ptr = 8'h05;
assign byte_read = buff[ptr]; // grab the 5th byte from the array
-
\$\begingroup\$ My array index is not 64b wide. The matter is that I am computing the array index after performing several operations. The resultant index I am getting is in IEEE Double format. You can google that; its sign, exponent, mantissa format. Now I want to convert it into simple integer to use it as array index. \$\endgroup\$user263210– user2632102015年02月26日 16:51:25 +00:00Commented Feb 26, 2015 at 16:51
-
\$\begingroup\$ Just to clarity, I am assuming that You mean array index when you say pointer. \$\endgroup\$user263210– user2632102015年02月26日 16:52:43 +00:00Commented Feb 26, 2015 at 16:52
You need to convert the 64 bit double precision number to an integer index within your array index bounds. It is up to you, the range of your real numbers, and the bounds of some_array
how to define the conversion function. Ideally you want a one-to-one mapping, but if your index value is small, you will obviously have overlaps.
Lets say you define some_array
as:
logic [N-1:0] some_array;
Then later in your code:
index = ConvertToIndex(X);
some_array[index] = 1'b1;
ConvertToIndex
is a function. For example, it can simply multiply the input by a constant value and then round it to the closest integer, which fits into the index of your array.
function logic [$clog2(N)-1:0] ConvertToIndex (real doubleInput);
logic [$clog2(N)-1:0] returnValue;
real modifiedInput;
modifiedInput = doubleInput * 10;
returnValue = modifiedInput; //rounding happens here
endfunction
I used the real data type which I believe is compatible with IEEE 754 double-precision.
In order to make this synthesizable, you really need to decide how the conversion should be done and what to do in case of overlaps.
Explore related questions
See similar questions with these tags.