0

Casting for integers is very straightforward, the extra bits simply disappear.

But, is it important to understand what is happening under the hood for casting floating point? I've tried to read information on how floating point is calculated, but I have yet to find one that explains it well. At least that's my excuse. I get the basic idea although the calculation of the mantissa is a bit difficult.

At least up to Java 7, I understand that floating points cannot be used in bitwise operations. Which makes sense because of how they are stored internally. Is there anything important that is needed to know on how floating points operate or are cast?

So, to Summarize:

Is it important to understand the internal workings of floating point like integers?

What is the internal process of casting a floating point to an integer?

Peter O.
33.1k14 gold badges86 silver badges97 bronze badges
asked Jul 20, 2014 at 18:10
5
  • In general: yes, it is important to understand how floating point values are represented. This is mandatory reading for you: What Every Computer Scientist Should Know About Floating-Point Arithmetic Commented Jul 20, 2014 at 18:12
  • I think you at least know the basics of how floating point numbers are internally represented in the various representations. Otherwise you may introduce bugs in your program simply by treating them incorrectly. (compare to zero directly, expect unrealistic high accuracy etc) Commented Jul 20, 2014 at 18:16
  • This blog post shows one way to convert a floating-point number to an integer without using the cast construct. It avoids the cast construct because the intention is to round to nearest, but that makes it close to what would be implemented in hardware, or what you might implement in Java if there weren't better functions already available for this: blog.frama-c.com/index.php?post/2013/05/03/nearbyintf2 Commented Jul 20, 2014 at 18:24
  • When you cast a floating point value from double to float it also drops the extra bits (the least significant ones) Commented Jul 20, 2014 at 20:06
  • 1
    @PeterLawrey The conversion from double-precision to single-precision is a little more complicated than dropping the extra bits because that conversion is to the nearest, and because the exponent actually needs to be recomputed (the result may be zero, the result may be a denormal, the result may be infinite). Commented Jul 20, 2014 at 21:26

2 Answers 2

1

What is the internal process of casting a floating point to an integer?

Java calls the machine code instruction which does this in compliance with the IEEE-754 standard. There is nothing for Java to do as such. If you want to know how casting works I suggest you read the standard.

Basically, the mantissa is shifted by the exponent and the sign applied. i.e. a floating point number is sign * 2^exponent * mantissa and all it does is perform this calculation and drop and fractional parts.

answered Jul 20, 2014 at 20:09
Sign up to request clarification or add additional context in comments.

Comments

0

First, you need to understand that a floating point number is essentially an approximation. You can put in, say 1.23 and get out 1.229998 (or some such), because 1.23 is represented exactly. Regardless of whether you will be doing any casts, you need to understand this, and how it affects computations (and especially comparisons).

From the standpoint of cast, casting a float to a double causes no loss of information, since a double can contain every value that a float can contain. But casting from double to float can cause loss of precision (and, for very large or small numbers, exponent overflow/underflow), since there's simply more information in a 64-bit value than in a 32-bit one, so some data's going to end up "on the floor".

Similarly, casting from an int to a double causes no loss of information, since a double can contain every value an int can contain and then some. But casting from int to float or from long to double or float can result in loss of precision (though there can never be an exponent overflow/underflow).

Casting from float or double to int or long can easily result in overflow/underflow and major loss of data, if the float or double value has a large positive exponent or any negative exponent. And, of course, when you cast from floating-point to fixed the fractional part of the number is truncated (essentially a "floor" operation).

answered Jul 21, 2014 at 0:41

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.