Floating point equality in "Numbers.java"

Question 1

Floating point inaccuracies are really annoying. I understood that in its true sense while developing the next version of Point (this time I'm actually foolproofing my code). Before I upload it for review, I want a class that I wrote, called Numbers to be reviewed.

Numbers.java

package library.util;
/**
 * This class provides a useful interface for easy (approximate) floating
 * point equality checks. This is necessary for several other classes.
 * 
 * @author Subhomoy Haldar (ambigram_maker)
 * @version 1.0
 */
public class Numbers {
 /**
 * The tolerance value for comparing the {@code float} values.
 */
 public static double FLOAT_TOLERANCE = 5E-8;
 /**
 * The tolerance value for comparing the {@code double} values.
 */
 public static double DOUBLE_TOLERANCE = 5E-16;
 /**
 * Returns {@code true} if the arguments are <i>exactly</i> equal.
 *
 * @param bytes The arguments to check.
 * @return {@code true} if the arguments are <i>exactly</i> equal.
 */
 public static boolean areEqual(byte... bytes) {
 int length = bytes.length;
 checkLength(length);
 byte d = bytes[0];
 for (int i = 1; i < length; i++) {
 if (d != bytes[i]) {
 return false;
 }
 }
 return true;
 }
 /**
 * Returns {@code true} if the arguments are <i>exactly</i> equal.
 *
 * @param shorts The arguments to check.
 * @return {@code true} if the arguments are <i>exactly</i> equal.
 */
 public static boolean areEqual(short... shorts) {
 int length = shorts.length;
 checkLength(length);
 short d = shorts[0];
 for (int i = 1; i < length; i++) {
 if (d != shorts[i]) {
 return false;
 }
 }
 return true;
 }
 /**
 * Returns {@code true} if the arguments are <i>exactly</i> equal.
 *
 * @param ints The arguments to check.
 * @return {@code true} if the arguments are <i>exactly</i> equal.
 */
 public static boolean areEqual(int... ints) {
 int length = ints.length;
 checkLength(length);
 int d = ints[0];
 for (int i = 1; i < length; i++) {
 if (d != ints[i]) {
 return false;
 }
 }
 return true;
 }
 /**
 * Returns {@code true} if the arguments are <i>exactly</i> equal.
 *
 * @param longs The arguments to check.
 * @return {@code true} if the arguments are <i>exactly</i> equal.
 */
 public static boolean areEqual(long... longs) {
 int length = longs.length;
 checkLength(length);
 long d = longs[0];
 for (int i = 1; i < length; i++) {
 if (d != longs[i]) {
 return false;
 }
 }
 return true;
 }
 /**
 * Returns {@code true} if the arguments are <i>exactly</i> equal.
 *
 * @param chars The arguments to check.
 * @return {@code true} if the arguments are <i>exactly</i> equal.
 */
 public static boolean areEqual(char... chars) {
 int length = chars.length;
 checkLength(length);
 char d = chars[0];
 for (int i = 1; i < length; i++) {
 if (d != chars[i]) {
 return false;
 }
 }
 return true;
 }
 /**
 * Returns {@code true} if the arguments are <i>approximately</i> equal.
 *
 * @param floats The arguments to check.
 * @return {@code true} if the arguments are <i>approximately</i> equal.
 */
 public static boolean areEqual(float... floats) {
 int length = floats.length;
 checkLength(length);
 float d = floats[0];
 for (int i = 1; i < length; i++) {
 if (!areEqual(d, floats[i])) {
 return false;
 }
 }
 return true;
 }
 /**
 * Returns {@code true} if the arguments are <i>approximately</i> equal.
 *
 * @param doubles The arguments to check.
 * @return {@code true} if the arguments are <i>approximately</i> equal.
 */
 public static boolean areEqual(double... doubles) {
 int length = doubles.length;
 checkLength(length);
 double d = doubles[0];
 for (int i = 1; i < length; i++) {
 if (!areEqual(d, doubles[i])) {
 return false;
 }
 }
 return true;
 }
 private static void checkLength(int length) throws
 IllegalArgumentException {
 if (length < 2) {
 throw new IllegalArgumentException
 ("At least two arguments required.");
 }
 }
 private static boolean areEqual(float f1, float f2) {
 // the corner cases first:
 if (Float.isNaN(f1) && Float.isNaN(f2)) {
 return true;
 }
 if (Float.isInfinite(f1) || Float.isInfinite(f2)) {
 return f1 == f2;
 }
 float abs;
 if (f1 == f2 || (abs = Math.abs(f1 - f2)) <= FLOAT_TOLERANCE) {
 return true;
 }
 // compare using the larger ulp
 float ulp1 = Math.ulp(f1);
 float ulp2 = Math.ulp(f2);
 return abs <= (ulp1 > ulp2 ? ulp1 : ulp2);
 }
 private static boolean areEqual(double d1, double d2) {
 // the corner cases first:
 if (Double.isNaN(d1) && Double.isNaN(d2)) {
 return true;
 }
 if (Double.isInfinite(d1) || Double.isInfinite(d2)) {
 return d1 == d2;
 }
 double abs;
 if (d1 == d2 || (abs = Math.abs(d1 - d2)) <= DOUBLE_TOLERANCE) {
 return true;
 }
 // compare using the larger ulp
 double ulp1 = Math.ulp(d1);
 double ulp2 = Math.ulp(d2);
 return abs <= (ulp1 > ulp2 ? ulp1 : ulp2);
 }
}

I want you to hit it with your best shot and check the shortcomings of the implementation (especially the float and double methods).

Reason (and usage) for `Numbers.java`

I would recommend using the methods as follows:

boolean example = Numbers.areEqual(60, Math.toDegrees(Math.acos(0.5)));

This is (in my opinion) easy to read and understand what it does.

Question 2

Note that floating point arithmetic is not inaccurate. It just doesn't match people's expectations for real numbers. You have to take it on its own terms, just as you accept that (1/3)*3 is not 1.

Question 3

@PeteBecker Yes... But do all programs need to have a workaround for matching people's expectations? Wouldn't you rather use a library to make your life easier?

Question 4

No, you have to adjust your expectations, just as you do with integer arithmetic. The problem is that most people (legitimately) don't understand what floating-point arithmetic does. "Nearly equal" tries to work around that lack of understanding by pretending it's something that it isn't.

Question 5

I couldn't agree more. But I'll surely be kicked out of my class if try to say tan 90 is not infinity. ;-)

Question 6

tan 90 is infinity (loosely speaking), as is tan π/2; those are statements about real numbers. tan(3.14159/2) is probably a large finite value; that's a statement about floating-point arithmetic.

Question 7

Your mathematical criterion for equality has some issues.

One important point to consider is that your "equality" is not transitive. In your code, you use a threshold of \$\epsilon\$ (\$\epsilon = 5\times10^{-16}\$ in your code, when the values have type double); you declare \$x\$ and \$y\$ to be equal to each other if \$|x-y|\leq \epsilon\$. Now suppose that you have three values \$a\,ドル \$b\$ and \$c\$ such that \$b = a + 0.9\epsilon\$ and \$c = a - 0.9\epsilon\$. In that case, areEqual(a, b, c) will return true, but areEqual(b, a, c) will return false. This is surprising behaviour -- the documentation of areEqual() does not say that the result depends on the order of parameters.

To fix some of the symptoms, you need to compare all pairs of values: areEqual(a, b, c) shall return true only if areEqual(a, b), areEqual(a, c) and areEqual(b, c) all return true. In all generality, this is quadratic: with 10 parameters, you end up doing 45 comparisons; with 20 parameters, 190 comparisons... This can become computationally prohibitive. And while this would avoid the problem of the result depending on the order of parameters, it would still not make the relation transitive: areEqual(a, b) and areEqual(a, c) could return true while areEqual(b, c) would return false.

Moreover, using a difference is usually not the "correct" way to approximate. If \$x = 10^{-30}\$ and \$y = 10^{-36}\$ then they normally should not be considered equal to each other, even approximately, since \$x\$ is one million times bigger than \$y\$ -- but your areEqual() will declare them equal to each other. You get the dual problem with big numbers: when \$x\$ is large (close to \2ドル^{53}\$ for double values), \$x\$ and \$x+1\$ (for instance) have a difference equal to \1ドル\$ (thus greater than \$\epsilon\$) but they really differ in their least significant bit only; you explicitly added some code using Math.ulp() to handle that case.

There are usage contexts where a difference is what makes most sense; but in general, a ratio is more appropriate. In physics, values are measures, with a given precision which is more or less the number of significant digits. A ratio captures that notion. Mathematically, this means that you declare \$x\$ and \$y\$ to be approximately equal to each other if \$|(x/y)-1| \leq \epsilon\$ (and also \$|(y/x)-1| \leq \epsilon\,ドル to make the relation at least symmetric). This will not catch all cases, and you will have to do something about values which are very close to \0ドル\$; but that criterion will be a more correct notion of "approximate equality", it will scale up and down smoothly, and avoid the kludgy business with Math.ulp().

Question 8

Welcome to Code Review! That's a very nice example of a no-code answer, thanks for contributing :-)

Question 9

The first time I read your answer, I was trying to defend my implementation. The second time I started thinking. The third time, I was cursing myself for being such an idiot. I will definitely take your advice. BTW, the TOLERANCE can be modified by the user as necessary. So, if user does this: Numbers.DOUBLE_TOLERANCE = 1e-40;, then Numbers.areEqual(1e-30, 1e-36) returns false.

Question 10

"In all generality, this is quadratic" - Finding a linear algorithm is not hard; First deal with the edge cases like NaN and infinity, then return max(a,b,c)-min(a,b,c)<tolerance.

Question 11

I would also note that OP's implementation treats NaN as equal to NaN - This is against the standard for floating points.

Question 12

Do you really need a new implementation for floating-point numbers comparison? java.math.BigDecimal may be useful for FP numbers comparison with its compareTo(arg) method.
areEqual(args...) overloading seems too repetitive. You may consider redefining them using generics in a single method. An example with minor changes in the original code:
```
public static <T extends Number> boolean areEqual(T... numbers) {
 int length = numbers.length;
 checkLength(length);
 T d = numbers[0];
 for (int i = 1; i < length; i++) {
 if (!d.equals(numbers[i])) {
 return false;
 }
 }
 return true;
}
```
Of course, in this case you will loose primitives, because they will be boxed, but that will save multiple lines of code.

Question 13

Hi! Welcome to Code Review! Might I suggest going on the tour and reading our help center :-). In the meantime, I've made your post a little easier to read by formatting the list properly.

Question 14

Another small concern (more programing than mathematic) about FLOAT_TOLERANCEand DOUBLE_TOLERANCE:

If the outside world of Numbers need to know these values, make them final (avoiding someone to modify them and altering the rightness of future numbers comparisons)
Otherwise, make them just private

Update

public class Numbers {
 /**
 * The tolerance value for comparing the {@code float} values.
 */
 public static final float FLOAT_TOLERANCE = 5E-8f;
 /**
 * The tolerance value for comparing the {@code double} values.
 */
 public static final double DOUBLE_TOLERANCE = 5E-16;
 ...
 /**
 * Returns {@code true} if the arguments are <i>approximately</i> equal. The default delta used for comparisons is {@value #FLOAT_TOLERANCE}.
 *
 * @param floats The arguments to check.
 * @return {@code true} if the arguments are <i>approximately</i> equal.
 */
 public static boolean areEquals(float... floats) {
 return areEqual(FLOAT_TOLERANCE, floats);
 }
 /**
 * Returns {@code true} if the arguments are <i>approximately</i> equal.
 *
 * @param tolerance The delta for comparisons.
 * @param floats The arguments to check.
 * @return {@code true} if the arguments are <i>approximately</i> equal.
 */
 public static boolean areEqual(float tolerance, float... floats) {
 int length = floats.length;
 checkLength(length);
 float d = floats[0];
 for (int i = 1; i < length; i++) {
 if (!areEqual_impl(tolerance, d, floats[i])) {
 return false;
 }
 }
 return true;
 }
 private static boolean areEqual_impl(float tolerance, float f1, float f2) {
 // the corner cases first:
 if (Float.isNaN(f1) && Float.isNaN(f2)) {
 return true;
 }
 if (Float.isInfinite(f1) || Float.isInfinite(f2)) {
 return f1 == f2;
 }
 float abs;
 if (f1 == f2 || (abs = Math.abs(f1 - f2)) <= tolerance) {
 return true;
 }
 // compare using the larger ulp
 float ulp1 = Math.ulp(f1);
 float ulp2 = Math.ulp(f2);
 return abs <= (ulp1 > ulp2 ? ulp1 : ulp2);
 }
 //Same principle for double
}

Question 15

Well, I intend to provide flexibility to the user. Like I comment on Tom Leek's answer: "the TOLERANCE can be modified by the user as necessary. So, if user does this: Numbers.DOUBLE_TOLERANCE = 1e-40;, then Numbers.areEqual(1e-30, 1e-36) returns false."

Question 16

@ambigram_maker I updated my answer with a solution that meets your requirements (flexibility) plus making it more reliable (no side-effects, easier to test) by adding a method taking in argument the tolerance. The FLOAT_TOLERANCE and DOUBLE_TOLERANCE are declared public static final (note the final) to avoid any user to modify them.

Question 17

Well, I like the idea. But there's one catch: It. Doesn't. Compile. :-/

Question 18

Also, why use a double for the FLOAT_TOLERANCE?

Question 19

@ambigram_maker you are right, I didn't notice you declared FLOAT_TOLERANCE as a double in the code you provided. I changed it to float.

Question 20

For me this is clean code.

There is much duplicated code, but this is because Java doesn't support generics for primitives. And even though it is duplicated it is easy to read and understand. Straight forward.

I think ckeckLength() is the wrong name for this method. Maybe assertAtLeastTwoElements(array) or something more descriptive would be better.

The equals method for floating points seems to be correct. I hope you have written some test cases.

But the biggest thing I was thinking about were the corner cases.
NaN is an error case for me. Mathematical you can't say 1/0 equals 2/0, because you don't really know what some number divided by zero is. It depends on the use cases, but in general An application should explode with a big error message if a result of a calculation is NaN, in my opinion.

Also infinity is... Well... Mathematical something is equal if this holds: \$x = y \Leftrightarrow |x-y|=0\$. This is what you are doing. And for floating points: \$x = y \Leftrightarrow |x-y|< \epsilon\$. Nevertheless for infinity it is \$|\infty - \infty| = \infty\$ (as well for NaN). So it doesn't hold.

There might be a bigger use case for the infinity use case to be the wanted behavior, so it could be fine, but this behavior should be documented.

At the end you would say: nice and clean code. Well done!

Question 21

Do you need the modulo? 1 - 1 = 0 as much as |1 - 1| = 10

Question 22

You see... 1/0 leads to a ArithmeticException, but 1.0/0 returns Double.POSITIVE_INFINITY. It is 0.0/0.0 that gives Double.NaN.

Tom Leek Tom Leek 3461 silver badge4 bronze badges · Accepted Answer · 2015-09-02 00:50:12Z

Your mathematical criterion for equality has some issues.

One important point to consider is that your "equality" is not transitive. In your code, you use a threshold of \$\epsilon\$ (\$\epsilon = 5\times10^{-16}\$ in your code, when the values have type double); you declare \$x\$ and \$y\$ to be equal to each other if \$|x-y|\leq \epsilon\$. Now suppose that you have three values \$a\,ドル \$b\$ and \$c\$ such that \$b = a + 0.9\epsilon\$ and \$c = a - 0.9\epsilon\$. In that case, areEqual(a, b, c) will return true, but areEqual(b, a, c) will return false. This is surprising behaviour -- the documentation of areEqual() does not say that the result depends on the order of parameters.

To fix some of the symptoms, you need to compare all pairs of values: areEqual(a, b, c) shall return true only if areEqual(a, b), areEqual(a, c) and areEqual(b, c) all return true. In all generality, this is quadratic: with 10 parameters, you end up doing 45 comparisons; with 20 parameters, 190 comparisons... This can become computationally prohibitive. And while this would avoid the problem of the result depending on the order of parameters, it would still not make the relation transitive: areEqual(a, b) and areEqual(a, c) could return true while areEqual(b, c) would return false.

Moreover, using a difference is usually not the "correct" way to approximate. If \$x = 10^{-30}\$ and \$y = 10^{-36}\$ then they normally should not be considered equal to each other, even approximately, since \$x\$ is one million times bigger than \$y\$ -- but your areEqual() will declare them equal to each other. You get the dual problem with big numbers: when \$x\$ is large (close to \2ドル^{53}\$ for double values), \$x\$ and \$x+1\$ (for instance) have a difference equal to \1ドル\$ (thus greater than \$\epsilon\$) but they really differ in their least significant bit only; you explicitly added some code using Math.ulp() to handle that case.

There are usage contexts where a difference is what makes most sense; but in general, a ratio is more appropriate. In physics, values are measures, with a given precision which is more or less the number of significant digits. A ratio captures that notion. Mathematically, this means that you declare \$x\$ and \$y\$ to be approximately equal to each other if \$|(x/y)-1| \leq \epsilon\$ (and also \$|(y/x)-1| \leq \epsilon\,ドル to make the relation at least symmetric). This will not catch all cases, and you will have to do something about values which are very close to \0ドル\$; but that criterion will be a more correct notion of "approximate equality", it will scale up and down smoothly, and avoid the kludgy business with Math.ulp().

Welcome to Code Review! That's a very nice example of a no-code answer, thanks for contributing :-)
The first time I read your answer, I was trying to defend my implementation. The second time I started thinking. The third time, I was cursing myself for being such an idiot. I will definitely take your advice. BTW, the TOLERANCE can be modified by the user as necessary. So, if user does this: Numbers.DOUBLE_TOLERANCE = 1e-40;, then Numbers.areEqual(1e-30, 1e-36) returns false.
"In all generality, this is quadratic" - Finding a linear algorithm is not hard; First deal with the edge cases like NaN and infinity, then return max(a,b,c)-min(a,b,c)<tolerance.
I would also note that OP's implementation treats NaN as equal to NaN - This is against the standard for floating points.

Stack Exchange Network

Floating point equality in "Numbers.java"

Numbers.java

Reason (and usage) for `Numbers.java`

4 Answers 4

Update

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Floating point equality in "Numbers.java"

Numbers.java

Reason (and usage) for Numbers.java

4 Answers 4

Update

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions

Reason (and usage) for `Numbers.java`