What can be done to programming languages to avoid floating point pitfalls?

Question 1

The misunderstanding of floating point arithmetic and its short-comings is a major cause of surprise and confusion in programming (consider the number of questions on Stack Overflow pertaining to "numbers not adding correctly"). Considering many programmers have yet to understand its implications, it has the potential to introduce many subtle bugs (especially into financial software). What can programming languages do to avoid its pitfalls for those that are unfamiliar with the concepts, while still offering its speed when accuracy is not critical for those that do understand the concepts?

Question 2

The only thing a programming language can do to avoid the pitfalls of floating-point processing is to ban it. Note that this includes base-10 floating-point as well, which is just as inaccurate in general, except that financial applications are pre-adapted to it.

Question 3

This is what "Numerical Analysis" is for. Learn how to minimize precision loss - aka floating point pitfalls.

Question 4

A good example of a floating point issue: stackoverflow.com/questions/10303762/0-0-0-0-0

Question 5

You say "especially for financial software", which brings up one of my pet peeves: money is not a float, it's an int.

Sure, it looks like a float. It has a decimal point in there. But that's just because you're used to units that confuse the issue. Money always comes in integer quantities. In America, it's cents. (In certain contexts I think it can be mills, but ignore that for now.)

So when you say 1ドル.23, that's really 123 cents. Always, always, always do your math in those terms, and you will be fine. For more information, see:

Martin Fowler's Quantity and Money patterns
His books Patterns of Enterprise Application Architecture and Analysis Patterns
Wikipedia on Banker's Rounding

Answering the question directly, programming languages should just include a Money type as a reasonable primitive.

update

Ok, I should have only said "always" twice, rather than three times. Money is indeed always an int; those who think otherwise are welcome to try sending me 0.3 cents and showing me the result on your bank statement. But as commenters point out, there are rare exceptions when you need to do floating point math on money-like numbers. E.g., certain kinds of prices or interest calculations. Even then, those should be treated like exceptions. Money comes in and goes out as integer quantities, so the closer your system hews to that, the saner it will be.

Question 6

@JoelFan: you're mistaking a concept for a platform specific implementation.

Question 7

It's not quite that simple. Interest calculations, among others, do produce fractional cents, and have to be rounded at some point according to a specified method.

Question 8

Fictional -1, since I lack the rep for a downvote :) ...This might be correct for whatever's in your wallet but there are plenty of accounting situations where you could well be dealing with tenths of a cent, or smaller fractions. Decimal is the only sane system for dealing with this, and your comment "ignore that for now" is the harbinger of doom for programmers everywhere :P

Question 9

@kevin cline: There are fractional cents in calculations, but there are conventions on how to handle them. The goal for financial calculations is not mathematical correctness, but getting the exact same results that a banker with a calculator would.

Question 10

Everything will be perfect by replacing the word "integer" with "rational"-

Question 11

Providing support for a Decimal type helps in many cases. Many languages have a decimal type, but they are underused.

Understanding the approximation that occurs when working with representation of real numbers is important. Using both decimal and floating point types 9 * (1/9) != 1 is a correct statement. When constants a optimizer may optimize the calculation so that it is correct.

Providing an approximates operator would help. However, such comparisons are problematic. Note that .9999 trillion dollars is approximately equal to 1 trillion dollars. Could you please deposit the difference in my bank account?

Question 12

0.9999... trillion dollars is precisely equal to 1 trillion dollars actually.

Question 13

@JUST: Yes but I haven't encountered any computers with registers that will hold 0.99999.... They all truncate at some point resulting in an inequality. 0.9999 is equal enough for engineering. For financial purposes it isn't.

Question 14

But what kind of system used trillions of dollars as the base unit instead of ones of dollars?

Question 15

@Brad Try calculateing (1 Trillion / 3) * 3 on your calculator. What value do you get?

Question 16

Not in the banking system...

Question 17

We were told what to do in the first year ( sophomore) lecture in computer science when I went to university , ( this course was a pre-requisite for most science courses as well)

I recall the lecturer saying "Floating point numbers are approximations. Use integer types for money. Use FORTRAN or other language with BCD numbers for accurate computation." ( and then he pointed out the approximation, using that classic example of 0.2 impossible to represent accurately in binary floating point). This also turned up that week in the laboratory exercises.

Same lecture : "If you must get more accuracy from floating point, sort your terms. Add small numbers together, not to big numbers." That stuck in my mind.

A few years ago I had a some spherical geometry that needed to be very accurate, and still fast. 80 bit double on PC's was not cutting it, so I added some types to the program that sorted terms before performing commutative operations. Problem solved.

Before you complain about the quality of the guitar, learn to play.

I had a co-worker four years ago who'd worked for JPL. He expressed disbelief that we used FORTRAN for some things. (We needed super accurate numerical simulations calculated offline.) "We replaced all that FORTRAN with C++" he said proudly. I stopped wondering why they missed a planet.

Question 18

+1 the right tool for the right job. Although I don't actually use FORTRAN. Thankfully neither do I work on our financial systems at work.

Question 19

"If you must get more accuracy from floating point, sort your terms. Add small numbers together, not to big numbers." Any sample on this?

Question 20

@mamcx Imagine a decimal floating point number having just one digit of precission. The computation 1.0 + 0.1 + ... + 0.1 (repeated 10 times) returns 1.0 as every intermediate result gets rounded. Doing it the other way round, you get intermediate results of 0.2, 0.3, ..., 1.0 and finally 2.0. This is an extreme example, but with realistic floating point numbers, similar problems happen. The base idea is that adding numbers similar in size leads to the smallest error. Start with the smallest numbers as their sum is bigger and therefore better suited for addition to bigger ones.

Question 21

Floating point stuff in Fortran and C++ is going to be mostly identical though. Both are accurate and offline, and I'm pretty sure Fortran has no native BCD reals...

Question 22

By default, languages should use arbitrary-precision rationals for non-integer numbers.

Those who need to optimize can always ask for floats. Using them as a default made sense in C and other systems programming languages, but not in most languages popular today.

Question 23

How do you deal with irrational numbers then?

Question 24

You do it the same way as with floats: approximation.

Question 25

Computations with arbitrary-precision rationals will often be orders of magnitude slower (possibly MANY orders of magnitude slower) than computations with a hardware-supported double. If a calculation needs to be accurate to a part per million, it's better to spend a microsecond computing it to within a few parts per billion, than to spend a second computing it absolutely precisely.

Question 26

@supercat: What you're suggesting is just a poster-child of premature optimisation. The current situation is that the vast majority of programmers have no need whatsoever for fast math, and then get bitten by hard to understand floating-point (mis)behaviour, so that the relatively tiny number of programmers who need fast math gets it without having to type a single extra character. This made sense in the seventies, now it's just nonsense. The default should be safe. Those who need fast should ask for it.

Question 27

@Waquo: You vastly underestimate the cost of arbitrary-precision rational arithmetic. Since repeated operations upon fractions will generally denominators to increase, use of unlimited-precision rational types can easily turn an algorithm which should run in linear time and constant space, into one which takes exponential time and exponential space. That doesn't sound very "safe" to me.

Question 28

Warning: The floating-point type System.Double lacks the precision for direct equality testing.

double x = CalculateX();
if (x == 0.1)
{
 // ............
}

I don't believe anything can or should be done at a language level.

Question 29

I haven't used a float or double in a long time, so I'm curious. Is that an actual existing compiler warning, or just one you'd like to see?

Question 30

@Karl - Personally I haven't seen it or need it but I imagine it could be useful to dedicated but green developers.

Question 31

The binary floating point types are no better or worse qualitatively than Decimal when it comes to equality testing. The difference between 1.0m/7.0m*7.0m and 1.0m may be many orders of magnitude less than the difference between 1.0/7.0*7.0, but it's not zero.

Question 32

@Patrick - I'm not sure what you are getting at. There is a huge difference between something being true for one case and being true for all cases.

Question 33

@ChaosPandion The problem with the example in this post isn't the equality-comparison, it's the floating-point literal. There is no float with the exact value 1.0/10. Floating point maths results in 100% accurate results when computing with integer numbers fitting within the mantissa.

Question 34

I find it strange that nobody has pointed out the Lisp family's rational number trick.

Seriously, open sbcl, and do this: (+ 1 3) and you get 4. If you do(* 3 2) you get 6. Now try (/ 5 3) and you get 5/3, or 5 thirds.

That should help somewhat in some situations, shouldn't it?

Additional example from androclus:

@Waquo in another answer here makes the great suggestion that "...languages should use arbitrary-precision rationals for non-integer numbers." and "Those who need to optimize can always ask for floats."

And indeed, @Haakon here provides a clear and simple example of one such implementation. (I am only going to add a slightly more complicated demonstration of how this plays out with a "real world" problem..)

As @Haakon points out: (Common) LISP decades ago decided to solve the problem (posed here by @Adam Paynter ) by preserving everything internally as a rational (a ratio of 2 integers, i.e. a fraction), which loses no accuracy.

However any time s/he likes, the programmer can take this internal representation and print it out as an approximate decimal form with any number of decimals s/he requests.

Below is an example program in LISP which demonstrates just that. It uses the Leibniz method to find a decimal approximation for Pi. Pi is an irrational number of course, and therefore no rational (fraction or decimal) will capture it. But representations of required precision for whatever purpose can be generated.

Actual Pi in decimal form starts out with "3.1415926...". The more iterations you run the Leibniz algorithm (and this program) through, the closer your rational (whether fraction or decimal) will get to the real thing.

But the Leibniz algorithm is extremely slow to converge, even just to 10 decimal places(*2), and there are faster methods. So here we use only 11 iterations just to make the point here:

(defvar *sum* 0)
(loop for x from 0 to 10
 do (setf *sum* (+ *sum* (/ (expt -1 x) (+ (* 2 x) 1)))))
(setf *sum* (* *sum* 4))
(format t "Current Internal (rational): ~d~%" *sum*)
(format t "Current decimal approximation (to 4 digits): ~,4f~%" *sum*)

Notice that when you run this with LISP (*1), you'll get the following output:

Current Internal (rational): 47028692/14549535
Current decimal approximation (to 4 digits): 3.2323

The only thing is that as you iterate more (not just 11 iterations, but millions, billions, etc.), the decimal printout will get closer and closer to what we know as Pi (oscillating around it with + an - error). For example, adding just 5 more iterations gets you much larger numerator and denominators:

Current Internal (rational): 13895021563328/4512611027925
Current decimal approximation (to 4 digits): 3.0792

and so on: As you can guess, internally the numerator and denominator of the fraction grow quickly, each becoming HUGE. It's actually fun to play with and see how much your computer can handle.(*2)

As a LISP fanboy, my guess is that way back, some very smart people might have noticed LISP's ability to do this and were inspired to carry it further (such as including irrationals and unknowns to be held internally as well). This would have indeed launched the symbolic and numerical analysis software family (Maxima, Mathematica, Maple, MatLab, MathCAD, SageMath, etc). But I've never seen it written that such was actually the case...

(*1) and I like the SBCL version too for its speed and the fact it is free and community-developed

(*2) Accuracy to 10 digits with Leibniz reputedly takes ~5 billion iterations. More than my 20 x 12th Gen Intel(R) Core(TM) i7-12700H running Linux and 32GB of RAM could do in a day and a half when I left it running.. The fraction gets truly monstrous, but she just kept chuggin' and never complained..

Question 35

I wonder, if possible to know if a result need to be represented as 1/3 or could be a exact decimal?

Question 36

good suggestion

Question 37

@Androclus this may be a wiki, but 16 edits changing the answer completely are a bit strange.

Question 38

The two biggest problems involving floating point numbers are:

inconsistent units applied to the calculations (note this also affects integer arithmetic in the same way)
failure to understand that FP numbers are an approximation and how to intelligently deal with rounding.

The first type of failure can only be remedied by providing a composite type that includes value and unit information. For example, a length or area value that incorporates the unit (meters or square meters or feet and square feet respectively). Otherwise you have to be diligent about always working with one type of unit of measurement and only converting to another when we share the answer with a human.

The second type of failure is a conceptual failure. The failures manifest themselves when people think of them as absolute numbers. It affects equality operations, cumulative rounding errors, etc. For example, it may be correct that for one system two measurements are equivalent within a certain margin of error. I.e. .999 and 1.001 are roughly the same as 1.0 when you don't care about differences that are smaller than +/- .1. However, not all systems are that lenient.

If there is any language level facility needed, then I would call it equality precision. In NUnit, JUnit, and similarly constructed testing frameworks you can control the precision that is considered correct. For example:

Assert.That(.999, Is.EqualTo(1.001).Within(10).Percent);
// -- or --
Assert.That(.999, Is.EqualTo(1.001).Within(.1));

If, for example, C# or Java were altered to include a precision operator, it might look something like this:

if(.999 == 1.001 within .1) { /* do something */ }

However, if you supply a feature like that, you also have to consider the case where equality is good if the +/- sides are not the same. For example, +1/-10 would consider two numbers equivalent if one of them was within 1 more, or 10 less than the first number. To handle this case, you might need to add a range keyword as well:

if(.999 == 1.001 within range(.001, -.1)) { /* do something */ }

Question 39

I'd switch the order. The conceptual problem is pervasive. The units conversion issue is relatively minor by comparison.

Question 40

I like the concept of a precision operator but as you mention further on it would definitely need to be well thought out. Personally I would be more inclined to see it as its own complete syntactical construct.

Question 41

It could also very easily be done in a library.

Question 42

@dan04: I was thinking more in terms of "all calculations accurate to within one percent" or the like. I've seen the tar-pit that is unit of measure handling and I'm staying well away.

Question 43

About 25 years ago, I saw a numeric package featuring a type consisting of a pair of floating-point numbers representing the maximum and minimum possible values for a quantity. As numbers passed through calculations, the difference between maximum and minimum would grow. Effectively, this provided a means of knowing how much real precision was present in a calculated value.

Question 44

What can programming languages do? Don't know if there's one answer to that question, because anything the compiler/interpreter does on the programmer's behalf to make his/her life easier usually works against performance, clarity, and readability. I think both the C++ way (pay only for what you need) and the Perl way (principle of least surprise) are both valid, but it depends on the application.

Programmers still need to work with the language and understand how it handles floating points, because if they don't, they'll make assumptions, and one day the perscribed behavior won't match up with their assumptions.

My take on what the programmer needs to know:

What floating-point types are available on the system and in the language
What type is needed
How to express the intentions of what type is needed in the code
How to correctly take advantage of any automatic type promotion to balance clarity and efficiency while maintaining correctness

Question 45

What can programming languages do to avoid [floating point] pitfalls...?

Use sensible defaults, e.g. built-in support for decmials.

Groovy does this quite nicely, although with a bit of effort you can still write code to introduce floating point imprecision.

Question 46

I agree there's nothing to do at the language level. Programmers must understand that computers are discrete and limited, and that many of the mathematical concepts represented in them are only approximations.

Never mind floating point. One has to understand that half of the bit patterns are used for negative numbers and that 2^64 is actually quite small to avoid typical problems with integer arithmetic.

Question 47

disagree, most language currently give too much support for binary floating point types (why is == even defined for floats?) and not enough support for rationals or decimals

Question 48

@jk: Even if the result of any computation would never be guaranteed equal to the result of any other computation, equality comparison would still be useful for the case where the same value gets assigned to two variables (though the equality rules commonly implemented are perhaps too loose, since x==y does not imply that performing a computation on x will yield the same result as performing the same computation on y).

Question 49

@supercat you still need comparison, but i'd rather the language required me to specify a tolerance for each floating point comparison, i can then still get back to equality by choosing tolerance = 0, but i'm at least forced to make that choice

Question 50

== yields true if two floating-point values are considered the same, and false if not. Plus NaN == x is false for all x, while +0 and -0 is true even though the numbers are different,

Question 51

One thing I would like to see would be a recognition that double to float should be regarded as a widening conversion, while float to double is narrowing(*). That may seem counter-intuitive, but consider what the types actually mean:

0.1f means "13,421,773.5/134,217,728, plus or minus 1/268,435,456 or so".
0.1 really means 3,602,879,701,896,397/36,028,797,018,963,968, plus or minus 1/72,057,594,037,927,936 or so"

If one has a double which holds the best representation of the quantity "one-tenth" and converts it to float, the result will be "13,421,773.5/134,217,728, plus or minus 1/268,435,456 or so", which is a correct description of the value.

By contrast, if one has a float which holds the best representation of the quantity "one-tenth" and converts it to double, the result will be "13,421,773.5/134,217,728, plus or minus 1/72,057,594,037,927,936 or so"--a level of implied accuracy which is wrong by a factor of over 53 million.

Although the IEEE-744 standard requires that floating-point maths be performed as though every floating-point number represents the exact numerical quantity precisely at the center of its range, that should not be taken to imply that floating-point values actually represent those exact numerical quantities. Rather, the requirement that the values be assumed to be at the center of their ranges stems from three facts: (1) calculations must be performed as though the operands have some particular precise values; (2) consistent and documented assumptions are more helpful than inconsistent or undocumented ones; (3) if one is going to make a consistent assumption, no other consistent assumption is apt to be better than assuming a quantity represents the center of its range.

Incidentally, I remember some 25 years or so ago, someone came up with a numerical package for C which used "range types", each consisting of a pair of 128-bit floats; all calculations would be done in such fashion as to compute the minimum and maximum possible value for each result. If one performed a big long iterative calculation and came up with a value of [12.53401391134 12.53902812673], one could be confident that while many digits of precision were lost to rounding errors, the result could still be reasonably expressed as 12.54 (and it wasn't really 12.9 or 53.2). I'm surprised I haven't seen any support for such types in any mainstream languages, especially since they would seem a good fit with math units that can operate on multiple values in parallel.

(*)In practice, it's often helpful to use double-precision values to hold intermediate computations when working with single-precision numbers, so having to use a typecast for all such operations could be annoying. Languages could help by having a "fuzzy double" type, which would perform computations as double, and could be freely cast to and from single; this would be especially helpful if functions which take parameters of type double and return double could be marked so that they would automatically generate an overload which accepts and returns "fuzzy double" instead.

Question 52

If more programming languages took a page from databases and allowed developers to specify the length and precision of their numeric data types, they could substantially reduce the probability of floating point related errors. If a language allowed a developer to declare a variable as a Float(2), indicating that they needed a floating point number with two decimal digits of precision, it could perform mathematical operations much more safely. If it did so by representing the variable as an integer internally and dividing by 100 before exposing the value, it could improve speed by using the faster integer arithmetic paths. The semantics of a Float(2) would also let developers avoid the constant need to round data before outputting it since a Float(2) would inherently round data to two decimal points.

Of course, you'd need to allow a developer to ask for a maximum-precision floating point value when the developer needs to have that precision. And you would introduce problems where slightly different expressions of the same mathematical operation produce potentially different results because of intermediate rounding operations when developers don't carry enough precision in their variables. But at least in the database world, that doesn't seem to be too big a deal. Most people aren't doing the sorts of scientific calculations that require lots of precision in intermediate results.

Question 53

Specifying length and precision would do very little that is useful. Having fixed-point base 10 would be useful for financial processing, which would remove much of the surprise people get from floating-point.

Question 54

@David - Perhaps I'm missing something but how is a fixed-point base 10 data type different than what I'm proposing here? A Float(2) in my example would have a fixed 2 decimal digits and would automatically round to the nearest hundredth which is what you'd likely use for simple financial calculations. More complex calculations would require that the developer allocated a larger number of decimal digits.

Question 55

What you're advocating is a fixed-point base 10 data type with programmer-specified precision. I'm saying that the programmer-specified precision is mostly pointless, and will just lead to the sorts of errors I used to run into in COBOL programs. (For example, when you change the precision of variables, it's real easy to miss one variable the value runs through. For another, it will take a lot more thinking about intermediate result size than is good.)

Question 56

A Float(2) like you propose should not be called Float, since there is nothing floating here, certainly not the "decimal point".

Question 57

One thing languages could do--remove the equality comparison from floating point types other than a direct comparison to the NAN values.

Equality testing would only exist is as function call that took the two values and a delta, or for languages like C# that allow types to have methods an EqualsTo that takes the other value and the delta.

Question 58

As other answers have noted, the only real way to avoid floating point pitfalls in financial software is not to use it there. This may actually be feasible -- if you provide a well-designed library dedicated to financial math.

Functions designed to import floating-point estimates should be clearly labelled as such, and provided with parameters appropriate to that operation, e.g.:

Finance.importEstimate(float value, Finance roundingStep)

The only real way to avoid floating point pitfalls in general is education -- programmers need to read and understand something like What Every Programmer Should Know About Floating-Point Arithmetic.

A few things that might help, though:

I'll second those who ask "why is exact equality testing for floating point even legal?"
Instead, use an isNear() function.
Provide, and encourage use of, floating-point accumulator objects (which add sequences of floating point values more stably than simply adding them all into a regular floating point variable).

Question 59

Education is the the only solution. Pitfalls cannot be avoided because there are infinitely many real numbers but a digital computer can only represent a finite number of bits. So in any numeric format some numbers cannot be represented precisely which lead to lack of precision which may lead to incorrect output.

Binary floating point has the well known problem where:

> 0.1d + 0.2d
0.30000000000000004

(Using C# here, d postfix indicate double-precision floating-point types)

Decimal types handle the above problem nicely, but have different issues:

> 0.1m + 0.2m
0.3
> (1.0m/3.0m) * 3.0m
0.9999999999999999999999999999

(The m postfix indicate the decimal type)

Decimal types has the advantage that numbers which can be written as decimals with a finite number of digits can be represented precisely, so this is preferred when numbers are presented in decimal form for end-users. But you still have precision-issues with numbers like 1/3, so it is not a panacea.

Note that these problems are not specific for floating point numbers. Fixed-point numbers exhibit the same problems. For example one answer suggest representing numbers as integers with an implicit offset of 0.01. But integer arithmetic certainly have its fair share of pitfalls:

> 1 / 3
> 0
> (1 / 3) * 3
> 0

Rational types handles the above examples nicely, but does not support irrational numbers like pi or e, and cannot support operations like square root. Or, depending on the language, it will just fall back to using floating point types in those cases, which is an additional pitfall.

Another problem with rational types is repeated operations may result in very large numerator or denominator values, which will either result in an overflow (if using fixed-size integers) or out of memory.

Bottom line is every numeric format has its limitations and pitfall, and a programmer need to understand this.

I don't think there is much a programming language can do about this fundamental problem, but a few things could perhaps avoid some pitfalls:

Integer should be the default numeric type.
The language should provide a decimal type along with the binary floating point type, and it should be just as easy to use. Neither should be the default, you should always specify which one to use.
There should be no automatic conversion from integer to floating-point. A conversion should always be explicit
Binary floating point values should not default to print in decimal notation, but should require an explicit operation to print as decimal.
Division should not be an allowed operation for integers. Integer-division should be supported but through a distinct operator or function.
Equality comparison should not be a supported operation for floating-point types (or at least it should not use the default == operator).

This might avoid some of the obvious pitfalls (like integer division) and force the programmer to be explicit about the choice of numeric format.

A more radical approach could be to introduce an "imprecise" runtime flag on numeric values. If an operation cause rounding, the resulting value gets the "imprecise" flag. Any operation involving imprecise inputs will always have an imprecise result, except for rounding operations. The type system could indicate where imprecise values are allowed. This would probably have severe performance implications, but might make sense for a teaching language.

Question 60

"The language should provide a decimal type along with the binary floating point type" This is something I think would really help. A built-in decimal type that is just as prominent (if not more) as the floating-point types. For the most part, I think if a developer doesn't understand the difference between floats and decimals, they probably should be using decimal.

Question 61

languages have Decimal type support; of course this doesn't really solve the problem, still you have no exact and finite representation of for example 1⁄3;
some DBs and frameworks have Money type support, this is basically storing number of cents as integer;
there are some libraries for rational numbers support; that solves problem of 1⁄3, but doesn't solve the problem of for example √2;

These above are applicable in some cases, but not really a general solution for dealing with float values. The real solution is to understand the problem and learn how to deal with it. If you're using float point calculations, you should always check is your algorithms are numerically stable . There is huge field of mathematics/computer science which relates to the problem. It's called Numerical Analysis .

Question 62

The problem is that many programmers havent learned their basic tools and have no idea what floating point numbers and decimal numbers actually mean.

64 bit double precision floating point numbers with 53 mantissa bits can represent every real number less than 2^54 with an error of at most 1, and every real number less than 2^47 with an error less than 1/100. If these numbers are dollars, then any amount less than 140 trillion dollars can be represented with an error less than one cent. Anyone has a problem with that? Probably not.

There is a problem that two different calculations can produce two different numbers, each precise enough. A simple calculation - round (100*x) / 100 makes sure there is only one number left. An awful lot of problems are gone at almost no cost if you use dollars and cents. (There are some countries where the larger money unit is divided into 1000 parts, so you’d multiply and divide by 1000). Of course if you look at arbitrary real numbers then don’t make such changes.

Decimal numbers can handle multiples of 10^-k for one fixed k. Sometimes that is enough for handling money values, sometimes it’s not. For example, UK VAT cannot be calculated using two decimals. You cannot divide an annual money amount by 12 or 360 or 365. What you need to do is find the exact legal rules you have to follow and implement these rules whatever it takes. No easy tricks in a programming language for that.

Question 63

Most programmers would be surprised that COBOL got that right... in the first version of COBOL there was no floating point, only decimal, and the tradition in COBOL continued until today that the first thing you think of when declaring a number is decimal... floating point would only be used if you really needed it. When C came along, for some reason, there was no primitive decimal type, so in my opinion, that's where all the problems started.

Question 64

C didn't have a decimal type because it isn't primitive, very few computers having any sort of hardware decimal instructions. You might ask why BASIC and Pascal didn't have it, since they weren't designed to conform closely to the metal. COBOL and PL/I are the only languages I know of the time that had anything like that.

whatsisname – whatsisname · Answer 1 · 2011-03-28 18:28:42Z

You say "especially for financial software", which brings up one of my pet peeves: money is not a float, it's an int.

Sure, it looks like a float. It has a decimal point in there. But that's just because you're used to units that confuse the issue. Money always comes in integer quantities. In America, it's cents. (In certain contexts I think it can be mills, but ignore that for now.)

So when you say 1ドル.23, that's really 123 cents. Always, always, always do your math in those terms, and you will be fine. For more information, see:

Martin Fowler's Quantity and Money patterns
His books Patterns of Enterprise Application Architecture and Analysis Patterns
Wikipedia on Banker's Rounding

Answering the question directly, programming languages should just include a Money type as a reasonable primitive.

update

Ok, I should have only said "always" twice, rather than three times. Money is indeed always an int; those who think otherwise are welcome to try sending me 0.3 cents and showing me the result on your bank statement. But as commenters point out, there are rare exceptions when you need to do floating point math on money-like numbers. E.g., certain kinds of prices or interest calculations. Even then, those should be treated like exceptions. Money comes in and goes out as integer quantities, so the closer your system hews to that, the saner it will be.

@JoelFan: you're mistaking a concept for a platform specific implementation.
It's not quite that simple. Interest calculations, among others, do produce fractional cents, and have to be rounded at some point according to a specified method.
Fictional -1, since I lack the rep for a downvote :) ...This might be correct for whatever's in your wallet but there are plenty of accounting situations where you could well be dealing with tenths of a cent, or smaller fractions. Decimal is the only sane system for dealing with this, and your comment "ignore that for now" is the harbinger of doom for programmers everywhere :P
@kevin cline: There are fractional cents in calculations, but there are conventions on how to handle them. The goal for financial calculations is not mathematical correctness, but getting the exact same results that a banker with a calculator would.
Everything will be perfect by replacing the word "integer" with "rational"-

JUST MY correct OPINION – JUST MY correct OPINION · Answer 2 · 2011-03-28 17:33:56Z

17

Providing support for a Decimal type helps in many cases. Many languages have a decimal type, but they are underused.

Understanding the approximation that occurs when working with representation of real numbers is important. Using both decimal and floating point types 9 * (1/9) != 1 is a correct statement. When constants a optimizer may optimize the calculation so that it is correct.

Providing an approximates operator would help. However, such comparisons are problematic. Note that .9999 trillion dollars is approximately equal to 1 trillion dollars. Could you please deposit the difference in my bank account?

Share

Improve this answer

answered Mar 28, 2011 at 17:33

community wiki

BillThor

5

2

0.9999... trillion dollars is precisely equal to 1 trillion dollars actually.

JUST MY correct OPINION
– JUST MY correct OPINION

2011年03月29日 07:00:18 +00:00
Commented Mar 29, 2011 at 7:00
7

@JUST: Yes but I haven't encountered any computers with registers that will hold 0.99999.... They all truncate at some point resulting in an inequality. 0.9999 is equal enough for engineering. For financial purposes it isn't.

BillThor
– BillThor

2011年03月29日 14:08:22 +00:00
Commented Mar 29, 2011 at 14:08
2

But what kind of system used trillions of dollars as the base unit instead of ones of dollars?

Brad
– Brad

2012年08月01日 18:23:54 +00:00
Commented Aug 1, 2012 at 18:23
@Brad Try calculateing (1 Trillion / 3) * 3 on your calculator. What value do you get?

BillThor
– BillThor

2012年08月04日 13:07:15 +00:00
Commented Aug 4, 2012 at 13:07
Not in the banking system...

Questor
– Questor

2025年04月17日 16:42:48 +00:00
Commented Apr 17 at 16:42

Add a comment |

James Khoury – James Khoury · Answer 3 · 2011-03-28 22:29:35Z

We were told what to do in the first year ( sophomore) lecture in computer science when I went to university , ( this course was a pre-requisite for most science courses as well)

I recall the lecturer saying "Floating point numbers are approximations. Use integer types for money. Use FORTRAN or other language with BCD numbers for accurate computation." ( and then he pointed out the approximation, using that classic example of 0.2 impossible to represent accurately in binary floating point). This also turned up that week in the laboratory exercises.

Same lecture : "If you must get more accuracy from floating point, sort your terms. Add small numbers together, not to big numbers." That stuck in my mind.

A few years ago I had a some spherical geometry that needed to be very accurate, and still fast. 80 bit double on PC's was not cutting it, so I added some types to the program that sorted terms before performing commutative operations. Problem solved.

Before you complain about the quality of the guitar, learn to play.

I had a co-worker four years ago who'd worked for JPL. He expressed disbelief that we used FORTRAN for some things. (We needed super accurate numerical simulations calculated offline.) "We replaced all that FORTRAN with C++" he said proudly. I stopped wondering why they missed a planet.

+1 the right tool for the right job. Although I don't actually use FORTRAN. Thankfully neither do I work on our financial systems at work.
"If you must get more accuracy from floating point, sort your terms. Add small numbers together, not to big numbers." Any sample on this?
@mamcx Imagine a decimal floating point number having just one digit of precission. The computation 1.0 + 0.1 + ... + 0.1 (repeated 10 times) returns 1.0 as every intermediate result gets rounded. Doing it the other way round, you get intermediate results of 0.2, 0.3, ..., 1.0 and finally 2.0. This is an extreme example, but with realistic floating point numbers, similar problems happen. The base idea is that adding numbers similar in size leads to the smallest error. Start with the smallest numbers as their sum is bigger and therefore better suited for addition to bigger ones.
Floating point stuff in Fortran and C++ is going to be mostly identical though. Both are accurate and offline, and I'm pretty sure Fortran has no native BCD reals...

dsimcha – dsimcha · Answer 4 · 2011-03-28 18:07:54Z

7

By default, languages should use arbitrary-precision rationals for non-integer numbers.

Those who need to optimize can always ask for floats. Using them as a default made sense in C and other systems programming languages, but not in most languages popular today.

Share

Improve this answer

answered Mar 28, 2011 at 18:07

community wiki

Waquo

11

2

How do you deal with irrational numbers then?

dsimcha
– dsimcha

2011年03月28日 18:24:39 +00:00
Commented Mar 28, 2011 at 18:24
3

You do it the same way as with floats: approximation.

Waquo
– Waquo

2011年03月28日 19:09:53 +00:00
Commented Mar 28, 2011 at 19:09
2

Computations with arbitrary-precision rationals will often be orders of magnitude slower (possibly MANY orders of magnitude slower) than computations with a hardware-supported double. If a calculation needs to be accurate to a part per million, it's better to spend a microsecond computing it to within a few parts per billion, than to spend a second computing it absolutely precisely.

supercat
– supercat

2012年08月01日 15:54:52 +00:00
Commented Aug 1, 2012 at 15:54
8

@supercat: What you're suggesting is just a poster-child of premature optimisation. The current situation is that the vast majority of programmers have no need whatsoever for fast math, and then get bitten by hard to understand floating-point (mis)behaviour, so that the relatively tiny number of programmers who need fast math gets it without having to type a single extra character. This made sense in the seventies, now it's just nonsense. The default should be safe. Those who need fast should ask for it.

Waquo
– Waquo

2012年08月01日 19:09:49 +00:00
Commented Aug 1, 2012 at 19:09
3

@Waquo: You vastly underestimate the cost of arbitrary-precision rational arithmetic. Since repeated operations upon fractions will generally denominators to increase, use of unlimited-precision rational types can easily turn an algorithm which should run in linear time and constant space, into one which takes exponential time and exponential space. That doesn't sound very "safe" to me.

supercat
– supercat

2012年08月01日 21:40:50 +00:00
Commented Aug 1, 2012 at 21:40

| Show 6 more comments

Karl Bielefeldt – Karl Bielefeldt · Answer 5 · 2011-03-28 16:43:17Z

7

Warning: The floating-point type System.Double lacks the precision for direct equality testing.

double x = CalculateX();
if (x == 0.1)
{
 // ............
}

I don't believe anything can or should be done at a language level.

Share

Improve this answer

edited Aug 1, 2012 at 13:04

community wiki

2 revs
ChaosPandion

9

1

I haven't used a float or double in a long time, so I'm curious. Is that an actual existing compiler warning, or just one you'd like to see?

Karl Bielefeldt
– Karl Bielefeldt

2011年03月28日 18:00:34 +00:00
Commented Mar 28, 2011 at 18:00
1

@Karl - Personally I haven't seen it or need it but I imagine it could be useful to dedicated but green developers.

ChaosPandion
– ChaosPandion

2011年03月28日 19:28:27 +00:00
Commented Mar 28, 2011 at 19:28
1

The binary floating point types are no better or worse qualitatively than Decimal when it comes to equality testing. The difference between 1.0m/7.0m*7.0m and 1.0m may be many orders of magnitude less than the difference between 1.0/7.0*7.0, but it's not zero.

supercat
– supercat

2012年07月12日 22:11:08 +00:00
Commented Jul 12, 2012 at 22:11
3

@Patrick - I'm not sure what you are getting at. There is a huge difference between something being true for one case and being true for all cases.

ChaosPandion
– ChaosPandion

2012年08月01日 14:15:53 +00:00
Commented Aug 1, 2012 at 14:15
1

@ChaosPandion The problem with the example in this post isn't the equality-comparison, it's the floating-point literal. There is no float with the exact value 1.0/10. Floating point maths results in 100% accurate results when computing with integer numbers fitting within the mantissa.

Patrick
– Patrick

2012年08月01日 20:01:28 +00:00
Commented Aug 1, 2012 at 20:01

| Show 4 more comments

mamcx – mamcx · Answer 6 · 2012-07-31 18:12:10Z

I find it strange that nobody has pointed out the Lisp family's rational number trick.

Seriously, open sbcl, and do this: (+ 1 3) and you get 4. If you do(* 3 2) you get 6. Now try (/ 5 3) and you get 5/3, or 5 thirds.

That should help somewhat in some situations, shouldn't it?

Additional example from androclus:

@Waquo in another answer here makes the great suggestion that "...languages should use arbitrary-precision rationals for non-integer numbers." and "Those who need to optimize can always ask for floats."

And indeed, @Haakon here provides a clear and simple example of one such implementation. (I am only going to add a slightly more complicated demonstration of how this plays out with a "real world" problem..)

As @Haakon points out: (Common) LISP decades ago decided to solve the problem (posed here by @Adam Paynter ) by preserving everything internally as a rational (a ratio of 2 integers, i.e. a fraction), which loses no accuracy.

However any time s/he likes, the programmer can take this internal representation and print it out as an approximate decimal form with any number of decimals s/he requests.

Below is an example program in LISP which demonstrates just that. It uses the Leibniz method to find a decimal approximation for Pi. Pi is an irrational number of course, and therefore no rational (fraction or decimal) will capture it. But representations of required precision for whatever purpose can be generated.

Actual Pi in decimal form starts out with "3.1415926...". The more iterations you run the Leibniz algorithm (and this program) through, the closer your rational (whether fraction or decimal) will get to the real thing.

But the Leibniz algorithm is extremely slow to converge, even just to 10 decimal places(*2), and there are faster methods. So here we use only 11 iterations just to make the point here:

(defvar *sum* 0)
(loop for x from 0 to 10
 do (setf *sum* (+ *sum* (/ (expt -1 x) (+ (* 2 x) 1)))))
(setf *sum* (* *sum* 4))
(format t "Current Internal (rational): ~d~%" *sum*)
(format t "Current decimal approximation (to 4 digits): ~,4f~%" *sum*)

Notice that when you run this with LISP (*1), you'll get the following output:

Current Internal (rational): 47028692/14549535
Current decimal approximation (to 4 digits): 3.2323

The only thing is that as you iterate more (not just 11 iterations, but millions, billions, etc.), the decimal printout will get closer and closer to what we know as Pi (oscillating around it with + an - error). For example, adding just 5 more iterations gets you much larger numerator and denominators:

Current Internal (rational): 13895021563328/4512611027925
Current decimal approximation (to 4 digits): 3.0792

and so on: As you can guess, internally the numerator and denominator of the fraction grow quickly, each becoming HUGE. It's actually fun to play with and see how much your computer can handle.(*2)

As a LISP fanboy, my guess is that way back, some very smart people might have noticed LISP's ability to do this and were inspired to carry it further (such as including irrationals and unknowns to be held internally as well). This would have indeed launched the symbolic and numerical analysis software family (Maxima, Mathematica, Maple, MatLab, MathCAD, SageMath, etc). But I've never seen it written that such was actually the case...

(*1) and I like the SBCL version too for its speed and the fact it is free and community-developed

(*2) Accuracy to 10 digits with Leibniz reputedly takes ~5 billion iterations. More than my 20 x 12th Gen Intel(R) Core(TM) i7-12700H running Linux and 32GB of RAM could do in a day and a half when I left it running.. The fraction gets truly monstrous, but she just kept chuggin' and never complained..

I wonder, if possible to know if a result need to be represented as 1/3 or could be a exact decimal?
@Androclus this may be a wiki, but 16 edits changing the answer completely are a bit strange.

S.Lott – S.Lott · Answer 7 · 2011-03-28 16:55:48Z

The two biggest problems involving floating point numbers are:

inconsistent units applied to the calculations (note this also affects integer arithmetic in the same way)
failure to understand that FP numbers are an approximation and how to intelligently deal with rounding.

The first type of failure can only be remedied by providing a composite type that includes value and unit information. For example, a length or area value that incorporates the unit (meters or square meters or feet and square feet respectively). Otherwise you have to be diligent about always working with one type of unit of measurement and only converting to another when we share the answer with a human.

The second type of failure is a conceptual failure. The failures manifest themselves when people think of them as absolute numbers. It affects equality operations, cumulative rounding errors, etc. For example, it may be correct that for one system two measurements are equivalent within a certain margin of error. I.e. .999 and 1.001 are roughly the same as 1.0 when you don't care about differences that are smaller than +/- .1. However, not all systems are that lenient.

If there is any language level facility needed, then I would call it equality precision. In NUnit, JUnit, and similarly constructed testing frameworks you can control the precision that is considered correct. For example:

Assert.That(.999, Is.EqualTo(1.001).Within(10).Percent);
// -- or --
Assert.That(.999, Is.EqualTo(1.001).Within(.1));

If, for example, C# or Java were altered to include a precision operator, it might look something like this:

if(.999 == 1.001 within .1) { /* do something */ }

However, if you supply a feature like that, you also have to consider the case where equality is good if the +/- sides are not the same. For example, +1/-10 would consider two numbers equivalent if one of them was within 1 more, or 10 less than the first number. To handle this case, you might need to add a range keyword as well:

if(.999 == 1.001 within range(.001, -.1)) { /* do something */ }

I'd switch the order. The conceptual problem is pervasive. The units conversion issue is relatively minor by comparison.
I like the concept of a precision operator but as you mention further on it would definitely need to be well thought out. Personally I would be more inclined to see it as its own complete syntactical construct.
@dan04: I was thinking more in terms of "all calculations accurate to within one percent" or the like. I've seen the tar-pit that is unit of measure handling and I'm staying well away.
About 25 years ago, I saw a numeric package featuring a type consisting of a pair of floating-point numbers representing the maximum and minimum possible values for a quantity. As numbers passed through calculations, the difference between maximum and minimum would grow. Effectively, this provided a means of knowing how much real precision was present in a calculated value.

score 2 · Answer 8 · 2011-03-28 16:54:18Z

What can programming languages do? Don't know if there's one answer to that question, because anything the compiler/interpreter does on the programmer's behalf to make his/her life easier usually works against performance, clarity, and readability. I think both the C++ way (pay only for what you need) and the Perl way (principle of least surprise) are both valid, but it depends on the application.

Programmers still need to work with the language and understand how it handles floating points, because if they don't, they'll make assumptions, and one day the perscribed behavior won't match up with their assumptions.

My take on what the programmer needs to know:

What floating-point types are available on the system and in the language
What type is needed
How to express the intentions of what type is needed in the code
How to correctly take advantage of any automatic type promotion to balance clarity and efficiency while maintaining correctness

score 2 · Answer 9 · 2011-03-28 17:12:34Z

What can programming languages do to avoid [floating point] pitfalls...?

Use sensible defaults, e.g. built-in support for decmials.

Groovy does this quite nicely, although with a bit of effort you can still write code to introduce floating point imprecision.

jk. – jk. · Answer 10 · 2011-03-28 16:58:45Z

2

I agree there's nothing to do at the language level. Programmers must understand that computers are discrete and limited, and that many of the mathematical concepts represented in them are only approximations.

Never mind floating point. One has to understand that half of the bit patterns are used for negative numbers and that 2^64 is actually quite small to avoid typical problems with integer arithmetic.

Share

Improve this answer

edited Mar 29, 2011 at 1:01

community wiki

2 revs
Apalala

4

disagree, most language currently give too much support for binary floating point types (why is == even defined for floats?) and not enough support for rationals or decimals

jk.
– jk.

2012年08月01日 07:29:42 +00:00
Commented Aug 1, 2012 at 7:29
@jk: Even if the result of any computation would never be guaranteed equal to the result of any other computation, equality comparison would still be useful for the case where the same value gets assigned to two variables (though the equality rules commonly implemented are perhaps too loose, since x==y does not imply that performing a computation on x will yield the same result as performing the same computation on y).

supercat
– supercat

2012年08月01日 15:57:49 +00:00
Commented Aug 1, 2012 at 15:57
@supercat you still need comparison, but i'd rather the language required me to specify a tolerance for each floating point comparison, i can then still get back to equality by choosing tolerance = 0, but i'm at least forced to make that choice

jk.
– jk.

2012年08月01日 17:35:11 +00:00
Commented Aug 1, 2012 at 17:35
== yields true if two floating-point values are considered the same, and false if not. Plus NaN == x is false for all x, while +0 and -0 is true even though the numbers are different,

gnasher729
– gnasher729

2025年04月17日 14:15:02 +00:00
Commented Apr 17 at 14:15

Add a comment |

score 2 · Answer 11 · 2012-07-12 16:14:09Z

One thing I would like to see would be a recognition that double to float should be regarded as a widening conversion, while float to double is narrowing(*). That may seem counter-intuitive, but consider what the types actually mean:

0.1f means "13,421,773.5/134,217,728, plus or minus 1/268,435,456 or so".
0.1 really means 3,602,879,701,896,397/36,028,797,018,963,968, plus or minus 1/72,057,594,037,927,936 or so"

If one has a double which holds the best representation of the quantity "one-tenth" and converts it to float, the result will be "13,421,773.5/134,217,728, plus or minus 1/268,435,456 or so", which is a correct description of the value.

By contrast, if one has a float which holds the best representation of the quantity "one-tenth" and converts it to double, the result will be "13,421,773.5/134,217,728, plus or minus 1/72,057,594,037,927,936 or so"--a level of implied accuracy which is wrong by a factor of over 53 million.

Although the IEEE-744 standard requires that floating-point maths be performed as though every floating-point number represents the exact numerical quantity precisely at the center of its range, that should not be taken to imply that floating-point values actually represent those exact numerical quantities. Rather, the requirement that the values be assumed to be at the center of their ranges stems from three facts: (1) calculations must be performed as though the operands have some particular precise values; (2) consistent and documented assumptions are more helpful than inconsistent or undocumented ones; (3) if one is going to make a consistent assumption, no other consistent assumption is apt to be better than assuming a quantity represents the center of its range.

Incidentally, I remember some 25 years or so ago, someone came up with a numerical package for C which used "range types", each consisting of a pair of 128-bit floats; all calculations would be done in such fashion as to compute the minimum and maximum possible value for each result. If one performed a big long iterative calculation and came up with a value of [12.53401391134 12.53902812673], one could be confident that while many digits of precision were lost to rounding errors, the result could still be reasonably expressed as 12.54 (and it wasn't really 12.9 or 53.2). I'm surprised I haven't seen any support for such types in any mainstream languages, especially since they would seem a good fit with math units that can operate on multiple values in parallel.

(*)In practice, it's often helpful to use double-precision values to hold intermediate computations when working with single-precision numbers, so having to use a typecast for all such operations could be annoying. Languages could help by having a "fuzzy double" type, which would perform computations as double, and could be freely cast to and from single; this would be especially helpful if functions which take parameters of type double and return double could be marked so that they would automatically generate an overload which accepts and returns "fuzzy double" instead.

David Thornley – David Thornley · Answer 12 · 2011-03-28 17:06:06Z

If more programming languages took a page from databases and allowed developers to specify the length and precision of their numeric data types, they could substantially reduce the probability of floating point related errors. If a language allowed a developer to declare a variable as a Float(2), indicating that they needed a floating point number with two decimal digits of precision, it could perform mathematical operations much more safely. If it did so by representing the variable as an integer internally and dividing by 100 before exposing the value, it could improve speed by using the faster integer arithmetic paths. The semantics of a Float(2) would also let developers avoid the constant need to round data before outputting it since a Float(2) would inherently round data to two decimal points.

Of course, you'd need to allow a developer to ask for a maximum-precision floating point value when the developer needs to have that precision. And you would introduce problems where slightly different expressions of the same mathematical operation produce potentially different results because of intermediate rounding operations when developers don't carry enough precision in their variables. But at least in the database world, that doesn't seem to be too big a deal. Most people aren't doing the sorts of scientific calculations that require lots of precision in intermediate results.

Specifying length and precision would do very little that is useful. Having fixed-point base 10 would be useful for financial processing, which would remove much of the surprise people get from floating-point.
@David - Perhaps I'm missing something but how is a fixed-point base 10 data type different than what I'm proposing here? A Float(2) in my example would have a fixed 2 decimal digits and would automatically round to the nearest hundredth which is what you'd likely use for simple financial calculations. More complex calculations would require that the developer allocated a larger number of decimal digits.
What you're advocating is a fixed-point base 10 data type with programmer-specified precision. I'm saying that the programmer-specified precision is mostly pointless, and will just lead to the sorts of errors I used to run into in COBOL programs. (For example, when you change the precision of variables, it's real easy to miss one variable the value runs through. For another, it will take a lot more thinking about intermediate result size than is good.)
A Float(2) like you propose should not be called Float, since there is nothing floating here, certainly not the "decimal point".

score 1 · Answer 13 · 2012-07-12 21:45:31Z

One thing languages could do--remove the equality comparison from floating point types other than a direct comparison to the NAN values.

Equality testing would only exist is as function call that took the two values and a delta, or for languages like C# that allow types to have methods an EqualsTo that takes the other value and the delta.

score 1 · Answer 14 · 2012-08-01 22:27:01Z

As other answers have noted, the only real way to avoid floating point pitfalls in financial software is not to use it there. This may actually be feasible -- if you provide a well-designed library dedicated to financial math.

Functions designed to import floating-point estimates should be clearly labelled as such, and provided with parameters appropriate to that operation, e.g.:

Finance.importEstimate(float value, Finance roundingStep)

The only real way to avoid floating point pitfalls in general is education -- programmers need to read and understand something like What Every Programmer Should Know About Floating-Point Arithmetic.

A few things that might help, though:

I'll second those who ask "why is exact equality testing for floating point even legal?"
Instead, use an isNear() function.
Provide, and encourage use of, floating-point accumulator objects (which add sequences of floating point values more stably than simply adding them all into a regular floating point variable).

JimmyJames – JimmyJames · Answer 15 · 2025-04-17 12:47:01Z

Education is the the only solution. Pitfalls cannot be avoided because there are infinitely many real numbers but a digital computer can only represent a finite number of bits. So in any numeric format some numbers cannot be represented precisely which lead to lack of precision which may lead to incorrect output.

Binary floating point has the well known problem where:

> 0.1d + 0.2d
0.30000000000000004

(Using C# here, d postfix indicate double-precision floating-point types)

Decimal types handle the above problem nicely, but have different issues:

> 0.1m + 0.2m
0.3
> (1.0m/3.0m) * 3.0m
0.9999999999999999999999999999

(The m postfix indicate the decimal type)

Decimal types has the advantage that numbers which can be written as decimals with a finite number of digits can be represented precisely, so this is preferred when numbers are presented in decimal form for end-users. But you still have precision-issues with numbers like 1/3, so it is not a panacea.

Note that these problems are not specific for floating point numbers. Fixed-point numbers exhibit the same problems. For example one answer suggest representing numbers as integers with an implicit offset of 0.01. But integer arithmetic certainly have its fair share of pitfalls:

> 1 / 3
> 0
> (1 / 3) * 3
> 0

Rational types handles the above examples nicely, but does not support irrational numbers like pi or e, and cannot support operations like square root. Or, depending on the language, it will just fall back to using floating point types in those cases, which is an additional pitfall.

Another problem with rational types is repeated operations may result in very large numerator or denominator values, which will either result in an overflow (if using fixed-size integers) or out of memory.

Bottom line is every numeric format has its limitations and pitfall, and a programmer need to understand this.

I don't think there is much a programming language can do about this fundamental problem, but a few things could perhaps avoid some pitfalls:

Integer should be the default numeric type.
The language should provide a decimal type along with the binary floating point type, and it should be just as easy to use. Neither should be the default, you should always specify which one to use.
There should be no automatic conversion from integer to floating-point. A conversion should always be explicit
Binary floating point values should not default to print in decimal notation, but should require an explicit operation to print as decimal.
Division should not be an allowed operation for integers. Integer-division should be supported but through a distinct operator or function.
Equality comparison should not be a supported operation for floating-point types (or at least it should not use the default == operator).

This might avoid some of the obvious pitfalls (like integer division) and force the programmer to be explicit about the choice of numeric format.

A more radical approach could be to introduce an "imprecise" runtime flag on numeric values. If an operation cause rounding, the resulting value gets the "imprecise" flag. Any operation involving imprecise inputs will always have an imprecise result, except for rounding operations. The type system could indicate where imprecise values are allowed. This would probably have severe performance implications, but might make sense for a teaching language.

"The language should provide a decimal type along with the binary floating point type" This is something I think would really help. A built-in decimal type that is just as prominent (if not more) as the floating-point types. For the most part, I think if a developer doesn't understand the difference between floats and decimals, they probably should be using decimal.

score 0 · Answer 16 · 2011-03-28 22:59:58Z

languages have Decimal type support; of course this doesn't really solve the problem, still you have no exact and finite representation of for example 1⁄3;
some DBs and frameworks have Money type support, this is basically storing number of cents as integer;
there are some libraries for rational numbers support; that solves problem of 1⁄3, but doesn't solve the problem of for example √2;

These above are applicable in some cases, but not really a general solution for dealing with float values. The real solution is to understand the problem and learn how to deal with it. If you're using float point calculations, you should always check is your algorithms are numerically stable . There is huge field of mathematics/computer science which relates to the problem. It's called Numerical Analysis .

score 0 · Answer 17 · 2025-04-16 12:23:07Z

The problem is that many programmers havent learned their basic tools and have no idea what floating point numbers and decimal numbers actually mean.

64 bit double precision floating point numbers with 53 mantissa bits can represent every real number less than 2^54 with an error of at most 1, and every real number less than 2^47 with an error less than 1/100. If these numbers are dollars, then any amount less than 140 trillion dollars can be represented with an error less than one cent. Anyone has a problem with that? Probably not.

There is a problem that two different calculations can produce two different numbers, each precise enough. A simple calculation - round (100*x) / 100 makes sure there is only one number left. An awful lot of problems are gone at almost no cost if you use dollars and cents. (There are some countries where the larger money unit is divided into 1000 parts, so you’d multiply and divide by 1000). Of course if you look at arbitrary real numbers then don’t make such changes.

Decimal numbers can handle multiples of 10^-k for one fixed k. Sometimes that is enough for handling money values, sometimes it’s not. For example, UK VAT cannot be calculated using two decimals. You cannot divide an annual money amount by 12 or 360 or 365. What you need to do is find the exact legal rules you have to follow and implement these rules whatever it takes. No easy tricks in a programming language for that.

David Thornley – David Thornley · Answer 18 · 2011-03-28 21:44:15Z

-2

Most programmers would be surprised that COBOL got that right... in the first version of COBOL there was no floating point, only decimal, and the tradition in COBOL continued until today that the first thing you think of when declaring a number is decimal... floating point would only be used if you really needed it. When C came along, for some reason, there was no primitive decimal type, so in my opinion, that's where all the problems started.

Share

Improve this answer

answered Mar 28, 2011 at 21:44

community wiki

JoelFan

12

2

C didn't have a decimal type because it isn't primitive, very few computers having any sort of hardware decimal instructions. You might ask why BASIC and Pascal didn't have it, since they weren't designed to conform closely to the metal. COBOL and PL/I are the only languages I know of the time that had anything like that.

David Thornley
– David Thornley

2011年03月28日 21:54:30 +00:00
Commented Mar 28, 2011 at 21:54
6

@JoelFan: so how do you write 1⁄3 in COBOL? Decimal doesn't solve any problems, base 10 is just as inaccurate as base 2.

vartec
– vartec

2011年03月28日 23:01:54 +00:00
Commented Mar 28, 2011 at 23:01
2

Decimal solves the problem of exactly representing dollars and cents, which is useful for a "Business Oriented" language. But otherwise, decimal is useless; it has the same kinds of errors (e.g., 1/3*3=0.99999999) while being much slower. Which is why it's not the default in languages that weren't specifically designed for accounting.

dan04
– dan04

2011年03月29日 00:12:07 +00:00
Commented Mar 29, 2011 at 0:12
1

And FORTRAN, which predates C by more than a decade, doesn't have standard decimal support either.

dan04
– dan04

2011年03月29日 05:09:29 +00:00
Commented Mar 29, 2011 at 5:09
2

@JoelFan: if you have quarterly value and you need per month value, guess what do you have to multiply it by... no, it's not 0.33, it's 1⁄3.

vartec
– vartec

2011年03月29日 16:24:01 +00:00
Commented Mar 29, 2011 at 16:24

| Show 7 more comments

Stack Exchange Network

What can be done to programming languages to avoid floating point pitfalls?

18 Answers 18

Linked

Hot Network Questions

What can be done to programming languages to avoid floating point pitfalls?

18 Answers 18

Linked

Related

Hot Network Questions