Frequently, in my programming experience, I need to make a decision whether I should use float
or double
for my real numbers. Sometimes I go for float
, sometimes I go for double
, but really this feels more subjective. If I would be confronted to defend my decision, I would probably not give sound reasons.
When do you use float
and when do you use double
? Do you always use double
, only when memory constraints are present you go for float
? Or you always use float
unless the precision requirement requires you to use double
? Are there some substantial differences regarding the computational complexity of basic arithmetics between float
and double
? What are the pros and cons of using float
or double
? And have you even used long double
?
-
30In many cases you want to use neither, but rather a decimal floating or fixedpoint type. Binary floating point types can't represent most decimals exactly.CodesInChaos– CodesInChaos2013年02月28日 11:20:06 +00:00Commented Feb 28, 2013 at 11:20
-
5Related to What causes floating point rounding errors?. @CodesInChaos my answer there suggests resources to help you make that determination, there is no one-size-fits-all solution.Mark Booth– Mark Booth2013年02月28日 13:26:17 +00:00Commented Feb 28, 2013 at 13:26
-
5What exactly do you mean by "decimals". If you need to represent values like 0.01 exactly (say, for money), then (binary) floating-point is not the answer. If you merely means non-integer numbers, then floating-point is likely ok -- but then "decimals" is not the best word to describe what you need.Keith Thompson– Keith Thompson2013年02月28日 16:22:28 +00:00Commented Feb 28, 2013 at 16:22
-
1Considering (as of today) most graphics cards accept floats over doubles, graphics programming often uses single precision.Thomas Eding– Thomas Eding2014年08月19日 17:01:03 +00:00Commented Aug 19, 2014 at 17:01
-
1You don't always have a choice. For example, on the Arduino platform, both double and float equate to float. You need to find an add-in library to handle real doubles.kiwiron– kiwiron2016年04月29日 05:22:35 +00:00Commented Apr 29, 2016 at 5:22
8 Answers 8
The default choice for a floating-point type should be double
. This is also the type that you get with floating-point literals without a suffix or (in C) standard functions that operate on floating point numbers (e.g. exp
, sin
, etc.).
float
should only be used if you need to operate on a lot of floating-point numbers (think in the order of thousands or more) and analysis of the algorithm has shown that the reduced range and accuracy don't pose a problem.
long double
can be used if you need more range or accuracy than double
, and if it provides this on your target platform.
In summary, float
and long double
should be reserved for use by the specialists, with double
for "every-day" use.
-
13I would probably not consider float for a few thousand values unless there were a performance problem related to floating point caching and data transfer. There is usually a substantial cost to doing the analysis to show that float is precise enough.Patricia Shanahan– Patricia Shanahan2013年02月28日 15:35:10 +00:00Commented Feb 28, 2013 at 15:35
-
8As an addendum, if you need compatibility with other systems, it can be advantageous to use the same data types.zzzzBov– zzzzBov2013年02月28日 16:30:14 +00:00Commented Feb 28, 2013 at 16:30
-
17I'd use floats for millions of numbers, not 1000s. Also, some GPUs do better with floats, in that specialized case use floats. Else, as you say, use doubles.user949300– user9493002014年08月19日 16:57:11 +00:00Commented Aug 19, 2014 at 16:57
-
4@PatriciaShanahan - 'performance problem related to..' A good example is if you are planning to use SSE2 or similar vector instructions, you can do 4 ops/vector in float (vs 2 per double) which can give a significant speed improvement (half as many ops and half as much data to read & write). This can significantly lower the threshold where using floats becomes attractive, and worth the trouble to sort out the numeric issues.greggo– greggo2014年09月09日 19:03:01 +00:00Commented Sep 9, 2014 at 19:03
-
16I endorse this answer with one additional advice: When one is operating with RGB values for display, it is acceptable to use
float
(and occasionally half-precision) because neither the human eye, the display, or the color system has that many bits of precision. This advice is applicable for say OpenGL etc. This additional advice does not apply to medical images, which have more strict precision requirements.rwong– rwong2014年11月17日 22:00:14 +00:00Commented Nov 17, 2014 at 22:00
There is rarely cause to use float instead of double in code targeting modern computers. The extra precision reduces (but does not eliminate) the chance of rounding errors or other imprecision causing problems.
The main reasons I can think of to use float are:
- You are storing large arrays of numbers and need to reduce your program's memory consumption.
- You are targeting a system that doesn't natively support double-precision floating point. Until recently, many graphics cards only supported single precision floating points. I'm sure there are plenty of low-power and embedded processors that have limited floating point support too.
- You are targeting hardware where single-precision is faster than double-precision, and your application makes heavy use of floating point arithmetic. On modern Intel CPUs I believe all floating point calculations are done in double precision, so you don't gain anything here.
- You are doing low-level optimization, for example using special CPU instructions that operate on multiple numbers at a time.
So, basically, double is the way to go unless you have hardware limitations or unless analysis has shown that storing double precision numbers is contributing significantly to memory usage.
-
4"Modern computers" meaning Intel x86 processors. Some of the machines the Ancients used provided perfectly adequate precision with the basic float type. (The CDC 6600 used a 60-bit word, 48 bits of normalized floating-point mantissa, 12 bits of exponent. That's ALMOST what the x86 gives you for double precision.)John R. Strohm– John R. Strohm2014年08月19日 17:03:04 +00:00Commented Aug 19, 2014 at 17:03
-
@John.R.Strohm: agreed, but C compilers did not exist on CDC6600. It was Fortran IV...Basile Starynkevitch– Basile Starynkevitch2014年08月19日 20:41:52 +00:00Commented Aug 19, 2014 at 20:41
-
2By "modern computers" I mean any processor built in the last decade or two, or really, since the IEEE floating point standard was widely implemented. I'm perfectly aware that non-x86 architectures exist and had that in mind with my answer - I mentioned GPUs and embedded processors, which are typically not x86.Tim Armstrong– Tim Armstrong2015年01月28日 21:43:48 +00:00Commented Jan 28, 2015 at 21:43
-
2That's simply not true, though. SSE2 can manipulate 4 floats or 2 doubles in one operation, AVX can manipulate 8 floats or 4 doubles, AVX-512 can manipulate 16 floats or 8 doubles. For any kind of high performance computing, math on floats should be thought of as twice the speed of the same operations on doubles on x86.Larry Gritz– Larry Gritz2016年09月20日 18:19:26 +00:00Commented Sep 20, 2016 at 18:19
-
3And it's even worse than that, since you can fit twice as many floats in processor cache as you can with doubles, and memory latency is likely to be the main bottleneck in many programs. Keeping a whole working set of floats warm in cache may be literally an order of magnitude faster than using doubles and having them spill to RAM.Larry Gritz– Larry Gritz2016年09月20日 18:20:42 +00:00Commented Sep 20, 2016 at 18:20
Use double
for all your calculations and temp variables. Use float
when you need to maintain an array of numbers - float[]
(if precision is sufficient), and you are dealing with over tens of thousands of float
numbers.
Many/most math functions or operators convert/return double
, and you don't want to cast the numbers back to float
for any intermediate steps.
E.g.
If you have an input of 100,000 numbers from a file or a stream and need to sort them, put the numbers in a float[]
.
Some platforms (ARM Cortex-M2, Cortex-M4 etc) don't support double (It can always be checked in the reference manual to your processor. If there is no compilation warnings or errors, it does not mean that code is optimal. double can be emulated.). That is why you may need to stick to int or float.
If that is not the case, I would use double.
You can check the famous article by D. Goldberg ("What Every Computer Scientist Should Know About Floating-Point Arithmetic"). You should think twice before using floating-point arithmetic. There is a pretty big chance they are not needed at all in your particular situation.
-
3This question was already pretty well answered a year ago... but in any case, I'd say any time you're using double on platforms with double precision FPU acceleration, you should be using it on any other, even if that means letting the compiler emulate it instead of taking advantage of a FPU with floating-point only (note that FPU's aren't required on all platforms either, in fact a Cortex-M4 architecture defines them as an optional feature [was M2 a typo?]).Selali Adobor– Selali Adobor2014年09月22日 23:23:37 +00:00Commented Sep 22, 2014 at 23:23
-
The key to that logic is, while it's true one should be weary of floating point arithmetic, and it's many "quirks", definitely not taking the presence of FPU support for doubles to mean simply use doubles instead of floats. Floats are very generally faster than doubles and take less memory (FPU features vary). The volume of usage precludes this point from being on premature optimization. As does the fact doubles are clearly overkill for a lot (maybe even most) applications. Do the elements on this page really need to have their relative positions and sizes calculated to 13 decimal places?Selali Adobor– Selali Adobor2014年09月22日 23:36:09 +00:00Commented Sep 22, 2014 at 23:36
-
3When including a link to an off site page or document, please copy the relevant information, or summary, from the document into your answer. Off site links have a tendency to disappear over time.Adam Zuckerman– Adam Zuckerman2014年09月23日 00:10:39 +00:00Commented Sep 23, 2014 at 0:10
For real world problems the sampling threshold of your data is important when answering this question. Similarly, the noise floor is also important. If either is exceeded by your data type selection, no benefit will come from increasing precision.
Most real world samplers are limited to 24 bit DAC s. Suggesting that 32 bits of precision on real world calculations should be adequate where the significand is 24 bits of precision.
Double precision comes at the cost of 2x memory. Therefore limiting the use of doubles over floats could drastically cut the memory footprint/bandwidth of running applications.
A very simple rule: You use double unless you, personally, can give reasons that you can defend, why you would use float.
Consequently, if you ask "should I use double or float", the answer is "use double".
The choice of what variable to use between float and double depends on the accuracy of the data required. If an answer is required to have negligible difference from the actual answer, number of decimal places required will be many thus will dictate that double to be in use.Float will chop off some decimal places part thus reducing the accuracy.
-
4This answer doesn't add anything new to the question, and fails to say anything of actual use.Martijn Pieters– Martijn Pieters2015年02月07日 11:26:32 +00:00Commented Feb 7, 2015 at 11:26
Usually, I use the float
type when I don't need much precision — for example, for money — which is wrong, but is what I'm used to wrongly do.
On the other hand, I use double
when I need more precision, for example for complex mathematical algorithms.
The C99 standard says this:
There are three floating point types: float, double, and long double. The type double provides at least as much precision as float, and the type long double provides at least as much precision as double. The set of values of the type float is a subset of the set of values of the type double; the set of values of the type double is a subset of the set of values of the type long double.
I never really used long double
, but I don't use C/C++ so much. Usually I use dynamically typed languages like Python, where you don't have to care about the types.
For further information about Double vs Float, see this question at SO.
-
30Using floating point for serious money calculations is probably a mistake.Bart van Ingen Schenau– Bart van Ingen Schenau2013年02月28日 10:53:18 +00:00Commented Feb 28, 2013 at 10:53
-
21float is exactly the wrong type for money. You need to be using the highest precision possible.ChrisF– ChrisF2013年02月28日 10:56:36 +00:00Commented Feb 28, 2013 at 10:56
-
11@BartvanIngenSchenau Floating point for money is usually okay, binary floating point is not. For example .net's
Decimal
is a floating point type and it's typically a good choice for money calculations.CodesInChaos– CodesInChaos2013年02月28日 11:21:04 +00:00Commented Feb 28, 2013 at 11:21 -
17@ChrisF You don't need "high precision" for money, you need exact values.Sean McSomething– Sean McSomething2013年02月28日 19:37:58 +00:00Commented Feb 28, 2013 at 19:37
-
2@SeanMcSomething - Fair point. However, floats are still the wrong type though and given the floating point types available in most languages you need "high precision" to get "exact values".ChrisF– ChrisF2013年03月01日 08:38:01 +00:00Commented Mar 1, 2013 at 8:38