I am having some trouble with packing and unpacking of binary floats in python when doing a binary file write. Here is what I have done:
import struct
f = open('file.bin', 'wb')
value = 1.23456
data = struct.pack('f',value)
f.write(data)
f.close()
f = open('file.bin', 'rb')
print struct.unpack('f',f.read(4))
f.close()
The result I get is the following:
(1.2345600128173828,)
What is going on with the extra digits? Is this a rounding error? How does this work?
-
1Yes, floating point numbers are, by their nature, imprecise.Martijn Pieters– Martijn Pieters2013年04月23日 09:16:35 +00:00Commented Apr 23, 2013 at 9:16
-
5For the full why, see What Every Computer Scientist Should Know About Floating-Point Arithmetic.Martijn Pieters– Martijn Pieters2013年04月23日 09:17:12 +00:00Commented Apr 23, 2013 at 9:17
-
2The Python tutorial summarizes the representation problems that you encountered.Martijn Pieters– Martijn Pieters2013年04月23日 09:20:28 +00:00Commented Apr 23, 2013 at 9:20
-
1If you want to avoid losing precision, you could pickle a Decimal instead.Aya– Aya2013年04月23日 09:33:03 +00:00Commented Apr 23, 2013 at 9:33
2 Answers 2
On most platforms, Python floats are what C would call a double, but you wrote your data out as float instead, which has half the precision.
If you were to use double, you'd have less precision loss:
>>> data = struct.pack('d',value)
>>> struct.unpack('d',data)
(1.23456,)
>>> data = struct.pack('f',value)
>>> struct.unpack('f',data)
(1.2345600128173828,)
The float struct format offers only single precision (24 bits for the significant precision).
6 Comments
0.8388607, where did you get that number from? The print function isn't showing "more precision", it's just converting the actual value that is stored to base ten.It's a decimal to binary problem.
You know how some fractions in decimal are repeating? For instance, 1/3 is 0.3333333-> forever. 1/7 is 0.142857142857[142857]-> forever.
So here's the kicker: repeating fractions are those with a denominator that has a factor that is not a factor of 10 -- eg not a multiple of 2 and/or 5.
- 1/2 divides evenly
- 1/3 repeats
- 1/4 divides evenly
- 1/5 divides evenly
- 1/6 repeats
- 1/7 repeats
- 1/8 divides evenly
- 1/9 repeats
- 1/10 divides evenly
- 1/11 repeats
- and so forth
So now how does that work in binary? Well, it kinda sucks, because the only factor that divides evenly is 2. All other prime numbers besides 2 will have repeating decimals that repeat forever -- and that includes tenths, hundredths, etc, which all have a factor of 5 in the denominator. 1.2345 is 12345/10000, which has factors 2 and 5 in the denominator, and that 5 means you have a repeating decimal in binary that repeats forever.
But you can't repeat forever. Which means that you will have to round off the decimal for it to fit in the binary digits encoding your float.
When you convert back to decimal, the rounding error is revealed.
The upshot for coding is: calculate divisions as late as possible to keep these errors from accumulating with each calculation.