This is Python 3.5 code.
I have attempted to develop my own frexp()
function that mimics the functionality of Python standard library's math.frexp()
.
I'm interested in suggestions for how improve my code style to become more Pythonic and how to achieve better performance and precision for this function.
My code:
import ctypes
def frexp(x):
*binstr, = bin(ctypes.c_int.from_buffer(ctypes.c_float(x)).value)
if binstr[0] == '-' : binstr.pop(0)
exponent = int ( (''.join(binstr))[2:10], 2 ) - 127
mantissa = 1
for index, item in enumerate(''.join(binstr)[10:]):
if item == '1':
mantissa += 2 ** ( -(index + 1) )
return mantissa, exponent
Precision: not accurate
>>> import random
>>> testednumber = random.random() + random.random() * ( 10 ** 10 )
>>> testednumber
5478324268.848498
>>> # This is my code's output
>>> import mymath
>>> mymath.frexp(testednumber)
(1.2755217552185059, 32)
>>> # This is Python's frexp() output
>>> import math
>>> math.frexp(testednumber)
(0.6377608828303053, 33)
>>> # my result:
>>> (1.2755217552185059) * ( 2 ** 32 )
5478324224.0
>>> # Python's frexp() result:
>>> (0.6377608828303053) * ( 2 ** 33 )
5478324268.848498
Speed: 55 times slower than math.frexp()
>>> # This is my minimum speed
>>> print(min(timeit.Timer("mymath.frexp(testednumber)", setup="import mymath; testednumber = 5478324268.848498").repeat(7,1000)))
0.01864783600103692
>>> print(min(timeit.Timer("mymath.frexp(testednumber)", setup="import mymath; testednumber = 5478324268.848498").repeat(7,1000)))
0.01870606800002861
>>> print(min(timeit.Timer("mymath.frexp(testednumber)", setup="import mymath; testednumber = 5478324268.848498").repeat(7,1000)))
0.018558342000687844
>>> # This is Python's frexp() minimum speed
>>> print(min(timeit.Timer("math.frexp(testednumber)", setup="import math; testednumber = 5478324268.848498").repeat(7,1000)))
0.0003347589990880806
>>> print(min(timeit.Timer("math.frexp(testednumber)", setup="import math; testednumber = 5478324268.848498").repeat(7,1000)))
0.0003345459990669042
>>> print(min(timeit.Timer("math.frexp(testednumber)", setup="import math; testednumber = 5478324268.848498").repeat(7,1000)))
0.0003347729998495197
2 Answers 2
1. Review
The
ctypes
module is dangerous (it's possible to crash Python if you get it wrong) and so it should be used as a last resort, when no other facilities will do the job. In this case we could use thestruct
module instead:>>> # Reinterpret double-precision float as a 64-bit unsigned integer >>> struct.unpack('Q', (struct.pack('d', 5478324268.848498))) (4752538446597034867,)
The code in the post gets at the bits in the representation by converting to a string of 0s and 1s. It would be better to use Python's bitwise operators instead, to avoid converting to a string and back again. Let's start by declaring named constants for the shifts and masks:
# IEEE-754 double-precision floating-point format SIGN_SHIFT = 63 EXPONENT_SHIFT = 52 EXPONENT_MASK = (1 << 11) - 1 MANTISSA_MASK = (1 << 52) - 1
and then
Q, = struct.unpack('Q', (struct.pack('d', 5478324268.848498))) sign = Q >> SIGN_SHIFT biased_exponent = (Q >> EXPONENT_SHIFT) & EXPONENT_MASK mantissa = Q & MANTISSA_MASK
To assemble the result, normalize the exponent and reverse the disassembly process:
NORMAL_EXPONENT = 1023 R = (sign << SIGN_SHIFT) + (NORMAL_EXPONENT << EXPONENT_SHIFT) + mantissa normalized, = struct.unpack('d', struct.pack('Q', R)) return normalized, biased_exponent - NORMAL_EXPONENT
(Use 1022 instead of 1023 if you want to match
math.frexp
exactly.)
2. Revised code
from math import isinf, isnan
# IEEE-754 double-precision floating-point format
SIGN_SHIFT = 63
EXPONENT_SHIFT = 52
EXPONENT_MASK = (1 << 11) - 1
MANTISSA_MASK = (1 << 52) - 1
NORMAL_EXPONENT = 1023
def frexp(x):
"""Return the mantissa and exponent of x, as pair (m, e).
m is a float and e is an int, such that x = m * 2.**e.
If x is 0, m and e are both 0. Else 1.0 <= abs(m) < 2.0.
"""
x = float(x)
if x == 0.0 or isinf(x) or isnan(x):
return x, 0
Q, = struct.unpack('Q', struct.pack('d', x))
sign = Q >> SIGN_SHIFT
biased_exponent = (Q >> EXPONENT_SHIFT) & EXPONENT_MASK
mantissa = Q & MANTISSA_MASK
R = (sign << SIGN_SHIFT) + (NORMAL_EXPONENT << EXPONENT_SHIFT) + mantissa
normalized, = struct.unpack('d', struct.pack('Q', R))
return normalized, biased_exponent - NORMAL_EXPONENT
(Note that this doesn't get denormals right. I could fix it with some more work, but I think I'd rather just use math.frexp
.)
3. Use cases for frexp
frexp
is useful when computing mathematical functions, where we need to get the argument into a known range in order to avoid overflow, or to use a particular solution technique. Two examples:
To compute \$\log a\$ we can use
frexp
to find \$a = m2^e\$ with \${1\over2} \le m < 1\$ and then $$\log a = \log m2^e = \log m + e\log 2$$ and since \$m\$ is close to 1 we can use polynomial approximation (and \$\log 2\$ can be hard-coded).The geometric mean of \$a_1, a_2, \dots, a_n\$ is \$\sqrt[n]{a_1a_2\dotsb a_n}\$ but the multiplication might overflow the floating-point range. Using
frexp
we can replace \$a_i\$ by \$m_i2^{e_i}\$ which gives us $$ \eqalign{ \sqrt[n]{a_1a_2\dotsb a_n} &= \sqrt[n]{m_12^{e_1}m_22^{e_2}\dotsb m_n2^{e_n}} \\ &= 2^{(e_1 + e_2 + \dotsb + e_n)/n}\sqrt[n]{m_1m_2\dotsb m_n}} $$ where \${1\over2} \le m_i < 1\$. We can now divide the multiplication into chunks of length 1023 or less, knowing that each chunk can be multiplied and remain in range. We then usefrexp
on the results of the chunks to extract some more powers of 2, and repeat.(An alternative approach to computing the geometric mean while keeping the intermediate results in range would be to use logarithms and exponentiation: $$ \sqrt[n]{a_1a_2\dotsb a_n} = \exp\left({\log a_1 + \log a_2 + \dotsb + \log a_n} \over n\right) $$ but computing a logarithm is much more expensive than multiplying.)
-
\$\begingroup\$ Thanks! Your algorithm is 10 times slower than math.frexp() . I have added it to the main post. \$\endgroup\$Ruan– Ruan2017年07月25日 17:03:10 +00:00Commented Jul 25, 2017 at 17:03
-
\$\begingroup\$ Why use
if x == 0.0 or isinf(x) or isnan(x)
instead ofif x == 0.0 or x == float("inf") or x == float("-inf") or x == float("nan")
? Won't thisimport math
slow down the code? \$\endgroup\$Ruan– Ruan2017年07月25日 21:22:37 +00:00Commented Jul 25, 2017 at 21:22 -
\$\begingroup\$
x == float('nan')
is always false, so that won't work. \$\endgroup\$Gareth Rees– Gareth Rees2017年07月25日 21:28:27 +00:00Commented Jul 25, 2017 at 21:28 -
\$\begingroup\$ Just curious, what would be the practical application of frexp()? How people use it in real life \$\endgroup\$Ruan– Ruan2017年07月26日 10:50:35 +00:00Commented Jul 26, 2017 at 10:50
The reason that it is 55 times slower then your version is that the math
module in python makes calls to C functions. When python imports the math
module, it imports mathmodule.c
which calls the C function. You will probably never be able to math the speed of the built in math library (or even get close).
-
1\$\begingroup\$ I know that, but I just wanted to get closer to math.frexp(). We went from 55 times slower to 10 times slower, so that's a great improvement \$\endgroup\$Ruan– Ruan2017年07月25日 21:24:25 +00:00Commented Jul 25, 2017 at 21:24
Explore related questions
See similar questions with these tags.