Cloning Python standard library's function math.frexp() that extracts exponent and mantissa from a floating point number

Question 1

This is Python 3.5 code.

I have attempted to develop my own frexp() function that mimics the functionality of Python standard library's math.frexp().

I'm interested in suggestions for how improve my code style to become more Pythonic and how to achieve better performance and precision for this function.

My code:

import ctypes
def frexp(x):
 *binstr, = bin(ctypes.c_int.from_buffer(ctypes.c_float(x)).value)
 if binstr[0] == '-' : binstr.pop(0)
 exponent = int ( (''.join(binstr))[2:10], 2 ) - 127 
 mantissa = 1 
 for index, item in enumerate(''.join(binstr)[10:]):
 if item == '1':
 mantissa += 2 ** ( -(index + 1) )
 return mantissa, exponent

Precision: not accurate

>>> import random
>>> testednumber = random.random() + random.random() * ( 10 ** 10 )
>>> testednumber
5478324268.848498
>>> # This is my code's output
>>> import mymath
>>> mymath.frexp(testednumber)
(1.2755217552185059, 32)
>>> # This is Python's frexp() output
>>> import math
>>> math.frexp(testednumber)
(0.6377608828303053, 33)
>>> # my result: 
>>> (1.2755217552185059) * ( 2 ** 32 )
5478324224.0
>>> # Python's frexp() result:
>>> (0.6377608828303053) * ( 2 ** 33 )
5478324268.848498

Speed: 55 times slower than math.frexp()

>>> # This is my minimum speed
>>> print(min(timeit.Timer("mymath.frexp(testednumber)", setup="import mymath; testednumber = 5478324268.848498").repeat(7,1000)))
0.01864783600103692
>>> print(min(timeit.Timer("mymath.frexp(testednumber)", setup="import mymath; testednumber = 5478324268.848498").repeat(7,1000)))
0.01870606800002861
>>> print(min(timeit.Timer("mymath.frexp(testednumber)", setup="import mymath; testednumber = 5478324268.848498").repeat(7,1000)))
0.018558342000687844
>>> # This is Python's frexp() minimum speed
>>> print(min(timeit.Timer("math.frexp(testednumber)", setup="import math; testednumber = 5478324268.848498").repeat(7,1000)))
0.0003347589990880806
>>> print(min(timeit.Timer("math.frexp(testednumber)", setup="import math; testednumber = 5478324268.848498").repeat(7,1000)))
0.0003345459990669042
>>> print(min(timeit.Timer("math.frexp(testednumber)", setup="import math; testednumber = 5478324268.848498").repeat(7,1000)))
0.0003347729998495197

Question 2

1. Review

The ctypes module is dangerous (it's possible to crash Python if you get it wrong) and so it should be used as a last resort, when no other facilities will do the job. In this case we could use the struct module instead:
```
>>> # Reinterpret double-precision float as a 64-bit unsigned integer
>>> struct.unpack('Q', (struct.pack('d', 5478324268.848498)))
(4752538446597034867,)
```

The code in the post gets at the bits in the representation by converting to a string of 0s and 1s. It would be better to use Python's bitwise operators instead, to avoid converting to a string and back again. Let's start by declaring named constants for the shifts and masks:

# IEEE-754 double-precision floating-point format
SIGN_SHIFT = 63
EXPONENT_SHIFT = 52
EXPONENT_MASK = (1 << 11) - 1
MANTISSA_MASK = (1 << 52) - 1

and then

Q, = struct.unpack('Q', (struct.pack('d', 5478324268.848498)))
sign = Q >> SIGN_SHIFT
biased_exponent = (Q >> EXPONENT_SHIFT) & EXPONENT_MASK
mantissa = Q & MANTISSA_MASK

To assemble the result, normalize the exponent and reverse the disassembly process:

NORMAL_EXPONENT = 1023
R = (sign << SIGN_SHIFT) + (NORMAL_EXPONENT << EXPONENT_SHIFT) + mantissa
normalized, = struct.unpack('d', struct.pack('Q', R))
return normalized, biased_exponent - NORMAL_EXPONENT

(Use 1022 instead of 1023 if you want to match math.frexp exactly.)

2. Revised code

from math import isinf, isnan
# IEEE-754 double-precision floating-point format
SIGN_SHIFT = 63
EXPONENT_SHIFT = 52
EXPONENT_MASK = (1 << 11) - 1
MANTISSA_MASK = (1 << 52) - 1
NORMAL_EXPONENT = 1023
def frexp(x):
 """Return the mantissa and exponent of x, as pair (m, e).
 m is a float and e is an int, such that x = m * 2.**e.
 If x is 0, m and e are both 0. Else 1.0 <= abs(m) < 2.0.
 """
 x = float(x)
 if x == 0.0 or isinf(x) or isnan(x):
 return x, 0
 Q, = struct.unpack('Q', struct.pack('d', x))
 sign = Q >> SIGN_SHIFT
 biased_exponent = (Q >> EXPONENT_SHIFT) & EXPONENT_MASK
 mantissa = Q & MANTISSA_MASK
 R = (sign << SIGN_SHIFT) + (NORMAL_EXPONENT << EXPONENT_SHIFT) + mantissa
 normalized, = struct.unpack('d', struct.pack('Q', R))
 return normalized, biased_exponent - NORMAL_EXPONENT

(Note that this doesn't get denormals right. I could fix it with some more work, but I think I'd rather just use math.frexp.)

3. Use cases for frexp

frexp is useful when computing mathematical functions, where we need to get the argument into a known range in order to avoid overflow, or to use a particular solution technique. Two examples:

To compute \$\log a\$ we can use frexp to find \$a = m2^e\$ with \${1\over2} \le m < 1\$ and then $$\log a = \log m2^e = \log m + e\log 2$$ and since \$m\$ is close to 1 we can use polynomial approximation (and \$\log 2\$ can be hard-coded).
The geometric mean of \$a_1, a_2, \dots, a_n\$ is \$\sqrt[n]{a_1a_2\dotsb a_n}\$ but the multiplication might overflow the floating-point range. Using frexp we can replace \$a_i\$ by \$m_i2^{e_i}\$ which gives us $$ \eqalign{ \sqrt[n]{a_1a_2\dotsb a_n} &= \sqrt[n]{m_12^{e_1}m_22^{e_2}\dotsb m_n2^{e_n}} \\ &= 2^{(e_1 + e_2 + \dotsb + e_n)/n}\sqrt[n]{m_1m_2\dotsb m_n}} $$ where \${1\over2} \le m_i < 1\$. We can now divide the multiplication into chunks of length 1023 or less, knowing that each chunk can be multiplied and remain in range. We then use frexp on the results of the chunks to extract some more powers of 2, and repeat.

(An alternative approach to computing the geometric mean while keeping the intermediate results in range would be to use logarithms and exponentiation: $$ \sqrt[n]{a_1a_2\dotsb a_n} = \exp\left({\log a_1 + \log a_2 + \dotsb + \log a_n} \over n\right) $$ but computing a logarithm is much more expensive than multiplying.)

Question 3

Thanks! Your algorithm is 10 times slower than math.frexp() . I have added it to the main post.

Question 4

Why use if x == 0.0 or isinf(x) or isnan(x) instead of if x == 0.0 or x == float("inf") or x == float("-inf") or x == float("nan")? Won't this import math slow down the code?

Question 5

x == float('nan') is always false, so that won't work.

Question 6

Just curious, what would be the practical application of frexp()? How people use it in real life

Question 7

The reason that it is 55 times slower then your version is that the math module in python makes calls to C functions. When python imports the math module, it imports mathmodule.c which calls the C function. You will probably never be able to math the speed of the built in math library (or even get close).

Question 8

I know that, but I just wanted to get closer to math.frexp(). We went from 55 times slower to 10 times slower, so that's a great improvement

Gareth Rees Gareth Rees 50.1k3 gold badges130 silver badges210 bronze badges · Accepted Answer · 2017-07-25 16:18:41Z

1. Review

The ctypes module is dangerous (it's possible to crash Python if you get it wrong) and so it should be used as a last resort, when no other facilities will do the job. In this case we could use the struct module instead:
```
>>> # Reinterpret double-precision float as a 64-bit unsigned integer
>>> struct.unpack('Q', (struct.pack('d', 5478324268.848498)))
(4752538446597034867,)
```

The code in the post gets at the bits in the representation by converting to a string of 0s and 1s. It would be better to use Python's bitwise operators instead, to avoid converting to a string and back again. Let's start by declaring named constants for the shifts and masks:

# IEEE-754 double-precision floating-point format
SIGN_SHIFT = 63
EXPONENT_SHIFT = 52
EXPONENT_MASK = (1 << 11) - 1
MANTISSA_MASK = (1 << 52) - 1

and then

Q, = struct.unpack('Q', (struct.pack('d', 5478324268.848498)))
sign = Q >> SIGN_SHIFT
biased_exponent = (Q >> EXPONENT_SHIFT) & EXPONENT_MASK
mantissa = Q & MANTISSA_MASK

To assemble the result, normalize the exponent and reverse the disassembly process:

NORMAL_EXPONENT = 1023
R = (sign << SIGN_SHIFT) + (NORMAL_EXPONENT << EXPONENT_SHIFT) + mantissa
normalized, = struct.unpack('d', struct.pack('Q', R))
return normalized, biased_exponent - NORMAL_EXPONENT

(Use 1022 instead of 1023 if you want to match math.frexp exactly.)

2. Revised code

from math import isinf, isnan
# IEEE-754 double-precision floating-point format
SIGN_SHIFT = 63
EXPONENT_SHIFT = 52
EXPONENT_MASK = (1 << 11) - 1
MANTISSA_MASK = (1 << 52) - 1
NORMAL_EXPONENT = 1023
def frexp(x):
 """Return the mantissa and exponent of x, as pair (m, e).
 m is a float and e is an int, such that x = m * 2.**e.
 If x is 0, m and e are both 0. Else 1.0 <= abs(m) < 2.0.
 """
 x = float(x)
 if x == 0.0 or isinf(x) or isnan(x):
 return x, 0
 Q, = struct.unpack('Q', struct.pack('d', x))
 sign = Q >> SIGN_SHIFT
 biased_exponent = (Q >> EXPONENT_SHIFT) & EXPONENT_MASK
 mantissa = Q & MANTISSA_MASK
 R = (sign << SIGN_SHIFT) + (NORMAL_EXPONENT << EXPONENT_SHIFT) + mantissa
 normalized, = struct.unpack('d', struct.pack('Q', R))
 return normalized, biased_exponent - NORMAL_EXPONENT

(Note that this doesn't get denormals right. I could fix it with some more work, but I think I'd rather just use math.frexp.)

3. Use cases for frexp

frexp is useful when computing mathematical functions, where we need to get the argument into a known range in order to avoid overflow, or to use a particular solution technique. Two examples:

To compute \$\log a\$ we can use frexp to find \$a = m2^e\$ with \${1\over2} \le m < 1\$ and then $$\log a = \log m2^e = \log m + e\log 2$$ and since \$m\$ is close to 1 we can use polynomial approximation (and \$\log 2\$ can be hard-coded).
The geometric mean of \$a_1, a_2, \dots, a_n\$ is \$\sqrt[n]{a_1a_2\dotsb a_n}\$ but the multiplication might overflow the floating-point range. Using frexp we can replace \$a_i\$ by \$m_i2^{e_i}\$ which gives us $$ \eqalign{ \sqrt[n]{a_1a_2\dotsb a_n} &= \sqrt[n]{m_12^{e_1}m_22^{e_2}\dotsb m_n2^{e_n}} \\ &= 2^{(e_1 + e_2 + \dotsb + e_n)/n}\sqrt[n]{m_1m_2\dotsb m_n}} $$ where \${1\over2} \le m_i < 1\$. We can now divide the multiplication into chunks of length 1023 or less, knowing that each chunk can be multiplied and remain in range. We then use frexp on the results of the chunks to extract some more powers of 2, and repeat.

(An alternative approach to computing the geometric mean while keeping the intermediate results in range would be to use logarithms and exponentiation: $$ \sqrt[n]{a_1a_2\dotsb a_n} = \exp\left({\log a_1 + \log a_2 + \dotsb + \log a_n} \over n\right) $$ but computing a logarithm is much more expensive than multiplying.)

Thanks! Your algorithm is 10 times slower than math.frexp() . I have added it to the main post.
Why use if x == 0.0 or isinf(x) or isnan(x) instead of if x == 0.0 or x == float("inf") or x == float("-inf") or x == float("nan")? Won't this import math slow down the code?
Just curious, what would be the practical application of frexp()? How people use it in real life

Stack Exchange Network

Cloning Python standard library's function math.frexp() that extracts exponent and mantissa from a floating point number

2 Answers 2

1. Review

2. Revised code

3. Use cases for frexp

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Cloning Python standard library's function math.frexp() that extracts exponent and mantissa from a floating point number

2 Answers 2

1. Review

2. Revised code

3. Use cases for frexp

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions