This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2015年12月22日 22:18 by johnwalker, last changed 2022年04月11日 14:58 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| issue25928.diff | skrah, 2015年12月28日 14:12 | review | ||
| Messages (27) | |||
|---|---|---|---|
| msg256873 - (view) | Author: John Walker (johnwalker) * | Date: 2015年12月22日 22:18 | |
In statistics, there is a FIXME on Line 250 above _decimal_to_ratio that says:
# FIXME This is faster than Fraction.from_decimal, but still too slow.
Half of the time is spent in a conversion in d.as_tuple(). Decimal internally stores the digits as a string, but in d.as_tuple(), the digits are individually cast to integers and returned as a tuple of integers.
This is OK, but _decimal_to_ratio undoes the work that was done in d.as_tuple() by adding them all back into an integer. A similar, but slightly different approach is taken in Fractions.from_decimal, where the tuple is cast into a string and then parsed into an integer. We can be a lot faster if we use the _int instance variable directly.
In the case of _decimal_to_ratio, the new code seems to be twice as fast with usage _decimal_to_ratio(Decimal(str(random.random()))):
def _decimal_to_ratio(d):
sign, exp = d._sign, d._exp
if exp in ('F', 'n', 'N'): # INF, NAN, sNAN
assert not d.is_finite()
return (d, None)
num = int(d._int)
if exp < 0:
den = 10**-exp
else:
num *= 10**exp
den = 1
if sign:
num = -num
return (num, den)
If the performance improvement is considered worthwhile, here are a few solutions I see.
1) Use _int directly in fractions and statistics.
2) Add a digits_as_str method to Decimal. This prevents exposing _int as an implementation detail, and makes sense to me since I suspect there is a lot of code casting the tuple of int to a string anyway.
3) Add a parameter to as_tuple for determining whether digits should be returned as a string or a tuple.
4) Deprecate _int in Decimal and add a new reference str_digits.
There are probably more solutions. I lean towards 4, because it makes usage easier and avoids cluttering Decimal with methods.
Here is what I used for benchmarks:
========
import timeit
old_setup = """
import random
from decimal import Decimal
def _decimal_to_ratio(d):
sign, digits, exp = d.as_tuple()
if exp in ('F', 'n', 'N'): # INF, NAN, sNAN
assert not d.is_finite()
return (d, None)
num = 0
for digit in digits:
num = num*10 + digit
if exp < 0:
den = 10**-exp
else:
num *= 10**exp
den = 1
if sign:
num = -num
return (num, den)
def run_it():
dec = Decimal(str(random.random()))
_decimal_to_ratio(dec)
"""
new_setup = """
import random
from decimal import Decimal
def _decimal_to_ratio(d):
sign, exp = d._sign, d._exp
if exp in ('F', 'n', 'N'): # INF, NAN, sNAN
assert not d.is_finite()
return (d, None)
num = int(d._int)
if exp < 0:
den = 10**-exp
else:
num *= 10**exp
den = 1
if sign:
num = -num
return (num, den)
def run_it():
dec = Decimal(str(random.random()))
_decimal_to_ratio(dec)
"""
if __name__ == '__main__':
print("Testing proposed implementation")
print("number = 10000")
print(timeit.Timer(stmt='run_it()', setup=new_setup).timeit(number=10000))
print("number = 100000")
print(timeit.Timer(stmt='run_it()', setup=new_setup).timeit(number=100000))
print("number = 1000000")
print(timeit.Timer(stmt='run_it()', setup=new_setup).timeit(number=1000000))
print("Testing old implementation")
print("number = 10000")
print(timeit.Timer(stmt='run_it()', setup=old_setup).timeit(number=10000))
print("number = 100000")
print(timeit.Timer(stmt='run_it()', setup=old_setup).timeit(number=100000))
print("number = 1000000")
print(timeit.Timer(stmt='run_it()', setup=old_setup).timeit(number=1000000))
|
|||
| msg256874 - (view) | Author: John Walker (johnwalker) * | Date: 2015年12月22日 22:21 | |
Heres the output of running the benchmark on my machine: Testing proposed implementation number = 10000 0.07098613299967838 number = 100000 0.6952260910002224 number = 1000000 6.948197601999709 Testing current implementation number = 10000 0.1418162760000996 number = 100000 1.350394603001405 number = 1000000 13.625065807000283 |
|||
| msg256914 - (view) | Author: Stefan Krah (skrah) * (Python committer) | Date: 2015年12月23日 15:07 | |
I guess there's some version mixup here: From Python 3.3 on the integrated C version of decimal does not store the digits as a string and does not have the private _int method. |
|||
| msg256917 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年12月23日 15:18 | |
May be implement the as_integer_ratio() method and/or numerator and denominator attributes in the Decimal class? |
|||
| msg256932 - (view) | Author: Steven D'Aprano (steven.daprano) * (Python committer) | Date: 2015年12月23日 20:57 | |
On Wed, Dec 23, 2015 at 03:18:28PM +0000, Serhiy Storchaka wrote: > May be implement the as_integer_ratio() method and/or numerator and > denominator attributes in the Decimal class? That would also be good as it will decrease the API differences between floats and Decimals and make it easier to duck-type one for the other. |
|||
| msg256933 - (view) | Author: John Walker (johnwalker) * | Date: 2015年12月23日 21:01 | |
> I guess there's some version mixup here: From Python 3.3 on > the integrated C version of decimal does not store the digits > as a string and does not have the private _int method. Stefan, _int is a slot in Lib/_pydecimal.py. It should be defined on python 3.5 and tip, unsure about other versions. Python 3.5.1 (default, Dec 7 2015, 12:58:09) [GCC 5.2.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> from decimal import Decimal >>> Decimal("100.00") Decimal('100.00') >>> Decimal("100.00")._int '10000' |
|||
| msg256934 - (view) | Author: Stefan Krah (skrah) * (Python committer) | Date: 2015年12月23日 21:30 | |
On Wed, Dec 23, 2015 at 09:01:22PM +0000, John Walker wrote: > Stefan, _int is a slot in Lib/_pydecimal.py. It should be defined on python 3.5 and tip, unsure about other versions. > > Python 3.5.1 (default, Dec 7 2015, 12:58:09) > [GCC 5.2.0] on linux > Type "help", "copyright", "credits" or "license" for more information. > >>> from decimal import Decimal > >>> Decimal("100.00") > Decimal('100.00') > >>> Decimal("100.00")._int > '10000' That should only happen if the C version did not build for some reason: Python 3.6.0a0 (default:323c10701e5d, Dec 14 2015, 14:28:41) [GCC 4.8.4] on linux Type "help", "copyright", "credits" or "license" for more information. >>> from decimal import Decimal >>> Decimal("100.00")._int Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'decimal.Decimal' object has no attribute '_int' >>> |
|||
| msg256936 - (view) | Author: John Walker (johnwalker) * | Date: 2015年12月23日 21:46 | |
> That should only happen if the C version did not build for some reason: Ahh, gotcha. I assume one instance where this happens is when the machine doesn't have libmpdec.so |
|||
| msg256937 - (view) | Author: Stefan Krah (skrah) * (Python committer) | Date: 2015年12月23日 21:50 | |
No, the regular build uses the libmpdec that is shipped with Python. The external libmpdec.so only comes into play if you compile --with-system-libmpdec. |
|||
| msg256938 - (view) | Author: John Walker (johnwalker) * | Date: 2015年12月23日 22:05 | |
> No, the regular build uses the libmpdec that is shipped with > Python. The external libmpdec.so only comes into play if you > compile --with-system-libmpdec. Oh, OK. I see whats happening. My distro deletes the shipped version and compiles --with-system-libmpdec. We're on the same page now, thanks. https://projects.archlinux.org/svntogit/packages.git/tree/trunk/PKGBUILD?h=packages/python |
|||
| msg256940 - (view) | Author: Stefan Krah (skrah) * (Python committer) | Date: 2015年12月23日 22:25 | |
Let's re-target this issue:
Implementing as_integer_ratio() sounds reasonable, but let's hope
people won't try to convert Decimal('1E+999999999999999999').
Mark, was there perhaps a reason not to add the method?
|
|||
| msg257003 - (view) | Author: Terry J. Reedy (terry.reedy) * (Python committer) | Date: 2015年12月25日 17:58 | |
A visible new feature is an enhancement (even if performance is the reason for the new feature) and can only go in 3.6. |
|||
| msg257073 - (view) | Author: Stefan Krah (skrah) * (Python committer) | Date: 2015年12月27日 12:13 | |
Previously the method was rejected in #8947. But the speedup is quite dramatic. This is a benchmark for converting "1.91918261298362195e-100", 1000000 repetitions: float.as_integer_ratio: 0.548023 Decimal.as_integer_ratio: normalize=False: 2.661191 Decimal.as_integer_ratio: normalize=True: 4.382903 Fraction.from_decimal: 29.436584 |
|||
| msg257088 - (view) | Author: Mark Dickinson (mark.dickinson) * (Python committer) | Date: 2015年12月27日 19:12 | |
> Mark, was there perhaps a reason not to add the method? In the past, there's been a desire to keep the decimal module minimal in scope, implementing little more than what's required for the specification. A number of proposed additions have been rejected on this basis. Ah, and now I read Stefan's reference to #8947. I'm not sure anything has really changed since that discussion. Count me as -0E0 on the proposal. Off-topic: I wonder whether `int.as_integer_ratio` should be implemented (for the same reasons that `int.numerator`, `int.real` and so on are). |
|||
| msg257090 - (view) | Author: Stefan Krah (skrah) * (Python committer) | Date: 2015年12月27日 19:53 | |
Keeping the API small is a good reason against it. There aren't many ways to destructure a Decimal though: As far as I know, the return value of as_tuple() dates back from the time when the coefficient was actually a tuple. The function is slow and not really nice to use. Probably many people would be happy if we added as_integer_ratio() and an as_triple() function that represents the coefficient as an integer. > Off-topic: I wonder whether `int.as_integer_ratio` should be implemented (for the same reasons that `int.numerator`, `int.real` and so on are). If users want to duck-type, I think yes. |
|||
| msg257094 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年12月27日 20:10 | |
Arguments against as_integer_ratio() look weighty. But may be there are less arguments against numerator and denominator? |
|||
| msg257097 - (view) | Author: Raymond Hettinger (rhettinger) * (Python committer) | Date: 2015年12月27日 20:30 | |
Now that there is more than one use case for Decimal.as_integer_ratio(), I'll add my support to this feature request. |
|||
| msg257115 - (view) | Author: Stefan Krah (skrah) * (Python committer) | Date: 2015年12月28日 14:12 | |
Here's a patch. The Python and doc parts are from Mark (#8947). I did not optimize the normalization yet, in C the code is less clean and there were some corner cases where the gcd is actually faster. |
|||
| msg257137 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2015年12月28日 22:14 | |
New changeset f3b09c269af0 by Stefan Krah in branch 'default': Issue #25928: Add Decimal.as_integer_ratio(). Python parts and docs by https://hg.python.org/cpython/rev/f3b09c269af0 |
|||
| msg257141 - (view) | Author: Stefan Krah (skrah) * (Python committer) | Date: 2015年12月28日 22:37 | |
Hopefully I wasn't moving too fast, but I won't have time in the next days/weeks. Serhiy and Alexander (#8947) would prefer more homogeneous error messages and docstrings between float/pydecimal/cdecimal. I wouldn't mind a patch in another issue (no argument clinic!), provided that we agree on something and have input from the native English speakers. Thanks for the review! |
|||
| msg257147 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年12月28日 22:52 | |
> Hopefully I wasn't moving too fast, but I won't have time in the next days/weeks. No, the patch is pretty clear. Thanks. > Serhiy and Alexander (#8947) would prefer more homogeneous error messages and docstrings between float/pydecimal/cdecimal. This would help to optimize creating a Fraction. Opened issue25971 for this. |
|||
| msg257149 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2015年12月28日 23:12 | |
New changeset 510ff609cb4f by Stefan Krah in branch 'default': Issue #25928: Temporarily disable some tests in test_statistics in order https://hg.python.org/cpython/rev/510ff609cb4f |
|||
| msg257150 - (view) | Author: Stefan Krah (skrah) * (Python committer) | Date: 2015年12月28日 23:15 | |
Steven, could you have a look at the failures in test_statistics? Some tests fail because they assume non-normalized fractions, I still have to look at the other assumptions. |
|||
| msg257151 - (view) | Author: Stefan Krah (skrah) * (Python committer) | Date: 2015年12月28日 23:47 | |
Ah yes, the test_statistics failures look like #18841 again. |
|||
| msg257179 - (view) | Author: Stefan Krah (skrah) * (Python committer) | Date: 2015年12月29日 12:51 | |
I've opened #25974 for the statistics.py issues. |
|||
| msg262887 - (view) | Author: Raymond Hettinger (rhettinger) * (Python committer) | Date: 2016年04月05日 02:39 | |
I don't think a new public method should have been added. Historically, we've been careful to not grow the API beyond what is in the spec or the dunder methods required to interface with standard Python. The feature creep is at odds with the intended goals for the module that have been present since the outset. As long as the spec remains unchanged, the API for this module should be treated as stable. Another issue is that the API for the module is already so large that it impairs usability. Please don't make it worse by adding new methods and inventing details that aren't in the spec. |
|||
| msg262896 - (view) | Author: Stefan Krah (skrah) * (Python committer) | Date: 2016年04月05日 10:10 | |
Raymond, you added your support in msg257097. I'm not very happy to spend my time implementing the feature and then rehashing everything after 3 months. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:58:25 | admin | set | github: 70116 |
| 2018年03月13日 19:04:17 | belopolsky | link | issue8947 superseder |
| 2016年04月05日 10:10:41 | skrah | set | nosy:
+ gvanrossum messages: + msg262896 |
| 2016年04月05日 02:39:13 | rhettinger | set | nosy:
+ tim.peters messages: + msg262887 |
| 2015年12月29日 12:51:14 | skrah | set | status: open -> closed resolution: fixed messages: + msg257179 stage: patch review -> resolved |
| 2015年12月28日 23:47:09 | skrah | set | messages: + msg257151 |
| 2015年12月28日 23:15:21 | skrah | set | messages: + msg257150 |
| 2015年12月28日 23:12:04 | python-dev | set | messages: + msg257149 |
| 2015年12月28日 22:52:50 | serhiy.storchaka | set | messages: + msg257147 |
| 2015年12月28日 22:37:29 | skrah | set | messages: + msg257141 |
| 2015年12月28日 22:14:15 | python-dev | set | nosy:
+ python-dev messages: + msg257137 |
| 2015年12月28日 16:47:16 | serhiy.storchaka | set | stage: test needed -> patch review |
| 2015年12月28日 14:12:26 | skrah | set | files:
+ issue25928.diff keywords: + patch messages: + msg257115 |
| 2015年12月27日 20:30:25 | rhettinger | set | messages: + msg257097 |
| 2015年12月27日 20:10:57 | serhiy.storchaka | set | messages: + msg257094 |
| 2015年12月27日 19:53:37 | skrah | set | messages: + msg257090 |
| 2015年12月27日 19:12:46 | mark.dickinson | set | messages: + msg257088 |
| 2015年12月27日 12:13:24 | skrah | set | nosy:
+ rhettinger, belopolsky messages: + msg257073 |
| 2015年12月25日 17:58:40 | terry.reedy | set | versions:
+ Python 3.6, - Python 3.5 nosy: + terry.reedy messages: + msg257003 type: performance -> enhancement stage: test needed |
| 2015年12月23日 22:25:16 | skrah | set | nosy:
+ mark.dickinson title: Improve performance of statistics._decimal_to_ratio and fractions.from_decimal -> Add Decimal.as_integer_ratio() messages: + msg256940 assignee: skrah |
| 2015年12月23日 22:05:14 | johnwalker | set | messages: + msg256938 |
| 2015年12月23日 21:50:32 | skrah | set | messages: + msg256937 |
| 2015年12月23日 21:46:39 | johnwalker | set | messages: + msg256936 |
| 2015年12月23日 21:30:05 | skrah | set | messages: + msg256934 |
| 2015年12月23日 21:01:22 | johnwalker | set | messages:
+ msg256933 versions: + Python 3.5 |
| 2015年12月23日 20:57:13 | steven.daprano | set | messages: + msg256932 |
| 2015年12月23日 15:18:28 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka messages: + msg256917 |
| 2015年12月23日 15:08:50 | skrah | set | versions: - Python 3.5 |
| 2015年12月23日 15:07:35 | skrah | set | nosy:
+ steven.daprano |
| 2015年12月23日 15:07:15 | skrah | set | nosy:
+ skrah messages: + msg256914 |
| 2015年12月22日 22:49:34 | johnwalker | set | type: performance |
| 2015年12月22日 22:21:02 | johnwalker | set | messages: + msg256874 |
| 2015年12月22日 22:18:25 | johnwalker | create | |