Issue 1811: True division of integers could be more accurate

➜

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/46136

classification

Title:	True division of integers could be more accurate
Type:	behavior	Stage:	resolved
Components:	Interpreter Core	Versions:	Python 3.2, Python 2.7

process

Dependencies:	Superseder:
Status:	closed	Resolution:	fixed
Assigned To:	mark.dickinson	Nosy List:	christian.heimes, mark.dickinson, terry.reedy, tim.peters
Priority:	normal	Keywords:	patch

Created on 2008年01月12日 05:20 by mark.dickinson, last changed 2022年04月11日 14:56 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
int_truediv.py	mark.dickinson, 2008年01月12日 05:20	More accurate truediv for integers: example code
long_division.patch	mark.dickinson, 2008年05月18日 17:04	patch for Objects/longobject.c
long_division2.patch	mark.dickinson, 2009年12月23日 15:33

Messages (12)
msg59798 - (view)	Author: Mark Dickinson (mark.dickinson) * (Python committer)	Date: 2008年01月12日 05:20
Division of two longs can produce results that are needlessly inaccurate: >>> from __future__ import division >>> 1040/1039 10.000000000000002 The correct result is, of course, 10.0, which is exactly representable as a float. The attached snippet of Python code shows that things don't have to be this way.
msg59824 - (view)	Author: Christian Heimes (christian.heimes) * (Python committer)	Date: 2008年01月12日 16:57
How fast is your algorithm compared to the current algorithm and where starts the break even zone? Most users use only small integers so it might be a good idea to use a simpler algorithm for longs with Py_SIZE() == 1. This is important for py3k.
msg59830 - (view)	Author: Mark Dickinson (mark.dickinson) * (Python committer)	Date: 2008年01月12日 17:18
It would be easy and safe to just use a/b = float(a)/float(b) if both a and b are less than 2**53 (assuming IEEE doubles). Then there wouldn't be any loss of speed for small integers. For large integers the algorithm I posted should run in time linear in the number of digits of max(a, b), at least in the worst case (though with appropriate optimizations it could be made much faster for 'random' inputs). The current algorithm has essentially O(1) runtime. To get proper timings I'd have to code this up properly. I'll do this, unless there's a consensus that it would be a waste of time :)
msg59838 - (view)	Author: Christian Heimes (christian.heimes) * (Python committer)	Date: 2008年01月12日 20:49
Mark Dickinson wrote: > To get proper timings I'd have to code this up properly. I'll do this, unless there's > a consensus that it would be a waste of time :) Why don't you ask Tim? He is the right person for the discussion. I'm only an interested amateur mathematician. Christian
msg59839 - (view)	Author: Mark Dickinson (mark.dickinson) * (Python committer)	Date: 2008年01月12日 21:05
Tim: is this worth fixing?
msg59988 - (view)	Author: Mark Dickinson (mark.dickinson) * (Python committer)	Date: 2008年01月16日 00:22
A related problem is that float(n) isn't always correctly rounded for an integer n. A contrived example: >>> n = 268 + 216 - 1 >>> float(n) 2.9514790517935283e+20 Here the difference between float(n) and the true value of n is around 0.99998 ulps; a correctly rounded float() would have error at most 0.5 ulps. I don't regard this as terribly serious: from looking at the code, I think it's always true that the error is strictly less than 1 ulp, which is just enough to guarantee that float(n) == n whenever n is exactly representable as a float. In contrast, the division of two integers can produce results that are up to 3.5 ulps out from the true value. This is, in my opinion, a worryingly large error for a simple calculation.
msg64034 - (view)	Author: Terry J. Reedy (terry.reedy) * (Python committer)	Date: 2008年03月19日 04:28
To my mind, the inaccurate result is a bug that should be fixed. Note: (3.0a3) >>> 10e40/10e39 10.0 The rationale for the division change is that (as far as reasonably possible) arithmetic operations with same values should give same result regardless of types. I have not looked at either algorithm, but if long/long started by finding divmod(), but added fractional value when remainer is non-zero instead of tossing it, exact quotients would easily be exact (unless too large).
msg67031 - (view)	Author: Mark Dickinson (mark.dickinson) * (Python committer)	Date: 2008年05月18日 17:04
Here's a patch that fixes the rounding of integer division. It includes a fast path for the case where both integers are small (less than about 3.5e12).
msg91020 - (view)	Author: Mark Dickinson (mark.dickinson) * (Python committer)	Date: 2009年07月28日 22:29
An example of a case that's almost 3.5 ulps out (Python 2.6): Python 2.6.2 (r262:71600, Jul 8 2009, 09:56:31) [GCC 4.0.1 (Apple Inc. build 5490)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> from __future__ import division >>> m = 295147931372582273023 >>> n = 295147932265116303360 >>> m/n 0.99999999697597697 The correctly rounded result would be the float given by 0.9999999969759773.
msg96834 - (view)	Author: Mark Dickinson (mark.dickinson) * (Python committer)	Date: 2009年12月23日 10:08
Stealing this from Tim, with the intention of acting on it in the next couple of weeks.
msg96840 - (view)	Author: Mark Dickinson (mark.dickinson) * (Python committer)	Date: 2009年12月23日 15:33
Here's an updated patch, against py3k. On my machine, a/b is a touch faster with this patch when abs(a), abs(b) are smaller than 1e15 or so; it's (inevitably) slower than the existing implementation for larger a and b. For 'random' a and b, average running time is proportional to the size of b, and is independent of the size of a; worst-case running time (which occurs when a has many trailing zero bits) grows as max(size(a), size(b)). Changing versions to 2.7 and 3.2, but I'm mostly aiming for 3.2. It may not be worth backporting to 2.7, given the extra effort required to deal correctly with ints as well as with longs.
msg96910 - (view)	Author: Mark Dickinson (mark.dickinson) * (Python committer)	Date: 2009年12月27日 15:10
Fixed in r77062 (trunk), r77063 (py3k).

History
Date	User	Action	Args
2022年04月11日 14:56:29	admin	set	github: 46136
2009年12月27日 15:10:11	mark.dickinson	set	status: open -> closed resolution: fixed messages: + msg96910 stage: patch review -> resolved
2009年12月23日 15:33:51	mark.dickinson	set	files: + long_division2.patch stage: patch review messages: + msg96840 versions: + Python 2.7, Python 3.2, - Python 2.6, Python 3.0
2009年12月23日 10:08:22	mark.dickinson	set	assignee: tim.peters -> mark.dickinson messages: + msg96834
2009年07月28日 22:29:05	mark.dickinson	set	messages: + msg91020
2008年05月18日 17:04:25	mark.dickinson	set	files: + long_division.patch keywords: + patch messages: + msg67031
2008年03月19日 04:28:40	terry.reedy	set	nosy: + terry.reedy messages: + msg64034
2008年01月16日 00:22:46	mark.dickinson	set	messages: + msg59988
2008年01月12日 23:33:04	rhettinger	set	assignee: tim.peters
2008年01月12日 21:05:34	mark.dickinson	set	nosy: + tim.peters messages: + msg59839
2008年01月12日 20:49:42	christian.heimes	set	messages: + msg59838
2008年01月12日 17:18:54	mark.dickinson	set	messages: + msg59830
2008年01月12日 16:57:16	christian.heimes	set	priority: normal nosy: + christian.heimes messages: + msg59824
2008年01月12日 05:20:56	mark.dickinson	set	components: + Interpreter Core versions: + Python 2.6, Python 3.0
2008年01月12日 05:20:28	mark.dickinson	create

homepage