Issue 22486: Add math.gcd()

➜

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/66676

classification

Title:	Add math.gcd()
Type:	performance	Stage:	resolved
Components:	Library (Lib)	Versions:	Python 3.5

process

Dependencies:	Superseder:
Status:	closed	Resolution:	fixed
Assigned To:	Nosy List:	belopolsky, gladman, mark.dickinson, pitrou, python-dev, scoder, serhiy.storchaka, vstinner, wolma
Priority:	normal	Keywords:	patch

Created on 2014年09月24日 20:40 by scoder, last changed 2022年04月11日 14:58 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
lehmer_gcd_4.patch	serhiy.storchaka, 2014年09月25日 12:23	review
lehmer_gcd_5.patch	serhiy.storchaka, 2014年09月25日 12:53	review
lehmer_gcd_6.patch	serhiy.storchaka, 2014年09月25日 20:13	review
lehmer_gcd_7.patch	serhiy.storchaka, 2014年09月27日 08:23	review
euclidean_gcd.patch	serhiy.storchaka, 2014年09月27日 08:28	review
lehmer_gcd_8.patch	serhiy.storchaka, 2014年12月12日 11:13	review

Pull Requests
URL	Status	Linked	Edit
PR 18021	merged	vstinner, 2020年01月16日 08:17

Messages (48)
msg227487 - (view)	Author: Stefan Behnel (scoder) * (Python committer)	Date: 2014年09月24日 20:40
fractions.gcd() is required for normalising numerator and denominator of the Fraction data type. Some speed improvements were applied to Fraction in issue 22464, now the gcd() function takes up about half of the instantiation time in the benchmark in issue 22458, which makes it quite a heavy part of the overall Fraction computation time. The current implementation is def gcd(a, b): while b: a, b = b, a%b return a Reimplementing it in C would provide for much faster calculations. Here is a Cython version that simply drops the calculation loop into C as soon as the numbers are small enough to fit into a C long long int: def _gcd(a, b): # Try doing all computation in C space. If the numbers are too large # at the beginning, retry until they are small enough. cdef long long ai, bi while b: try: ai, bi = a, b except OverflowError: pass else: # switch to C loop while bi: ai, bi = bi, ai%bi return ai a, b = b, a%b return a It's substantially faster already because the values will either be small enough right from the start or quickly become so after a few iterations with Python objects. Further improvements should be possible with a dedicated PyLong implementation based on Lehmer's GCD algorithm: https://en.wikipedia.org/wiki/Lehmer_GCD_algorithm
msg227507 - (view)	Author: Mark Dickinson (mark.dickinson) * (Python committer)	Date: 2014年09月25日 06:22
In case it's useful, see issue #1682 for my earlier Lehmer gcd implementation. At the time, that approach was dropped as being premature optimisation.
msg227522 - (view)	Author: Antoine Pitrou (pitrou) * (Python committer)	Date: 2014年09月25日 10:44
If Python grows an optimized implementation, how about exposing it in the math module?
msg227524 - (view)	Author: Mark Dickinson (mark.dickinson) * (Python committer)	Date: 2014年09月25日 11:06
> If Python grows an optimized implementation, how about exposing it in the math module? +1.
msg227525 - (view)	Author: Stefan Behnel (scoder) * (Python committer)	Date: 2014年09月25日 11:28
That's what the patch does anyway. +1
msg227527 - (view)	Author: Antoine Pitrou (pitrou) * (Python committer)	Date: 2014年09月25日 11:45
Hmm... which patch?
msg227528 - (view)	Author: Stefan Behnel (scoder) * (Python committer)	Date: 2014年09月25日 11:49
http://bugs.python.org/file9486/lehmer_gcd.patch (see #1682)
msg227529 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)	Date: 2014年09月25日 12:20
Here is updated Mark's patch from issue1682. It is ported to 3.5, slightly simplified and optimized (I did not touched the main algorithm still), utilized in the fractions module, added tests and documentation. It speeds up Stefan's fractions benchmark about 20%.
msg227531 - (view)	Author: Stefan Behnel (scoder) * (Python committer)	Date: 2014年09月25日 12:29
The problem is that this changes the behaviour of fractions.gcd() w.r.t. negative numbers. It's a public function, so we should keep it for backwards compatibility reasons, especially when adding a new function in the math module.
msg227532 - (view)	Author: Stefan Behnel (scoder) * (Python committer)	Date: 2014年09月25日 12:32
Oh, and thanks for working on it, Serhiy! :)
msg227533 - (view)	Author: Wolfgang Maier (wolma) *	Date: 2014年09月25日 12:36
see issue22477 for a discussion of whether the behavior of fractions.gcd should be changed or not
msg227534 - (view)	Author: Wolfgang Maier (wolma) *	Date: 2014年09月25日 12:38
sorry, forgot to format the link: issue<22477>
msg227535 - (view)	Author: Stefan Behnel (scoder) * (Python committer)	Date: 2014年09月25日 12:48
The thing is, if we add something new in a substantially more exposed place (the math module), then why break legacy code in addition? Just leaving it the way it is won't harm anyone, really.
msg227536 - (view)	Author: Wolfgang Maier (wolma) *	Date: 2014年09月25日 12:51
I wasn't arguing for or against anything, just providing a link to the relevant discussion.
msg227537 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)	Date: 2014年09月25日 12:53
Well, here is a patch which keeps the same weird behavior of fractions.gcd().
msg227538 - (view)	Author: (gladman)	Date: 2014年09月25日 13:04
I am inclined to think that a maths.gcd() makes sense as this would be where I would go first to find this function. And the prospect of better performance is attractive since the gcd is an important operation in work with number theory algorithms. Would it co-exist with fractions.gcd(), with identical semantics? Or would it co-exist with fractions.gcd(), with the 'less surprising' semantics that are under discussion in the 'GCD in Fractions' thread? Would it take on the suggestion of operating on one or more input parameters?
msg227554 - (view)	Author: Mark Dickinson (mark.dickinson) * (Python committer)	Date: 2014年09月25日 16:56
> Or would it co-exist with fractions.gcd(), with the 'less surprising' > semantics that are under discussion in the 'GCD in Fractions' thread? Yes, exactly. math.gcd will always give a nonnegative result. The output of fractions.gcd remains unchanged for integer inputs, for backwards compatibility.
msg227555 - (view)	Author: Mark Dickinson (mark.dickinson) * (Python committer)	Date: 2014年09月25日 16:57
Serhiy: thank you! I've been meaning to update that patch for a long time, but hadn't found the courage or time to face the inevitable bitrot.
msg227570 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)	Date: 2014年09月25日 20:13
Now I spent more time on the patch. Changes in updated patch: * Removed code duplication for odd and even k. * Temporary buffers c and d no longer allocated on every iteration. * Long result now compacted. No longer unused allocated size. * Added checks for results of long_abs() (it can fail). * Merged _PyLong_GCD and long_gcd. Fast path for small negative integers no longer need to copy long objects in long_abs(). * Added tests for large negative numbers and for case Py_SIZE(a) - Py_SIZE(b) > 3.
msg227619 - (view)	Author: Stefan Behnel (scoder) * (Python committer)	Date: 2014年09月26日 15:01
Thanks, Serhiy. However, something is wrong with the implementation. The benchmark runs into an infinite loop (it seems). And so do the previous patches. Does it work for you?
msg227620 - (view)	Author: Stefan Behnel (scoder) * (Python committer)	Date: 2014年09月26日 15:06
I compiled it with 30 bit digits, in case that's relevant. (It might be.)
msg227623 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)	Date: 2014年09月26日 15:21
It works to me (compiled with 15-bit digits). Cold you please add debugging prints (before and after the call of math.gcd()) and find which operation is looping (math.gcd() itself, and for what arguments, or some Python code)?
msg227629 - (view)	Author: Stefan Behnel (scoder) * (Python committer)	Date: 2014年09月26日 15:56
This is what hangs for me: math.gcd(1216342683557601535506311712, 436522681849110124616458784) "a" and "b" keep switching between both values, but otherwise, the loop just keeps running. The old fractions.gcd() gives 32 for them.
msg227630 - (view)	Author: Stefan Behnel (scoder) * (Python committer)	Date: 2014年09月26日 16:03
I can confirm that it works with 15 bit digits.
msg227637 - (view)	Author: Mark Dickinson (mark.dickinson) * (Python committer)	Date: 2014年09月26日 17:28
To avoid regressions, please can we leave the old `fractions.gcd` exactly as it was? For example, the current `fractions.gcd` does work for Fraction instances [1]. That's certainly not its intended use, but I wouldn't be surprised if there's code out there that uses it in that way. It also just happens to work for nonnegative finite float inputs, because a % b gives exact results when a and b are positive floats, so no error is introduced at any point. I'd also worry about breaking existing uses involving integer-like objects (instances of numpy.int64, for example) in place of instances of ints. [1] By "works", I mean that if a and b are Fractions then gcd(a, b) returns a Fraction such that (1) a and b are integer multiples of gcd(a, b), and (2) gcd(a, b) is an integer multiple of any other number with this property.
msg227638 - (view)	Author: Mark Dickinson (mark.dickinson) * (Python committer)	Date: 2014年09月26日 17:28
> This is what hangs for me: Uh-oh. Sounds like I screwed up somewhere. I'll take a look this weekend, unless Serhiy beats me too it.
msg227640 - (view)	Author: Mark Dickinson (mark.dickinson) * (Python committer)	Date: 2014年09月26日 17:30
> too it. Bah. "to it". Stupid fingers.
msg227641 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)	Date: 2014年09月26日 17:42
Thank you Stefan. I confirm that it hangs with 30-bit digits. One existing bug is in the use of PyLong_AsLong() before simple Euclidean loop. It should be PyLong_AsLongLong() if the long is not enough for two digits. But there is another bug in inner loop...
msg227663 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)	Date: 2014年09月27日 08:23
Here is fixed patch. There was integer overflow. In C shortshort is extended to int, but intint results int.
msg227664 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)	Date: 2014年09月27日 08:28
And for comparison here is simpler patch with Euclidean algorithm.
msg227670 - (view)	Author: Stefan Behnel (scoder) * (Python committer)	Date: 2014年09月27日 13:06
Patch 7 works for me. Why are the two Py_ABS() calls at the end needed when we start off the algorithm with long_abs()? The Lehmer code is complex (I guess that's why you added the pure Euclidean implementation), but it's the right algorithm to use here, so I'd say we should. It's 4% faster than the Euclidean code for the fractions benchmark when using 30 bit digits, but (surprisingly enough) about the same speed with 15 bit digits. There is no major difference to expect here as the numbers are perpetually normalised in Fractions and thus kept small (usually small enough to fit into a 64bit integer), i.e. Euclid should do quite well on them. The difference for big numbers is substantial though: Euclid: $ ./python -m timeit -s 'from math import gcd; a = 2123 + 3653 + 523 + 749; b = 2653 + 2123 + 523 + 1134' 'gcd(a,b)' 10000 loops, best of 3: 71 usec per loop Lehmer: $ ./python -m timeit -s 'from math import gcd; a = 2123 + 3653 + 523 + 749; b = 2653 + 2123 + 523 + 1134' 'gcd(a,b)' 100000 loops, best of 3: 11.6 usec per loop
msg227672 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)	Date: 2014年09月27日 13:39
> Why are the two Py_ABS() calls at the end needed when we start off the > algorithm with long_abs()? Because long_abs()'s are omitted for small enough numbers (common case). So we avoid a copying for negative numbers or int subclasses. > I guess that's why you added the pure Euclidean implementation Euclidean algorithm is required step at the end of Lehmer algorithm. > It's 4% faster than the Euclidean code for the fractions benchmark > when using 30 bit digits, but (surprisingly enough) about the same speed > with 15 bit digits. May be because Lehmer code uses 64-bit computation for 30-bit digits, and Euclidean code always uses 32-bit computation. > The difference for big numbers is substantial though: 1000-bit integers are big, but can be encountered in real word (e.g. in cryptography). So may be there is need in Lehmer algorithm.
msg227711 - (view)	Author: Stefan Behnel (scoder) * (Python committer)	Date: 2014年09月27日 18:42
My personal take is: if there is an implementation in the stdlib, it should be the one that's most widely applicable. And that includes large numbers. We have a working implementation that is algorithmically faster for large numbers, so I can't see why we should drop it unused. I'm for merging patch 7.
msg228544 - (view)	Author: Stefan Behnel (scoder) * (Python committer)	Date: 2014年10月05日 08:46
Any objections to merging the last patch?
msg228545 - (view)	Author: Mark Dickinson (mark.dickinson) * (Python committer)	Date: 2014年10月05日 08:49
> Any objections to merging the last patch? Yes! Please don't make these changes to `Fractions.gcd`: they'll cause regressions for existing uses of `Fractions.gcd` with objects not of type `int`.
msg228546 - (view)	Author: Stefan Behnel (scoder) * (Python committer)	Date: 2014年10月05日 08:52
There are not such changes in patch 7. The fractions.gcd() function is unchanged but no longer used by the Fraction type, which now uses math.gcd() internally instead.
msg228547 - (view)	Author: Mark Dickinson (mark.dickinson) * (Python committer)	Date: 2014年10月05日 08:54
Ah, I misread; thanks. What happens with this patch if a Fraction has been created with Integrals that aren't of type int? (E.g., with NumPy int32 instances, for example?)
msg228548 - (view)	Author: STINNER Victor (vstinner) * (Python committer)	Date: 2014年10月05日 08:58
Why patching fraction.Fraction constructor instead of fractions.gcd()? I don't like the idea of having two functions, math.gcd and fractions.gcd, which do almost the same, but one is slow, whereas the other is fast. It's harder to write efficient code working on Python < 3.5 (use fractions) and Python >= 3.5 (use math or fractions?). I suggest to modify fractions.gcd() to use math.gcd() if the two parameters are int. We just have to adjust the sign: if the second parameter is negative, return -math.gcd(a, b). (I guess that we have unit tests for fractions.gcd checking the 4 cases for signed parameters.)
msg228549 - (view)	Author: Mark Dickinson (mark.dickinson) * (Python committer)	Date: 2014年10月05日 09:03
> I suggest to modify fractions.gcd() to use math.gcd() if the two parameters are int. Sounds fine to me, so long as the code (both fractions.gcd and the fractions.Fraction implementation) continues to function as before for objects that don't have exact type int.
msg228550 - (view)	Author: Stefan Behnel (scoder) * (Python committer)	Date: 2014年10月05日 09:05
+1 I mean, there is already such a type check in Fraction.__init__(), but I can see a case for also optimising fraction.gcd() for exact ints.
msg228551 - (view)	Author: Mark Dickinson (mark.dickinson) * (Python committer)	Date: 2014年10月05日 09:30
One other suggestion: I think math.gcd should work with arbitrary Python objects implementing __index__, and not just with instances of int.
msg228552 - (view)	Author: Mark Dickinson (mark.dickinson) * (Python committer)	Date: 2014年10月05日 09:32
> I mean, there is already such a type check in Fraction.__init__() That type-check doesn't protect us from non-int Integrals, though, as far as I can tell. It looks to me as though doing `Fraction(numpy.int32(3), numpy.int32(2))` would fail with a TypeError after this patch. (It works in Python 3.4.)
msg228556 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)	Date: 2014年10月05日 10:26
I suggest just add deprecation warning in fractions.gcd(). Or at least add notes which recommend math.gcd() in the docstring and the documentation of fractions.gcd(). > One other suggestion: I think math.gcd should work with arbitrary Python > objects implementing __index__, and not just with instances of int. Agree.
msg232521 - (view)	Author: STINNER Victor (vstinner) * (Python committer)	Date: 2014年12月12日 08:57
What's the status of this issue? See also the issue #22477.
msg232523 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)	Date: 2014年12月12日 11:12
Here is a patch which addresses both Mark's suggestions. * math.gcd() now work with arbitrary Python objects implementing __index__. * fractions.gcd() and Fraction's constructor now use math.gcd() if both arguments are int, but also support non-ints (e.g. Fractions or floats). * fractions.gcd() now is deprecated. But before committing I want to experiment with simpler implementation and compare it with current complex implementation. If the difference will be not too large, we could use simpler implementation.
msg242048 - (view)	Author: Stefan Behnel (scoder) * (Python committer)	Date: 2015年04月26日 07:16
Any more comments on this? The deadlines for new features in Py3.5 are getting closer. It seems we're just discussing details here, but pretty much everyone wants this feature. So, what are the things that still need to be done? Serhiy submitted working patches months ago.
msg243015 - (view)	Author: Roundup Robot (python-dev) (Python triager)	Date: 2015年05月12日 21:20
New changeset 34648ce02bd4 by Serhiy Storchaka in branch 'default': Issue #22486: Added the math.gcd() function. The fractions.gcd() function now is https://hg.python.org/cpython/rev/34648ce02bd4
msg360112 - (view)	Author: STINNER Victor (vstinner) * (Python committer)	Date: 2020年01月16日 10:02
New changeset 4691a2f2a2b8174a6c958ce6976ed5f3354c9504 by Victor Stinner in branch 'master': bpo-39350: Remove deprecated fractions.gcd() (GH-18021) https://github.com/python/cpython/commit/4691a2f2a2b8174a6c958ce6976ed5f3354c9504

History
Date	User	Action	Args
2022年04月11日 14:58:08	admin	set	github: 66676
2020年01月16日 10:02:58	vstinner	set	messages: + msg360112
2020年01月16日 08:17:24	vstinner	set	pull_requests: + pull_request17417
2015年05月13日 06:52:44	berker.peksag	set	stage: patch review -> resolved
2015年05月12日 23:26:09	benjamin.peterson	set	status: open -> closed resolution: fixed
2015年05月12日 21:20:24	python-dev	set	nosy: + python-dev messages: + msg243015
2015年04月26日 07:16:47	scoder	set	messages: + msg242048
2014年12月12日 11:13:19	serhiy.storchaka	set	files: + lehmer_gcd_8.patch
2014年12月12日 11:12:36	serhiy.storchaka	set	messages: + msg232523
2014年12月12日 08:57:02	vstinner	set	messages: + msg232521
2014年10月05日 10:26:35	serhiy.storchaka	set	messages: + msg228556
2014年10月05日 09:32:45	mark.dickinson	set	messages: + msg228552
2014年10月05日 09:30:00	mark.dickinson	set	messages: + msg228551
2014年10月05日 09:05:16	scoder	set	messages: + msg228550
2014年10月05日 09:03:17	mark.dickinson	set	messages: + msg228549
2014年10月05日 08:58:44	vstinner	set	nosy: + vstinner messages: + msg228548
2014年10月05日 08:54:20	mark.dickinson	set	messages: + msg228547
2014年10月05日 08:52:26	scoder	set	messages: + msg228546
2014年10月05日 08:49:22	mark.dickinson	set	messages: + msg228545
2014年10月05日 08:46:57	scoder	set	messages: + msg228544
2014年09月29日 19:17:16	belopolsky	set	nosy: + belopolsky
2014年09月27日 18:42:25	scoder	set	messages: + msg227711
2014年09月27日 13:39:24	serhiy.storchaka	set	messages: + msg227672
2014年09月27日 13:06:01	scoder	set	messages: + msg227670
2014年09月27日 08:28:06	serhiy.storchaka	set	files: + euclidean_gcd.patch messages: + msg227664
2014年09月27日 08:23:26	serhiy.storchaka	set	files: + lehmer_gcd_7.patch messages: + msg227663
2014年09月26日 17:42:01	serhiy.storchaka	set	messages: + msg227641
2014年09月26日 17:30:27	mark.dickinson	set	messages: + msg227640
2014年09月26日 17:28:54	mark.dickinson	set	messages: + msg227638
2014年09月26日 17:28:05	mark.dickinson	set	messages: + msg227637
2014年09月26日 16:03:17	scoder	set	messages: + msg227630
2014年09月26日 15:56:51	scoder	set	messages: + msg227629
2014年09月26日 15:21:26	serhiy.storchaka	set	messages: + msg227623
2014年09月26日 15:06:22	scoder	set	messages: + msg227620
2014年09月26日 15:01:57	scoder	set	messages: + msg227619
2014年09月25日 20:13:12	serhiy.storchaka	set	files: + lehmer_gcd_6.patch messages: + msg227570
2014年09月25日 16:57:20	mark.dickinson	set	messages: + msg227555
2014年09月25日 16:56:00	mark.dickinson	set	messages: + msg227554
2014年09月25日 13:04:09	gladman	set	nosy: + gladman messages: + msg227538
2014年09月25日 12:53:53	serhiy.storchaka	set	files: + lehmer_gcd_5.patch messages: + msg227537
2014年09月25日 12:51:00	wolma	set	messages: + msg227536
2014年09月25日 12:48:47	scoder	set	messages: + msg227535
2014年09月25日 12:38:48	wolma	set	messages: + msg227534
2014年09月25日 12:36:23	wolma	set	messages: + msg227533
2014年09月25日 12:32:36	scoder	set	messages: + msg227532
2014年09月25日 12:29:40	scoder	set	type: enhancement -> performance messages: + msg227531 components: - Extension Modules
2014年09月25日 12:23:04	serhiy.storchaka	set	files: + lehmer_gcd_4.patch keywords: + patch
2014年09月25日 12:20:57	serhiy.storchaka	set	type: performance -> enhancement components: + Extension Modules title: Speed up fractions.gcd() -> Add math.gcd() nosy: + serhiy.storchaka messages: + msg227529 stage: patch review
2014年09月25日 11:53:29	wolma	set	nosy: + wolma
2014年09月25日 11:49:27	scoder	set	messages: + msg227528
2014年09月25日 11:45:52	pitrou	set	messages: + msg227527
2014年09月25日 11:28:37	scoder	set	messages: + msg227525
2014年09月25日 11:06:32	mark.dickinson	set	messages: + msg227524
2014年09月25日 10:44:23	pitrou	set	nosy: + pitrou messages: + msg227522
2014年09月25日 06:22:43	mark.dickinson	set	nosy: + mark.dickinson messages: + msg227507
2014年09月24日 20:40:35	scoder	create

homepage