I have programmed the extended euclidean algorithm. Is this a good approach?
def ext_ggT(a, b):
x,y, u,v = 0,1, 1,0
while a != 0:
q, r = b//a, b%a
m, n = x-u*q, y-v*q
b,a, x,y, u,v = a,r, u,v, m,n
gcd = b
return gcd, x, y
2 Answers 2
Not being familiar with the algorithm in question, it was very non-obvious to me what the code did, and in general I think it's good to write code in such a way that it's comprehensible with minimal reasonable effort even if the reader isn't a domain expert (i.e. they should be able to look up documentation, look at your code, and see how the two relate). The main hindrances I faced were:
- It wasn't immediately obvious that
x, y, u, v
represented two distinct series (where u was always a prior value of x, etc). In general there are a lot of variables in this code to keep track of, and not a lot of explanation of what they're all for. - Dependencies between the different values likewise were not obvious. Having all the assignments smushed onto one line made it hard to discern this visually; it's nice to use compact tuple assignments when the relationships are obvious, but it doesn't necessarily always improve readability.
- There're no doc/comments explaining what's going on.
- The name ext_ggT doesn't follow Python's snake_case naming convention and is a bit cryptic.
After reading the wiki link (thx Peilon) I was able to sort of reverse-engineer it and then I made some changes so that the code matches up more with my understanding based on the wiki article (and is commented so that anyone looking at this code side by side with the wiki article will immediately see what goes with what).
from collections import deque
from typing import Tuple
def extended_euclidean(a: int, b: int) -> Tuple[int, int, int]:
"""
Returns (gcd, x, y) such that:
gcd = greatest common divisor of (a, b)
x, y = coefficients such that ax + by = gcd
"""
# We only need to keep the last two elements of each series.
r = deque([b, a], 2)
s = deque([0, 1], 2)
t = deque([1, 0], 2)
# The next element of each series is a function of the previous two.
# We stop building these series once r (the remainder) is zero;
# the final result comes from the iteration prior to that one.
while r[-1] != 0:
q = r[-2] // r[-1]
r.append(r[-2] % r[-1])
s.append(s[-2] - s[-1] * q)
t.append(t[-2] - t[-1] * q)
return r[-2], s[-2], t[-2]
assert extended_euclidean(240, 46) == (2, -9, 47)
The biggest change is that I've represented the various series described in the wiki article as iterables, rather than representing each as two scalars; this doesn't make much difference to the way the code actually runs, but the fact that these six values (previously a, b, x, y, u, and v) represent three distinct series is now very obvious to the reader. The three series are initialized and extended in such a way as to "make alike look alike" -- you can see at a glance how each successive element is computed from the prior two, and easily discern where there are and aren't dependencies between these values.
You could initialize these series as simply:
r = [b, a]
s = [0, 1]
t = [1, 0]
and the code would return the correct result, but to preserve the behavior of only keeping the last two elements (which I agree is a good space optimization) I've converted them to deque
s with maxlen=2
. The deque abstracts away the business of automatically popping the unneeded values off of the left side, which helps declutter the "interesting" part of the implementation while still preserving the property of having it use constant space.
-
6\$\begingroup\$ I strongly disagree with using deques. There is so much unnecessary overhead for a simple algorithm. The original implementation follows the mathematical formulation but this version strays from it. \$\endgroup\$qwr– qwr2020年04月10日 04:27:17 +00:00Commented Apr 10, 2020 at 4:27
-
1\$\begingroup\$ I timed things, and this is rougly twice as slow as the original code with
python3
, and ten times as slow withpypy3
. \$\endgroup\$the default.– the default.2020年04月10日 07:17:43 +00:00Commented Apr 10, 2020 at 7:17 -
1\$\begingroup\$ Good points! In most real-world situations, it's more important to optimize reading and debugging a particular function than constant factors of its actual runtime, because human time is much more expensive than CPU time. But in a situation where every nanosecond counts because you're operating at massive scale, you may not want to use Python at all (say, use Rust to write the performance-sensitive part of your code and call the compiled library via CFFI from your Python code). \$\endgroup\$Samwise– Samwise2020年04月10日 15:07:09 +00:00Commented Apr 10, 2020 at 15:07
-
1\$\begingroup\$ I'd be very interested in seeing other implementations that follow the description of the algorithm as described in Wiki (i.e. expressing it as the r, s, t series). On the way to this one I did a version with 2-tuples that probably performs better than deques, but I didn't like the way that it read. \$\endgroup\$Samwise– Samwise2020年04月10日 15:08:36 +00:00Commented Apr 10, 2020 at 15:08
-
\$\begingroup\$ Inspite of all reasons, simple commenting and meaningful variable names would have been enough. Spirit of simplicity is lost. In fact, yours needs original one to show (a Mathl. newbie) how simple it is. Might be both complement each other, so in a sense both implementations are needed. \$\endgroup\$jiten– jiten2021年05月16日 12:22:29 +00:00Commented May 16, 2021 at 12:22
Give your variables meaningful names reflecting their role in the algorithm.
-
6\$\begingroup\$ In the context of mathematics it can be hard to find meaningful variable names \$\endgroup\$northerner– northerner2020年04月10日 00:22:59 +00:00Commented Apr 10, 2020 at 0:22
-
2\$\begingroup\$ I can only think of good names for quotient and remainder. Nothing else really makes anything clearer \$\endgroup\$qwr– qwr2020年04月10日 04:28:21 +00:00Commented Apr 10, 2020 at 4:28
-
1\$\begingroup\$ @northerner It's hard to find meaningful names in computer programming too. Just seems like a cop out. \$\endgroup\$2020年04月10日 10:09:52 +00:00Commented Apr 10, 2020 at 10:09
-
2\$\begingroup\$ @Peilonrayz you seriously seem to miss the point \$\endgroup\$northerner– northerner2020年04月10日 10:12:16 +00:00Commented Apr 10, 2020 at 10:12
-
1\$\begingroup\$ @Peilonrayz you don't backup your assertion at all. Take the Euclidean Algorithm for instance. It takes two arguments, commonly called
a
andb
. Please tell me a better name for them? \$\endgroup\$northerner– northerner2020年04月10日 10:15:05 +00:00Commented Apr 10, 2020 at 10:15
Is this a good approach?
approach to what exactly? Or do you want opinions if you coded it well, advice how to code it better? \$\endgroup\$