RSA algorithm implementation in Python 3

Question 1

First, a reminder of the RSA algorithm and what my program implements:

Take two distinct, large primes p and q
- Ideally these have a similar byte-length
Multiply p and q and store the result in n
Find the totient for n using the formula $$\varphi(n)=(p-1)(q-1)$$
Take an e coprime that is greater, than 1 and less than n
Find d using the formula $$d\cdot e\equiv1\mod\varphi(n)$$

At this point, the pair (e, n) is the public key and the private key (d, n) is the private key.

Conception:

Implement the RSA algorithm
Ask the user for necessary data (primes, coprime greater than 1 and less than n, string)
Encrypt and decrypt the given string by the user using the RSA algorithm

What do you think about my Python 3 implementation of the RSA algorithm? What's the performance of this program?

try:
 input = raw_input
except NameError:
 pass
try:
 chr = unichr
except NameError:
 pass
p=int(input('Enter prime p: '))
q=int(input('Enter prime q: '))
print("Choosen primes:\np=" + str(p) + ", q=" + str(q) + "\n")
n=p*q
print("n = p * q = " + str(n) + "\n")
phi=(p-1)*(q-1)
print("Euler's function (totient) [phi(n)]: " + str(phi) + "\n")
def gcd(a, b):
 while b != 0:
 c = a % b
 a = b
 b = c
 return a
def modinv(a, m):
 for x in range(1, m):
 if (a * x) % m == 1:
 return x
 return None
def coprimes(a):
 l = []
 for x in range(2, a):
 if gcd(a, x) == 1 and modinv(x,phi) != None:
 l.append(x)
 for x in l:
 if x == modinv(x,phi):
 l.remove(x)
 return l
print("Choose an e from a below coprimes array:\n")
print(str(coprimes(phi)) + "\n")
e=int(input())
d=modinv(e,phi)
print("\nYour public key is a pair of numbers (e=" + str(e) + ", n=" + str(n) + ").\n")
print("Your private key is a pair of numbers (d=" + str(d) + ", n=" + str(n) + ").\n")
def encrypt_block(m):
 c = modinv(m**e, n)
 if c == None: print('No modular multiplicative inverse for block ' + str(m) + '.')
 return c
def decrypt_block(c):
 m = modinv(c**d, n)
 if m == None: print('No modular multiplicative inverse for block ' + str(c) + '.')
 return m
def encrypt_string(s):
 return ''.join([chr(encrypt_block(ord(x))) for x in list(s)])
def decrypt_string(s):
 return ''.join([chr(decrypt_block(ord(x))) for x in list(s)])
s = input("Enter a message to encrypt: ")
print("\nPlain message: " + s + "\n")
enc = encrypt_string(s)
print("Encrypted message: " + enc + "\n")
dec = decrypt_string(enc)
print("Decrypted message: " + dec + "\n")

Question 2

Instead of using "some text" + str(value_I_Want) + "more text"... you should try to use the .format function. E.G. "This is my string and {} is my value".format(value) It makes code more managable and readable.

Question 3

Also, you should notice that your encrypt and decrypt funciton are identical, maybe you can cut out some extraneous code with an extra parameter? ;) I will also add that I generally like to keep my function definitions and code execution separated for readability. Keep all of your inputs and prints below function definitions (preferrably inside __name__ == "__main__" conditional).

Question 4

I think your Modular Inverse implementation is slow. We can apply this Extended GCD algorithm recursive implementation which shows quite a dramatic speed improvement (at least on my machine):

def egcd(a, b):
 if a == 0:
 return b, 0, 1
 else:
 g, y, x = egcd(b % a, a)
 return g, x - (b // a) * y, y
def modinv(a, m):
 g, x, y = egcd(a, m)
 if g != 1:
 raise Exception('modular inverse does not exist')
 else:
 return x % m

Here is a simple benchmark for some a and m which have a modular inverse of 16013:

In [24]: a = 108 ** 151
In [25]: m = 22499
In [26]: %timeit modinv_OP(a, m)
100 loops, best of 3: 11.2 ms per loop
In [27]: %timeit modinv_new(a, m)
100000 loops, best of 3: 5.62 μs per loop

I also wonder if adding an LRU cache via functools.lru_cache() would make an impact - probably not, but please experiment with that.

Some other improvements:

you can avoid converting s to a list in order to iterate character by character:
```
''.join(str(chr(encrypt_block(ord(x)))) for x in s)
```
I've also switched to a "lazy" join when joined from a generator.
if on Python 3.6+, you can use f-strings for your print messages

Question 5

You can use math.gcd:

from math import gcd

One must not compare variables with None using equality operator

Comparisons to singletons like None should always be done with is or is not, never the equality operators.

modinv for prime mod may be implemented like this:

def prime_mod_inv(x, mod):
 return pow(x, mod - 2, mod)

You can use print with multiple arguments.

No:

print("Choosen primes:\np=" + str(p) + ", q=" + str(q) + "\n")

Yes:

print("Choosen primes:")
print("p =", p, ", q =", q)

No:

print("Euler's function (totient) [phi(n)]: " + str(phi) + "\n")

Yes:

print("Euler's function (totient) [phi(n)]:", phi)

No:

print(str(coprimes(phi)) + "\n")

Yes:

print(coprimes(phi), "\n")

You can replace this

l = []
for x in range(2, a):
 if gcd(a, x) == 1 and modinv(x,phi) != None:
 l.append(x)
for x in l:
 if x == modinv(x,phi):
 l.remove(x)

With list comprehension

l = [x for x in range(2, a) 
 if gcd(a, x) == 1 
 and modinv(x, phi) is not None
 and modinv(x, phi) != x]

Question 6

also, please don't name lists l it looks way too much like 1 in a lot of fonts.

Question 7

Actually, it is better not to use variable at all. List comprehension allows to write coprimes() function with a single return statement :)

Question 8

Rather than using Euler's Totient formula to compute d, it's recommended to use the Carmichael Lambda Function instead. The Carmichael Lambda(CML) divides the Totient. The CML provides the smallest possible secure private key.

Luckily, if your primes are truly primes, one can EASILY compute the CML. It's the lcm of p-1, q-1. Then use the same Modular inverse (definitely use the extended GCD ) algorithm for the modular inverse, a beauty of an algorithm.

Question 9

Instead of choosing an e, you can randomize it

z = coprimes(phi)
e = random.choice(z)

Question 10

Welcome to CodeReview@SE. It's entirely OK for an answer to raise but one valid point. You present an alternative choice in interface. I could easily see this as a code review if you compared the relative merits.

alecxe alecxe 17.5k8 gold badges52 silver badges93 bronze badges · Accepted Answer · 2017-08-29 19:58:07Z

I think your Modular Inverse implementation is slow. We can apply this Extended GCD algorithm recursive implementation which shows quite a dramatic speed improvement (at least on my machine):

def egcd(a, b):
 if a == 0:
 return b, 0, 1
 else:
 g, y, x = egcd(b % a, a)
 return g, x - (b // a) * y, y
def modinv(a, m):
 g, x, y = egcd(a, m)
 if g != 1:
 raise Exception('modular inverse does not exist')
 else:
 return x % m

Here is a simple benchmark for some a and m which have a modular inverse of 16013:

In [24]: a = 108 ** 151
In [25]: m = 22499
In [26]: %timeit modinv_OP(a, m)
100 loops, best of 3: 11.2 ms per loop
In [27]: %timeit modinv_new(a, m)
100000 loops, best of 3: 5.62 μs per loop

I also wonder if adding an LRU cache via functools.lru_cache() would make an impact - probably not, but please experiment with that.

Some other improvements:

you can avoid converting s to a list in order to iterate character by character:
```
''.join(str(chr(encrypt_block(ord(x)))) for x in s)
```
I've also switched to a "lazy" join when joined from a generator.
if on Python 3.6+, you can use f-strings for your print messages

Stack Exchange Network

RSA algorithm implementation in Python 3

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

RSA algorithm implementation in Python 3

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions