Fast Number Factorization in Python

Question 1

Here's my implementation to factorize a positive integer:

def is_prime(n):
 if n == 1:
 return False
 if n % 2 == 0:
 return False
 i = 3
 while i * i <= n:
 if n % i == 0:
 return False
 i += 2
 return True
# @timeit
def prime_factors(n):
 prime_factor_list = []
 while not n % 2:
 prime_factor_list.append(2)
 n //= 2
 while not n % 3:
 prime_factor_list.append(3)
 n //= 3
 i = 5
 while n != 1:
 if is_prime(i):
 while not n % i:
 prime_factor_list.append(i)
 n //= i
 i += 2
 return prime_factor_list

How can I improve this code in speed and make it pythonic?

Question 2

I think that the best way to do fast number factorization in python is just call an external library (which does not rely on Python iteration) such as sympy. See related answer on StackOverflow stackoverflow.com/questions/4643647/… ).

Question 3

`prime_factors()`

The single greatest improvement to performance that you could make would be to remove the call to is_prime() altogether. If all factors smaller than i have been removed from n already, then i could not possibly be composite.

The next improvement you could make would be to turn prime_factors() into a generator. The function would no longer need to resize and recopy prime_factor_list whenever the list needs to grow beyond its estimated size. Furthermore, the caller could start making use of initial results before the entire factorization is finished.

Using three lines (i = 5, while n != 1, and i += 2) to structure the loop — something that would be succinctly written in C as a for (i = 5; n != 1; i += 2) header — is un-Pythonic. Such counting loops are usually better written using some kind of range(), xrange(), enumerate(), or something from itertools. Here, itertools.chain() and itertools.count() would be appropriate.

Without changing the spirit of your algorithm, I would write:

import itertools
def prime_factors(n):
 for i in itertools.chain([2], itertools.count(3, 2)):
 if n <= 1:
 break
 while n % i == 0:
 n //= i
 yield i

`is_prime()`

As mentioned above, you don't need is_prime() at all, if your only goal is to implement prime_factors().

However, if you wanted to implement is_prime() anyway, you could tighten it up a bit. My main recommendation would be to avoid performing the i * i multiplication with each iteration. You would probably be better off computing the \$\sqrt{n}\$ limit just once.

Then, as with the other function, you should express the loop more Pythonically using range() or xrange(). Actually, you wouldn't need an explicit loop; you could use any() instead.

So, if we stick to your algorithm in spirit, we could tighten up the implementation like this:

def is_prime(n):
 if n < 3 or n % 2 == 0:
 return n == 2
 else:
 return not any(n % i == 0 for i in range(3, int(n**0.5 + 2), 2))

Question 4

using is_prime was really dumb. Good Observation.

Question 5

@200_success @user2305 "If all factors smaller than i have been removed from n already, then i could not possibly be composite." i can and will be composite (it's just going through odd numbers, and most of them are composite). It's just that, if i is composite, it will not divide the remaining n, and I think that's what the OP was trying to skip. However, I do agree with removing the call to is_prime(): trying to avoid a O(1) operation with a O(log(n)) check is not exactly an optimization! :)

Question 6

@200_success: you can break the loop if n < i * i and yield n as the largest prime factor.

Question 7

Your code is written fairly well, yet in is_prime you forgot to check whether the input integer is 2; just add this:

if n == 2:
 return True

Also, for some negative integers is_prime returns True. I suggest you add the following statement to prune them away:

if n <= 1:
 return False

Question 8

prime_factors never calls is_prime with a number that's divisible by 2 or 3. is_prime could be faster if it used this knowledge. However, if is_prime will use this knowledge, then it may be a bit misleading in global scope. So you could move the definition of is_prime inside of prime_factors.

Although the code is Pythonic alright, it would be more idiomatic to use a prime generator to get the next prime factor, instead of iterating from 5 upwards. Given a prime generator gen, the final loop of the code would become:

while n != 1:
 prime = next(gen)
 while not n % prime:
 prime_factor_list.append(prime)
 n //= prime

Question 9

Few additional thoughts to Fast Number Factorization in Python answer.

is_prime()

In case if you have multiple consequent calls you should use something like Sieve_of_Eratosthenes. If you will, time to generate sieve will depend on maximum value of number to factorize but total time will be reduces.

prime_factors()

There is one thing you miss in your code. Lets take prime number, let is be \$ 10007 \$ and multiply it by \$ 2 \,ドル we will receive \$ 20014 \$. Its factorization will be \$ 20014 = 10007 \times 2 \$.

Now lets analyze your prime_factors. You will find that \$ 2 \$ is prime divisor of \$ 20014 \$ and will continue to iterate through all prime numbers up to \$ 10007 \$ but actually you have to iterate up to square root of initial number, because if you cross this line and reminder still not equal \$ 1 \$ than it is prime.

This trick can reduce time for such cases.

As the result, you can assume that you need to generate sieve up to square root of highest number.

200_success 200_success 145k22 gold badges190 silver badges478 bronze badges · Accepted Answer · 2016-03-04 09:00:06Z

`prime_factors()`

The single greatest improvement to performance that you could make would be to remove the call to is_prime() altogether. If all factors smaller than i have been removed from n already, then i could not possibly be composite.

The next improvement you could make would be to turn prime_factors() into a generator. The function would no longer need to resize and recopy prime_factor_list whenever the list needs to grow beyond its estimated size. Furthermore, the caller could start making use of initial results before the entire factorization is finished.

Using three lines (i = 5, while n != 1, and i += 2) to structure the loop — something that would be succinctly written in C as a for (i = 5; n != 1; i += 2) header — is un-Pythonic. Such counting loops are usually better written using some kind of range(), xrange(), enumerate(), or something from itertools. Here, itertools.chain() and itertools.count() would be appropriate.

Without changing the spirit of your algorithm, I would write:

import itertools
def prime_factors(n):
 for i in itertools.chain([2], itertools.count(3, 2)):
 if n <= 1:
 break
 while n % i == 0:
 n //= i
 yield i

`is_prime()`

As mentioned above, you don't need is_prime() at all, if your only goal is to implement prime_factors().

However, if you wanted to implement is_prime() anyway, you could tighten it up a bit. My main recommendation would be to avoid performing the i * i multiplication with each iteration. You would probably be better off computing the \$\sqrt{n}\$ limit just once.

Then, as with the other function, you should express the loop more Pythonically using range() or xrange(). Actually, you wouldn't need an explicit loop; you could use any() instead.

So, if we stick to your algorithm in spirit, we could tighten up the implementation like this:

def is_prime(n):
 if n < 3 or n % 2 == 0:
 return n == 2
 else:
 return not any(n % i == 0 for i in range(3, int(n**0.5 + 2), 2))

@200_success @user2305 "If all factors smaller than i have been removed from n already, then i could not possibly be composite." i can and will be composite (it's just going through odd numbers, and most of them are composite). It's just that, if i is composite, it will not divide the remaining n, and I think that's what the OP was trying to skip. However, I do agree with removing the call to is_prime(): trying to avoid a O(1) operation with a O(log(n)) check is not exactly an optimization! :)
@200_success: you can break the loop if n < i * i and yield n as the largest prime factor.

Stack Exchange Network

Fast Number Factorization in Python

4 Answers 4

`prime_factors()`

`is_prime()`

is_prime()

prime_factors()

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Fast Number Factorization in Python

4 Answers 4

prime_factors()

is_prime()

is_prime()

prime_factors()

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions

`prime_factors()`

`is_prime()`