This is my code for project euler #35 (in Python):
The number, 197, is called a circular prime because all rotations of the digits: 197, 971, and 719, are themselves prime.
There are thirteen such primes below 100: 2, 3, 5, 7, 11, 13, 17, 31, 37, 71, 73, 79, and 97.
How many circular primes are there below one million?
The program takes about 2 minutes to run, and I don't really know how to improve the efficiency of it. I'm using a sieve to generate the primes, and I'm only checking rotations until the rotation isn't prime.
def primes_sieve(limit):
limitn = limit+1
not_prime = [False] * limitn
primes = []
for i in range(2, limitn):
if not_prime[i]:
continue
for f in range(i*2, limitn, i):
not_prime[f] = True
primes.append(i)
return primes
primes = set(primes_sieve(1000000));
exclude = ['2','4','5','6','8','0']
def circularPrime(n):
ss = str(n);
for i in range(n):
if ss[-1] in exclude:
return 0;
if int(ss) not in primes:
return 0;
ss = ss[-1] + ss[:-1];
return 1;
gg = 0;
for num in primes:
gg += circularPrime(num);
print gg;
Any help is greatly appreciated.
-
\$\begingroup\$ I have no time for an actual review at the moment but I think my answer for a question linked to the same problem might interest you : codereview.stackexchange.com/questions/58610/… . I am not sure all optimisations have been covered by other people's answer. \$\endgroup\$SylvainD– SylvainD2015年11月17日 14:40:36 +00:00Commented Nov 17, 2015 at 14:40
5 Answers 5
BUG
Your program gives the wrong answer, yielding 53 instead of 55. The reason for this is you over-eagerly exclude rotations in circularPrime
if anything ends in an even digit or 5
. But 2
and 5
are prime, yet end in a digit that you exclude!
It turns out that the exclusion check (in addition to giving you the wrong answer) doesn't actually help you anyway. Timing your function run ten times:
with check: 6.094s
w/o check: 6.075s
So drop it, and you get the correct answer, and it runs faster (or the same, within noise).
Semicolons
Semicolons are a C/C++/Java/etc. thing. While they're technically correct Python syntax, you don't need them, and they look weird, so you can simply drop them.
Sieve
Typically, we'd start our sieve with something like is_prime = [True] * limitn
. It's a little weird to see it backwards, but I guess that's ok.
More efficient sieving
Let's take a look at the sieve part of your code, swapping bools like I suggested:
for i in range(2, limitn):
if not is_prime[i]:
continue
for f in range(i*2, limitn, i):
is_prime[f] = False
primes.append(i)
First of all, having the continue
makes it harder to read - better to keep the flow mono-directional:
for i in range(2, limitn):
if is_prime[i]:
primes.append(i)
for f in range(i*2, limitn, i):
is_prime[f] = False
Now, there's two issues here. In Python 2.7, range()
gives you a list. The full list. That's hugely wasteful since you never want the full list, only the next element one at a time. For that, there's xrange()
. Just swapping out those two functions:
range() 6.075s
xrange() 4.913s
Now, let's take a look at your inner loop:
for f in xrange(i*2, limitn, i):
is_prime[f] = False
f
is a bad variable name for this, prefer something like j
, but that's not important. Consider the bounds. At this point, we know that \$i\$ is prime. We also know that this is the first time we've gotten to it, so every multiple of \$i\$ is still marked as being potentially prime, except for those divisible by some other prime \$p < i\$. So \2ドルi\$ we already know isn't prime, because it's divisible by \2ドル\$. \3ドルi\$ likewise. And so on. The first composite number that we have to cross off will be the first one not divisible by any such prime \$p\$... which is \$i^2\$.
for j in xrange(i*i, limitn, i):
is_prime[j] = False
That improves us to:
OP, w/ bugfix 6.075s
range() -> xrange() 4.913s
inner loop from i^2 4.343s
Bring back the bug
I love deliberately provocative headings. Anyway, you had the right idea of checking for the even digits, but we can be more aggressive with it. Before we check any of the rotations, let's check for the presence of any of those digits first:
def circularPrime(n):
ss = str(n)
if any(c in ss for c in '024568'):
return 0
# now check everything else
This way, we can avoid a whole bunch of work for all of our 6-digit primes that start with an even number. However, from the very first thing I wrote, we know that this will give us the wrong answer. However, we know how to fix it. We know we are erroneously excluding 2 and 5, so let's just add them back:
gg = 2 # because 2,5
for num in primes:
gg += circularPrime(num)
return gg
That drops us down to:
OP, w/ bugfix 6.075s
range() -> xrange() 4.913s
inner loop from i^2 4.343s
excluding again 3.721s
I had originally said your check didn't give you any performance benefit, but this one does - since building up all the string rotations and doing all the int
casting is expensive and in this way we're able to avoid all of that up front. Since the number of circular primes is sparse (only 55 out of ~75k primes!), anything we can do to aggressively weed our set down is good.
There are things to comment upon in your code, and there are things to optimize.
Style review
- Avoid global code in between functions – Your code is hard to read as you sprinkle code and functions in between each other. This also makes it harder to reuse your code as a module, at some point in the time.
- Use good
snake_case
names for variables and functions – The pythonic way is to use name likecircular_prime
orlimit_n
. And what isss
andgg
? They don't describe the content or purpose of the variable - Constants can be declared at top, using
SNAKE_CASE
names – At leastexcludes
could be declared as a constant. Theprimes
is an edge case, which I'm a little ambivalent about. But anyway if you can avoid globals, that would be the best. exclude
as string, not list – In your code you useexclude
as a list of single characters, and you do ass[-1] in exclude
. The same effect an be used withexclude = '245680'
. It looks a little nicer- Using
sum()
on a list comprehension – Instead of your loop usinggg
, you could simply dogg = sum(circularPrime(num) for num in primes)
. - Change
circularPrime()
into a boolean function – I think it's more natural if this function returnedTrue
orFalse
, instead of 0 and 1. You could still sum up all the true values as indicated in previous bullet point - Avoid semicolons in Python – You don't need the semicolons, so strip them away
- Describe your functions – Commenting using docstrings or ordinary comments is good when the functions or code snippet becomes a little complicated.
Code optimisation
I'll not touch in on the sieve optimisation, as Barry has covered that already. However there are a few points I would like to make otherwise:
- In Python 2 you are much better of using
xrange()
, rather thanrange()
as the latter generates the entire list in memory, whilst the first only defines a generator which gives you the next number when you ask for it. Especially when getting to larger numbers, this really makes a difference - Try avoid repeated or unneccessary calculations – Barry has touched in on this already as he discusses the limits to use for the different loops. This can further be extended by noting that in your case if the number contains the even digits (or 5), one of the rotations will be a non-prime. You do return early if you find one, but you could also build a range of only the needed candidates, and not loop over all primes
- Similarily your correction from
range(n)
torange(len(ss))
, and even better toxrange(len(ss))
is a good catch. You do however test for being prime one time extra as when you callcircularPrime
you know it is a prime. However this is not extremely costly as it one extra lookup in your table - Another point, not verified in performance testings, could be to consider whether it would be faster to build
primes
as a list of strings, since your primary test against it is with strings which you has to convert back to an int. Similarily, it could be that using a plain list instead of a set could be worth considering. As a general principle it good advice to try to avoid to many convertions between different types. - Let functions return values, and don't print directly – To enhance reuse of functions, a function should normally return a value, and not print it out. That is of course given that it is not a presentation function of sorts, in which case it shouldn't do calculation in a greater extent.
Here is some code to test for circular primes, testing only possible candidates:
def all_digit_combinations(allowed_digits, maximum):
"""Yield all combinations of allowed_digits below maximum.
For allowed_digits being a list of single digits, i.e. [1, 3, 7, 9],
combine all variants of these digits until we pass the maximum value.
"""
no_of_digits = 1
while True:
for digit_tuple in itertools.product(allowed_digits, repeat=no_of_digits):
new_number = reduce(lambda rst, d: rst * 10 + d, digit_tuple)
if new_number < maximum:
yield new_number
else:
raise StopIteration
no_of_digits += 1
def is_circular_prime(n):
"""Verifies that n and all rotations are prime numbers."""
def are_rotations_prime(n):
ss = str(n);
for i in xrange(len(ss)):
if int(ss) not in PRIMES:
return False;
ss = ss[-1] + ss[:-1];
return True
return are_rotations_prime(n)
def count_circular_primes_holroy(n):
"""Count all circular primes lower than n.
A circular primes is a number which in all rotations of digits are
still prime numbers. That is, 197 is a circular primes at both 197,
971, and 719 are all primes. However 271, is not a circular prime,
since 712 are non-prime, even though 271 and 127 are prime numbers.
With exception of 2 and 5, all circular primes does indeed only
consist of the numbers 1, 3, 7, and 9.
"""
# Initialise count, remembering that 2 and 5 are circular primes
count = 2 if n > 5 else 1 if n > 2 else 0
return sum(is_circular_prime(n)
for n in all_digit_combinations([1, 3, 7, 9], n)) + count
if __name__ == '__main__':
n = 1000000
PRIMES = primes_sieve(n)
print('Count of circular primes below {}: {}'.format(n, count_circular_primes(n)))
This version runs substantially faster than your version. Here are some timing tests, where I left out the buidling of prime list, and left out your versions when it took to long:
count_circular_primes_org = 100 in 0.001 ms
count_circular_primes_mod = 100 in 0.000 ms
count_circular_primes_holroy = 100 in 0.000 ms
count_circular_primes_org = 1000 in 0.022 ms
count_circular_primes_mod = 1000 in 0.001 ms
count_circular_primes_holroy = 1000 in 0.001 ms
count_circular_primes_org = 10000 in 0.951 ms
count_circular_primes_mod = 10000 in 0.030 ms
count_circular_primes_holroy = 10000 in 0.012 ms
count_circular_primes_mod = 100000 in 2.098 ms
count_circular_primes_holroy = 100000 in 0.358 ms
count_circular_primes_holroy = 1000000 in 21.666 ms
The mod-version is your code, but using xrange(len(ss))
. The org-version is your code, modified to use xrange(n)
. And the last one is mine version. Notice how the time of my version doesn't grow as fast as your version due to not testing each and every number. Also do note that it could probably be even faster, if I implemented some memoisation (or similar) to avoid retesting whether all rotations are prime numbers. This effect would increase the more digits in the candidates there are.
I'm kinda stupid. This:
def circularPrime(n):
ss = str(n);
for i in range(n):
if ss[-1] in exclude:
return 0;
if int(ss) not in primes:
return 0;
ss = ss[-1] + ss[:-1];
return 1;
Needs to be this:
def circularPrime(n):
ss = str(n);
for i in range(len(ss)):
if ss[-1] in exclude:
return 0;
if int(ss) not in primes:
return 0;
ss = ss[-1] + ss[:-1];
return 1;
I'm unnecessarily running iterations in the loop. The code now runs in 2 seconds instead of 2 minutes.
-
3\$\begingroup\$ I'm not sure if it will be any faster, but I would check if number contains "even" digits first (
if any(i in exlude for i in ss): return False
) and then use itertools.permutations (return all(int(i) in primes for i in itertools.permutations(ss))
). Actually, I think it would be faster, as comprehensions are faster than loops, and check for containing "even" digits shouldn't take long comparing toint(i) in primes
\$\endgroup\$Alissa– Alissa2015年11月16日 15:43:46 +00:00Commented Nov 16, 2015 at 15:43 -
\$\begingroup\$ @Alissa That's a good suggestion, you could put it in an answer. You are correct that comprehensions/generators would be faster. \$\endgroup\$SuperBiasedMan– SuperBiasedMan2015年11月16日 16:45:10 +00:00Commented Nov 16, 2015 at 16:45
-
\$\begingroup\$ This gives the wrong answer, I ran your code and got 53. \$\endgroup\$Barry– Barry2015年11月16日 17:17:00 +00:00Commented Nov 16, 2015 at 17:17
Here is another way: use itertools.product to generale all possible candidates using digits 1,3,7,9 with number length from 1 to 6. Then check if all the rotations of the candidates are prime. If they are, add that to answer, if not, continue with next candidate.
To prevent calculating duplicates as we might have a rotation of another number in candidates, we also use a used dictionary to check if we have seen any of the rotations of the number we are about to check.
As an identifier, I used the max number of the possible rotations, but could be min as well. Here is the code with this version:
import math
import itertools
def get_rotations(num): # returns rotations of a number
string_num = str(num)
arr=[]
for i in range(len(string_num)):
arr.append(int(string_num[i:] + string_num[0:i]))
return arr
def isPrime(n): # we do not check a lot of numbers, so sieve might not be worth it
if n == 1 or n % 2 == 0:
return False
x=3
while(x*x <= n):
if n % x == 0:
return False
x+=2;
return True
answer = 2 # 2 and 5
used = {}
for x in range(1,7):
candidates = set([int(''.join(map(str,uni))) for uni in itertools.product([1,3,7,9], repeat=x)])
for candidate in candidates:
rotations = set(get_rotations(candidate))
st = max(rotations)
if st not in used:
used[st] = True
isCircular = True # we assume its a circular prime
counter = 0;
for x in rotations:
if not isPrime(x):
isCircular = False
break;
if isCircular: # if it was a circular Prime, add to answer the unique permutations
answer+=len(rotations)
print answer
-
\$\begingroup\$ This is the best suggestion in this thread. You are testing only 4096 items instead of several millions. \$\endgroup\$ArekBulski– ArekBulski2015年11月30日 14:24:03 +00:00Commented Nov 30, 2015 at 14:24
-
\$\begingroup\$ @ArekBulski thanks! Holroy solution is quite similar though \$\endgroup\$juvian– juvian2015年11月30日 16:46:11 +00:00Commented Nov 30, 2015 at 16:46
This should be a bit faster: UPD: permutations won't work here, but we can make a lovely cycle function
def cycle(str_num):
return [str_num[i:]+str_num[:i] for i in range(len(str_num))]
def circularPrime(n):
ss = str(n);
if any(i in exlude for i in ss): return 0
return all(int(i) in primes for i in cycle(ss))
It makes one extra loop, but it's fixed-size, and shouldn't take much time comparing to the loop iterating through permutations. But it is known that permutations work faster than usual for-loops (explanation).
Another thing to do is to remove primes = set(...)
as your generator takes each prime only once (and making set out of a list is pretty expensive)
Also there is one more thing to fasten: checking if a permutation is prime. Let's use the fact that we only use prime generating function once, and the fact that accessing a list element by index is O(1) while checking x in list
is O(len(list))
limit = 1000000+1
#I inverted this list, but it doesn't make much difference
is_prime = [True] * limit
primes = []
for i in range(2, limit):
if not is_prime[i]:
continue
for f in range(i*2, limit, i):
is_prime[f] = False
primes.append(i)
evens = ['2','4','5','6','8','0']
# just an option, not sure which is faster, we can make cycle function to be generator
def cycle(str_num):
for i in range(len(str_num)):
yield str_num[i:]+str_num[:i]
def is_circular(n):
str_number = str(n)
#I've added a check here, because otherwise 2 isn't counted (contains '2')
if n == 2: return True
if any(i in evens for i in str_number): return False
return all(is_prime[int(i)] for i in cycle(str_number))
count = 0
for num in primes:
count += 1 if is_circular(num) else 0
print count
Please note also naming fixes I made (clear names are better than short ones), and remember: we don't need semicolons in Python.
-
\$\begingroup\$ You also needs to add an exception for
n==5
, I think \$\endgroup\$holroy– holroy2015年11月16日 22:24:26 +00:00Commented Nov 16, 2015 at 22:24 -
\$\begingroup\$ You forgot
5
for the2
check. Also, I'm not sure about how "known" it is that this code is faster. I just timed it and it's slower than OP's. \$\endgroup\$Barry– Barry2015年11月16日 22:34:16 +00:00Commented Nov 16, 2015 at 22:34 -
\$\begingroup\$ I think it might be slower because for small numbers there are not much combos to check, much less than number of
evens
which we also check. I think that complexity of my version is lower, but it will be faster only on bigger numbers. I haven't thought about the fact that a million is not that big. \$\endgroup\$Alissa– Alissa2015年11月17日 09:27:51 +00:00Commented Nov 17, 2015 at 9:27
Explore related questions
See similar questions with these tags.