Count SemiPrimes task from Codility

Question 1

I'm trying to solve the following exercise from Codility:

A prime is a positive integer X that has exactly two distinct divisors: 1 and X. The first few prime integers are 2, 3, 5, 7, 11 and 13.

A semiprime is a natural number that is the product of two (not necessarily distinct) prime numbers. The first few semiprimes are 4, 6, 9, 10, 14, 15, 21, 22, 25, 26.

You are given two non-empty arrays P and Q, each consisting of M integers. These arrays represent queries about the number of semiprimes within specified ranges.

Query K requires you to find the number of semiprimes within the range (P[K], Q[K]), where 1 ≤ P[K] ≤ Q[K] ≤ N.

For example, consider an integer N = 26 and arrays P, Q such that:
P[0] = 1 Q[0] = 26
P[1] = 4 Q[1] = 10
P[2] = 16 Q[2] = 20
The number of semiprimes within each of these ranges is as follows:

(1, 26) is 10, (4, 10) is 4, (16, 20) is 0. Write a function:

def solution(N, P, Q)

that, given an integer N and two non-empty arrays P and Q consisting of M integers, returns an array consisting of M elements specifying the consecutive answers to all the queries.

For example, given an integer N = 26 and arrays P, Q such that:
P[0] = 1 Q[0] = 26
P[1] = 4 Q[1] = 10
P[2] = 16 Q[2] = 20
the function should return the values [10, 4, 0], as explained above.

Write an efficient algorithm for the following assumptions:

N is an integer within the range [1..50,000]; M is an integer within the range [1..30,000]; each element of arrays P and Q is an integer within the range [1..N]; P[i] ≤ Q[i]. https://app.codility.com/programmers/lessons/11-sieve_of_eratosthenes/count_semiprimes/

My idea is to first find the primes in a given range, then find the semiprimes by multiplying the primes with each other, calculate the prefix sum of the semiprimes and finally return the number of semiprimes in a given slice by subtracting the prefix sum of the lower index from the higher.

My code is:

def find_primes(n):
 primes = set(range(1, n+1))
 i = 2
 while i * i <= n:
 if i in primes:
 k = i * i
 while k <=n :
 if k in primes:
 primes.remove(k)
 k += i
 i += 1
 return list(primes)
def find_semi_primes(n):
 primes = find_primes(n)
 p = len(primes)
 semi_primes = set()
 for i in range(1, p):
 for j in range(i, p):
 p1 = primes[i]
 p2 = primes[j]
 if p1 * p2 > n:
 break
 semi_primes.add(p1 * p2)
 return semi_primes
def solution(N, P, Q):
 semi_primes = find_semi_primes(N)
 prefix_count = (N+1) * [0]
 for n in range(1, N+1):
 prefix_count[n] = prefix_count[n-1]
 if n in semi_primes:
 prefix_count[n] += 1
 q = len(P)
 result = q * [0]
 for i in range(q):
 result[i] = prefix_count[Q[i]] - prefix_count[P[i]-1]
 return result

Question 2

Here are some suggestions, mainly regarding coding style.

Syntax

When I copied your code from the question and tried to run it, I got a compile error due to the missing indentation in the find_primes function:

if k in primes:
primes.remove(k)

Perhaps there was a problem when you copied the code into the post. This works:

if k in primes:
 primes.remove(k)

Let's start from the top:

def find_primes(n):

That's a good name for a function. However, it doesn't specifically convey what it is doing. That could be improved with a docstring as recommended by the PEP 8 style guide. The docstring could summarize the function's purpose as well as specify the input type and return type:

def find_primes(n):
 """
 Find all primes numbers less than or equal to the input.
 The input (n) is expected to be a positive integer.
 Return a list of primes.
 """

primes is a good name for an array variable:

primes = set(range(1, n+1))

However, it at first seems a little misleading because you are initially setting it to all integers, not just prime numbers. I recommend adding a comment stating that the non-primes will be removed from the array.

It would also be helpful to add comments regarding your algorithm, explaining your nested while loops.

This line uses inconsistent spacing around the operators:

while k <=n :

The black program can be used to automatically reformat the code with consistency:

while k <= n:

The common definition of prime numbers does not include 1, yet your find_primes function returns 1 in its primes array. Even the Codility description shows the first prime number as 2. I suggest adding a comment in the code for this.

You should also add a docstring for the find_semi_primes function, which should include a description of what a "semiprime" is. It should also include a description of the algorithm, explaining the nested loops.

The expression p1 * p2 is repeated a couple times. You can avoid the DRY using a new variable named product:

for j in range(i, p):
 product = primes[i] * primes[j]
 if product > n:
 break
 semi_primes.add(product)

Question 3

This code seems very "procedural" in nature, looking like the work of a C or C++ programmer rather than that of a fluent Pythonista. For example, I see a lot of for i in range(): loops where the idiomatic Python would use a comprehension or transform.

Let's start by correcting find_primes() - 1 is not a prime, so the initial set should start at 2:

 primes = set(range(2, n+1))

Then instead of ignoring the first element in find_semi_primes(), we should index from zero:

 for i in range(0, p):

That makes the code more understandable for readers used to working with primes.

Consider the final loop:

q = len(P)
result = q * [0]
for i in range(q):
 result[i] = prefix_count[Q[i]] - prefix_count[P[i]-1]
return result

My Pythonic replacement uses map to evaluate each query and produce the result:

def semiprimes_in_range(lower, upper):
 return prefix_count[upper] - prefix_count[lower-1]
return list(map(semiprimes_in_range, p, q))

It's possible to write find_semiprimes can be written more readably as:

def find_semiprimes(n):
 primes = find_primes((n+1)//2)
 return {x * y for x, y in itertools.combinations_with_replacement(primes, 2 )}

However, I don't recommend this because it generates many semiprimes greater than n, hugely impacting performance.

But do note the ceil division by 2 - since the smallest prime is 2, we know that's the smallest possible factor of the semiprimes.

We can use itertools.accumulate() to generate prefix_count - that's another chunk of code we don't have to write ourselves:

 prefix_count = list(itertools.accumulate(map(lambda x: x in semiprimes, range(n+1))))

My modified code

It's not perfect, but demonstrates my suggestions as a complete program.

import itertools
def find_primes(n):
 '''
 Generate prime numbers up to (and including) 'n'.
 '''
 # Implementation is sieve of Eratosthenes.
 primes = set(range(2, n+1))
 i = 2
 while i * i <= n:
 if i in primes:
 k = i * i
 while k <= n:
 if k in primes:
 primes.remove(k)
 k += i
 i += 1
 return sorted(primes)
def find_semiprimes(n):
 '''
 Generate semiprime numbers up to (and including) 'n'.
 '''
 primes = find_primes(n)
 p = len(primes)
 semiprimes = set()
 for i in range(0, p):
 for j in range(i, p):
 p1 = primes[i]
 p2 = primes[j]
 if p1 * p2 > n:
 break
 semiprimes.add(p1 * p2)
 return semiprimes
def solution(n, p, q):
 '''
 Create a list containing the following elements:
 For every element of p and corresponding element of q (both ≤ n),
 count how many semiprimes in the range [p, q).
 '''
 semiprimes = find_semiprimes(n)
 prefix_count = list(itertools.accumulate(map(lambda x: x in semiprimes, range(n+1))))
 def semiprimes_in_range(lower, upper):
 return prefix_count[upper] - prefix_count[lower-1]
 return list(map(semiprimes_in_range, p, q))
if __name__ == '__main__':
 print(solution(26, [1, 4, 16], [26, 10, 20]))

Question 4

A bug: find_semi_primes assumes that the list of primes is in sorted order (it skips index 0 where it assumes the number 1 is, and it breaks the inner loop when the semiprime becomes too large). But find_primes returns list(primes) where primes is a set, and sets are unordered. So you might get any arbitrary order instead of sorted order. (In CPython you currently do get sorted order, but that's due to implementation details that should not be relied on.)

And I'd change

if k in primes:
 primes.remove(k)

to

primes.discard(k)

for simplicity.

toolic toolic 14.9k5 gold badges29 silver badges208 bronze badges · Answer 1 · 2025-04-30 10:51:32Z

Here are some suggestions, mainly regarding coding style.

Syntax

When I copied your code from the question and tried to run it, I got a compile error due to the missing indentation in the find_primes function:

if k in primes:
primes.remove(k)

Perhaps there was a problem when you copied the code into the post. This works:

if k in primes:
 primes.remove(k)

Let's start from the top:

def find_primes(n):

That's a good name for a function. However, it doesn't specifically convey what it is doing. That could be improved with a docstring as recommended by the PEP 8 style guide. The docstring could summarize the function's purpose as well as specify the input type and return type:

def find_primes(n):
 """
 Find all primes numbers less than or equal to the input.
 The input (n) is expected to be a positive integer.
 Return a list of primes.
 """

primes is a good name for an array variable:

primes = set(range(1, n+1))

However, it at first seems a little misleading because you are initially setting it to all integers, not just prime numbers. I recommend adding a comment stating that the non-primes will be removed from the array.

It would also be helpful to add comments regarding your algorithm, explaining your nested while loops.

This line uses inconsistent spacing around the operators:

while k <=n :

The black program can be used to automatically reformat the code with consistency:

while k <= n:

The common definition of prime numbers does not include 1, yet your find_primes function returns 1 in its primes array. Even the Codility description shows the first prime number as 2. I suggest adding a comment in the code for this.

You should also add a docstring for the find_semi_primes function, which should include a description of what a "semiprime" is. It should also include a description of the algorithm, explaining the nested loops.

The expression p1 * p2 is repeated a couple times. You can avoid the DRY using a new variable named product:

for j in range(i, p):
 product = primes[i] * primes[j]
 if product > n:
 break
 semi_primes.add(product)

Toby Speight Toby Speight 87.8k14 gold badges104 silver badges325 bronze badges · Answer 2 · 2025-05-01 07:17:50Z

This code seems very "procedural" in nature, looking like the work of a C or C++ programmer rather than that of a fluent Pythonista. For example, I see a lot of for i in range(): loops where the idiomatic Python would use a comprehension or transform.

Let's start by correcting find_primes() - 1 is not a prime, so the initial set should start at 2:

 primes = set(range(2, n+1))

Then instead of ignoring the first element in find_semi_primes(), we should index from zero:

 for i in range(0, p):

That makes the code more understandable for readers used to working with primes.

Consider the final loop:

q = len(P)
result = q * [0]
for i in range(q):
 result[i] = prefix_count[Q[i]] - prefix_count[P[i]-1]
return result

My Pythonic replacement uses map to evaluate each query and produce the result:

def semiprimes_in_range(lower, upper):
 return prefix_count[upper] - prefix_count[lower-1]
return list(map(semiprimes_in_range, p, q))

It's possible to write find_semiprimes can be written more readably as:

def find_semiprimes(n):
 primes = find_primes((n+1)//2)
 return {x * y for x, y in itertools.combinations_with_replacement(primes, 2 )}

However, I don't recommend this because it generates many semiprimes greater than n, hugely impacting performance.

But do note the ceil division by 2 - since the smallest prime is 2, we know that's the smallest possible factor of the semiprimes.

We can use itertools.accumulate() to generate prefix_count - that's another chunk of code we don't have to write ourselves:

 prefix_count = list(itertools.accumulate(map(lambda x: x in semiprimes, range(n+1))))

My modified code

It's not perfect, but demonstrates my suggestions as a complete program.

import itertools
def find_primes(n):
 '''
 Generate prime numbers up to (and including) 'n'.
 '''
 # Implementation is sieve of Eratosthenes.
 primes = set(range(2, n+1))
 i = 2
 while i * i <= n:
 if i in primes:
 k = i * i
 while k <= n:
 if k in primes:
 primes.remove(k)
 k += i
 i += 1
 return sorted(primes)
def find_semiprimes(n):
 '''
 Generate semiprime numbers up to (and including) 'n'.
 '''
 primes = find_primes(n)
 p = len(primes)
 semiprimes = set()
 for i in range(0, p):
 for j in range(i, p):
 p1 = primes[i]
 p2 = primes[j]
 if p1 * p2 > n:
 break
 semiprimes.add(p1 * p2)
 return semiprimes
def solution(n, p, q):
 '''
 Create a list containing the following elements:
 For every element of p and corresponding element of q (both ≤ n),
 count how many semiprimes in the range [p, q).
 '''
 semiprimes = find_semiprimes(n)
 prefix_count = list(itertools.accumulate(map(lambda x: x in semiprimes, range(n+1))))
 def semiprimes_in_range(lower, upper):
 return prefix_count[upper] - prefix_count[lower-1]
 return list(map(semiprimes_in_range, p, q))
if __name__ == '__main__':
 print(solution(26, [1, 4, 16], [26, 10, 20]))

Kelly Bundy Kelly Bundy 3,2467 silver badges21 bronze badges · Answer 3 · 2025-05-01 14:05:29Z

A bug: find_semi_primes assumes that the list of primes is in sorted order (it skips index 0 where it assumes the number 1 is, and it breaks the inner loop when the semiprime becomes too large). But find_primes returns list(primes) where primes is a set, and sets are unordered. So you might get any arbitrary order instead of sorted order. (In CPython you currently do get sorted order, but that's due to implementation details that should not be relied on.)

And I'd change

if k in primes:
 primes.remove(k)

to

primes.discard(k)

for simplicity.

Stack Exchange Network

Count SemiPrimes task from Codility

3 Answers 3

Syntax

My modified code

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Count SemiPrimes task from Codility

3 Answers 3

Syntax

My modified code

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions