I have devised my own version for a Goldbach verification project which uses sets to compare already sieved primes. My implementation also uses generators to "sieve a segment" since storing base 10 numbers in such an extensive list is memory intensive. However, as in most cases, it is not as efficient as I would wish it to be. I hope to shorten the time complexity sufficiently by optimizing one of the implementations found in the program. In addition, to understand my program further, I used the difference in primes, which would be subtracted from the even number consecutively until the difference in the final set was zero.
Set comprehension
f_stack = {i for i in range(6, ((seg + 1)) * seg_size, 2)}
This piece is particularly slow in comparison to the rest of the program when n
is sufficiently large. I garner that the loops themselves are also slow when n
gets sufficiently large so I suppose some extreme loop unrolling in assembly would fix this, but there must be some more practical solution.
General algorithm
The implementation I made I believe could have some improvements in the general algorithm that offers more performance, maybe some bitwise implementation, or another data structure that has the comparison functionalities of a set, yet maintains a more inordinate capacity for iterating . And maybe even an implementation that takes advantage of the clock speed on the cpu to maximize performance. The most obvious performance increase would be to implement multiprocessing, but I wish to improve on the progression of the algorithm itself.
And yes I know that I do not have any if name main, and I also may have messed up on some of my type hints.
def primesieve (n:int) -> tuple:
sieve = [True] * (n // 2)
for i in range(3, int(n ** 0.5) + 1, 2):
if sieve[i // 2]:
sieve[i * i // 2::i] = [False] * ((n - i * i - 1) // (2 * i) + 1)
return [2] + [2 * i + 1 for i in range(1, n// 2) if sieve[i]]
def split_list(a_list):
half = len(a_list)//2
return [0] + a_list[1:half]
def difference(primes: list):
result = [b - a for a, b in zip(primes[:-1], primes[1:])]
return result
def stack(seg: int, seg_size: int,pm: object, fuprime: set):
if seg == 0:
f_stack = {i for i in range(6, ((seg + 1)) * seg_size, 2)}
else:
f_stack = {i for i in range(seg * seg_size, ((seg + 1)) * seg_size, 2)}
for j in pm:
f_stack = {(i - j) for i in f_stack}
f_stack = f_stack.difference(fuprime)
yield f_stack
def prime_gen(prime:list):
yield 0
for i in prime:
yield i
def main():
limit = 10 ** 6
fuprime = primesieve(limit)
seg_size = 10 ** 6
prime = split_list(fuprime)
prime = difference(prime)
for i in range(0, (limit // seg_size)):
pm = prime_gen(prime)
stacks = stack(i, seg_size, pm, fuprime)
for sets in stacks:
if sets.difference(fuprime) == set():
break
pass
main()
-
1\$\begingroup\$ To use less memory you can optimize the sieve. You can take inspiration from this. \$\endgroup\$user140242– user1402422023年01月26日 13:49:57 +00:00Commented Jan 26, 2023 at 13:49
1 Answer 1
Layout
There is inconsistent vertical whitespace and space around operators. The black program can be used to automatically format the code for consistency.
Simpler
These lines in the difference
function:
result = [b - a for a, b in zip(primes[:-1], primes[1:])]
return result
can be combined into one line:
return [b - a for a, b in zip(primes[:-1], primes[1:])]
This eliminates the generically-named result
variable.
In the main
function this line can be deleted:
pass
Main guard
Add the "main" guard:
if __name__ == "__main__":
main()
Documentation
The PEP 8 style guide recommends adding docstrings for functions.
Also, add a header docstring at the top of the file to summarize the code's purpose and to describe the algorithm:
"""
Goldbach verification project which uses sets to compare already sieved primes.
"""
Naming
PEP 8 recommends snake_case for function and variable names.
primesieve
would be prime_sieve
.
For fuprime
, you should change fu
to be something more meaningful --
and less offensive :)
DRY
These 2 lines are nearly identical:
f_stack = {i for i in range(6, ((seg + 1)) * seg_size, 2)}
f_stack = {i for i in range(seg * seg_size, ((seg + 1)) * seg_size, 2)}
You could create a variable for the start of the range:
range_start = 6 if seg == 0 else seg * seg_size
f_stack = {i for i in range(range_start, ((seg + 1)) * seg_size, 2)}
Lint check
Since you mentioned set comprehensions as a concern, pylint identified a few issues:
R1721: Unnecessary use of a comprehension, use set(range(6, (seg + 1) * seg_size, 2)) instead. (unnecessary-comprehension)
R1721: Unnecessary use of a comprehension, use set(range(seg * seg_size, (seg + 1) * seg_size, 2)) instead. (unnecessary-comprehension)