I have created a program to find prime numbers up to the 32nd power of 2. It uses the sieve of Eratosthenes. I would appreciate it if you could let me know any bugs or improvements I can make.
/*
* prime32.c
*/
#include <stdio.h>
char isprime1[0x10000];
char isprime2[0x10000];
typedef unsigned int UINT;
void solve1() {
int i, j;
for (i = 0; i < 0x10000; ++i) {
isprime1[i] = 1;
}
isprime1[0] = 0;
for (i = 2; i <= 0x10000; ++i) {
if (!isprime1[i-1]) continue;
for (j = 2 * i - 1; j < 0x10000; j += i) {
isprime1[j] = 0;
}
}
}
void solve2(UINT start) {
long long int i, j, offset;
for (i = 0xFFFF; i >= 0; --i)
isprime2[i] = 1;
for (i = 1; i <= 0x10000; ++i) {
if (!isprime1[i-1]) continue;
offset = (start + 0xFFFF) % i;
for (j = 0xFFFF - offset; j >= 0; j -= i) {
isprime2[j] = 0;
}
}
}
void print1(FILE *fp) {
int i;
for (i = 0; i < 0x10000; ++i) {
if (!isprime1[i]) continue;
fprintf(fp, "%d\n", i + 1);
}
}
void print2(FILE *fp, UINT start) {
long long int i;
for (i = 0; i < 0x10000; ++i) {
if (!isprime2[i]) continue;
fprintf(fp, "%lld\n", start + i);
}
}
int main() {
FILE *fp;
UINT i;
solve1();
fp = fopen("prime32.out", "w");
print1(fp);
for (i = 0x10001; i != 1; i += 0x10000) {
solve2(i);
print2(fp, i);
}
fclose(fp);
return 0;
}
2 Answers 2
int i, j; for (i = 0; i < 0x10000; ++i) {
Largest int
may be as small as 32767 (0x7FFF), so this loop may be infinite when targeting such platforms. I recommend you include <stdint.h>
and use (e.g.) uint_fast32_t i
.
fp = fopen("prime32.out", "w"); print1(fp);
When fopen()
fails (as it may well do if this program is run with a working directory that is not writable by the user), then it returns a null pointer. We should test that fp
is non-null before attempting to use it.
I recommend simply writing to standard output, rather than to a hard-coded file name - that's more flexible, as a user can connect that stream to anywhere they want - a different file, or input to another program, for example.
The isprime1[i] = 1
loop of each function would be more readable if replaced with memset(isprime1, 1, sizeof isprime1)
. memset()
is declared in <string.h>
.
I don't like the magic numbers 0x10000 and 0xFFFF sprinkled all over the place. If we changed one of these, would we easily know which others to change, and which to leave? Defining relevant constants would help with that.
There are more efficient implementations of the Sieve of Eratosthenes - for example, we only need to start marking off multiples of prime i
beginning at i*i
, as lower values have already been accounted for earlier in the process.
I don't really see the point of solve2()
instead of simply making isprime1
larger - perhaps some explanatory comments would help?
-
4\$\begingroup\$ Toby's idea of writing to
stdout
also applies to reading from a file. When I write small programs like this I like to take values fromstdin
instead of reading from files. That way you don't have to worry about opening and closing files or dealing with file paths/permissions, and as a bonus if decide to keep your mini-project you have a nice Unix-y utility. \$\endgroup\$Ajinkya Kamat– Ajinkya Kamat2023年12月24日 21:59:22 +00:00Commented Dec 24, 2023 at 21:59
Breaking issue: 1 isn't prime, but solve1 seems to treat it as prime (sets it to 1 in the first loop, then skips it in the second because it starts at i=2).
Breaking issue: If I understand right, on 64 bit machines, for (i = 0x10001; i != 1; i += 0x10000) {
with is set to UINT (unsigned int), seems to be looping from 0x0001 0001
to 0xFFFF 0001
which could be the 64th power of 2, not the 32nd as stated in the description. This will take likely some time to execute.
Breaking issue: solve2
takes a UINT (unsigned int), but offset
is a signed long long int. On a 64 bit machine, both may well be 64 bits, in which case you'll have problems once offset = 0x7FFF 0000
.
Minor nit: There are no comments. Unless you are forbidden from commenting, comment your code. If you are forbidden from commenting, still comment your code, then submit a copy with the comments stripped to whomever forbids them.
Minor nit: The "isprime" arrays are misnamed, since they do not say whether stuff in them is prime, until processing is complete and they are no longer used. "isUntestedOrPrime" would be more truthful.
Minor nit: "isprime" is two words, so for readability should be in camelCase or snake_case: isPrime or is_prime, depending on your style guide. If you have no style guide to follow, snake_case is generally preferred because it helps translation for readers who are not fluent in English.
Minor nit: You set your arrays with ints, but test them as boolean. Consistency either way would be more readable.
Minor nit: Your first loop in solve1 starts at 0, but then immediately after the loop, you overwrite the zeroth item. Consider changing to start at i=1, or as Toby Speight sagely points out, just use memset. You could also just initialize isprime to all 1s at creation time, which would also make it clear that this is an array that is populated only once (by solve1).
Minor nit: Your second loop in solve1 starts at 2, so you're assuming the primeness of 0 and 1. At that point, why not just hardcode all of them? At the very least, the first non-prime is 4, so since you've preset everything to prime in the memset, you can start at 3 and have i+=2 to skip all the even numbers higher than 2 (but then you'd need to zero them our separately; or you could make your arrays half the size and have them only track odd numbers, but that'd take fiddlier math).
-
1\$\begingroup\$
unsigned int
should be 32bit in both 32bit and 64bit environments. At least that was the case withgcc
. \$\endgroup\$Haruo Wakakusa– Haruo Wakakusa2023年12月25日 13:47:31 +00:00Commented Dec 25, 2023 at 13:47 -
4\$\begingroup\$ @HaruoWakakusa I think int datatype sizes are mandatory minimums, rather than absolute sizes. Everything larger than byte can be 64 bits and still meet the requirements that
long long >= long >= int >= short
. So fixed-width integer types (C99, stdint.h) should ideally be used when mixing int types, especially where we're explicitly relying on overflow, as here. Unfortunately, compilers make very poor sources of truth for standards, and relying on them can result in issues like this. But so does the internet, so don't trust me: or at least, trust if you like, but always verify if important! \$\endgroup\$Dewi Morgan– Dewi Morgan2023年12月25日 14:29:44 +00:00Commented Dec 25, 2023 at 14:29 -
\$\begingroup\$ Minor nit:
isprime()
is fine.is_prime()
is OK.isPrime()
is OK, ... Too minor an issue. \$\endgroup\$chux– chux2023年12月26日 16:42:43 +00:00Commented Dec 26, 2023 at 16:42 -
1\$\begingroup\$ @chux,
isprime()
is not fine in my projects, because it's reserved for future library use (begins withis
followed by lowercase letter). \$\endgroup\$Toby Speight– Toby Speight2024年01月03日 17:21:30 +00:00Commented Jan 3, 2024 at 17:21 -
\$\begingroup\$ @TobySpeight Fair enough: "7.33.2 Character handling <ctype.h> Function names that begin with either
is
orto
, and a lowercase letter are potentially reserved identifiers and may be added to the declarations in the <ctype.h> header." IMO, Wishy-washy "potentially reserved". \$\endgroup\$chux– chux2024年01月03日 19:17:09 +00:00Commented Jan 3, 2024 at 19:17