I'm doing an Objective-C project and I need to do some prime factorization and so I came up with the following algorithm which is 99% C, except for the array part which was easiest to implement in Objective-C because it dynamically allocates new memory for the elements as they are simply added. The Objective-C part is pretty self-explanatory.
I'm mainly looking for feedback on the algorithm, not necessarily the coding style and such, but I will gladly feedback on something I can improve on! On my machine with O3 optimizations, I get just under two seconds on factorizing 1234567890. I really don't know if that is good or bad. One thing I would like to add to it is a table for previously calculated factorizations, when on large numbers you will most likely encounter numbers more than once (e.g. 4, 6, etc.). I could maybe add to the function that if it's even, add a 2 to the factors and don't do the division/prime check which may improve performance.
//Declare an array
NSMutableArray *factors;
bool isPrime(long n) {
//Check if even or ends in a five
if ((n % 2 == 0 && n > 2) || (n > 5 && n % 10 == 5)) {
return 0;
}
if (n == 2) return 1;
//set upper bound to square root
long u = sqrt(n);
long c = 0;
for (long i = 3; i <= u; i+=2)
if (n % i == 0) {
c++;
break;
}
return !c;
}
void factorize(long n) {
//If prime return
if (isPrime(n)) {
[factors addObject:@(n)];
return;
}
for (long i = n/2; i > 1; i--) {
if (n % i) continue; //Check if indivisible
long d = n/i;
bool iP = isPrime(i), dP = isPrime(d);
if (iP)
[factors addObject:@(i)];
else
factorize(i);
if (dP)
[factors addObject:@(d)];
else
factorize(d);
return;
}
}
int main(int argc, const char * argv[]) {
factors = [NSMutableArray array];
long n = 1234567890;
factorize(n);
NSLog(@"%@", factors);
return 0;
}
-
\$\begingroup\$ Why bother optimizing a slow algorithm when you can just replace it with another (easy to implement) algorithm that will instantly factor all 64-bit integers in under few thousand or so simple iterations? Work smart, not hard - en.wikipedia.org/wiki/Pollard%27s_rho_algorithm \$\endgroup\$Thomas– Thomas2014年09月05日 13:54:10 +00:00Commented Sep 5, 2014 at 13:54
-
\$\begingroup\$ @awashburn This is prime factorization, not primality testing.. \$\endgroup\$Thomas– Thomas2014年09月05日 14:08:20 +00:00Commented Sep 5, 2014 at 14:08
-
\$\begingroup\$ @Thomas I'd still argue against a stochastic algorithm... \$\endgroup\$recursion.ninja– recursion.ninja2014年09月05日 15:18:37 +00:00Commented Sep 5, 2014 at 15:18
-
\$\begingroup\$ @awashburn The algorithm I linked is deterministic, and randomizing it doesn't really gain anything. Most nontrivial factorization algorithms are probabilistic though, that's just the reality of things, for 64 bit inputs though I find deterministic Miller-Rabin and rho (after trial division by the first few primes for efficiency) is as fast as it gets. Although I haven't tried ECM, but I suspect it has too much overhead in this case. \$\endgroup\$Thomas– Thomas2014年09月05日 15:30:20 +00:00Commented Sep 5, 2014 at 15:30
-
\$\begingroup\$ @Thomas But the function you linked to uses stochastic principals to define a partial (non-total) function. Your right it is deterministic, still wouldn't recommend a partial function... \$\endgroup\$recursion.ninja– recursion.ninja2014年09月05日 15:36:00 +00:00Commented Sep 5, 2014 at 15:36
4 Answers 4
Your algorithm tries to find the largest factors first. This is ineffective because you have to test each possible factor whether it is a prime number or not.
It is much more effective to start with the smallest factor.
If you divide the number by each factor found this way then you will get only prime factors, and the prime testing
function isPrime()
becomes obsolete. You start with dividing by 2 and then test possible odd
factors in increasing order. (If you have a pre-computed list of primes then this can be further improved.)
On my computer this reduced the runtime for the factorization of 1234567890 from 1.6 to 0.00005 seconds.
In addition, a global factors
variable is no longer necessary with this change,
and your method could look like this:
NSMutableArray * primeFactorization(long n) {
NSMutableArray *factors = [NSMutableArray array];
// Divide by 2:
while (n > 1 && n % 2 == 0) {
[factors addObject:@(2)];
n /= 2;
}
// Divide by 3, 5, 7, ...
//
// i is a possible *smallest* factor of the (remaining) number n.
// If i * i > n then n is either 1 or a prime number.
for (long i = 3; i * i <= n; i += 2) {
while (n > 1 && n % i == 0) {
[factors addObject:@(i)];
n /= i;
}
}
if (n > 1) {
// Append last prime factor:
[factors addObject:@(n)];
}
return factors;
}
int main(int argc, const char * argv[]) {
long n = 1234567890;
NSDate *startTime = [NSDate date];
NSArray *factors = primeFactorization(n);
NSLog(@"time: %f", -[startTime timeIntervalSinceNow]);
NSLog(@"%@", factors);
return 0;
}
-
1\$\begingroup\$ Since you don't use
argc
&argv
, I would putvoid
in their spot. At least, that is what I would do in C. Other than that, very good advice: +1 \$\endgroup\$syb0rg– syb0rg2014年09月05日 15:14:50 +00:00Commented Sep 5, 2014 at 15:14
Feedback on the algorithm
//Check if even or ends in a five
if ((n % 2 == 0 && n > 2) || (n > 5 && n % 10 == 5)) {
return 0;
}
The part which checks if your number is divided by 5 makes very little optimization, since it's checked on the second iteration of the loop which follow.
for (long i = 3; i <= u; i+=2)
if (n % i == 0) {
c++;
break;
}
I suggest creating a cache of all prime numbers up to, say, 100000, and rewrite the loop like this:
for (long i = 0; i < primeCacheSize; i++)
if (n % primeCache[i] == 0) {
return 1;
}
for (long i = 100001; i <= u; i++)
if (n % i == 0) {
return 1;
}
return 0;
This is a bit strange:
//If prime return
if (isPrime(n)) {
[factors addObject:@(n)];
return;
}
for (long i = n/2; i > 1; i--) {
if (n % i) continue; //Check if indivisible
How will this code work on 208907 * 208907 * 208907 (9117147423118643)? Your function isPrime
will determine that n is divided by 208907, return false, and you'll start iterating from 9117147423118643 / 2
downwards. Looks like an infinite (and useless) loop.
The solution I may suggest:
make your isPrime
function return the number itself if n is prime or number's divider if it isn't.
for (long i = 0; i < primeCacheSize; i++)
if (n % primeCache[i] == 0) {
return primeCache[i];
}
for (long i = 100001; i <= u; i++)
if (n % i == 0) {
return i;
}
return n;
And rewrite factorize like that:
void factorize(long n) {
//If prime return
while (n > 1) {
divider = isPrime(n);
[factors addObject:@(divider)];
n /= divider;
}
-
\$\begingroup\$ I'd limit the prime cache loops with
u
as well to avoid unnecessary checks:i < primeCacheSize && i <= u
. \$\endgroup\$DarkDust– DarkDust2014年09月05日 10:53:15 +00:00Commented Sep 5, 2014 at 10:53 -
2\$\begingroup\$ Also, with the suggestion of having
isPrime
return the divider: the function nameisPrime
is now misleading as I'd expect a boolean value to be returned, so its name should be changed to something likefirstPrimeOf
or whatever. Also, I'd recommend using aNSMutableSet
instead of anNSMutableArray
forfactors
. \$\endgroup\$DarkDust– DarkDust2014年09月05日 10:56:56 +00:00Commented Sep 5, 2014 at 10:56 -
\$\begingroup\$ A note on your prime cache. if you have
10000
primes cached, and the number is not in the cache and the number is greater then the upper limit of the cache, you should not start the next loop' \$\endgroup\$recursion.ninja– recursion.ninja2014年09月05日 14:10:10 +00:00Commented Sep 5, 2014 at 14:10 -
\$\begingroup\$ If the number is not in the cache, you should start the second loop not at
i=cacheSize+1
but ati=cache[cacheSize]+2
(+2
so it's an odd number), and the loop should increment byi+=2
. This ensures that you are only checking odd numbers. \$\endgroup\$recursion.ninja– recursion.ninja2014年09月05日 14:11:30 +00:00Commented Sep 5, 2014 at 14:11
Given that this answer is marked with the c tag, and the user quite appropriately points out that the code is 99% c, I decided that it might be worthwhile to demonstrate how simple the C solution to this problem is.
There are only two aspects of Objective-C the user uses.
NSMutableArray
. This is convenient and it handles the resizing of the array for us. However, it's important to note that there's nothing magical about how anNSMutableArray
resizes. Objective-C mutable arrays are resized in the same way as a C-array.NSNumber
. We don't actually seeNSNumber
anywhere directly, but that's what the@()
syntax is doing--creatingNSNumber
objects. And why? BecauseNSArray
can not store primitive data types. It must store pointers.
Unless the actual intent is to use this array in some other Objective-C code where the NSNumber
objects will actually be easier to deal with, there's not a particularly good reason to use Objective-C here, except perhaps being uncomfortable in working with C-style arrays.
So, here's my solution to this problem in C.
As a starting point, I used Martin R's answer, which actually also is very similar to the algorithm I used for a slightly different problem here.
Please note, the following code snippet includes the code from Martin's answer so that the two solutions can be compared side-by-side in case he ever edits/removes his answer.
Martin's very good answer:
NSMutableArray * primeFactorization(long n) {
NSMutableArray *factors = [NSMutableArray array];
// Divide by 2:
while (n > 1 && n % 2 == 0) {
[factors addObject:@(2)];
n /= 2;
}
// Divide by 3, 5, 7, ...
//
// i is a possible *smallest* factor of the (remaining) number n.
// If i * i > n then n is either 1 or a prime number.
for (long i = 3; i * i <= n; i += 2) {
while (n > 1 && n % i == 0) {
[factors addObject:@(i)];
n /= i;
}
}
if (n > 1) {
// Append last prime factor:
[factors addObject:@(n)];
}
return factors;
}
My approach in straight C:
long * cPrimeFactorization(long n, long *factorCount) {
long currentSize = 2;
long currentIndex = 0;
long *factors = malloc(sizeof(long) * currentSize);
while (n > 1 && n % 2 == 0) {
factors[currentIndex++] = 2;
if (currentIndex >= currentSize) {
currentSize *= 2;
long *reallocFactors = realloc(factors, currentSize * sizeof(long));
if (reallocFactors) {
factors = reallocFactors;
} else {
printf("realloc failed");
free(factors);
return NULL;
}
}
n /= 2;
}
for (long i = 3; i * i <= n; i += 2) {
while (n > 1 && n % i == 0) {
factors[currentIndex++] = i;
if (currentIndex >= currentSize) {
currentSize *= 2;
long *reallocFactors = realloc(factors, currentSize * sizeof(long));
if (reallocFactors) {
factors = reallocFactors;
} else {
printf("realloc failed");
free(factors);
return NULL;
}
}
n /= i;
}
}
if (n > 1) {
factors[currentIndex++] = n;
}
*factorCount = currentIndex;
return factors;
}
Running them side by side:
int main(int argc, const char * argv[]) {
long n = 1234567890;
NSDate *startTime = [NSDate date];
NSArray *oFactors = primeFactorization(n);
NSLog(@"time (Objective-C): %f", -[startTime timeIntervalSinceNow]);
startTime = [NSDate date];
long factorCount;
long * cFactors = cPrimeFactorization(n, &factorCount);
NSLog(@"time (C): %f", -[startTime timeIntervalSinceNow]);
NSLog(@"%@", oFactors);
for (long i = 0; i < factorCount; ++i) {
printf("%li\n", cFactors[i]);
}
return 0;
}
For the sake of benchmarking consistency, I just used the Objective-C approach (because Martin already had it and I didn't feel like looking up a C approach). I tried a handful of different numbers and ran them all multiple times. The C approach runs faster in every case.
More than speed, the C approach actually takes less total memory than the Objective-C approach. Not only does the Objective-C array have more overhead than the C array, (each index is the same size, 8-bytes), but the Objective-C array is an index of pointers to objects. So the Objective-C array is already slightly bigger than the C array, plus there's several magnitudes more memory space for holding all of the NSNumber
objects out on the heap as well.
And that we have to spend so much time creating these NSNumber
objects out on the stack is part of the reason the Objective-C solution is slower. It's also slower because passing messages to objects (the addObject:
we call several times) is very slow compared to being able to directly insert a value in at a memory location (which is what we do with the C-array).
Please note, I'm definitely an Objective-C programmer, and I am not a C programmer in the slightest. My C code almost certainly has room for improvement I'd imagine... but as Objective-C programmers, we must always keep in mind that pure-C should always be an option.
As a note, this speed (and memory) difference becomes particularly noticeable with very large powers of two. For example, with 2^49th, which is 562,949,953,421,312, the Objective-C solution takes almost 3 times as long on my computer.
-
\$\begingroup\$ With regards to
printf("realloc failed");
, the caller should handle reporting errors. \$\endgroup\$Corbin– Corbin2014年09月05日 23:40:42 +00:00Commented Sep 5, 2014 at 23:40 -
\$\begingroup\$ Yeah, probably. I'm not familiar with C-standards or what should be done in a case like this... I just know it's possible for
realloc
to fail... \$\endgroup\$nhgrif– nhgrif2014年09月05日 23:47:01 +00:00Commented Sep 5, 2014 at 23:47 -
\$\begingroup\$ @nhgrif: My last comment was wrong and I have deleted it. But it would be interesting if - after you added the necessary allocation to the
while (n > 1 && n % 2 == 0)
loop - the difference is still particularly noticeable for powers of two. \$\endgroup\$Martin R– Martin R2014年09月07日 06:04:17 +00:00Commented Sep 7, 2014 at 6:04
There are math potential problems with double sqrt(double)
bool isPrime(long n) {
...
long u = sqrt(n);
...
for (long i = 3; i <= u; i+=2)
Conversion from
long
todouble
may lose precision. Althoughdouble
typically can represent every value of anint54_t
,long
could be wider such as 64 bits.A good
sqrt()
will return the exact value for a perfect square, but not allsqrt()
behave so nicely. Code should be prepared for something likesqrt(121)
-->10.99999...
.Since the
for()
loop has no trouble iterating a little pass ideal, recommend a simple solution for both above issues: Increase the upper limit by 1.long u = sqrt(n); u++;