Factorials, loops, break and professional code

Question 1

The program below correctly calculates factorials, until the limit for an unsigned long long integer is exceeded.

My question has to do with loops and exiting loops. The program uses one infinite while loop and the break statement to exit 2 while loops.

Do working, professional programmers handle loops this way? If not, how would they write and exit these loops?

/* factorial.c
 program to calculate factorials until the limit for an unsigned long long
 integer is exceeded
 20! = 2,432,902,008,176,640,000
 limit = 18,446,744,073,709,551,615
 21! = 51,090,942,171,709,440,000
*/
#include <stdio.h>
#include <limits.h>
unsigned long long int factorial (int i)
{
 // function returns 0 when limit exceeded
 int j;
 unsigned long long int result;
 if ( i == 0 )
 result = 1;
 else {
 result = i; 
 j = i - 1;
 while ( j >= 1 ) {
 if ( result <= ULLONG_MAX / j ) 
 result *= j;
 else {
 result = 0;
 break;
 }
 --j; 
 }
 } 
 return result;
}
int main (void)
{
 int i;
 unsigned long long int number;
 printf ("\nTABLE OF FACTORIALS\n\n");
 printf (" n n!\n");
 printf ("-- ------------------------\n");
 i = 0;
 while ( 1 ) {
 number = factorial (i);
 if ( number == 0 )
 break; 
 printf ("%2i %24llu\n", i, number);
 ++i;
 }
 return 0;
}

Question 2

I'd have to agree with Evert in my workplace its perfectly acceptable to break out of loops on the assurance that you know the break will take place.

Question 3

While your code is quite nice, I would argue that professional coders will not do any more work than necessary. Since you know the domain you have define factorial in, why not precompiled the first 20 factorials and return the appropriate one? It may not scale, but since you have the restriction of no factorial bigger than 20, it is a useful idea to consider

Question 4

I guess it's as much a matter of taste as it is of existing coding guidelines that you have to abide by.

I'd be fine with early returns. Thus, I wouldn't have an else clause, just if (i == 0) { return 1; } followed by what's inside your else clause.
Ditto for the case where result > ULLONG_MAX / j: just an early return.

I guess early returns are easier to grok in short functions than in longer functions; and on the other hand, there is certainly an argument to be made to have just a single return point*.

For the second while loop in main, I would perhaps replace it with a for loop, as I feel that states the intention a bit better. Arguably, the condition may need to be constant there, or perhaps the maximum allowed value for i.
And you could shuffle a few lines around to have the check for number != 0 inside the loop (that does put an extra call to factorial just outside such a for loop).

All in all, it largely feels like picking nits. Pick a coding style you're happy with, possibly abiding by some good standards that people have drafted, and stick with it (until people give you a good explanation why you should do something else).

I would slightly worry about the implicit int to unsigned long long cast; I haven't bothered looking up casting rules, but you are casting a potentially negative value to an unsigned variable. Are there any compiler warnings?

*) One case I could imagine, is when you want to do some clean-up before exiting the function: free allocated memory, close files etc. By always exiting at the end of the function, were this clean-up happens, you don't repeat the clean-up code elsewhere. This also involves one of the very few cases where goto serves a good purpose in C (well, not everyone will agree on that): it becomes the C equivalent of a defer or finally statement. Of course, here it's not needed.

Question 5

My compiler is gcc and I have all All warnings enabled -- it did not issue any warnings. It hadn't occurred to me that setting result = i was mixing data types.

Question 6

Sadly, for a reason I don't know (historical?), -Wall in gcc doesn't mean all. If you can use clang instead, you can use -Weverything, and you will get warnings about those implicit conversions. (Searching with these flags and compilers as keywords will probably lead you to the bunch of flags to add to gcc to get all warnings.)

Question 7

The other reviews have covered most items, so I'll just add a few.

Use constant string concatenation

The current code includes this:

printf("\nTABLE OF FACTORIALS\n\n");
printf(" n n!\n");
printf("-- ------------------------\n");

But we don't really need to make 3 separate calls to printf here. Instead, let the compiler automatically concatenate string literals together so we can write this:

printf("\nTABLE OF FACTORIALS\n\n"
 " n n!\n"
 "-- ------------------------\n");

I'd also suggest that puts would be a more appropriate call than printf since all we're printing is a string literal.

Be careful with signed vs. unsigned

If you don't intend for your code to handle negative numbers, I'd suggest that you indicate this clearly by changing the function signature to this:

unsigned long long factorial(unsigned i)

Minimize the scope of variable

In modern C, we don't tend to declare all variables at the top of each function. Instead, they're declared as they're needed and in as small a scope as possible, as in the following suggestion:

Prefer `for` to `while` loops

In the main code, the while(1) suggests that the loop continues forever but it really doesn't. In fact, we increment i each time and exit when the factorial code detects an overflow. Why don't we just write it that way instead of making the reader work so hard?

int main(void)
{
 printf("\nTABLE OF FACTORIALS\n\n"
 " n n!\n"
 "-- ------------------------\n");
 unsigned long long number;
 // factorial() returns 0 on overflow
 for (unsigned i; (number = factorial(i)); ++i) {
 printf ("%2u %24llu\n", i, number);
 }
}

Simplify your code

The current factorial code has the same problem with its loop, which is that it's not made clear when the loop ends. Also, the code for detecting overflow is much more complex than it needs to be. Instead, we can simply note that when the multiplication results in overflow, the resulting product is less than the previous value. Using that, I'd write the code like this:

// return the factorial of i or
// 0 if multiplication overflows
unsigned long long factorial(unsigned i)
{
 if (i == 0) {
 return 1;
 }
 unsigned long long res;
 for (res = i--; i; --i) {
 unsigned long long prev = res;
 res *= i;
 if (res < prev) { // oveflow?
 return 0;
 }
 }
 return res;
}

Omit `return 0`

When a C or C++ program reaches the end of main the compiler will automatically generate code to return 0, so there is no need to put return 0; explicitly at the end of main.

Note: when I make this suggestion, it's almost invariably followed by one of two kinds of comments: "I didn't know that." or "That's bad advice!" My rationale is that it's safe and useful to rely on compiler behavior explicitly supported by the standard. For C, since C99; see ISO/IEC 9899:1999 section 5.1.2.2.3:

[...] a return from the initial call to the main function is equivalent to calling the exit function with the value returned by the main function as its argument; reaching the } that terminates the main function returns a value of 0.

For C++, since the first standard in 1998; see ISO/IEC 14882:1998 section 3.6.1:

If control reaches the end of main without encountering a return statement, the effect is that of executing return 0;

All versions of both standards since then (C99 and C++98) have maintained the same idea. We rely on automatically generated member functions in C++, and few people write explicit return; statements at the end of a void function. Reasons against omitting seem to boil down to "it looks weird". If, like me, you're curious about the rationale for the change to the C standard read this question. Also note that in the early 1990s this was considered "sloppy practice" because it was undefined behavior (although widely supported) at the time.

So I advocate omitting it; others disagree (often vehemently!) In any case, if you encounter code that omits it, you'll know that it's explicitly supported by the standard and you'll know what it means.

Question 8

Your K&R bracing is flawlessly consistent, which is awesome =)

This is less so:

 if ( number == 0 )
 break;

You're consistently omitting the braces everywhere you can. While consistency is excellent, omitting optional braces has caused real, serious problems - in a nutshell:

if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
 goto fail;
 goto fail;

Don't goto fail - enclose all scopes with proper braces:

 if ( number == 0 ) {
 break;
 }

The whitespace inside the parentheses is unnecessary and feels distracting to me, and my own personal preference would be to remove the space between a function call and its argument list:

if ( i == 0 ) // if (i == 0) {
while ( 1 ) { // while (1) {
number = factorial (i); // number = factorial(i);
printf ("%2i %24llu\n", i, number); // printf("%2i %24llu\n", i, number);

But as others have said this is nit-picking: consistency is king. Writing professional code is writing code that nicely blends into the existing code base, because it doesn't use a different bracing or spacing style. And it's consistently consistent.

Variables, unless they're trivial loop counters, should have a meaningful name. It is a well-established convention to use i and j in for loops, for example. But i in this case is more than a trivial loop counter: it's an argument to factorial.

In mathematics, the factorial of a non-negative integer \$n\,ドル denoted by \$n!\,ドル is the product of all positive integers less than or equal to \$n\$.

_{https://en.wikipedia.org/wiki/Factorial}

If you can't quite put a name on what that number \$n\$ would be called (I can't), IMO n would make a perfectly fine identifier for it. If there's a domain-specific name for it, it's probably a good idea to use that name instead. For example a divide function would preferably have dividend and divisor parameters, instead of x and y.

I like that there's a comment that immediately states that we're returning 0 when the return type is overflowed.

The problem is with returning 0 when the type is overflowed: it gives two different meanings to what the function returns - sometimes it's your result, and sometimes it's an error code. Yet using 0 for an error code doesn't feel right: typically a function that runs successfully to completion returns that value - and you already know that:

int main (void)
{
 //...
 return 0;
}

This Stack Overflow Q&A describes some good ways to deal with errors and return values.

If your function signature looked like this:

FactorialError factorial (int n, unsigned long long int *result)

It returns one of these:

typedef enum {NONE, OVERFLOW} FactorialError;

If all your functions consistently take a pointer to their result(s), and return an error value, then the day you have a function that needs to return different error codes, you don't need to start giving yet another meaning to what you're returning - and when you write code, the day you need to make a change to it is never really far - you could decide to make the function safe to call outside of main by handling a negative n parameter value and returning a different error code:

typedef enum {NONE, OVERFLOW, ILLEGAL_NEGATIVE} FactorialError;

(or in this case you could simply change the signature to take an unsigned int n.. that was just an example)

The beauty of this is that you don't need the comment anymore, and it becomes crystal-clear why the calling code needs that if block.. which, since we're returning an enum, would be better off as a switch block:

int DONE = 1;
int completed = 0;
while (completed == 0) {
 FactorialError result;
 unsigned long long int number;
 result = factorial(i, &number);
 switch (result) {
 case FactorialError.OVERFLOW:
 printf ("%2i! overflows the largest representable value.", i);
 completed = DONE;
 break;
 case FactorialError.ILLEGAL_NEGATIVE:
 printf ("%2i! cannot be computed.", i);
 completed = DONE;
 break;
 default:
 // assert result is FactorialError.NONE
 printf ("%2i %24llu\n", i, number);
 }
 ++i;
}

As for the actual implementation, as already mentioned returning early is never a bad idea; it removes the noise and lets you focus on the "clean" execution path:

// 0! is always 1
if (i == 0) {
 result = 1;
 return FactorialError.NONE;
}
// -1! cannot be computed
if (i < 0) {
 return FactorialError.ILLEGAL_NEGATIVE;
}
// 12! is the largest factorial that fits a 32-bit integer
if (i > 12) {
 return FactorialError.OVERFLOW;
}

Notice the error paths leave the pointer unassigned; reading that pointer in these error paths would be undefined behavior, which should be expected from the caller.

In this case we know that \$n!\$ needs to fit an int - assuming that's a 32-bit integer (I'm not familiar with c), the largest value of n we can work with is known to be 12 - so we can know we're going to overflow before we even start looping, leaving the body of the loop with a higher signal-to-noise ratio.

Arguably since the number of iterations is known from the start, a for loop would have been more appropriate than a while(1) infinite loop.

Question 9

Having the factorial function accept a pointer to the result as an argument and then use the return value as a (separate) error indicator is certainly better than a kind of combined value/error code. Thanks for the considered answered -- particularly the Apple fail example when using braces.

Question 10

Here's a rewrite of the factorial program incorporating several suggestions -- most notably to separate the value returned and any error code into two distinct variables. Also not to mix signed and unsigned data types. I tried to drop the extra printf statements in the headings and my compiler complained.

One last question (about pointers): do I need to initialize number to 0 in main before calling the factorial function?

/* factorial.c
 program to calculate factorials until the limit for an unsigned long long
 integer is exceeded
 20! = 2,432,902,008,176,640,000
 limit = 18,446,744,073,709,551,615
 21! = 51,090,942,171,709,440,000
*/
#include <stdio.h>
#include <stdbool.h>
#include <limits.h>
bool factorial (unsigned int n, unsigned long long int *result)
{
 if ( n == 0 )
 *result = 1;
 else {
 *result = n;
 for ( unsigned int i = n - 1; i >= 1; --i ) {
 if ( *result <= ULLONG_MAX / i )
 *result *= i;
 else 
 return false; // result wraps around, exit
 }
 } 
 return true;
}
int main (void)
{
 unsigned long long int number = 0;
 printf ("\nTABLE OF FACTORIALS\n\n");
 printf (" n n!\n");
 printf ("-- ------------------------\n");
 unsigned int i = 0;
 while ( factorial (i, &number) ) {
 printf ("%2i %24llu\n", i, number);
 ++i;
 }
 return 0;
}

Question 11

The pointer being an "out" parameter, IMO it's perfectly fine (perhaps possibly even recommended) to not initialize it before the call - then again I'm a C# guy so.. perhaps post the revised code as a follow-up question? See codereview.stackexchange.com/help/someone-answers for more details. I think some elements in current answers would be repeated if this code were to be reviewed though. All answers favor a for loop, for example.

Question 12

The reason for the while loop, instead of a for loop, is that I was trying to let the program determine the limit -- even though I know the answer in advance (20!). I've thought (granted I am not a professional programmer) that for loops are best suited to instances where a single variable controls the loop and the end point is known.So I'm essentially considering that I don't know the loop's end point at the start of the program as it's a programming exercise.

Question 13

This code incurs the "cost" of a division every iteration of the for loop which make it less efficient than it could be. Also, the for loop end condition doesn't necessarily need to be a loop counter as I've shown in my answer.

user86624user86624 · Accepted Answer · 2016-11-12 03:44:32Z

I guess it's as much a matter of taste as it is of existing coding guidelines that you have to abide by.

I'd be fine with early returns. Thus, I wouldn't have an else clause, just if (i == 0) { return 1; } followed by what's inside your else clause.
Ditto for the case where result > ULLONG_MAX / j: just an early return.

I guess early returns are easier to grok in short functions than in longer functions; and on the other hand, there is certainly an argument to be made to have just a single return point*.

For the second while loop in main, I would perhaps replace it with a for loop, as I feel that states the intention a bit better. Arguably, the condition may need to be constant there, or perhaps the maximum allowed value for i.
And you could shuffle a few lines around to have the check for number != 0 inside the loop (that does put an extra call to factorial just outside such a for loop).

All in all, it largely feels like picking nits. Pick a coding style you're happy with, possibly abiding by some good standards that people have drafted, and stick with it (until people give you a good explanation why you should do something else).

I would slightly worry about the implicit int to unsigned long long cast; I haven't bothered looking up casting rules, but you are casting a potentially negative value to an unsigned variable. Are there any compiler warnings?

*) One case I could imagine, is when you want to do some clean-up before exiting the function: free allocated memory, close files etc. By always exiting at the end of the function, were this clean-up happens, you don't repeat the clean-up code elsewhere. This also involves one of the very few cases where goto serves a good purpose in C (well, not everyone will agree on that): it becomes the C equivalent of a defer or finally statement. Of course, here it's not needed.

My compiler is gcc and I have all All warnings enabled -- it did not issue any warnings. It hadn't occurred to me that setting result = i was mixing data types.
Sadly, for a reason I don't know (historical?), -Wall in gcc doesn't mean all. If you can use clang instead, you can use -Weverything, and you will get warnings about those implicit conversions. (Searching with these flags and compilers as keywords will probably lead you to the bunch of flags to add to gcc to get all warnings.)

Stack Exchange Network

Factorials, loops, break and professional code

4 Answers 4

Use constant string concatenation

Be careful with signed vs. unsigned

Minimize the scope of variable

Prefer `for` to `while` loops

Simplify your code

Omit `return 0`

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Factorials, loops, break and professional code

4 Answers 4

Use constant string concatenation

Be careful with signed vs. unsigned

Minimize the scope of variable

Prefer for to while loops

Simplify your code

Omit return 0

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions

Prefer `for` to `while` loops

Omit `return 0`