The program below correctly calculates factorials, until the limit for an unsigned long long
integer is exceeded.
My question has to do with loops and exiting loops. The program uses one infinite while loop and the break statement to exit 2 while loops.
Do working, professional programmers handle loops this way? If not, how would they write and exit these loops?
/* factorial.c
program to calculate factorials until the limit for an unsigned long long
integer is exceeded
20! = 2,432,902,008,176,640,000
limit = 18,446,744,073,709,551,615
21! = 51,090,942,171,709,440,000
*/
#include <stdio.h>
#include <limits.h>
unsigned long long int factorial (int i)
{
// function returns 0 when limit exceeded
int j;
unsigned long long int result;
if ( i == 0 )
result = 1;
else {
result = i;
j = i - 1;
while ( j >= 1 ) {
if ( result <= ULLONG_MAX / j )
result *= j;
else {
result = 0;
break;
}
--j;
}
}
return result;
}
int main (void)
{
int i;
unsigned long long int number;
printf ("\nTABLE OF FACTORIALS\n\n");
printf (" n n!\n");
printf ("-- ------------------------\n");
i = 0;
while ( 1 ) {
number = factorial (i);
if ( number == 0 )
break;
printf ("%2i %24llu\n", i, number);
++i;
}
return 0;
}
-
\$\begingroup\$ I'd have to agree with Evert in my workplace its perfectly acceptable to break out of loops on the assurance that you know the break will take place. \$\endgroup\$user122352– user1223522016年11月12日 18:41:19 +00:00Commented Nov 12, 2016 at 18:41
-
\$\begingroup\$ While your code is quite nice, I would argue that professional coders will not do any more work than necessary. Since you know the domain you have define factorial in, why not precompiled the first 20 factorials and return the appropriate one? It may not scale, but since you have the restriction of no factorial bigger than 20, it is a useful idea to consider \$\endgroup\$spyr03– spyr032016年11月12日 19:35:31 +00:00Commented Nov 12, 2016 at 19:35
4 Answers 4
I guess it's as much a matter of taste as it is of existing coding guidelines that you have to abide by.
I'd be fine with early returns. Thus, I wouldn't have an else
clause, just if (i == 0) { return 1; }
followed by what's inside your else
clause.
Ditto for the case where result > ULLONG_MAX / j
: just an early return.
I guess early returns are easier to grok in short functions than in longer functions; and on the other hand, there is certainly an argument to be made to have just a single return point*.
For the second while
loop in main
, I would perhaps replace it with a for
loop, as I feel that states the intention a bit better.
Arguably, the condition may need to be constant there, or perhaps the maximum allowed value for i
.
And you could shuffle a few lines around to have the check for number != 0
inside the loop (that does put an extra call to factorial
just outside such a for loop).
All in all, it largely feels like picking nits. Pick a coding style you're happy with, possibly abiding by some good standards that people have drafted, and stick with it (until people give you a good explanation why you should do something else).
I would slightly worry about the implicit int
to unsigned long long
cast; I haven't bothered looking up casting rules, but you are casting a potentially negative value to an unsigned variable. Are there any compiler warnings?
*) One case I could imagine, is when you want to do some clean-up before exiting the function: free allocated memory, close files etc. By always exiting at the end of the function, were this clean-up happens, you don't repeat the clean-up code elsewhere. This also involves one of the very few cases where goto
serves a good purpose in C (well, not everyone will agree on that): it becomes the C equivalent of a defer
or finally
statement. Of course, here it's not needed.
-
\$\begingroup\$ My compiler is gcc and I have all All warnings enabled -- it did not issue any warnings. It hadn't occurred to me that setting result = i was mixing data types. \$\endgroup\$Verbatim– Verbatim2016年11月12日 05:03:17 +00:00Commented Nov 12, 2016 at 5:03
-
\$\begingroup\$ Sadly, for a reason I don't know (historical?),
-Wall
in gcc doesn't mean all. If you can useclang
instead, you can use-Weverything
, and you will get warnings about those implicit conversions. (Searching with these flags and compilers as keywords will probably lead you to the bunch of flags to add to gcc to get all warnings.) \$\endgroup\$user86624– user866242016年11月12日 07:38:41 +00:00Commented Nov 12, 2016 at 7:38
The other reviews have covered most items, so I'll just add a few.
Use constant string concatenation
The current code includes this:
printf("\nTABLE OF FACTORIALS\n\n");
printf(" n n!\n");
printf("-- ------------------------\n");
But we don't really need to make 3 separate calls to printf
here. Instead, let the compiler automatically concatenate string literals together so we can write this:
printf("\nTABLE OF FACTORIALS\n\n"
" n n!\n"
"-- ------------------------\n");
I'd also suggest that puts
would be a more appropriate call than printf
since all we're printing is a string literal.
Be careful with signed vs. unsigned
If you don't intend for your code to handle negative numbers, I'd suggest that you indicate this clearly by changing the function signature to this:
unsigned long long factorial(unsigned i)
Minimize the scope of variable
In modern C, we don't tend to declare all variables at the top of each function. Instead, they're declared as they're needed and in as small a scope as possible, as in the following suggestion:
Prefer for
to while
loops
In the main
code, the while(1)
suggests that the loop continues forever but it really doesn't. In fact, we increment i
each time and exit when the factorial code detects an overflow. Why don't we just write it that way instead of making the reader work so hard?
int main(void)
{
printf("\nTABLE OF FACTORIALS\n\n"
" n n!\n"
"-- ------------------------\n");
unsigned long long number;
// factorial() returns 0 on overflow
for (unsigned i; (number = factorial(i)); ++i) {
printf ("%2u %24llu\n", i, number);
}
}
Simplify your code
The current factorial
code has the same problem with its loop, which is that it's not made clear when the loop ends. Also, the code for detecting overflow is much more complex than it needs to be. Instead, we can simply note that when the multiplication results in overflow, the resulting product is less than the previous value. Using that, I'd write the code like this:
// return the factorial of i or
// 0 if multiplication overflows
unsigned long long factorial(unsigned i)
{
if (i == 0) {
return 1;
}
unsigned long long res;
for (res = i--; i; --i) {
unsigned long long prev = res;
res *= i;
if (res < prev) { // oveflow?
return 0;
}
}
return res;
}
Omit return 0
When a C or C++ program reaches the end of main
the compiler will automatically generate code to return 0, so there is no need to put return 0;
explicitly at the end of main
.
Note: when I make this suggestion, it's almost invariably followed by one of two kinds of comments: "I didn't know that." or "That's bad advice!" My rationale is that it's safe and useful to rely on compiler behavior explicitly supported by the standard. For C, since C99; see ISO/IEC 9899:1999 section 5.1.2.2.3:
[...] a return from the initial call to the
main
function is equivalent to calling theexit
function with the value returned by themain
function as its argument; reaching the}
that terminates themain
function returns a value of 0.
For C++, since the first standard in 1998; see ISO/IEC 14882:1998 section 3.6.1:
If control reaches the end of main without encountering a return statement, the effect is that of executing return 0;
All versions of both standards since then (C99 and C++98) have maintained the same idea. We rely on automatically generated member functions in C++, and few people write explicit return;
statements at the end of a void
function. Reasons against omitting seem to boil down to "it looks weird". If, like me, you're curious about the rationale for the change to the C standard read this question. Also note that in the early 1990s this was considered "sloppy practice" because it was undefined behavior (although widely supported) at the time.
So I advocate omitting it; others disagree (often vehemently!) In any case, if you encounter code that omits it, you'll know that it's explicitly supported by the standard and you'll know what it means.
Your K&R bracing is flawlessly consistent, which is awesome =)
This is less so:
if ( number == 0 ) break;
You're consistently omitting the braces everywhere you can. While consistency is excellent, omitting optional braces has caused real, serious problems - in a nutshell:
if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0) goto fail; goto fail;
Don't goto fail
- enclose all scopes with proper braces:
if ( number == 0 ) {
break;
}
The whitespace inside the parentheses is unnecessary and feels distracting to me, and my own personal preference would be to remove the space between a function call and its argument list:
if ( i == 0 ) // if (i == 0) {
while ( 1 ) { // while (1) {
number = factorial (i); // number = factorial(i);
printf ("%2i %24llu\n", i, number); // printf("%2i %24llu\n", i, number);
But as others have said this is nit-picking: consistency is king. Writing professional code is writing code that nicely blends into the existing code base, because it doesn't use a different bracing or spacing style. And it's consistently consistent.
Variables, unless they're trivial loop counters, should have a meaningful name. It is a well-established convention to use i
and j
in for
loops, for example. But i
in this case is more than a trivial loop counter: it's an argument to factorial
.
In mathematics, the factorial of a non-negative integer \$n\,ドル denoted by \$n!\,ドル is the product of all positive integers less than or equal to \$n\$.
If you can't quite put a name on what that number \$n\$ would be called (I can't), IMO n
would make a perfectly fine identifier for it. If there's a domain-specific name for it, it's probably a good idea to use that name instead. For example a divide
function would preferably have dividend
and divisor
parameters, instead of x
and y
.
I like that there's a comment that immediately states that we're returning 0 when the return type is overflowed.
The problem is with returning 0 when the type is overflowed: it gives two different meanings to what the function returns - sometimes it's your result, and sometimes it's an error code. Yet using 0 for an error code doesn't feel right: typically a function that runs successfully to completion returns that value - and you already know that:
int main (void) { //... return 0; }
This Stack Overflow Q&A describes some good ways to deal with errors and return values.
If your function signature looked like this:
FactorialError factorial (int n, unsigned long long int *result)
It returns one of these:
typedef enum {NONE, OVERFLOW} FactorialError;
If all your functions consistently take a pointer to their result(s), and return an error value, then the day you have a function that needs to return different error codes, you don't need to start giving yet another meaning to what you're returning - and when you write code, the day you need to make a change to it is never really far - you could decide to make the function safe to call outside of main
by handling a negative n
parameter value and returning a different error code:
typedef enum {NONE, OVERFLOW, ILLEGAL_NEGATIVE} FactorialError;
(or in this case you could simply change the signature to take an unsigned int n
.. that was just an example)
The beauty of this is that you don't need the comment anymore, and it becomes crystal-clear why the calling code needs that if
block.. which, since we're returning an enum
, would be better off as a switch
block:
int DONE = 1;
int completed = 0;
while (completed == 0) {
FactorialError result;
unsigned long long int number;
result = factorial(i, &number);
switch (result) {
case FactorialError.OVERFLOW:
printf ("%2i! overflows the largest representable value.", i);
completed = DONE;
break;
case FactorialError.ILLEGAL_NEGATIVE:
printf ("%2i! cannot be computed.", i);
completed = DONE;
break;
default:
// assert result is FactorialError.NONE
printf ("%2i %24llu\n", i, number);
}
++i;
}
As for the actual implementation, as already mentioned returning early is never a bad idea; it removes the noise and lets you focus on the "clean" execution path:
// 0! is always 1
if (i == 0) {
result = 1;
return FactorialError.NONE;
}
// -1! cannot be computed
if (i < 0) {
return FactorialError.ILLEGAL_NEGATIVE;
}
// 12! is the largest factorial that fits a 32-bit integer
if (i > 12) {
return FactorialError.OVERFLOW;
}
Notice the error paths leave the pointer unassigned; reading that pointer in these error paths would be undefined behavior, which should be expected from the caller.
In this case we know that \$n!\$ needs to fit an int
- assuming that's a 32-bit integer (I'm not familiar with c), the largest value of n
we can work with is known to be 12
- so we can know we're going to overflow before we even start looping, leaving the body of the loop with a higher signal-to-noise ratio.
Arguably since the number of iterations is known from the start, a for
loop would have been more appropriate than a while(1)
infinite loop.
-
1\$\begingroup\$ Having the factorial function accept a pointer to the result as an argument and then use the return value as a (separate) error indicator is certainly better than a kind of combined value/error code. Thanks for the considered answered -- particularly the Apple fail example when using braces. \$\endgroup\$Verbatim– Verbatim2016年11月12日 17:34:03 +00:00Commented Nov 12, 2016 at 17:34
Here's a rewrite of the factorial program incorporating several suggestions -- most notably to separate the value returned and any error code into two distinct variables. Also not to mix signed and unsigned data types. I tried to drop the extra printf statements in the headings and my compiler complained.
One last question (about pointers): do I need to initialize number to 0 in main before calling the factorial function?
/* factorial.c
program to calculate factorials until the limit for an unsigned long long
integer is exceeded
20! = 2,432,902,008,176,640,000
limit = 18,446,744,073,709,551,615
21! = 51,090,942,171,709,440,000
*/
#include <stdio.h>
#include <stdbool.h>
#include <limits.h>
bool factorial (unsigned int n, unsigned long long int *result)
{
if ( n == 0 )
*result = 1;
else {
*result = n;
for ( unsigned int i = n - 1; i >= 1; --i ) {
if ( *result <= ULLONG_MAX / i )
*result *= i;
else
return false; // result wraps around, exit
}
}
return true;
}
int main (void)
{
unsigned long long int number = 0;
printf ("\nTABLE OF FACTORIALS\n\n");
printf (" n n!\n");
printf ("-- ------------------------\n");
unsigned int i = 0;
while ( factorial (i, &number) ) {
printf ("%2i %24llu\n", i, number);
++i;
}
return 0;
}
-
\$\begingroup\$ The pointer being an "out" parameter, IMO it's perfectly fine (perhaps possibly even recommended) to not initialize it before the call - then again I'm a C# guy so.. perhaps post the revised code as a follow-up question? See codereview.stackexchange.com/help/someone-answers for more details. I think some elements in current answers would be repeated if this code were to be reviewed though. All answers favor a
for
loop, for example. \$\endgroup\$Mathieu Guindon– Mathieu Guindon2016年11月13日 01:35:43 +00:00Commented Nov 13, 2016 at 1:35 -
1\$\begingroup\$ The reason for the while loop, instead of a for loop, is that I was trying to let the program determine the limit -- even though I know the answer in advance (20!). I've thought (granted I am not a professional programmer) that for loops are best suited to instances where a single variable controls the loop and the end point is known.So I'm essentially considering that I don't know the loop's end point at the start of the program as it's a programming exercise. \$\endgroup\$Verbatim– Verbatim2016年11月13日 02:38:08 +00:00Commented Nov 13, 2016 at 2:38
-
\$\begingroup\$ This code incurs the "cost" of a division every iteration of the
for
loop which make it less efficient than it could be. Also, thefor
loop end condition doesn't necessarily need to be a loop counter as I've shown in my answer. \$\endgroup\$Edward– Edward2016年11月13日 13:08:03 +00:00Commented Nov 13, 2016 at 13:08