I have the following code in many places in a large application:
if (datastruct.data_ok &&
cur_time > datastruct.start_time && cur_time < datastruct.end_time)
{
do_something();
}
Now, I wanted to eliminate repeated code by defining a function time_ok(...) which would simplify the repeated code to:
if (time_ok(&datastruct, cur_time))
{
do_something();
}
In my opinion, this is a well-made refactoring and should be therefore done. If the definition of the condition changes in the future, code maintenance becomes easier as only one location needs to be changed.
However, several team members noticed that the function can be abused by giving it a NULL argument, which crashes the program. They also complained that the return value is a boolean and not an error code, and all functions should according to them return an error code.
So, he refactoring would according to them be done in the following way:
bool time_ok_condition;
int err;
err = time_ok(&datastruct, &time_ok_condition, cur_time);
if (err != 0)
{
fprintf(stderr, "Error %d at file %s line %d", err, __FILE__, __LINE__);
exit(1);
}
if (time_ok_condition)
{
do_something();
}
However, in my opinion this results in excessively long code and if I really have to do all this, I'd much rather have the original non-refactored code.
I invented a way around this rule by using a macro:
if (TIME_OK(datastruct, cur_time))
{
do_something();
}
...which can never crash the program on NULL arguments as the macro takes the structure name and not a pointer to the structure as the argument. However, I don't particularly like this approach of using a macro because the macro evaluates its arguments multiple times.
I noticed that although C was developed for Unix and Unix system calls traditionally return error codes, the standard C library functions such as strstr, strchr, strspn do not return an error code and therefore handle NULL arguments by crashing the program.
Now, it is an entirely different question whether the program should crash with coredump or print a meaningful error message for the theoretical error that never happens in this program (see the question How should "this can never happen" style of errors in C be handled? on Stack Overflow).
However, this question is about whether it should be allowed to return something else than an error code. A positive answer would in practice necessitate the use of "crash with coredump" strategy in this very specific case as the range of a boolean return cannot represent errors, but does not necessarily mean that a "crash with coredump" strategy is optimal in all cases.
Who is correct? Should C functions always return an error code like Unix system calls, or should it be allowed to write functions that return something else if there is justified reason for it such as code simplicity?
2 Answers 2
The requirement for an error code arises from the fact the new function now can have 3 outcomes instead of 2: the time struct is ok, it is not ok, or the passed argument is NULL. The contract you have in mind for your function is "do not pass a NULL value in, otherwise the program will crash or show some undefined behaviour", whilst your coworkers believe programs become safer without such functions.
According to this former PSE post, and the topmost answer, the viewpoint of your coworkers is an extremely pedantic one, and in the context you have in mind the function will be called with something extremly unlikely to be NULL. Thus it will be most probably safe (maybe safer!) not to handle NULL by the return of an error code (as long as the program will crash or exit immediately with an error message, so it does not hide any severe error).
Of course, when your "time_ok" function is going to become part of a library, which might be reused in lots of different contexts (and maybe in a context where calling exit
is not a good option), then it is indeed better to return an error code for the NULL argument to let the caller decide how to handle that problem.
-
The time_ok function is part of a 7000-line long component of the system which is a runnable program and not a general-purpose library and there are no plans to make it a general-purpose library. This conflict was eventually solved by printing a clear error message in the time_ok function if argument is NULL and exiting using exit(1), as the coworkers don't like core dumps. Of course, the error condition currently never happens in the software.juhist– juhist03/17/2015 13:11:32Commented Mar 17, 2015 at 13:11
All functions should return an error code?
I can't see a good reason change a function like
int Add( int lhs, int rhs ) {...}
to
// Out parameter: result
error_t Add( int lhs, int rhs, int * result ) {...}
especially as now you are having to deal with the possibility that result is null.
In your specific example, I would use something similar to the first form without the pointer, leaving to the compiler the decision between inlining to save the copying vs the increase in code size (for something this small I expect it to always pick inlining, but it is in a better position to judge)
bool time_ok( data_t datastruct, time_t cur_time ) {
return datastruct.data_ok &&
( cur_time > datastruct.start_time ) &&
( cur_time < datastruct.end_time )
}
where data_t and time_t are the types of datastruct and cur_time
This relies on the fact that time_ok is a Total Function (i.e. not a Partial Function), equivalently that for all data_t
and for all time_t
there is a defined bool
that time_ok can return. This is easiest to prove in cases such as this, where the only thing occurring is values are compared and combined.
assert
to catch errors that logically shouldn't happen (i.e. if an assert failed it's because the programmer messed up)>=
). The upper-bound should still be exclusive, though.