C Program which compares software version strings

Question 1

I wrote the following C program to compare software version strings. Please provide any general feedback you'd like, but with special focus on error handling decisions, user input validation/sanitation, and robustness. My end-goal is for the program to properly handle user input and not have non-graceful crashes, etc... A couple of known issues/concerns I already have (but haven't decided on a solution for yet) is, where should I be handling errors? Should I be printing to stderr only in the driver (main) program and leave all of that out of the main "library" code? Secondly, I have an enum definitionn for FAIL but I never actually use it because when I wrote the switch clause, I thought it may be a better idea to just default to error rather than select it explicitly. I'm not sure if this is the best choice. This was compiled with GCC 7.5.0 for Linux 64-bit on my end.

The program is only intended to work with inputs with 4 version numbers concatenated with a . for now. I also need to build out a better driver which accepts actual user input eventually of course.

#include <stdio.h>
#include <assert.h>
#include <stdlib.h>
#include <string.h>
#include <stdbool.h>
enum VERSION_RESULT
{
 FAIL = -1,
 LEFT = 1,
 RIGHT = 2,
 EQUAL = 3
};
typedef struct _version
{
 unsigned int major;
 unsigned int minor;
 unsigned int build;
 unsigned int revision;
} VERSION;
VERSION *version_new(char *str);
int version_parse(VERSION *pVer, char *str);
bool validate_version_string(char *str);
void version_print(VERSION *pVer);
void version_compare_print(VERSION *pL, VERSION *pR);
int main() {
 char ver_string[] = {'5','.','2','5','.','5','.','0','0円'}; // CANNOT use read-only memory due to strtok
 char ver_string2[] = {'5','.','5','.','5','.','1','0円'};
 VERSION *ver = version_new(ver_string);
 VERSION *ver2 = version_new(ver_string2);
 version_compare_print(ver, ver2);
 free(ver);
 free(ver2);
 ver = NULL;
 ver2 = NULL;
 return EXIT_SUCCESS;
}
VERSION *version_new(char *str)
{
 assert(str);
 VERSION *pVer = malloc(sizeof(VERSION));
 assert(pVer);
 int result = version_parse(pVer, str);
 if(result == -1)
 {
 fprintf(stderr, "Failed to parse version from string.\n");
 free(pVer);
 return NULL;
 }
 return pVer;
}
int version_parse(VERSION *pVer, char * str)
{
 assert(pVer && str);
 if(!validate_version_string(str))
 {
 fprintf(stderr, "Invalid version #. Format xxx.xxx.xxx.xxx\n");
 return -1;
 }
 char *token;
 char *delim = ".";
 token = strtok(str, delim);
 pVer->major = (unsigned int) atoi(token);
 token = strtok(NULL, delim);
 pVer->minor = (unsigned int) atoi(token);
 token = strtok(NULL, delim);
 pVer->build = (unsigned int) atoi(token);
 token = strtok(NULL, delim);
 pVer->revision = (unsigned int) atoi(token);
}
bool validate_version_string(char *str)
{
 assert(str);
 unsigned int count = 0;
 char *copy;
 char tmp;
 for(copy = str; *copy != '0円'; ++copy)
 {
 tmp = *copy;
 if(tmp == '.') ++count;
 if((tmp > '9' || tmp < '0') && tmp != '.') return false;
 }
 if(count == 3) return true;
 return false;
}
void version_print(VERSION *pVer)
{
 assert(pVer);
 printf("%u.%u.%u.%u", pVer->major,pVer->minor,pVer->build,pVer->revision);
 return;
}
/*
 * Compare 2 versions. Return -1 if error, 1 if left is largest, 2 if right is largest, 3 if equal
 */
enum VERSION_RESULT version_compare(VERSION *pL, VERSION *pR)
{
 assert(pL && pR);
 //Major checks
 if(pL->major > pR->major) return LEFT;
 if(pR->major > pL->major) return RIGHT;
 //Majors are equal
 if(pL->minor > pR->minor) return LEFT;
 if(pR->minor > pL->minor) return RIGHT;
 //Minors are equal
 if(pL->build > pR->build) return LEFT;
 if(pR->build > pL->build) return RIGHT;
 //Builds are equal
 if(pL->revision > pR->revision) return LEFT;
 if(pR->revision > pL->revision) return RIGHT;
 //Revisions are equal
 return EQUAL;
}
void version_compare_print(VERSION *pL, VERSION *pR)
{
 assert(pL && pR);
 switch(version_compare(pL, pR))
 {
 case LEFT:
 version_print(pL);
 printf(" > ");
 version_print(pR);
 putchar('\n');
 break;
 case RIGHT:
 version_print(pL);
 printf(" < ");
 version_print(pR);
 putchar('\n');
 break;
 case EQUAL:
 version_print(pL);
 printf(" == ");
 version_print(pR);
 putchar('\n');
 break;
 default:
 fprintf(stderr, "An error occurred in the version compare function.\n");
 }
 return;
}

Question 2

Don't strtok + atoi. Use strtol, which (a) doesn't need a mutable input, (b) has much better error handling and reporting, and (c) eliminates the need for independent validation. An example of use would be

char * end;
pVer->major = strtol(str, &end, 0);
if (*end != '.') {
 // major is not a number.
 return suitable_failure;
}
str = end + 1;
pVer->minor = strtol(str, &end, 0);
// etc, and keep going with other pieces of a version.
// After revision is computed, *end must be 0

Instead of spelling out the field names, have an array. Then you could wrap the repeated number parsing in a loop.

I don't see the need for a dynamic allocation. VERSION is simple struct, and c would be happy to return it:

VERSION version_new(char * str)
{
 VERSION v;
 ....
 return v;
}
int main()
{
 ....
 VERSION ver1 = version_new(str);
 // etc

Question 3

@vnp to clarify, you mean don't use the struct fields but just use an array? Re: " Instead of spelling out the field names, have an array. Then you could wrap the repeated number parsing in a loop."

Question 4

As major is unsigned, strtoul() makes more sense then strtol() here.

Question 5

Instead of spelling out every single element of the array,

char ver_string[] = {'5','.','2','5','.','5','.','0','0円'};

just use a string literal. It's the same thing:

char ver_string[] = "5.25.5.0";

Should I be printing to stderr only in the driver (main) program and leave all of that out of the main "library" code?

Yes, exactly. You can return error codes and set errno to be more specific.

Secondly, I have an enum definitionn for FAIL but I never actually use it because when I wrote the switch clause, I thought it may be a better idea to just default to error rather than select it explicitly.

That's fine. Anything other than a valid input is an invalid input.

You don't return a value from version_parse even though it looks like you even inserted a newline where it was supposed to go.

Instead of using LEFT and RIGHT, I would use GREATER and LESS. Those names convey more meaning to me.

Usually, functions that don't take values have an unnamed void parameter. I suggest doing that with main, because it might be undefined behavior if you don't:

int main(void)
{

(I also fixed up the formatting to make it more consistent.)

In validate_version_string, tmp and copy are unnecessary. Just use str and *str, like so:

for(; *str != '0円'; ++str)
{
 if(*str == '.') ++count;
 if((*str > '9' || *str < '0') && *str != '.') return false;
}

You may think that removing tmp is bad for performance, but it's not; in any nonzero optimization level, these are equivalent.

Question 6

The validation of the string seems to leave some gaps. For instance, the version numbers seem to be valid whatever size they have. They could be zero digits large, or contain so many digits that they would not fit into an integer.

I'd try not to print to standard error if this is supposed to be used as a library. In that case you might want to use separate header files, c code and test code of course as well.

I'd keep version_compare_print possibly out of the library; at the very least it should otherwise just populate a string pointer rather than print it to standard out. version_print makes more sense, but it should also populate a string. Then the application can print it to standard output. Probably best to call it version_sprint.

Coding remarks

main

I'm not sure if setting:

ver = NULL;

does any good for a local variable, after you've already freed it.

version_new

I'm not sure if this function should be there at all, just let the caller perform the malloc and call version_parse. They'll have to dealloc anyway.

version_parse

unsigned int count = 0;

Count of what? You should make that explicit in the variable name (delimiter_count seems logical).

char *copy;

This doesn't seem to operate on a copy - just on a copy of the reference; just call it position or something similar.

return -1;

You're hiding the error code. Either you should probably upgrade to a more application specific code or just return the code you're given. Currently it is possible to read the error from standard error stream, but the calling method will be left in the dark.

version_compare

Compare almost always use a negative value if the leftmost parameter is smaller than the second, zero if they are equal and a positive value if the leftmost parameter is great than the second.

As you don't seem to use FAIL value anyway (and since you use pointers to the already parsed versions), I think it is best to keep to that.

version_compare_print

I'm not sure if this should be a special function; better keep it with main or at least explain what it does.

There is a lot of repetition in the code. You could just set a variable to "<", "==" or ">" and then perform the print after the switch.

The final return statement seems rather useless; nothing is returned and the method is at the end.

Question 7

Setting freed variables to null instead of letting them dangle makes it much easier to find use after free errors. It's not that uncommon, although usually someone writes a macro for it sooner than later.

Question 8

It seems to me that for local variables it just adds to the clutter; I can see this spill over from pointers in other locations though. Maybe I'm too used to Java where null pointers are commonly the cause of problems, not part of the solution to problems.

Question 9

No not GCed objects (which yes I guess with resurrection that would be a problem), but .NET got me. I meant the Closeable interfaces (try with resources and so) Say using a file reader after it has been closed. Also with try-with-resources not a problem in general because you limit the scope to its lifetime, but if you can't (say it's a member) setting those to null is a good idea too.

Question 10

Well that's the problem trying to find analogies for constructs in other languages (my point wasn't to speed up GC, but to make it harder for someone to access a closed resource afterwards). The main point is that there's no downside to setting the variable to NULL, while it can help avoid a common programming error (use after free errors are a very common security exploit after all)

Question 11

Maarten Bodewes, "I'm not sure if setting: ver = NULL; does any good". This is an OK C idiom. It has marginal cost yet marginal value, so a such if fine to have, fine to not have. Follow group's coding guidelines here. @Voo, In cases where ver is not used later, it may get optimized out, so provides no improved security. Agree about ease for debugging.

Question 12

A small version compare improvement would perform up to n +1 rather than 2n compares.

Instead of

if(pL->major > pR->major) return LEFT;
if(pR->major > pL->major) return RIGHT;
if(pL->minor > pR->minor) return LEFT;
if(pR->minor > pL->minor) return RIGHT;
...

Compare for equality first:

if(pL->major != pR->major) {
 return pL->major > pR->major ? LEFT : RIGHT;
}
if(pL->minor != pR->minor) {
 return pL->minor > pR->minor ? LEFT : RIGHT;
}
...

Yet since such code is not likely to be in a critical path, coding for clarity likely overrides.

vnp vnp 58.6k4 gold badges55 silver badges144 bronze badges · Answer 1 · 2020-03-15 21:29:07Z

Don't strtok + atoi. Use strtol, which (a) doesn't need a mutable input, (b) has much better error handling and reporting, and (c) eliminates the need for independent validation. An example of use would be

char * end;
pVer->major = strtol(str, &end, 0);
if (*end != '.') {
 // major is not a number.
 return suitable_failure;
}
str = end + 1;
pVer->minor = strtol(str, &end, 0);
// etc, and keep going with other pieces of a version.
// After revision is computed, *end must be 0

Instead of spelling out the field names, have an array. Then you could wrap the repeated number parsing in a loop.

I don't see the need for a dynamic allocation. VERSION is simple struct, and c would be happy to return it:

VERSION version_new(char * str)
{
 VERSION v;
 ....
 return v;
}
int main()
{
 ....
 VERSION ver1 = version_new(str);
 // etc

@vnp to clarify, you mean don't use the struct fields but just use an array? Re: " Instead of spelling out the field names, have an array. Then you could wrap the repeated number parsing in a loop."
As major is unsigned, strtoul() makes more sense then strtol() here.

S.S. Anne S.S. Anne 1,78510 silver badges27 bronze badges · Answer 2 · 2020-03-15 20:52:54Z

Instead of spelling out every single element of the array,

char ver_string[] = {'5','.','2','5','.','5','.','0','0円'};

just use a string literal. It's the same thing:

char ver_string[] = "5.25.5.0";

Should I be printing to stderr only in the driver (main) program and leave all of that out of the main "library" code?

Yes, exactly. You can return error codes and set errno to be more specific.

Secondly, I have an enum definitionn for FAIL but I never actually use it because when I wrote the switch clause, I thought it may be a better idea to just default to error rather than select it explicitly.

That's fine. Anything other than a valid input is an invalid input.

You don't return a value from version_parse even though it looks like you even inserted a newline where it was supposed to go.

Instead of using LEFT and RIGHT, I would use GREATER and LESS. Those names convey more meaning to me.

Usually, functions that don't take values have an unnamed void parameter. I suggest doing that with main, because it might be undefined behavior if you don't:

int main(void)
{

(I also fixed up the formatting to make it more consistent.)

In validate_version_string, tmp and copy are unnecessary. Just use str and *str, like so:

for(; *str != '0円'; ++str)
{
 if(*str == '.') ++count;
 if((*str > '9' || *str < '0') && *str != '.') return false;
}

You may think that removing tmp is bad for performance, but it's not; in any nonzero optimization level, these are equivalent.

Maarten Bodewes Maarten Bodewes 6,59920 silver badges53 bronze badges · Answer 3 · 2020-03-15 20:50:14Z

The validation of the string seems to leave some gaps. For instance, the version numbers seem to be valid whatever size they have. They could be zero digits large, or contain so many digits that they would not fit into an integer.

I'd try not to print to standard error if this is supposed to be used as a library. In that case you might want to use separate header files, c code and test code of course as well.

I'd keep version_compare_print possibly out of the library; at the very least it should otherwise just populate a string pointer rather than print it to standard out. version_print makes more sense, but it should also populate a string. Then the application can print it to standard output. Probably best to call it version_sprint.

Coding remarks

main

I'm not sure if setting:

ver = NULL;

does any good for a local variable, after you've already freed it.

version_new

I'm not sure if this function should be there at all, just let the caller perform the malloc and call version_parse. They'll have to dealloc anyway.

version_parse

unsigned int count = 0;

Count of what? You should make that explicit in the variable name (delimiter_count seems logical).

char *copy;

This doesn't seem to operate on a copy - just on a copy of the reference; just call it position or something similar.

return -1;

You're hiding the error code. Either you should probably upgrade to a more application specific code or just return the code you're given. Currently it is possible to read the error from standard error stream, but the calling method will be left in the dark.

version_compare

Compare almost always use a negative value if the leftmost parameter is smaller than the second, zero if they are equal and a positive value if the leftmost parameter is great than the second.

As you don't seem to use FAIL value anyway (and since you use pointers to the already parsed versions), I think it is best to keep to that.

version_compare_print

I'm not sure if this should be a special function; better keep it with main or at least explain what it does.

There is a lot of repetition in the code. You could just set a variable to "<", "==" or ">" and then perform the print after the switch.

The final return statement seems rather useless; nothing is returned and the method is at the end.

Setting freed variables to null instead of letting them dangle makes it much easier to find use after free errors. It's not that uncommon, although usually someone writes a macro for it sooner than later.
It seems to me that for local variables it just adds to the clutter; I can see this spill over from pointers in other locations though. Maybe I'm too used to Java where null pointers are commonly the cause of problems, not part of the solution to problems.
No not GCed objects (which yes I guess with resurrection that would be a problem), but .NET got me. I meant the Closeable interfaces (try with resources and so) Say using a file reader after it has been closed. Also with try-with-resources not a problem in general because you limit the scope to its lifetime, but if you can't (say it's a member) setting those to null is a good idea too.
Well that's the problem trying to find analogies for constructs in other languages (my point wasn't to speed up GC, but to make it harder for someone to access a closed resource afterwards). The main point is that there's no downside to setting the variable to NULL, while it can help avoid a common programming error (use after free errors are a very common security exploit after all)
Maarten Bodewes, "I'm not sure if setting: ver = NULL; does any good". This is an OK C idiom. It has marginal cost yet marginal value, so a such if fine to have, fine to not have. Follow group's coding guidelines here. @Voo, In cases where ver is not used later, it may get optimized out, so provides no improved security. Agree about ease for debugging.

chux chux 36.4k2 gold badges43 silver badges96 bronze badges · Answer 4 · 2020-03-16 13:29:08Z

A small version compare improvement would perform up to n +1 rather than 2n compares.

Instead of

if(pL->major > pR->major) return LEFT;
if(pR->major > pL->major) return RIGHT;
if(pL->minor > pR->minor) return LEFT;
if(pR->minor > pL->minor) return RIGHT;
...

Compare for equality first:

if(pL->major != pR->major) {
 return pL->major > pR->major ? LEFT : RIGHT;
}
if(pL->minor != pR->minor) {
 return pL->minor > pR->minor ? LEFT : RIGHT;
}
...

Yet since such code is not likely to be in a critical path, coding for clarity likely overrides.

Stack Exchange Network

C Program which compares software version strings

4 Answers 4

Coding remarks

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

C Program which compares software version strings

4 Answers 4

Coding remarks

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions