6
\$\begingroup\$

I'm trying to parse a char* into an int without using atoi(). I walk through the string, check if it's a valid digit, then add that digit to my integer by multiplying by 10 and adding the digit. I am not accepting negative integers.

Here's the code I'm using:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define TRUE 1
#define FALSE 0
int main(int argc, char *argv[]) {
 if (argc < 2)
 exit(1);
 char *p = argv[1];
 int amount = 0;
 int len = strlen((const char*) argv[1]); // calculate len beforehand
 int contin = TRUE;
 for (int i = 0; contin && i < len; ++i, ++p) {
 switch (*p) {
 case '0':
 amount = (amount * 10);
 break;
 case '1':
 amount = (amount * 10) + 1;
 break;
 case '2':
 amount = (amount * 10) + 2;
 break;
 case '3':
 amount = (amount * 10) + 3;
 break;
 case '4':
 amount = (amount * 10) + 4;
 break;
 case '5':
 amount = (amount * 10) + 5;
 break;
 case '6':
 amount = (amount * 10) + 6;
 break;
 case '7':
 amount = (amount * 10) + 7;
 break;
 case '8':
 amount = (amount * 10) + 8;
 break;
 case '9':
 amount = (amount * 10) + 9;
 break;
 default:
 contin = FALSE;
 break;
 }
 }
 fprintf(stdout, "amount: %i\n", amount);
 return 0;
}

...which works nicely. But, is there a better/more idiomatic way to do this?

EDIT: Thanks to Groo, I'm able to remove the giant switch statement:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define TRUE 1
#define FALSE 0
int main(int argc, char *argv[]) {
 if (argc < 2)
 exit(1);
 char *p = argv[1];
 int amount = 0;
 int len = strlen((const char*) argv[1]); // calculate len beforehand
 int contin = TRUE;
 for (int i = 0; contin && i < len; ++i, ++p) {
 /* handle negative integers */
 if (*p == '-' && i == 0) {
 fprintf(stderr, "negative integers are invalid.\n");
 exit(1);
 }
 if (*p > 0x2F && *p < 0x3A)
 amount = (amount * 10) + (*p - '0');
 else {
 contin = FALSE;
 --p;
 }
 }
 fprintf(stdout, "amount: %i\n", amount);
 return 0;
}
```
asked Jan 18, 2020 at 16:34
\$\endgroup\$
2
  • 3
    \$\begingroup\$ I wouldn't use 0x2F and 0x3A because most people don't know ASCII by heart and because your intent is to compare then with the digits zero and nine. Use '0' and '9' -- it's easier to understand and better reflects the intent of the code. \$\endgroup\$ Commented Jan 19, 2020 at 7:44
  • 1
    \$\begingroup\$ Also note that every answer so far works because the integer value of numbers 0 to 9 are guaranteed by the C standard to be consecutive. There's no such guarantee for other characters, so you can not reliably do the same thing with hexadecimal digits. \$\endgroup\$ Commented Jan 19, 2020 at 14:07

2 Answers 2

8
\$\begingroup\$
  • Since command line arguments are null-terminated, we can avoid strlen() entirely by just running through the string until we hit '0円'; this also gets rid of contin and i
  • (*p > 0x2F && *p < 0x3A) isn't very obvious, I'd just use '0' and '9' to make it easier to understand the intention and to be consistent with the rest of the code
  • You could use EXIT_SUCCESS and EXIT_FAILURE instead of 0 and 1 respectively
  • For consistency, I'd replace your call to exit() with a simple return
  • I'd remove the amount: part of the print; it should be clear from the context (running this program) what the output actually is; this would also make processing the output easier

Putting it all together, we end up with this short piece:

int main(int argc, char *argv[]) {
 if (argc < 2)
 return EXIT_FAILURE;
 char *p = argv[1];
 int amount = 0;
 while (*p) {
 if (*p < '0')
 return EXIT_FAILURE;
 if (*p > '9')
 return EXIT_FAILURE;
 amount = (amount * 10) + (*p++ - '0');
 }
 fprintf(stdout, "%i\n", amount);
 return EXIT_SUCCESS;
}

Note, however, that this doesn't account for:

  • negative numbers (starting with a -)
  • positive numbers starting with a +
  • any leading whitespace

Furthermore:

  • If the input starts with a number and has a letter following sometimes later, it will break out and print the number it has found until then; but if the first character found is a letter, the program prints 0 and reports success, which could be by design, but could also be considered a bug
  • The use of *p++ can be a bit dangerous if this was code shared with less experienced programmers, as simply changing it to ++*p would increase the value pointed to instead of the pointer itself; changing it to *(p++) might help
  • While this comes down to taste, I would argue to always use curly braces, even for one-liners: otherwise, adding another line later would break the code; also consistency is a plus
  • Adding support for leading whitespace is easy and also supports my previous point, as missing brackets would break the program:
 if (*p == ' ') {
 *p++;
 continue;
 }
  • If you don't want to implement support for negative numbers, it might be reasonable to change the type of amount to unsigned or - even better - unsigned long long, which would give you a possible range of [0, +18446744073709551615]
  • Naive handling of negative integers would be quite easy to implement (but note Roland's answer for caveats with this):
 int sign = 1;
 if (*p == '-') {
 sign = -1;
 *p++;
 }
 /* loop */
 amount *= sign;
answered Jan 18, 2020 at 16:48
\$\endgroup\$
9
  • \$\begingroup\$ yes, the part when the program stops when it encounters an invalid digit is intentional. \$\endgroup\$ Commented Jan 18, 2020 at 18:18
  • \$\begingroup\$ Why write EXIT_FAILURE twice in 2 if statements.? Why not if (*p < '0' || *p > '9') EXIT_FAILURE? \$\endgroup\$ Commented Jan 18, 2020 at 20:24
  • 1
    \$\begingroup\$ Yeah, the whole "1 line if" discussion is highly subjective. You're right though, I do use one line ifs, especially for this kind of "early return" code where it is very unlikely someone wants to add a second line. Your answer is very good, it doesn't need editing. These are subjective discussion points. Your aversion to || was new to me, but there is nothing wrong with it. \$\endgroup\$ Commented Jan 18, 2020 at 20:45
  • 2
    \$\begingroup\$ Handling negative integers is not as easy as you think. There is one integer that is a valid negative number but an invalid positive number. For int32_t that is the number -2147483648. \$\endgroup\$ Commented Jan 18, 2020 at 21:17
  • 1
    \$\begingroup\$ As a learning example, printing the result is perfectly acceptable, but to be usable, the program should be a function returning the parsed int. \$\endgroup\$ Commented Jan 19, 2020 at 7:06
2
\$\begingroup\$

What should the output be if I enter 123456789012345678901234567890? That's producing an integer overflow right now.

answered Jan 18, 2020 at 18:11
\$\endgroup\$
5
  • 2
    \$\begingroup\$ uh, I'm new, but shouldn't this be a comment? Just asking :) \$\endgroup\$ Commented Jan 18, 2020 at 18:16
  • \$\begingroup\$ correct, when I put this into my application I'll add code that will handle an overflow. \$\endgroup\$ Commented Jan 18, 2020 at 18:16
  • 6
    \$\begingroup\$ Granted, this answer is very short. It reveals a bug in your code though and thus might be considered a review. It's only meant as an addition to the other answer, though. \$\endgroup\$ Commented Jan 18, 2020 at 21:15
  • \$\begingroup\$ In this cases we can store the answer by taking this number modulo a prime number i.e 10ドル^9 + 7$. \$\endgroup\$ Commented Jan 18, 2020 at 22:27
  • \$\begingroup\$ @strikersps, doesn't that just change program from Undefined Behaviour to one that produces a wrong answer? It's hard to see how that would be an improvement. You need a better strategy for dealing with overflow - I suggest it's better to return EXIT_FAILURE, since a number that's too large is the same category as one that's too small. \$\endgroup\$ Commented Jan 20, 2020 at 9:02

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.