I'm trying to parse a char*
into an int
without using atoi()
. I walk through the string, check if it's a valid digit, then add that digit to my integer by multiplying by 10 and adding the digit. I am not accepting negative integers.
Here's the code I'm using:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define TRUE 1
#define FALSE 0
int main(int argc, char *argv[]) {
if (argc < 2)
exit(1);
char *p = argv[1];
int amount = 0;
int len = strlen((const char*) argv[1]); // calculate len beforehand
int contin = TRUE;
for (int i = 0; contin && i < len; ++i, ++p) {
switch (*p) {
case '0':
amount = (amount * 10);
break;
case '1':
amount = (amount * 10) + 1;
break;
case '2':
amount = (amount * 10) + 2;
break;
case '3':
amount = (amount * 10) + 3;
break;
case '4':
amount = (amount * 10) + 4;
break;
case '5':
amount = (amount * 10) + 5;
break;
case '6':
amount = (amount * 10) + 6;
break;
case '7':
amount = (amount * 10) + 7;
break;
case '8':
amount = (amount * 10) + 8;
break;
case '9':
amount = (amount * 10) + 9;
break;
default:
contin = FALSE;
break;
}
}
fprintf(stdout, "amount: %i\n", amount);
return 0;
}
...which works nicely. But, is there a better/more idiomatic way to do this?
EDIT: Thanks to Groo, I'm able to remove the giant switch statement:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define TRUE 1
#define FALSE 0
int main(int argc, char *argv[]) {
if (argc < 2)
exit(1);
char *p = argv[1];
int amount = 0;
int len = strlen((const char*) argv[1]); // calculate len beforehand
int contin = TRUE;
for (int i = 0; contin && i < len; ++i, ++p) {
/* handle negative integers */
if (*p == '-' && i == 0) {
fprintf(stderr, "negative integers are invalid.\n");
exit(1);
}
if (*p > 0x2F && *p < 0x3A)
amount = (amount * 10) + (*p - '0');
else {
contin = FALSE;
--p;
}
}
fprintf(stdout, "amount: %i\n", amount);
return 0;
}
```
2 Answers 2
- Since command line arguments are null-terminated, we can avoid
strlen()
entirely by just running through the string until we hit'0円'
; this also gets rid ofcontin
andi
(*p > 0x2F && *p < 0x3A)
isn't very obvious, I'd just use'0'
and'9'
to make it easier to understand the intention and to be consistent with the rest of the code- You could use
EXIT_SUCCESS
andEXIT_FAILURE
instead of0
and1
respectively - For consistency, I'd replace your call to
exit()
with a simplereturn
- I'd remove the
amount:
part of the print; it should be clear from the context (running this program) what the output actually is; this would also make processing the output easier
Putting it all together, we end up with this short piece:
int main(int argc, char *argv[]) {
if (argc < 2)
return EXIT_FAILURE;
char *p = argv[1];
int amount = 0;
while (*p) {
if (*p < '0')
return EXIT_FAILURE;
if (*p > '9')
return EXIT_FAILURE;
amount = (amount * 10) + (*p++ - '0');
}
fprintf(stdout, "%i\n", amount);
return EXIT_SUCCESS;
}
Note, however, that this doesn't account for:
- negative numbers (starting with a
-
) - positive numbers starting with a
+
- any leading whitespace
Furthermore:
- If the input starts with a number and has a letter following sometimes later, it will
break
out and print the number it has found until then; but if the first character found is a letter, the program prints0
and reports success, which could be by design, but could also be considered a bug - The use of
*p++
can be a bit dangerous if this was code shared with less experienced programmers, as simply changing it to++*p
would increase the value pointed to instead of the pointer itself; changing it to*(p++)
might help - While this comes down to taste, I would argue to always use curly braces, even for one-liners: otherwise, adding another line later would break the code; also consistency is a plus
- Adding support for leading whitespace is easy and also supports my previous point, as missing brackets would break the program:
if (*p == ' ') {
*p++;
continue;
}
- If you don't want to implement support for negative numbers, it might be reasonable to change the type of
amount
tounsigned
or - even better -unsigned long long
, which would give you a possible range of [0, +18446744073709551615] - Naive handling of negative integers would be quite easy to implement (but note Roland's answer for caveats with this):
int sign = 1;
if (*p == '-') {
sign = -1;
*p++;
}
/* loop */
amount *= sign;
-
\$\begingroup\$ yes, the part when the program stops when it encounters an invalid digit is intentional. \$\endgroup\$Kied Llaentenn– Kied Llaentenn2020年01月18日 18:18:19 +00:00Commented Jan 18, 2020 at 18:18
-
\$\begingroup\$ Why write EXIT_FAILURE twice in 2 if statements.? Why not
if (*p < '0' || *p > '9') EXIT_FAILURE
? \$\endgroup\$Oliver Schönrock– Oliver Schönrock2020年01月18日 20:24:07 +00:00Commented Jan 18, 2020 at 20:24 -
1\$\begingroup\$ Yeah, the whole "1 line if" discussion is highly subjective. You're right though, I do use one line ifs, especially for this kind of "early return" code where it is very unlikely someone wants to add a second line. Your answer is very good, it doesn't need editing. These are subjective discussion points. Your aversion to
||
was new to me, but there is nothing wrong with it. \$\endgroup\$Oliver Schönrock– Oliver Schönrock2020年01月18日 20:45:14 +00:00Commented Jan 18, 2020 at 20:45 -
2\$\begingroup\$ Handling negative integers is not as easy as you think. There is one integer that is a valid negative number but an invalid positive number. For
int32_t
that is the number-2147483648
. \$\endgroup\$Roland Illig– Roland Illig2020年01月18日 21:17:19 +00:00Commented Jan 18, 2020 at 21:17 -
1\$\begingroup\$ As a learning example, printing the result is perfectly acceptable, but to be usable, the program should be a function returning the parsed int. \$\endgroup\$qwr– qwr2020年01月19日 07:06:40 +00:00Commented Jan 19, 2020 at 7:06
What should the output be if I enter 123456789012345678901234567890? That's producing an integer overflow right now.
-
2\$\begingroup\$ uh, I'm new, but shouldn't this be a comment? Just asking :) \$\endgroup\$Kied Llaentenn– Kied Llaentenn2020年01月18日 18:16:19 +00:00Commented Jan 18, 2020 at 18:16
-
\$\begingroup\$ correct, when I put this into my application I'll add code that will handle an overflow. \$\endgroup\$Kied Llaentenn– Kied Llaentenn2020年01月18日 18:16:42 +00:00Commented Jan 18, 2020 at 18:16
-
6\$\begingroup\$ Granted, this answer is very short. It reveals a bug in your code though and thus might be considered a review. It's only meant as an addition to the other answer, though. \$\endgroup\$Roland Illig– Roland Illig2020年01月18日 21:15:15 +00:00Commented Jan 18, 2020 at 21:15
-
\$\begingroup\$ In this cases we can store the answer by taking this number modulo a prime number i.e 10ドル^9 + 7$. \$\endgroup\$strikersps– strikersps2020年01月18日 22:27:34 +00:00Commented Jan 18, 2020 at 22:27
-
\$\begingroup\$ @strikersps, doesn't that just change program from Undefined Behaviour to one that produces a wrong answer? It's hard to see how that would be an improvement. You need a better strategy for dealing with overflow - I suggest it's better to return
EXIT_FAILURE
, since a number that's too large is the same category as one that's too small. \$\endgroup\$Toby Speight– Toby Speight2020年01月20日 09:02:00 +00:00Commented Jan 20, 2020 at 9:02
0x2F
and0x3A
because most people don't know ASCII by heart and because your intent is to compare then with the digits zero and nine. Use'0'
and'9'
-- it's easier to understand and better reflects the intent of the code. \$\endgroup\$