I implemented the atoi()
function! Here is my code:
int my_atoi(char* pointer)
{
int result = 0;
char* pointer1;
multiplier = 1;
char sign = 1;
if(*pointer == '-')
sign =- 1;
pointer1 = pointer;
while(*pointer != '0円')
{
if(*pointer >= '0' && *pointer <= '9')
multiplier = multiplier * 10;
pointer = pointer + 1;
}
pointer = pointer1;
while(*pointer != '0円')
{
if(*pointer >= '0' && *pointer <= '9')
{
result = result + ( (*pointer%48) * multiplier);
multiplier = multiplier / 10;
}
pointer = pointer+1;
}
return (result * sign) / 10;
}
I wonder if there is any way that I can improve my function. I know there is a problem with my function. What if the user wants to convert from char*
to int
this string: "232-19". What should I do then? Any advice would be really helpful!
-
\$\begingroup\$ how is the problem "string to int: 232-19" connected with the code at hand? \$\endgroup\$Vogel612– Vogel6122014年03月30日 15:59:04 +00:00Commented Mar 30, 2014 at 15:59
-
\$\begingroup\$ Well what if i want to convert from string to int the number -255 and by accident i type "8-255" .Then according to my algorith the number 8255 will be returned.I know it's pretty stupid to worry about these things but what if the user is extremely dumb? Furthermore i know it is really difficult for someone to type 8-255 instead of -255 but you never know, it may happen! \$\endgroup\$RookieCookie– RookieCookie2014年03月30日 16:08:05 +00:00Commented Mar 30, 2014 at 16:08
-
\$\begingroup\$ raise an error. the input format is faulty. you shouldn't guess what the user wanted, but make him make his intent unmistakably clear ;) \$\endgroup\$Vogel612– Vogel6122014年03月30日 16:10:37 +00:00Commented Mar 30, 2014 at 16:10
-
\$\begingroup\$ You only need one pass of the string (not two). \$\endgroup\$Loki Astari– Loki Astari2014年03月30日 18:23:13 +00:00Commented Mar 30, 2014 at 18:23
-
2\$\begingroup\$ Please do not edit your code after it has been reviewed so that it could make any reviews irrelevant. \$\endgroup\$syb0rg– syb0rg2014年03月30日 20:03:53 +00:00Commented Mar 30, 2014 at 20:03
5 Answers 5
Things you could improve
Variables/Initialization
Where do you declare
multiplier
? I assume that since it is not declared within the method, it is declared as a global variable. Try to avoid global variables.The problem with global variables is that since every function has access to these, it becomes increasingly hard to figure out which functions actually read and write these variables.
To understand how the application works, you pretty much have to take into account every function which modifies the global state. That can be done, but as the application grows it will get harder to the point of being virtually impossible (or at least a complete waste of time).
If you don't rely on global variables, you can pass state around between different functions as needed. That way you stand a much better chance of understanding what each function does, as you don't need to take the global state into account.
So instead of using global variables, initialize the variables in
main()
, and pass them as arguments to functions if necessary. In this case, I don't see the need formultiplier
to be used outside of the function at all, so simply keep it declared within the function.sign
should be anint
, and not achar
.
Algorithm
Right now you are implementing a complicated and hard to follow method of converting a character into a number. The easy way is to have
isdigit()
do the hard work for you. This also will help you implement the DRY principle.while(*pointer != '0円') { if(*pointer >= '0' && *pointer <= '9') multiplier = multiplier * 10; pointer = pointer + 1; } pointer = pointer1; while(*pointer != '0円') { if(*pointer >= '0' && *pointer <= '9') { result = result + ( (*pointer%48) * multiplier); multiplier = multiplier / 10; } pointer = pointer+1; }
See how you have two loops doing almost identical things? Here is how I simplified all of that by using
isdigit()
.while (isdigit(*c)) { value *= 10; value += (int) (*c - '0'); c++; }
You loop through the characters in the string as long as they are digits. For each one, add to the counter you're keeping - the value to add is the integer value of the character. This is done by subtracting the ASCII value of
'0'
from the ascii value of the digit in question.Note that this code doesn't handle overflow. If you pass in "89384798719061231" (which won't fit in an
int
), the result is undefined. The fix is simple enough, just use along long int
to mitigate against that. We'll still have issues for extremely long numbers, but fixing that so that the function works as intended is a bit more complicated.
Documentation
Where did all of your comments go? A newer developer would simply gawk at some of your code.
result = result + ( (*pointer%48) * multiplier);
Comments can really go a long way into helping other understand your code. Don't go overboard with them though, you will have to balance how much of them to put into your program.
Syntax/Styling
This looks like a typo.
if(*pointer == '-') sign =- 1;
Add a space for clarity.
if(*pointer == '-') sign = -1;
You should not be modifying your
char*
you accept as a parameter into the function. Therefore, declare the parameter as constant.int my_atoi(const char* pointer)
Use more shorthand operators.
pointer++; // same as pointer = pointer+1; multiplier /= 10; // same as multiplier = multiplier / 10; multiplier *= 10; // same as multiplier = multiplier * 10;
Final Code
#include <stdio.h>
#include <assert.h>
#include <ctype.h>
long long int my_atoi(const char *c)
{
long long int value = 0;
int sign = 1;
if( *c == '+' || *c == '-' )
{
if( *c == '-' ) sign = -1;
c++;
}
while (isdigit(*c))
{
value *= 10;
value += (int) (*c-'0');
c++;
}
return (value * sign);
}
int main(void)
{
assert(5 == my_atoi("5"));
assert(-2 == my_atoi("-2"));
assert(-1098273980709871235 == my_atoi("-1098273980709871235"));
puts("All good."); // I reach this statement on my system
}
-
5\$\begingroup\$ You shouldn't change return types arbitrarily.
atoi()
traditionally returns anint
, somy_atoi()
should, too. If you want to parse along long
, then emulatestrtoll()
. \$\endgroup\$200_success– 200_success2014年03月31日 03:25:39 +00:00Commented Mar 31, 2014 at 3:25 -
4\$\begingroup\$
isdigit(*c)
is not define for*c
values less than 0 (other than EOF). Better towhile (isdigit((unsigned char) (*c) ))
\$\endgroup\$chux– chux2014年04月01日 03:42:52 +00:00Commented Apr 1, 2014 at 3:42 -
\$\begingroup\$ Missed corner: When
my_atoi()
result should beLLONG_MIN
,value += (int) (*c-'0');
is signed integer overflow (UB) as it tries to formLLONG_MAX + 1
. \$\endgroup\$chux– chux2017年11月17日 21:38:41 +00:00Commented Nov 17, 2017 at 21:38 -
\$\begingroup\$ Using
isdigit
is wrong at all, since it doesn't have a related functionnumeric_value
. Therefore, if your character set has two ranges of digits (0 to 9, and ٠ to ٩), the Indic numbers will be parsed wrong. Just stick to'0' <= c && c <= '9'
to be safe. This also avoids the undefined behavior from using the ctype function incorrectly. \$\endgroup\$Roland Illig– Roland Illig2018年01月13日 05:38:46 +00:00Commented Jan 13, 2018 at 5:38 -
\$\begingroup\$ You missed an important point when you wrote "ASCII value of '0'": there's nothing that says the host character set needs to be ASCII (only that 0..9 are contiguous). That's why you write
'0'
rather than an encoding-specific codepoint number. \$\endgroup\$Toby Speight– Toby Speight2018年02月21日 21:10:35 +00:00Commented Feb 21, 2018 at 21:10
[Edit]
Except for the behavior on error, atoi()
is equivalent to (int)strtol(nptr, (char **)NULL, 10)
. strtol()
accepts leading white space. OP's my_atoi(char* pointer)
does not. To remedy:
int my_atoi(const char* pointer) {
while (isspace((unsigned char) *pointer)) {
pointer++;
}
...
The below does describe a good way to handle INT_MIN
.
OTOH, handing values outside [INT_MIN...INT_MAX]
is is not defined by the C spec, so some simplifications can be had. See far below.
When a string represents INT_MIN
, (let's assume 32-bit int
) such as "-2147483648"
, code runs into int
overflow attempting to get to calculate 2147483648
. A simple way to solver this is rather than finding the positive value and then negating it, embrace the negative side of things. By doing the lion share of the math in the INT_MIN
to 0
range, we avoid UB. Down-side: some find this approach more challenging to follow.
Going to a wider integer or unsigned
it not always possible as the integer size of "text--> integer" routine may be the maximum size. Strictly speaking unsigned
does not always have a wider positive range than int
. In any case, all the math can be handled at the desired signed integer size without resorting to other types.
#include <ctype.h>
#include <limits.h>
int my_atoi(const char* pointer) { // good idea to make the `const`
int result = 0;
while (isspace((unsigned char) *pointer)) {
pointer++;
}
char sign = *pointer;
if (*pointer == '-' || *pointer == '+') { // text could lead with a '+'
pointer++;
}
int ch;
// isdigit() expects an unsigned char or EOF, not char
while ((ch = (unsigned char)(*pointer)) != 0) {
if (!isdigit(ch)) break;
ch -= '0';
// Will overflow occur?
if ((result < INT_MIN/10) ||
(result == INT_MIN/10 && ch > -(INT_MIN%10))) Handle_Overflow();
result *= 10;
result -= ch; // - , not +
pointer++;
}
if (sign != '-') {
if (result < -INT_MAX) Handle_Overflow();
result = -result;
}
return result;
}
Notes:
pointer%48
is confusing. What is special about 48? If you mean '0'
, then use pointer % '0'
.
"string: "232-19". What should I do then?" Recommend stopping conversion at "232" and returning the value 232. Could set errno
, but the typical atoi()
function does not do too much error handling.
On overflow, setting errno
, could happen, but again, typical atoi()
function does not do too much error handling. Suggest simple returning INT_MAX
or INT_MIN
.
If you want better error handling, change to something like the following and set an error status.
int my_atoi(const char *s, int *ErrorCode);
or location where things ended. If this are good, they ended at the '0円'
.
int my_atoi(const char *s, const char **endptr);
[Edit] Simplified: Removed out-of-range detection as C spec allows that. "If the value of the result cannot be represented, the behavior is undefined.
int my_atoi(const char* pointer) {
int result = 0;
while (isspace((unsigned char) *pointer)) {
pointer++;
}
char sign = *pointer;
if (*pointer == '-' || *pointer == '+') {
pointer++;
}
while (isdigit((unsigned char)*pointer)) {
result = result*10 - (*pointer++ - '0');
}
if (sign != '-') {
result = -result;
}
return result;
}
-
1\$\begingroup\$
INT_MIN/10
andINT_MIN%10
require C99 behavior. \$\endgroup\$chux– chux2016年04月27日 19:40:42 +00:00Commented Apr 27, 2016 at 19:40
char sign = *pointer;
if (*pointer == '-' || *pointer == '+') {
pointer++;
}
Why de-referencing "pointer" three times? One time is enough:
char sign = *pointer;
if (sign == '-' || sign == '+') {
pointer++;
}
-
1\$\begingroup\$ Welcome to Code Review, your first answer looks good, enjoy your stay! Though I wonder if it makes a difference in the generated code. \$\endgroup\$ferada– ferada2016年11月16日 18:13:46 +00:00Commented Nov 16, 2016 at 18:13
if you are ok with recursion then code could be shortened to one below
#include <string.h>
#include <math.h>
#include <stdbool.h>
int natural_number(const char* string) {
int index = strlen(string) - 1;
int number = pow(10, index) * (*string - '0');
return (index == 0) ? number : number + natural_number(string + 1);
}
int my_atoi(const char* string) {
int sign = (*string == '-') ? -1 : 1;
int offset = (*string == '-') ? 1 : 0;
return sign * natural_number(string + offset);
}
/* test cases */
my_atoi("-100") == -100;
my_atoi("0") == 0;
my_atoi("100") == 100;
Stack exhaustion could be mitigated by -foptimize-sibling-calls
compiler flag, that being supported by both GCC and Clang compilers.
Update:
As noted by Roland Illig implementation does not handle malformed input.
If it is desired closelly follow atoi
semantics then next code should be fine don't forget set Compile Options
to one in comments.
int digit(char symbol) {
return symbol - '0';
}
/* tail call optimized */
int natural_number_tc(const char* string, int number) {
return !isdigit(*string)
? number
: natural_number_tc(string + 1, 10 * number + digit(*string));
}
int natural_number(const char* string) {
return natural_number_tc(string, 0);
}
const char* left_trim_tc(const char* string, const char* symbol) {
return !isspace(*string) ? symbol : left_trim_tc(string + 1, symbol + 1);
}
const char* left_trim(const char* string) {
return left_trim_tc(string, string);
}
int my_atoi(const char* string) {
const char* symbol = left_trim(string);
int sign = (*symbol == '-') ? -1 : 1;
size_t offset = (*symbol == '-' || *symbol == '+') ? 1 : 0;
return sign * natural_number(symbol + offset);
}
This is still chux's code where loops replaced with recursion
int result = 0;
while (isdigit((unsigned char)*pointer)) {
result = 10 * result + (*pointer - '0');
pointer++;
}
// VS
int loop(const char* pointer, int result) {
return !isdigit((unsigned char)*pointer)
? result
: loop(pointer + 1, 10 * result + (*pointer - '0'))
}
-
\$\begingroup\$ Test case:
buf = malloc(65536); buf[0] = '0円'; my_atoi(buf)
will probably crash. \$\endgroup\$Roland Illig– Roland Illig2018年01月13日 05:46:09 +00:00Commented Jan 13, 2018 at 5:46 -
\$\begingroup\$ Test case:
bufsize = 1 << 20; buf = malloc(bufsize); memset(buf, '0', bufsize); buf[bufsize - 1] = '0円'; my_atoi(buf)
will take a very long time. \$\endgroup\$Roland Illig– Roland Illig2018年01月13日 05:50:29 +00:00Commented Jan 13, 2018 at 5:50
For a exercise in leetcode, wrote following impl: atoi cpp code
class Solution {
private:
bool checkMin(int a, int b=10, int c=0, int min_val=INT_MIN) {
/*
accepts a*b + c, min
a>min; b>min; c>min
check a*b+c > min or not
b>0; a<0 -ive; c<0
a!=0
*/
min_val = min_val -c;
//std::cout<<"new min input: "<<a <<" , "<< c<<" iter: "<<b << " "<<min_val <<std::endl;
//compare with a now
if(a<min_val)
return false;
int cur_prod = 0;
if(a==0)
return true;
for(;b>1;b--) {
cur_prod += a;
int curr_diff = min_val-cur_prod;
/*
subtraction possible because
min_val<prod,
min_val-prod<prod-prod
min_val-prod<0 ---1
prod<0
-prod>0
min_val+(-prod )> min_val+0 [x+ (+ive quantity)>x ]
min_val-prod>min_val --2
from 1, 2
min_val< min_val-prod < 0 ---3
from 3, min_val-prod can be expressed in integer
check if curr_diff still can hold a deduction of a
which means: curr_diff<a should hold, for a further a deduction in prod
-5, -6
for ex of min_val = 59, a = -6 at b = 2 (9th iteration) prod = -54
you can't add -6 now, since it will cross definable limit
only b-1 iterations
because at i-1 th iteration, ith product formation is checked
*/
//std::cout<<"check function for input: "<<a <<" , "<< c<<" iter: "<<b << " prod now = "
//<< cur_prod << " diff = " <<curr_diff<<" is curr_dif<a "<<(curr_diff<a)<<std::endl;
if(curr_diff>a) {
//std::cout<<" not possible"<<std::endl;
return false;
}
}
return true;
}
bool checkMax(int a, int b=10, int c=0, int max_val=INT_MAX) {
/*
accepts a*b + c, min
a<max; b<max; c<max
check a*b+c < max or not
b>0; a>0, c>0
*/
max_val = max_val -c;
//std::cout<<"new max input: "<<a <<" , "<< c<<" iter: "<<b << " "<<max_val <<std::endl;
//compare with a now
if(a>max_val) return false;
int cur_prod = 0;
if(a==0) return true;
for(;b>1;b--) {
cur_prod += a;
int curr_diff = max_val-cur_prod;
/*
subtraction possible because
max_val>prod,
max_val-prod>prod-prod
max_val-prod>0 ---1
prod>0
-prod<0
max_val+(-prod )< max_val+0 [x+ (-ive quantity)<x ]
max_val-prod<max_val --2
from 1, 2
0< max_val-prod < max_val ---3
from 3, max_val-prod can be expressed in integer
check if curr_diff still can hold a increment of a
which means: curr_diff>a should hold, for a further a deduction in prod
5>6 fails
for ex of max_val = 59, a = 6 at b = 2 (9th iteration) prod = 54
you can't add 6 now, since it will cross definable limit
only b-1 iterations
because at i-1 th iteration, ith product formation is checked
*/
//std::cout<<"check function for input: "<<a <<" , "<< c<<" iter: "<<b << " prod now = "
// << cur_prod << " diff = " <<curr_diff<<" is curr_dif<a "<<(curr_diff>a)<<std::endl;
if(curr_diff<a) {
//std::cout<<" not possible"<<std::endl;
return false;
}
}
return true;
}
public:
int myAtoi(string str) {
//code to trim string
int i =0, end=str.length()-1;
//std::cout<<i<<" "<<end<<std::endl;
while(i<end && str.at(i)==' ') {i++;continue;}
while(end>-1 && str.at(end)==' ') {end--;continue;}
if(end<i) return 0;
int sign=1;
if(str.at(i)=='-') {sign = -1; i++;}
else if(str.at(i)=='+') {i++;}
string tr_str = str.substr(i, end-i+1);
int num = 0;
for(char& digit : tr_str) {
if(digit<'0' || digit>'9') return num; // not convertable character - exit
int c= digit-'0';
if(sign==-1) {
//std::cout<<"Evaluating "<<c<<std::endl;
//number cannot be lower than INT_MIN
// do a check of num * 10 - c
//num<0 already
if(checkMin(num, 10, -c, INT_MIN))
num = num*10 -c;
else {
num = INT_MIN;
break;
}
//std::cout<<"number is"<<num<<std::endl;
}
else {
if(checkMax(num, 10, c, INT_MAX))
num = num*10 +c;
else {
num = INT_MAX;
break;
}
//std::cout<<"number is"<<num<<std::endl;
}
}
return num;
}
};
-
1\$\begingroup\$ Welcome to Code Review! You have presented an alternative solution, but haven't reviewed the code. Please explain your reasoning (how your solution works and why it is better than the original) so that the author and other readers can learn from your thought process. \$\endgroup\$Toby Speight– Toby Speight2018年02月21日 20:58:26 +00:00Commented Feb 21, 2018 at 20:58
-
\$\begingroup\$ the code uses a method, where checkMin, where no direct multiplication is performed until the result is validated. to be greater than INT_MIN. \$\endgroup\$Panther_live– Panther_live2018年02月23日 04:16:47 +00:00Commented Feb 23, 2018 at 4:16
Explore related questions
See similar questions with these tags.