Checking if each char in a string is a decimal digit

Question 1

I had to write a program that would verify that a string contains any digits beside numbers 0-9 and the first '-' sign to indicate if pos/neg. The algorithm works like I want it to, but is there a better way of writing it, e.g. streamlining it, making it shorter?

bool isNum(string number) {
 // Checks if first char in buffer contains invalid digits 
 // Needed because any negative sign after the one used to indicate negative sign.
 if (!((number[0] >= '0' && number[0] <= '9') || number[0] == '-' || number[0] == '+'))
 return false;
 // Checks if each digit after the first is anything but 0-9
 for (int i = 1; number[i] != '0円'; i++) {
 if (number[i] < '0' || number[i] > '9')
 return false;
 }
 // If neither statement returns false, returns true by default
 return true;
}

Question 2

Your code doesn’t match your text, as the former also accepts a leading + character.

Question 3

In the cctype header, you have the std::isdigit(int) function. You can use this instead of your conditions that check if the character is between '0' and '9'.
You index into a std::string with an int inside the for loop. Use std::string::size_type as that is the proper type that can index into a std::string (int might be too small).
Mark the function as noexcept.
Consider passing the std::string parameter by reference if you're planning on using large number strings (~20 character numbers). This prevents unnecessary copying from dynamically allocated strings.
Check vs. the size instead of indexing into the string and checking if the current character is '0円'.
Qualify standard types with std::. Don't use using namespace std;.
There is no need to negate that first condition.
Bug: isNum("-") and isNum("+") return true. An additional check is required if the first character is '+' or '-'.
Naming is important. Your function is called isNum(), but I believe that isInt() better describes it; the function does not detect floating point numbers. Please also consider this comment by Toby Speight in regards to naming.

After changes are applied:

#include <string>
#include <cctype>
bool isinteger(std::string const& n) noexcept
{
 if (std::isdigit(n[0]) || (n.size() > 1 && (n[0] == '-' || n[0] == '+')))
 {
 for (std::string::size_type i{ 1 }; i < n.size(); ++i)
 if (!std::isdigit(n[i]))
 return false;
 return true;
 }
 return false;
}

Note 1: I've renamed the function to isinteger() to keep consistency with other similar functions such as isdigit(). I've also reordered the initial condition to take advantage of boolean short circuiting.

Note 2: As Rakete1111 points out, if your compiler hasn't caught up to C++11 yet, accessing the first character of an empty string is undefined behaviour.

The condition should be modified like so:

// check that the string is not empty, only if we're working with < C++11
if ((!n.empty() && std::isdigit(n[0])) || (n.size() > 1 && (n[0] == '-' || n[0] == '+')))
{
 // ...
}

However, if your compiler is C++11 (or later) compliant, you can safely leave it; accessing the first character of empty strings returns '0円' in that case.

See here for more information: http://en.cppreference.com/w/cpp/string/basic_string/operator_at

Question 4

If this function is in the global namespace, then a name beginning with is may be a poor choice, as it can't then be safely used in a translation unit that also includes C standard headers (i.e. not through their C++ wrappers).

Question 5

Note that if the string is empty, your code has undefined behavior.

Question 6

@Deduplicator What do you mean it's fine? It is not, std:: isdigit(n[0]) is the first executed statement, and it accesses n[0]. If n is empty, you have undefined behavior

Question 7

@Rakete1111: See en.cppreference.com/w/cpp/string/basic_string/operator_at

Question 8

Note that even before C++11, it's OK since he's using the const form of the operator.

Question 9

The other reviews have hit most of the important points, so I'll just provide a single line alternative that uses the C++11 std::regex_match. That function looks like this:

bool wasInteger(const std::string &num) {
 return std::regex_match(num, std::regex("[+-]?[0-9]+"));
}

With a regex this simple, I might not bother with a wrapper function.

Test your function

I also wrote (and encourage you to write) a simple test program:

#include <iostream>
#include <iomanip>
#include <string>
#include <regex>
bool wasInteger(const std::string &num) {
 return std::regex_match(num, std::regex("[+-]?[0-9]+"));
}
struct Test {
 std::string str;
 bool expected;
};
static const std::string CSI{"\x1b["};
static const std::string RED{CSI + "31m"};
static const std::string GREEN{CSI + "32m"};
static const std::string RESET{CSI + "0m"};
int main() {
 static const Test tests[]{
 {"+", false},
 {"-", false},
 {"0", true},
 {"3", true},
 {"9", true},
 {"a", false},
 {"99a9", false},
 {"9909", true},
 {"", false},
 {"-3.14", false},
 {"+32768", true},
 {"-32768", true},
 };
 static const std::string badgood[]{ 
 RED + "[BAD] " + RESET, 
 GREEN + "[OK] "+ RESET
 }; 
 for (const auto &t : tests) {
 auto result = wasInteger(t.str);
 std::cout << badgood[result == t.expected] 
 << std::boolalpha << "got " << result 
 << ", expected " << t.expected 
 << " from \"" << t.str << "\"\n";
 }
}

Results

When I run the program I get the following output:

[OK] got false, expected false from "+"
[OK] got false, expected false from "-"
[OK] got true, expected true from "0"
[OK] got true, expected true from "3"
[OK] got true, expected true from "9"
[OK] got false, expected false from "a"
[OK] got false, expected false from "99a9"
[OK] got true, expected true from "9909"
[OK] got false, expected false from ""
[OK] got false, expected false from "-3.14"
[OK] got true, expected true from "+32768"
[OK] got true, expected true from "-32768"

Question 10

Note: I've done some simple timing tests and it seems that using regex is orders of magnitude slower than other solutions. 300ms (normal loop) vs 8000ms (regex). Excellent point about testing +1.

Question 11

Yes, it's often the case that a regex is slower. One can use std::regex::optimize and factor out the regex into a static const, and enable compiler optimizations, but that only gets us so far. In cases for which raw performance is imperative, a regex, especially for simple patterns like this one, is often not the optimum choice. However, if the goal is to optimize the programmer's productivity, it's often, as in this case, much simpler (for the human!) than all other correct alternatives.

Question 12

Use the standard library
Pass by const& if you only want to read from a variable (and not copy it)
Mark functions that should not throw noexcept
std::size_t is guaranteed to be large enough to hold every possible index, while int is not

Check if the string is not empty before accessing the first element

bool isNum(const std::string& str) noexcept {
 if (str.empty())
 return false;
 if (std::isdigit(str.front()) || (str.length() > 1 && (str.front() == '+' || str.front() == '-')))
 return std::all_of(str.cbegin() + 1, str.cend(), ::isdigit);
 return false;
}

Question 13

I like your suggestions, but I think the code has a bug. It would accept the string "+" even though it has no digits (because std::all_of returns true if the range is empty).

Question 14

@AdrianMcCarthy You're right! I actually noticed that I'd have to be careful using std::all_of, but then I forgot about it when I wrote this answer :P Thanks! It's fixed now :)

Question 15

There seems to be a bug present. This version returns false for isNum("1") and true for isNum("1s").

Question 16

@user2296177 It doesn't even compile :/ Well, no, I didn't test it. Should have. I tested it now correctly... Thanks

Question 17

Lots of good info in the other answers, but I want to add a style tip I learned for range-checking. I'd argue that this:

number[0] >= '0' && number[0] <= '9'

is harder to read than this:

'0' <= number[0] && number[0] <= '9'

These expressions mean exactly the same thing, but the second makes it obvious at a glance that you're looking to see if the character is inside a range, what the bounds of that range is, and whether the bounds are inclusive or exclusive.

(Obviously, it would be even better to use std::isdigit. But if you had to implement your own isdigit, putting all the values in order would make it easier to read.)

Likewise, to check if something is outside the range, you move the value outside the bounds:

number[0] < '0' || '9' < number[0]

If you read it aloud, it may sound a little like Yoda-speak until you get used to it. But if you're skimming through the code, your visual pattern matcher will immediately recognize this as an out-of-bounds test.

Question 18

is there a better way of writing it, e.g. streamlining it, making it shorter?

Don't repeat yourself. You have some duplication in the digit check between the first character and the rest of the string.

std::string may have null characters embedded in its contents. Unless you intend to read up to the first encountered null character, prefer to use the size/length member function.

#include <cctype>
#include <string>
bool issigned(char c) {
 return c == '-' || c == '+';
}
bool isnumber(const std::string& number) {
 std::string::size_type pos = 0;
 if (issigned(number[pos])) {
 ++pos;
 }
 for (; pos < number.size(); ++pos) {
 if (!std::isdigit(number[pos])) {
 return false;
 }
 }
 return !number.empty();
}

An alternative approach would be to use std::strtol. strtol stops reading the string at the first character it cannot recognize as part of a number. This may be the first encountered null character, or it may be the first numeric character greater than or equal to base.

Note 1: This also handles leading whitespace.

Note 2: c++11 guarantees that std::string is laid out contiguously.

Note 3: c++11 guarantees that the null character is returned when pos == size().

#include <cstdlib>
#include <string>
bool isnumber(const std::string& number) {
 char* first;
 std::strtol(number.c_str(), &first, 10);
 const auto last = number.c_str() + number.size();
 return !number.empty() && !std::isspace(number[0]) && first == last;
}

Question 19

You know std::strtol discards leading whitespace even if you choose base 10? Better make it:

char* first; return !number.empty() && !std::isspace(number[0]) && (std::strtol(number.c_str(), &first, 10), first) == number.c_str() + number.size();

Question 20

I did note that std::strtol does discard leading whitespace. But updated with the isspace check.

Question 21

To "make it shorter" as requested you can rearrange the code so that expressions which are currently repeated (e.g. number[...]) are no longer repeated.

bool isNum(const string& number) {
 // set to true if we find a number
 bool rc = false;
 for (size_t i = 0; ; ++i) {
 // load the character
 char c = number[i];
 // test for end of string
 if (!c)
 return rc;
 // test for digit
 if (c >= '0' && c <= '9')
 // found digit
 rc = true;
 else {
 // sign is allowed on first character only
 if ((i != 0) || (c != '-' && c != '+'))
 return false;
 }
 }
}

Note:

I moved the end-of-string test out of the for and into the loop, so that it too could use c instead of number[i].
This code is slightly inefficient, because it sets rc = true more than once. On the other hand there's less code ... the boolean rc is a result of my combining the handling of the first character into the loop which handles subsequent characters. What you did the OP (i.e. separate code to handle the special case for when i is 0) makes the code longer but faster (see also Loop unrolling).

You could also rewrite the above as a switch expression:

bool isNum(const string& number) {
 // set to true if we find a number
 bool rc = false;
 for (size_t i = 0; ; ++i) {
 // load the character
 switch (number[i]) {
 case '0':
 case '1':
 case '2':
 case '3':
 case '4':
 case '5':
 case '6':
 case '7':
 case '8':
 case '9':
 rc = true;
 break;
 case '+':
 case '-':
 if (i != 0)
 return false;
 break;
 case '0円': // end of string
 return rc;
 default:
 return false;
 }
 }
}

I think this implementation using a switch statement is commendably clear; but it wouldn't generalize well to parsing something more complicated for example a double-precision number.

Question 22

You assume no embedded NULs. That's something which should be tested...

Question 23

Apparently you're right; I didn't know that. Can a std::string contain embedded nulls? To fix that I suppose you do the for loop while i < number.length(), return false if you find a NUL (which counts as a not-a-digit), and return rc after you stop running the loop.

Question 24

A different approach also using the standard library.

bool isnum(const std::string& num)
{
 if ( num.empty() ) return false;
 size_t start=0;
 if ( ( num[start] == '+') || ( num[start] == '-' ) ) start++;
 if ( start == num.length())
 return false;
 return num.find_first_not_of("0123456789",start) == std::string::npos;
}

Not sure what are the requirements, usually one would drop spaces at least from the beginning of the string for this sort of problem.

user2296177 user2296177 3,5831 gold badge15 silver badges37 bronze badges · Answer 1 · 2017-05-05 03:22:10Z

In the cctype header, you have the std::isdigit(int) function. You can use this instead of your conditions that check if the character is between '0' and '9'.
You index into a std::string with an int inside the for loop. Use std::string::size_type as that is the proper type that can index into a std::string (int might be too small).
Mark the function as noexcept.
Consider passing the std::string parameter by reference if you're planning on using large number strings (~20 character numbers). This prevents unnecessary copying from dynamically allocated strings.
Check vs. the size instead of indexing into the string and checking if the current character is '0円'.
Qualify standard types with std::. Don't use using namespace std;.
There is no need to negate that first condition.
Bug: isNum("-") and isNum("+") return true. An additional check is required if the first character is '+' or '-'.
Naming is important. Your function is called isNum(), but I believe that isInt() better describes it; the function does not detect floating point numbers. Please also consider this comment by Toby Speight in regards to naming.

After changes are applied:

#include <string>
#include <cctype>
bool isinteger(std::string const& n) noexcept
{
 if (std::isdigit(n[0]) || (n.size() > 1 && (n[0] == '-' || n[0] == '+')))
 {
 for (std::string::size_type i{ 1 }; i < n.size(); ++i)
 if (!std::isdigit(n[i]))
 return false;
 return true;
 }
 return false;
}

Note 1: I've renamed the function to isinteger() to keep consistency with other similar functions such as isdigit(). I've also reordered the initial condition to take advantage of boolean short circuiting.

Note 2: As Rakete1111 points out, if your compiler hasn't caught up to C++11 yet, accessing the first character of an empty string is undefined behaviour.

The condition should be modified like so:

// check that the string is not empty, only if we're working with < C++11
if ((!n.empty() && std::isdigit(n[0])) || (n.size() > 1 && (n[0] == '-' || n[0] == '+')))
{
 // ...
}

However, if your compiler is C++11 (or later) compliant, you can safely leave it; accessing the first character of empty strings returns '0円' in that case.

See here for more information: http://en.cppreference.com/w/cpp/string/basic_string/operator_at

If this function is in the global namespace, then a name beginning with is may be a poor choice, as it can't then be safely used in a translation unit that also includes C standard headers (i.e. not through their C++ wrappers).
Note that if the string is empty, your code has undefined behavior.
@Deduplicator What do you mean it's fine? It is not, std:: isdigit(n[0]) is the first executed statement, and it accesses n[0]. If n is empty, you have undefined behavior
@Rakete1111: See en.cppreference.com/w/cpp/string/basic_string/operator_at
Note that even before C++11, it's OK since he's using the const form of the operator.

Edward Edward 67.2k4 gold badges120 silver badges284 bronze badges · Answer 2 · 2017-05-05 11:02:32Z

The other reviews have hit most of the important points, so I'll just provide a single line alternative that uses the C++11 std::regex_match. That function looks like this:

bool wasInteger(const std::string &num) {
 return std::regex_match(num, std::regex("[+-]?[0-9]+"));
}

With a regex this simple, I might not bother with a wrapper function.

Test your function

I also wrote (and encourage you to write) a simple test program:

#include <iostream>
#include <iomanip>
#include <string>
#include <regex>
bool wasInteger(const std::string &num) {
 return std::regex_match(num, std::regex("[+-]?[0-9]+"));
}
struct Test {
 std::string str;
 bool expected;
};
static const std::string CSI{"\x1b["};
static const std::string RED{CSI + "31m"};
static const std::string GREEN{CSI + "32m"};
static const std::string RESET{CSI + "0m"};
int main() {
 static const Test tests[]{
 {"+", false},
 {"-", false},
 {"0", true},
 {"3", true},
 {"9", true},
 {"a", false},
 {"99a9", false},
 {"9909", true},
 {"", false},
 {"-3.14", false},
 {"+32768", true},
 {"-32768", true},
 };
 static const std::string badgood[]{ 
 RED + "[BAD] " + RESET, 
 GREEN + "[OK] "+ RESET
 }; 
 for (const auto &t : tests) {
 auto result = wasInteger(t.str);
 std::cout << badgood[result == t.expected] 
 << std::boolalpha << "got " << result 
 << ", expected " << t.expected 
 << " from \"" << t.str << "\"\n";
 }
}

Results

When I run the program I get the following output:

[OK] got false, expected false from "+"
[OK] got false, expected false from "-"
[OK] got true, expected true from "0"
[OK] got true, expected true from "3"
[OK] got true, expected true from "9"
[OK] got false, expected false from "a"
[OK] got false, expected false from "99a9"
[OK] got true, expected true from "9909"
[OK] got false, expected false from ""
[OK] got false, expected false from "-3.14"
[OK] got true, expected true from "+32768"
[OK] got true, expected true from "-32768"

Note: I've done some simple timing tests and it seems that using regex is orders of magnitude slower than other solutions. 300ms (normal loop) vs 8000ms (regex). Excellent point about testing +1.
Yes, it's often the case that a regex is slower. One can use std::regex::optimize and factor out the regex into a static const, and enable compiler optimizations, but that only gets us so far. In cases for which raw performance is imperative, a regex, especially for simple patterns like this one, is often not the optimum choice. However, if the goal is to optimize the programmer's productivity, it's often, as in this case, much simpler (for the human!) than all other correct alternatives.

Rakete1111 Rakete1111 2,5821 gold badge16 silver badges23 bronze badges · Answer 3 · 2017-05-05 10:41:32Z

10

\$\begingroup\$

Use the standard library
Pass by const& if you only want to read from a variable (and not copy it)
Mark functions that should not throw noexcept
std::size_t is guaranteed to be large enough to hold every possible index, while int is not

Check if the string is not empty before accessing the first element

bool isNum(const std::string& str) noexcept {
 if (str.empty())
 return false;
 if (std::isdigit(str.front()) || (str.length() > 1 && (str.front() == '+' || str.front() == '-')))
 return std::all_of(str.cbegin() + 1, str.cend(), ::isdigit);
 return false;
}

Share

edited May 6, 2017 at 10:33

answered May 5, 2017 at 10:41

Rakete1111's user avatar

Rakete1111 Rakete1111

2,5821 gold badge16 silver badges23 bronze badges

\$\endgroup\$

4

\$\begingroup\$ I like your suggestions, but I think the code has a bug. It would accept the string "+" even though it has no digits (because std::all_of returns true if the range is empty). \$\endgroup\$

Adrian McCarthy
– Adrian McCarthy

2017年05月05日 16:50:58 +00:00
Commented May 5, 2017 at 16:50
\$\begingroup\$ @AdrianMcCarthy You're right! I actually noticed that I'd have to be careful using std::all_of, but then I forgot about it when I wrote this answer :P Thanks! It's fixed now :) \$\endgroup\$

Rakete1111
– Rakete1111

2017年05月05日 16:55:10 +00:00
Commented May 5, 2017 at 16:55
\$\begingroup\$ There seems to be a bug present. This version returns false for isNum("1") and true for isNum("1s"). \$\endgroup\$

user2296177
– user2296177

2017年05月06日 00:23:17 +00:00
Commented May 6, 2017 at 0:23
\$\begingroup\$ @user2296177 It doesn't even compile :/ Well, no, I didn't test it. Should have. I tested it now correctly... Thanks \$\endgroup\$

Rakete1111
– Rakete1111

2017年05月06日 10:34:41 +00:00
Commented May 6, 2017 at 10:34

Add a comment |

Adrian McCarthy Adrian McCarthy 3001 silver badge7 bronze badges · Answer 4 · 2017-05-05 16:43:26Z

Lots of good info in the other answers, but I want to add a style tip I learned for range-checking. I'd argue that this:

number[0] >= '0' && number[0] <= '9'

is harder to read than this:

'0' <= number[0] && number[0] <= '9'

These expressions mean exactly the same thing, but the second makes it obvious at a glance that you're looking to see if the character is inside a range, what the bounds of that range is, and whether the bounds are inclusive or exclusive.

(Obviously, it would be even better to use std::isdigit. But if you had to implement your own isdigit, putting all the values in order would make it easier to read.)

Likewise, to check if something is outside the range, you move the value outside the bounds:

number[0] < '0' || '9' < number[0]

If you read it aloud, it may sound a little like Yoda-speak until you get used to it. But if you're skimming through the code, your visual pattern matcher will immediately recognize this as an out-of-bounds test.

Snowhawk Snowhawk 6,7701 gold badge19 silver badges37 bronze badges · Answer 5 · 2017-05-06 07:10:34Z

is there a better way of writing it, e.g. streamlining it, making it shorter?

Don't repeat yourself. You have some duplication in the digit check between the first character and the rest of the string.

std::string may have null characters embedded in its contents. Unless you intend to read up to the first encountered null character, prefer to use the size/length member function.

#include <cctype>
#include <string>
bool issigned(char c) {
 return c == '-' || c == '+';
}
bool isnumber(const std::string& number) {
 std::string::size_type pos = 0;
 if (issigned(number[pos])) {
 ++pos;
 }
 for (; pos < number.size(); ++pos) {
 if (!std::isdigit(number[pos])) {
 return false;
 }
 }
 return !number.empty();
}

An alternative approach would be to use std::strtol. strtol stops reading the string at the first character it cannot recognize as part of a number. This may be the first encountered null character, or it may be the first numeric character greater than or equal to base.

Note 1: This also handles leading whitespace.

Note 2: c++11 guarantees that std::string is laid out contiguously.

Note 3: c++11 guarantees that the null character is returned when pos == size().

#include <cstdlib>
#include <string>
bool isnumber(const std::string& number) {
 char* first;
 std::strtol(number.c_str(), &first, 10);
 const auto last = number.c_str() + number.size();
 return !number.empty() && !std::isspace(number[0]) && first == last;
}

You know std::strtol discards leading whitespace even if you choose base 10? Better make it: char* first; return !number.empty() && !std::isspace(number[0]) && (std::strtol(number.c_str(), &first, 10), first) == number.c_str() + number.size();
I did note that std::strtol does discard leading whitespace. But updated with the isspace check.

ChrisW ChrisW 13.1k1 gold badge35 silver badges76 bronze badges · Answer 6 · 2017-05-07 10:15:50Z

To "make it shorter" as requested you can rearrange the code so that expressions which are currently repeated (e.g. number[...]) are no longer repeated.

bool isNum(const string& number) {
 // set to true if we find a number
 bool rc = false;
 for (size_t i = 0; ; ++i) {
 // load the character
 char c = number[i];
 // test for end of string
 if (!c)
 return rc;
 // test for digit
 if (c >= '0' && c <= '9')
 // found digit
 rc = true;
 else {
 // sign is allowed on first character only
 if ((i != 0) || (c != '-' && c != '+'))
 return false;
 }
 }
}

Note:

I moved the end-of-string test out of the for and into the loop, so that it too could use c instead of number[i].
This code is slightly inefficient, because it sets rc = true more than once. On the other hand there's less code ... the boolean rc is a result of my combining the handling of the first character into the loop which handles subsequent characters. What you did the OP (i.e. separate code to handle the special case for when i is 0) makes the code longer but faster (see also Loop unrolling).

You could also rewrite the above as a switch expression:

bool isNum(const string& number) {
 // set to true if we find a number
 bool rc = false;
 for (size_t i = 0; ; ++i) {
 // load the character
 switch (number[i]) {
 case '0':
 case '1':
 case '2':
 case '3':
 case '4':
 case '5':
 case '6':
 case '7':
 case '8':
 case '9':
 rc = true;
 break;
 case '+':
 case '-':
 if (i != 0)
 return false;
 break;
 case '0円': // end of string
 return rc;
 default:
 return false;
 }
 }
}

I think this implementation using a switch statement is commendably clear; but it wouldn't generalize well to parsing something more complicated for example a double-precision number.

You assume no embedded NULs. That's something which should be tested...
Apparently you're right; I didn't know that. Can a std::string contain embedded nulls? To fix that I suppose you do the for loop while i < number.length(), return false if you find a NUL (which counts as a not-a-digit), and return rc after you stop running the loop.

didiz didiz 1114 bronze badges · Answer 7 · 2017-05-06 09:29:07Z

A different approach also using the standard library.

bool isnum(const std::string& num)
{
 if ( num.empty() ) return false;
 size_t start=0;
 if ( ( num[start] == '+') || ( num[start] == '-' ) ) start++;
 if ( start == num.length())
 return false;
 return num.find_first_not_of("0123456789",start) == std::string::npos;
}

Not sure what are the requirements, usually one would drop spaces at least from the beginning of the string for this sort of problem.

Stack Exchange Network

Checking if each char in a string is a decimal digit

7 Answers 7

Test your function

Results

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Checking if each char in a string is a decimal digit

7 Answers 7

Test your function

Results

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions