I had to write a program that would verify that a string contains any digits beside numbers 0-9 and the first '-' sign to indicate if pos/neg. The algorithm works like I want it to, but is there a better way of writing it, e.g. streamlining it, making it shorter?
bool isNum(string number) {
// Checks if first char in buffer contains invalid digits
// Needed because any negative sign after the one used to indicate negative sign.
if (!((number[0] >= '0' && number[0] <= '9') || number[0] == '-' || number[0] == '+'))
return false;
// Checks if each digit after the first is anything but 0-9
for (int i = 1; number[i] != '0円'; i++) {
if (number[i] < '0' || number[i] > '9')
return false;
}
// If neither statement returns false, returns true by default
return true;
}
7 Answers 7
In the
cctype
header, you have thestd::isdigit(int)
function. You can use this instead of your conditions that check if the character is between'0'
and'9'
.You index into a
std::string
with anint
inside the for loop. Usestd::string::size_type
as that is the proper type that can index into astd::string
(int
might be too small).Mark the function as
noexcept
.Consider passing the
std::string
parameter by reference if you're planning on using large number strings (~20 character numbers). This prevents unnecessary copying from dynamically allocated strings.Check vs. the size instead of indexing into the string and checking if the current character is
'0円'
.Qualify standard types with
std::
. Don't useusing namespace std;
.There is no need to negate that first condition.
Bug:
isNum("-")
andisNum("+")
returntrue
. An additional check is required if the first character is'+'
or'-'
.Naming is important. Your function is called
isNum()
, but I believe thatisInt()
better describes it; the function does not detect floating point numbers. Please also consider this comment by Toby Speight in regards to naming.
After changes are applied:
#include <string>
#include <cctype>
bool isinteger(std::string const& n) noexcept
{
if (std::isdigit(n[0]) || (n.size() > 1 && (n[0] == '-' || n[0] == '+')))
{
for (std::string::size_type i{ 1 }; i < n.size(); ++i)
if (!std::isdigit(n[i]))
return false;
return true;
}
return false;
}
Note 1: I've renamed the function to isinteger()
to keep consistency with other similar functions such as isdigit()
. I've also reordered the initial condition to take advantage of boolean short circuiting.
Note 2: As Rakete1111 points out, if your compiler hasn't caught up to C++11 yet, accessing the first character of an empty string is undefined behaviour.
The condition should be modified like so:
// check that the string is not empty, only if we're working with < C++11
if ((!n.empty() && std::isdigit(n[0])) || (n.size() > 1 && (n[0] == '-' || n[0] == '+')))
{
// ...
}
However, if your compiler is C++11 (or later) compliant, you can safely leave it; accessing the first character of empty strings returns '0円'
in that case.
See here for more information: http://en.cppreference.com/w/cpp/string/basic_string/operator_at
-
1\$\begingroup\$ If this function is in the global namespace, then a name beginning with
is
may be a poor choice, as it can't then be safely used in a translation unit that also includes C standard headers (i.e. not through their C++ wrappers). \$\endgroup\$Toby Speight– Toby Speight2017年05月05日 08:43:42 +00:00Commented May 5, 2017 at 8:43 -
4\$\begingroup\$ Note that if the string is empty, your code has undefined behavior. \$\endgroup\$Rakete1111– Rakete11112017年05月05日 10:42:44 +00:00Commented May 5, 2017 at 10:42
-
1\$\begingroup\$ @Deduplicator What do you mean it's fine? It is not,
std:: isdigit(n[0])
is the first executed statement, and it accessesn[0]
. Ifn
is empty, you have undefined behavior \$\endgroup\$Rakete1111– Rakete11112017年05月05日 11:32:58 +00:00Commented May 5, 2017 at 11:32 -
1\$\begingroup\$ @Rakete1111: See en.cppreference.com/w/cpp/string/basic_string/operator_at \$\endgroup\$Deduplicator– Deduplicator2017年05月05日 11:54:42 +00:00Commented May 5, 2017 at 11:54
-
1\$\begingroup\$ Note that even before C++11, it's OK since he's using the
const
form of the operator. \$\endgroup\$JDługosz– JDługosz2017年05月07日 08:09:04 +00:00Commented May 7, 2017 at 8:09
The other reviews have hit most of the important points, so I'll just provide a single line alternative that uses the C++11 std::regex_match
. That function looks like this:
bool wasInteger(const std::string &num) {
return std::regex_match(num, std::regex("[+-]?[0-9]+"));
}
With a regex this simple, I might not bother with a wrapper function.
Test your function
I also wrote (and encourage you to write) a simple test program:
#include <iostream>
#include <iomanip>
#include <string>
#include <regex>
bool wasInteger(const std::string &num) {
return std::regex_match(num, std::regex("[+-]?[0-9]+"));
}
struct Test {
std::string str;
bool expected;
};
static const std::string CSI{"\x1b["};
static const std::string RED{CSI + "31m"};
static const std::string GREEN{CSI + "32m"};
static const std::string RESET{CSI + "0m"};
int main() {
static const Test tests[]{
{"+", false},
{"-", false},
{"0", true},
{"3", true},
{"9", true},
{"a", false},
{"99a9", false},
{"9909", true},
{"", false},
{"-3.14", false},
{"+32768", true},
{"-32768", true},
};
static const std::string badgood[]{
RED + "[BAD] " + RESET,
GREEN + "[OK] "+ RESET
};
for (const auto &t : tests) {
auto result = wasInteger(t.str);
std::cout << badgood[result == t.expected]
<< std::boolalpha << "got " << result
<< ", expected " << t.expected
<< " from \"" << t.str << "\"\n";
}
}
Results
When I run the program I get the following output:
[OK] got false, expected false from "+"
[OK] got false, expected false from "-"
[OK] got true, expected true from "0"
[OK] got true, expected true from "3"
[OK] got true, expected true from "9"
[OK] got false, expected false from "a"
[OK] got false, expected false from "99a9"
[OK] got true, expected true from "9909"
[OK] got false, expected false from ""
[OK] got false, expected false from "-3.14"
[OK] got true, expected true from "+32768"
[OK] got true, expected true from "-32768"
-
2\$\begingroup\$ Note: I've done some simple timing tests and it seems that using regex is orders of magnitude slower than other solutions. 300ms (normal loop) vs 8000ms (regex). Excellent point about testing +1. \$\endgroup\$user2296177– user22961772017年05月06日 00:08:27 +00:00Commented May 6, 2017 at 0:08
-
\$\begingroup\$ Yes, it's often the case that a regex is slower. One can use
std::regex::optimize
and factor out the regex into astatic const
, and enable compiler optimizations, but that only gets us so far. In cases for which raw performance is imperative, a regex, especially for simple patterns like this one, is often not the optimum choice. However, if the goal is to optimize the programmer's productivity, it's often, as in this case, much simpler (for the human!) than all other correct alternatives. \$\endgroup\$Edward– Edward2017年05月06日 02:40:48 +00:00Commented May 6, 2017 at 2:40
- Use the standard library
- Pass by
const&
if you only want to read from a variable (and not copy it) - Mark functions that should not throw
noexcept
std::size_t
is guaranteed to be large enough to hold every possible index, whileint
is notCheck if the string is not empty before accessing the first element
bool isNum(const std::string& str) noexcept { if (str.empty()) return false; if (std::isdigit(str.front()) || (str.length() > 1 && (str.front() == '+' || str.front() == '-'))) return std::all_of(str.cbegin() + 1, str.cend(), ::isdigit); return false; }
-
\$\begingroup\$ I like your suggestions, but I think the code has a bug. It would accept the string "+" even though it has no digits (because std::all_of returns true if the range is empty). \$\endgroup\$Adrian McCarthy– Adrian McCarthy2017年05月05日 16:50:58 +00:00Commented May 5, 2017 at 16:50
-
\$\begingroup\$ @AdrianMcCarthy You're right! I actually noticed that I'd have to be careful using
std::all_of
, but then I forgot about it when I wrote this answer :P Thanks! It's fixed now :) \$\endgroup\$Rakete1111– Rakete11112017年05月05日 16:55:10 +00:00Commented May 5, 2017 at 16:55 -
\$\begingroup\$ There seems to be a bug present. This version returns
false
forisNum("1")
andtrue
forisNum("1s")
. \$\endgroup\$user2296177– user22961772017年05月06日 00:23:17 +00:00Commented May 6, 2017 at 0:23 -
\$\begingroup\$ @user2296177 It doesn't even compile :/ Well, no, I didn't test it. Should have. I tested it now correctly... Thanks \$\endgroup\$Rakete1111– Rakete11112017年05月06日 10:34:41 +00:00Commented May 6, 2017 at 10:34
Lots of good info in the other answers, but I want to add a style tip I learned for range-checking. I'd argue that this:
number[0] >= '0' && number[0] <= '9'
is harder to read than this:
'0' <= number[0] && number[0] <= '9'
These expressions mean exactly the same thing, but the second makes it obvious at a glance that you're looking to see if the character is inside a range, what the bounds of that range is, and whether the bounds are inclusive or exclusive.
(Obviously, it would be even better to use std::isdigit
. But if you had to implement your own isdigit
, putting all the values in order would make it easier to read.)
Likewise, to check if something is outside the range, you move the value outside the bounds:
number[0] < '0' || '9' < number[0]
If you read it aloud, it may sound a little like Yoda-speak until you get used to it. But if you're skimming through the code, your visual pattern matcher will immediately recognize this as an out-of-bounds test.
is there a better way of writing it, e.g. streamlining it, making it shorter?
Don't repeat yourself. You have some duplication in the digit check between the first character and the rest of the string.
std::string
may have null characters embedded in its contents. Unless you intend to read up to the first encountered null character, prefer to use the size
/length
member function.
#include <cctype>
#include <string>
bool issigned(char c) {
return c == '-' || c == '+';
}
bool isnumber(const std::string& number) {
std::string::size_type pos = 0;
if (issigned(number[pos])) {
++pos;
}
for (; pos < number.size(); ++pos) {
if (!std::isdigit(number[pos])) {
return false;
}
}
return !number.empty();
}
An alternative approach would be to use std::strtol
. strtol
stops reading the string at the first character it cannot recognize as part of a number. This may be the first encountered null character, or it may be the first numeric character greater than or equal to base
.
Note 1: This also handles leading whitespace.
Note 2: c++11 guarantees that std::string
is laid out contiguously.
Note 3: c++11 guarantees that the null character is returned when pos == size()
.
#include <cstdlib>
#include <string>
bool isnumber(const std::string& number) {
char* first;
std::strtol(number.c_str(), &first, 10);
const auto last = number.c_str() + number.size();
return !number.empty() && !std::isspace(number[0]) && first == last;
}
-
\$\begingroup\$ You know
std::strtol
discards leading whitespace even if you choose base 10? Better make it:char* first; return !number.empty() && !std::isspace(number[0]) && (std::strtol(number.c_str(), &first, 10), first) == number.c_str() + number.size();
\$\endgroup\$Deduplicator– Deduplicator2017年05月06日 11:05:54 +00:00Commented May 6, 2017 at 11:05 -
\$\begingroup\$ I did note that
std::strtol
does discard leading whitespace. But updated with theisspace
check. \$\endgroup\$Snowhawk– Snowhawk2017年05月06日 20:26:00 +00:00Commented May 6, 2017 at 20:26
To "make it shorter" as requested you can rearrange the code so that expressions which are currently repeated (e.g. number[...]
) are no longer repeated.
bool isNum(const string& number) {
// set to true if we find a number
bool rc = false;
for (size_t i = 0; ; ++i) {
// load the character
char c = number[i];
// test for end of string
if (!c)
return rc;
// test for digit
if (c >= '0' && c <= '9')
// found digit
rc = true;
else {
// sign is allowed on first character only
if ((i != 0) || (c != '-' && c != '+'))
return false;
}
}
}
Note:
- I moved the end-of-string test out of the
for
and into the loop, so that it too could usec
instead ofnumber[i]
. - This code is slightly inefficient, because it sets
rc = true
more than once. On the other hand there's less code ... the booleanrc
is a result of my combining the handling of the first character into the loop which handles subsequent characters. What you did the OP (i.e. separate code to handle the special case for wheni
is0
) makes the code longer but faster (see also Loop unrolling).
You could also rewrite the above as a switch expression:
bool isNum(const string& number) {
// set to true if we find a number
bool rc = false;
for (size_t i = 0; ; ++i) {
// load the character
switch (number[i]) {
case '0':
case '1':
case '2':
case '3':
case '4':
case '5':
case '6':
case '7':
case '8':
case '9':
rc = true;
break;
case '+':
case '-':
if (i != 0)
return false;
break;
case '0円': // end of string
return rc;
default:
return false;
}
}
}
I think this implementation using a switch statement is commendably clear; but it wouldn't generalize well to parsing something more complicated for example a double-precision number.
-
\$\begingroup\$ You assume no embedded
NUL
s. That's something which should be tested... \$\endgroup\$Deduplicator– Deduplicator2017年05月07日 19:53:15 +00:00Commented May 7, 2017 at 19:53 -
1\$\begingroup\$ Apparently you're right; I didn't know that. Can a std::string contain embedded nulls? To fix that I suppose you do the for loop while
i < number.length()
, return false if you find aNUL
(which counts as a not-a-digit), andreturn rc
after you stop running the loop. \$\endgroup\$ChrisW– ChrisW2017年05月07日 20:15:10 +00:00Commented May 7, 2017 at 20:15
A different approach also using the standard library.
bool isnum(const std::string& num)
{
if ( num.empty() ) return false;
size_t start=0;
if ( ( num[start] == '+') || ( num[start] == '-' ) ) start++;
if ( start == num.length())
return false;
return num.find_first_not_of("0123456789",start) == std::string::npos;
}
Not sure what are the requirements, usually one would drop spaces at least from the beginning of the string for this sort of problem.
+
character. \$\endgroup\$