I want to detect $number
substring in std::string
and then replace it with $number + 1
.
For example, the string Hello9ドルWorld1ドル
should become Hello10ドルWorld2ドル
.
Here's my code:
#include <iostream>
#include <string>
void modifyDollarNumber(std::string &str)
{
for (size_t i = str.length(); i --> 0 ;)
{
if (str[i] == '$')
{
size_t j = i + 1;
while (j < str.length() && isdigit(str[j]))
{
++j;
}
size_t len = j - (i + 1);
if (len)
{
std::string sub = str.substr(i + 1, len);
int num = std::stoi(sub) + 1;
str.erase(i + 1, len);
sub = std::to_string(num);
str.insert(i + 1, sub);
}
}
}
}
int main()
{
std::string str = "!$@#34ドル1ドル%^&5ドル*1ドル$!%91ドル12ドル@3ドル";
modifyDollarNumber(str);
std::cout << "Result : " << str << '\n';
}
And I can get the result I want which is
Result : !$@#35ドル2ドル%^&6ドル*2ドル$!%92ドル13ドル@4ドル
Program ended with exit code: 0
But I want to improve my code so it can be as fast as possible.
How can I simplify modifyDollarNumber()
function?
-
2\$\begingroup\$ 'fast' is generally within a particular usage; does it need to be fast for lots of small strings like the example, or long strings with many numbers...? \$\endgroup\$Phil H– Phil H2018年08月02日 14:02:13 +00:00Commented Aug 2, 2018 at 14:02
-
1\$\begingroup\$ What's the range of the numbers to be replaced? Can they be negative? \$\endgroup\$Toby Speight– Toby Speight2018年08月02日 14:51:45 +00:00Commented Aug 2, 2018 at 14:51
-
\$\begingroup\$ @PhilH I will be processing a long string with many $numbers \$\endgroup\$Zack Lee– Zack Lee2018年08月02日 16:44:35 +00:00Commented Aug 2, 2018 at 16:44
-
\$\begingroup\$ @TobySpeight They can’t be negative so they should be larger than 0 and should be integers only. \$\endgroup\$Zack Lee– Zack Lee2018年08月02日 16:46:08 +00:00Commented Aug 2, 2018 at 16:46
-
1\$\begingroup\$ Good - I've used an unsigned type in my answer, on that assumption. \$\endgroup\$Toby Speight– Toby Speight2018年08月02日 16:52:03 +00:00Commented Aug 2, 2018 at 16:52
3 Answers 3
Few observations.
I. You don't need to scan the whole string, character-by-character. str.find('$')
will do it for you (and remember to start the second search, and all subsequent ones, from the last reference detected.)
II. Rather than taking substrings, parsing them, erasing and inserting—why not increment in-place? You know exactly your number span (from the one past '$' to the one past the last digit.) If this string only consists of '9's, insert a single '1' right after '$' and change all nines into zeroes. If there are digits less than nine, increment the rightmost non-nine, and change all the subsequent nines into zeroes, if any, no insertion/erasure required. Proceed with the next search.
III. As @AJNeufeld mentioned, string reconstruction can be done in a separate buffer, though vast testing is needed to decide if this is a faster option.
-
\$\begingroup\$ thanks for your tips. regarding your second tip, i’m looking for a solution that is easily customizable. for example, it should be easy to change from $(number + 1) to $(number * 2) later. \$\endgroup\$Zack Lee– Zack Lee2018年08月02日 17:09:44 +00:00Commented Aug 2, 2018 at 17:09
You've misspelt std::size_t
throughout, and also std::isdigit
(which is missing the necessary include of <cctype>
- note also that passing plain char
to the character classification functions is risky - cast to unsigned char
first).
The in-place modification of your string involves copying increasing parts of it multiple times (even when the replacement string is of the same length). You can avoid that quite simply by using std::string::replace()
instead of erase()
+insert()
:
std::string sub = str.substr(i + 1, len);
int num = std::stoi(sub) + 1;
str.replace(i + 1, len, std::to_string(num));
This still leaves a lot of copying when the increment adds a digit (9, 99, 999, ...) - I think your test-case should include at least one of each. To avoid that problem (and to make the usage more intuitive to the caller), it may be better to write a function that returns a copy of the string (so accept it by const reference):
#include <algorithm>
#include <cctype>
#include <string>
std::string modifyDollarNumber(const std::string& str)
{
std::string result;
result.reserve(str.length());
auto out = std::back_inserter(result);
auto pos = str.cbegin();
while (pos != str.cend()) {
auto dollar_pos = std::find(pos, str.cend(), '$');
std::copy(pos, dollar_pos, out);
// no more substitutions?
if (dollar_pos == str.cend()) { break; }
// copy the dollar sign
result += '$';
pos = dollar_pos + 1;
// is it followed by a number?
auto digit_end = std::find_if(pos, str.end(),
[](unsigned char c){ return !std::isdigit(c); });
if (digit_end == pos) { continue; }
// copy the incremented number
auto num = std::stoul(std::string{pos, digit_end});
result.append(std::to_string(num+1));
pos = digit_end;
}
return result;
}
#include <iostream>
int main()
{
const std::string str = "1ドル $-22 027ドル $$ $";
std::cout << "Result : " << modifyDollarNumber(str) << '\n';
}
But if raw speed is more important than readability, you'll need to benchmark with some representative inputs to see which is best for you.
-
\$\begingroup\$ is it problematic to write simply ‘size_t’ and ‘isdigit()’ even if it compiles? \$\endgroup\$Zack Lee– Zack Lee2018年08月02日 17:01:17 +00:00Commented Aug 2, 2018 at 17:01
-
1\$\begingroup\$ Yes - even though the compiler you're using right now defines these symbols in the global namespace as well as in
std
, there's no guarantee that it will continue to do so, and it almost certainly won't be compilable by at least some people who want to use your code. Stick to what's guaranteed in the standard, and your code will be much more portable (and therefore useful). \$\endgroup\$Toby Speight– Toby Speight2018年08月02日 17:15:11 +00:00Commented Aug 2, 2018 at 17:15 -
1\$\begingroup\$ Of course, if you really want to miss off the
std
prefix, you canusing std::size_t;
early in your function. But I don't generally recommend that unless you're using it on almost every line. \$\endgroup\$Toby Speight– Toby Speight2018年08月02日 17:16:33 +00:00Commented Aug 2, 2018 at 17:16
String manipulation can be slow. Inserting and deleting characters requires shifting characters in memory, if modifying in place, or allocating and freeing temporaries.
It could be faster to allocate one destination character buffer (char[]
or wchar_t[]
), and copy characters into it, performing the translations as required, and then converting into a std::string
at the end.
The destination buffer would need space for str.length() + std::count(str.begin(), str.end(), '$')
characters, since 9ドル
can become 10ドル
, etc.
-
\$\begingroup\$ Note that using a fresh string as buffer is approx. as effective. \$\endgroup\$bipll– bipll2018年08月02日 15:00:24 +00:00Commented Aug 2, 2018 at 15:00
-
\$\begingroup\$ I would appreciate a working example if possible. \$\endgroup\$Zack Lee– Zack Lee2018年08月02日 16:55:49 +00:00Commented Aug 2, 2018 at 16:55