Here is a simplified sample. Basically, it does checks on a string from a string list. If the check passes, it will remove that string (filterStringOut(i);
), and it is no longer necessary to continue any other checks. Thus continue
to the next string.
void ParsingTools::filterStrings(QStringList &sl)
{
/* Filter string list */
QString s;
for (int i=0; i<sl.length(); i++) {
s = sl.at(i);
// Improper length, remove
if (s.length() != m_Length) {
filterStringOut(i);
continue; // Once removed, can move on to the next string
}
// Lacks a substring, remove
for (int j=0; j<m_Include.length(); j++) {
if (!s.contains(m_Include.at(j))) {
filterStringOut(i);
/* break; and continue; */
}
}
// Contains a substring, remove
for (int j=0; j<m_Exclude.length(); j++) {
if (s.contains(m_Exclude.at(j))) {
filterStringOut(i);
/* break; and continue; */
}
}
}
}
How ought one continue the outer loop from inside a nested loop?
My best guess is to use goto
and place a label at the end of the outer loop. That prompted me to ask this question, given how taboo goto
can be.
In the c++ IRC chat, it was suggested that I place the for
loops in bool functions, that return true if a check passed. thus
if ( containsExclude(s)) continue;
if (!containsInclude(s)) continue;
or that I simply create a local boolean, set it to true break
, check bool and continue if true.
Given that I am using this in a parser, I actually do need to prioritize performance in this example. Is this a situation where goto
is still useful, or is it a case where I need to restructure my code?
7 Answers 7
Don't nest: convert to functions instead. And have those functions return true
if they perform their action and the subsequent steps can be skipped; false
otherwise. That way you completely avoid the whole problem of how to break out of one level, continue within another, etc as you just chain the calls with ||
(this assumes C++ stops processing an expression on a true
; I think it does).
So your code might end up looking like the following (I haven't written C++ in years, so it likely contains syntax errors, but should give you the general idea):
void ParsingTools::filterStrings(QStringList &sl)
{
QString s;
for (int i=0; i<sl.length(); i++) {
s = sl.at(i);
removeIfImproperLength(s, i) ||
removeIfLacksRequiredSubstring(s, i) ||
removeIfContainsInvalidSubstring(s, i);
}
}
bool removeIfImproperLength(QString s, int i) {
if (s.length() != m_Length)
{
filterStringOut(i);
return true;
}
return false;
}
bool removeIfLacksSubstring(QString s, int i) {
for (int j=0; j<m_Include.length(); j++) {
if (!s.contains(m_Include.at(j))) {
filterStringOut(i);
return true;
}
}
return false;
}
bool removeIfContainsInvalidSubstring(QString s, int i) {
for (int j=0; j<m_Exclude.length(); j++) {
if (s.contains(m_Exclude.at(j))) {
filterStringOut(i);
return true;
}
}
return false;
}
-
1"reoveIfImproperLength" Typo. :)Neil– Neil2018年05月04日 08:04:33 +00:00Commented May 4, 2018 at 8:04
-
5It is better if the three condition-checking functions remain side-effect free (i.e. instead of doing "remove-if", just return the boolean condition and let the caller (
ParsingTools::filterStrings
) call thefilterStringOut(i)
function, as shown in dagnelies's answer.rwong– rwong2018年05月04日 09:28:02 +00:00Commented May 4, 2018 at 9:28 -
So you're using function call semantics as the basis for C++'s missing break statements. Very clever.Ryan Reich– Ryan Reich2018年05月15日 05:54:54 +00:00Commented May 15, 2018 at 5:54
From a more bird view perspective, I would refactor the code so that it looks like this... (in pseudo code, it's too long ago I touched C++)
void filterStrings(sl)
{
/* Filter string list */
for (int i=0; i<sl.length(); i++) {
QString s = sl.at(i);
if(!isProperString(s)) {
filterStringOut(i);
}
}
}
bool isProperString(s) {
if (s.length() != m_Length)
return false; // Improper length
for (int j=0; j<m_Include.length(); j++) {
if (!s.contains(m_Include.at(j))) {
return false; // Lacks a substring
}
}
for (int j=0; j<m_Exclude.length(); j++) {
if (s.contains(m_Exclude.at(j))) {
return false; // Contains a substring
}
}
return true; // all tests passed, it's a proper string
}
This is IMHO cleaner because it clearly separates what constitutes a proper string, and what you do when it isn't.
You could even go one step further and use built-in filter methods like myProperStrings = allMyStrings.filter(isProperString)
I really like how @dagnelies starts. Short and to the point. A good use of high level abstraction. I'm only tweaking it's signature and avoiding a needless negative.
void ParsingTools::filterStrings(QStringList &sl)
{
for (int i=0; i<sl.length(); i++) {
QString s = sl.at(i);
if ( isRejectString(s) ) {
filterStringOut(i);
}
}
}
However, I like how @DavidArno breaks out the requirement tests as individual functions. Sure the whole thing becomes longer but every function is wonderfully tiny. Their names avoid the need for comments to explain what they are. I just don't like that they take on the extra responsibility of calling filterStringOut()
.
By the way, yes C++ will stop evaluation of a ||
chain on a true
so long as you haven't overloaded the ||
operator. This is called short circuit evaluation. But this is a trivial micro optimization that you are free to ignore as your read the code so long as the functions are side effect free (such as the ones below).
The following should make the definition of a reject string clear without dragging you through needless details:
bool isRejectString(QString s) {
return isDifferentLength(s, m_Length)
|| sansRequiredSubstring(s, m_Include)
|| hasForbiddenSubstring(s, m_Exclude)
;
}
Relieved of the need to call filterStringOut()
the requirement test functions become shorter and their names are much simpler. I've also put everything they're dependent on in their parameter list to make it easier to understand them without looking inside.
bool isDifferentLength(QString s, int length) {
return ( s.length() != length );
}
bool sansRequiredSubstring(QString s, QStringList &include) {
for (int j=0; j<include.length(); j++) {
QString requiredSubstring = include.at(j);
if ( !s.contains(requiredSubstring) ) {
return true;
}
}
return false;
}
bool hasForbiddenSubstring(QString s, QStringList &exclude) {
for (int j=0; j<exclude.length(); j++) {
QString forbiddenSubstring = exclude.at(j);
if ( s.contains(forbiddenSubstring) ) {
return true;
}
}
return false;
}
I added requiredSubstring
and forbiddenSubstring
for the humans. Will they slow you down? Test and find out. It's easier to make readable code actually fast then to make prematurely optimized code readable or actually fast.
If the functions turn out to slow you down look into inline functions before you subject the humans to unreadable code. Again, don't assume this will give you speed. Test.
I think you'll find any of these more readable than nested for loops. Those, combined with the if
's, were starting to give you a real arrow anti-pattern. I think the lesson here is that tiny functions are a good thing.
-
1Although it’s a conbination of other two answers, this adds a lot of value. Make it "for humans", test performance before you decide to "obscure" the code, and make easy to test. Great stuff!carlossierra– carlossierra2018年05月11日 11:26:28 +00:00Commented May 11, 2018 at 11:26
-
1Actually, my use of
! isProperString
rather thanisImproperString
was intentional. I tend to avoid negations in function names. Imagine you need to check if it's indeed a proper string later on, you'll need!isImproperString
which is IMHO more prone to confusion because of the double negation.dagnelies– dagnelies2018年05月12日 08:36:22 +00:00Commented May 12, 2018 at 8:36 -
@dagnelies better?candied_orange– candied_orange2018年05月15日 01:37:02 +00:00Commented May 15, 2018 at 1:37
Just use a lambda for the predicate, and then use the power of standard algorithms and short-circuiting. No need for any convoluted or exotic control-flow:
void ParsingTools::filterStrings (QStringList& list)
{
for (int i = list.size(); i--;) {
const auto& s = list[i];
auto contains = [&](const QString& x) { return s.contains(x); };
if (s.size() != m_Length
|| !std::all_of(m_Include.begin(), m_Include.end(), contains)
|| std::any_of(m_Exclude.begin(), m_Exclude.end(), contains))
filterStringOut(i);
}
}
There is also the option to make the content of the outer loop (the one you want to continue) a lambda, and simply use return
.
It's surprisingly easy if you know lambdas; you basically start your loop interior with [&]{
and end it with }()
; inside you can use return;
at any time to leave it:
void ParsingTools::filterStrings(QStringList &sl)
{
/* Filter string list */
QString s;
for (int i=0; i<sl.length(); i++) {
[&]{ // start a lamdba defintion
s = sl.at(i);
// Improper length, remove
if (s.length() != m_Length) {
filterStringOut(i);
// continue; // Once removed, can move on to the next string
return; // happily return here, this will continue
}
// Lacks a substring, remove
for (int j=0; j<m_Include.length(); j++) {
if (!s.contains(m_Include.at(j))) {
filterStringOut(i);
/* break; and continue; */ return; // happily return here, this will continue the i-loop
}
}
// Contains a substring, remove
for (int j=0; j<m_Exclude.length(); j++) {
if (s.contains(m_Exclude.at(j))) {
filterStringOut(i);
/* break; and continue; */ return; // happily return here, this will continue the i-loop
}
}
}() // close/end the lambda definition and call it
}
}
-
3(1) To actually call the lambda right away, at the end of the closing curly brace, one must do the call (with a pair of parentheses, potentially with the list of arguments or without). (2) all three places that use
continue
andbreak
must be replaced withreturn
. Your code appears to leave the first place (which usescontinue
) unchanged, but that must be changed as well, because the code is inside the lambda and thecontinue
statement couldn't find a scope that is a loop.rwong– rwong2018年05月05日 06:18:44 +00:00Commented May 5, 2018 at 6:18 -
I wrote that while waiting on red lights. Corrected.Aganju– Aganju2018年05月05日 15:27:14 +00:00Commented May 5, 2018 at 15:27
I think @dganelies has the right idea as a starting point, but I think I'd consider going a step further: write a generic function that can carry out the same pattern for (almost) any container, criteria and action:
template <class Container, class Action, class Condition>
void map_if(Container &container, Action action, Condition cond) {
for (std::size_t i = 0; i < container.length(); i++) {
auto s = container.at(i);
if (cond(s))
action(i);
}
}
Your filterStrings
would then just define the criteria, and pass the appropriate action:
void ParsingTools::filterStrings(QStringList const &sl)
{
auto isBad = [&](QString const &s) {
if (s.length() != m_Length)
return true;
for (int j = 0; j < m_Include.length(); j++) {
if (!s.contains(m_Include.at(j))) {
return true;
}
}
for (int j = 0; j < m_Exclude.length(); j++) {
if (s.contains(m_Exclude.at(j))) {
return true;
}
}
return false;
};
map_if(sl, filterStringOut, isBad);
}
Of course, there are other ways to approach that basic problem as well. For example, using the standard library, you seem to want something on the same general order as std::remove_if
.
Several answers suggest a major refactor of the code. This is probably not a bad way to go, but I wanted to provide an answer which is more in line with the question itself.
Rule #1: Profile before optimizing
Always profile the results before attempting an optimization. You may find yourself wasting a great deal of time if you don't.
That being said...
As it is, I have personally tested this sort of code on MSVC. The booleans are the way to go. Name the boolean something semantically meaningful like containsString
.
...
boo containsString = true; // true until proven false
// Lacks a substring, remove
for (int j=0; j<m_Include.length(); j++) {
if (!s.contains(m_Include.at(j))) {
filterStringOut(i);
/* break; and continue; */
containsString = false;
}
}
if (!containsString)
continue;
On MSVC (2008), in release mode (typical optimizer settings), the compiler optimized a similar loop down to exactly the same set of opcodes as the goto
version. It was smart enough to see that the value of the boolean was directly tied to control flow, and elided everything away. I have not tested gcc, but I presume it can do similar types of optimization.
This has the advantage over goto
of simply not raising any concern by purists who consider goto
to be harmful, without sacrificing a single instruction's worth of performance.
Explore related questions
See similar questions with these tags.
goto
, despite its bad reputation. Don't fear names - fear concepts.