I recently had a discussion with a friend about code maintainability with regards to modifying an iterator inside of the body of a loop (C#
syntax):
List<int> test = new List<int>();
for (int i = 0; i < 30;)
test.AddRange(new int[] { i++, i++, i++ });
The code works fine, and as expected adds 0 - 29
into the list. However, he pointed out that the execution does look odd (and I agree), and told me about using Enumerable.Range(start, end)
. I have since switched to using the method and it works as needed.
During our discussion he stated that things like this cause issues with maintainability because it forces other developers to pause and examine what the intent is along with what is actually happening prior to making changes (we should all be doing this anyways in my opinion). He stated that things of this nature, aren't truly needed and should be refactored to a simpler version for that very reason. I do agree with this statement but he gave an example of obfuscation that we both agreed would not compile in C#
and is undefined behavior in C++
. I posted a question on StackOverflow but it was poorly received thus I believe it may be a better fit for here.
The code he wrote was:
int i = 2;
int c = i++ + ++((i++)++);
I tried to solve it using order of operations but I receive different results if I write it out by hand, and if I try to write it proceduraly in code:
int i = 2;
int p1 = i++;
int p2 = p1++;
int p3 = ++p2;
int c = i++ + p3;
The code above compiles in C#
and presume it would have no issues in most languages. This produces a result of c = 6
and i = 4
, however trying to solve by hand gives me c = 10
and i = 6
.
Can these issues be solved so the line i++ + ++((i++)++)
will compile and produce a sensible result? If so, what is the result of execution? Maybe there is already a major language (with C-style increment operators) where this works, which I am not aware of?
I've tried the following languages:
- C#
- C++
- Java
- JavaScript
2 Answers 2
In the languages I'm familiar with (e.g. C, C++, JavaScript, TypeScript, Java), the code <variable>++
captures the value of <variable>
, increments the variable's value, and then yields the captured (pre-increment) value. This only works if <variable>
is something that can be incremented (i.e. an l-value). For instance, (4 + 8)++
would NOT compile, even though you might hope it would yield the value 13, because (4 + 8)
has no storage that can be incremented.
Your expression ++((i++)++)
looks like it might increment i
three times, yielding (I think) the original value of i
plus one. However, the outer two ++
s are NOT acting on lvalues, and so won't compile.
P.S. Although you didn't ask about this, and although I'm not familiar with the details of the C# spec, I'd raise a serious red flag at your initial code:
test.AddRange(new int[] { i++, i++, i++ });
I'd be worried about two things:
- You're expecting the numbers to be added in order, but it's possible that the order of evaluation of the components of
{ i++, i++, i++ }
may not be left-to-right - It's even possible that the results may be undefined, with multiple increments happening before the results are captured in the expressions.
Again, I don't know the details of C#, so these may technically not be problems here. However, it's a sign of poor code design when you have to know implementation details before you can be sure an expression does what you want it to.
-
I agree completely, hence the reason I posed the question. I originally wrote the line you copied from my post to remove a triplet of
test.Add(I++)
calls. The execution was as expected adding0, 1, 2
to the list, but it didn't look clean enough and I was sure there was a better way henceEnumerable.Range(start, end)
. I definitely appreciate the feedback and was afraid this might be the final answer based on all of the languages I know which you explained the concept quite thoroughly (though I already understood that prior). +1Taco– Taco2019年01月15日 03:17:43 +00:00Commented Jan 15, 2019 at 3:17 -
In C++, the order of execution of items in a list initialiser is the order in which they appear. It is valid, and the defined behaviour will always produce the expected result, see 8.5.4 (4) here: open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3690.pdf#page=222anon– anon2019年01月15日 10:20:06 +00:00Commented Jan 15, 2019 at 10:20
-
C# guarantees left-to-right evaluation of arguments.JacquesB– JacquesB2019年01月16日 07:31:36 +00:00Commented Jan 16, 2019 at 7:31
The fundamental problem is what does ++i
and i++
actually mean?
Each language will give a different answer, so that is important. But lets take a look at the abstract idea first, pseudo code.
operator (i)++
{
temp = i;
i=i+1;
return temp;
}
operator ++(i)
{
i=i+1;
return i;
}
Now in most languages post-increment will return an r-value, and pre-increment returns an l-value.
- An r-value is essentially a copy of the original variable's value, any mutation to it is lost after the statement/outer expression has evaluated.
- An l-value is the original variable, any mutation is preserved after the statement/outer expression has evaluated.
Some languages allow mutations to happen to r-values, this broadens their expressiveness but leads to counter-intuitive state change. If post-increment returns an r-value then: ++((i++)++)
is post-incremented twice, and pre-incremented once, the expression reduces to ++i
, when you could reasonable expect that the expression should return i+1
and then update i
to be i+3
.
Conversely a language may take the stance that modifying an r-value is an error. This does reduce the expressiveness of the language, but also remove a whole class of semantically dubious constructs like ++((i++)++)
. The compiler would say, that this is probably not what you meant, fix your thinking.
Some languages may view pre-increment as returning an r-value. This does lead to one of the two problems above, either its confusing, or its restrictive. Hence why its usually an l-value, as it has the least surprise.
Now some languages view post-increment as returning an l-value. This does bring into question what the state of the variable is at any given point in the expression. So what does incremented later mean?
- Does it mean that the variable is incremented before the next named reference to it?
- Does it mean that the variable is incremented once at the end-of-the statement/outer-expression?
- Does it mean that the variable is incremented per operator at the end-of-the statement/outer-expression?
Also in which order should the operators be applied?
- Left to right?
- right to left?
- most-nested to least nested?
- least nested to most nested?
Whatever the answer is will change the meaning a lot.
Assume that the language uses l-values for both pre and post, applies the increment before the next named reference to it, and evaluates left-to-right then:
i = 2;
c = i++ + ++((i++)++);
would evaluate like:
i = 2;
c = i; //i++ is just i
i = i + 1; //now update i with +1
c = c + ++i; //i++ is just i, twice
i = i + 1 + 1; //now update i with two post-increments.
this gives: i = 6
and c = 4
For contrast keep the other rules but change when the post-increments are applied to at the end of the statement/expression giving:
i = 2;
c = i + ++i; //i++ is just i, thrice
i = i + 1 + 1 + 1; //now update i with three post-increments.
this gives: i = 6
and c = 5
.
These rules particular the order of evaluation will also affect the result from the perspective of pre-increment. Take the previous evaluation with post-increments at the end of statement but reverse the order of evaluation to be right-to-left then this would give: i = 6
and c = 6
.
To summarise, try to avoid language constructs, even perfectly valid ones, that force you to be a language expert to even read it correctly. To that end I try to avoid state-mutating expressions in a complex expression. It is generally better to elaborate exactly what you mean then to trust that:
- you perfectly understand the current language,
- the compiler agrees with your interpretation of the language,
- future readers of your code have the mental capacity to understand what you meant.
31
instead of30
and let's assume the outcome is what theauthor
of the code intended - how many people will fail to see what he intended? That's rather the bomb some disgruntled developer would write before leaving the team.