Is there a difference between these two versions of code?
foreach (var thing in things)
{
int i = thing.number;
// code using 'i'
// pay no attention to the uselessness of 'i'
}
int i;
foreach (var thing in things)
{
i = thing.number;
// code using 'i'
}
Or does the compiler not care? When I'm speaking of difference I mean in terms of performance and memory usage. ..Or basically just any difference or do the two end up being the same code after compilation?
-
6Have you tried compiling the two and looking at the bytecode output?user40980– user409802015年09月09日 14:28:44 +00:00Commented Sep 9, 2015 at 14:28
-
4@MichaelT I don't feel like I'm qualified to compare bytecode output.. If I find a difference I'm not sure I'd be able to understand what it means exactly.Alternatex– Alternatex2015年09月09日 14:34:52 +00:00Commented Sep 9, 2015 at 14:34
-
4If its the same, you don't need to be qualified.user40980– user409802015年09月09日 14:35:36 +00:00Commented Sep 9, 2015 at 14:35
-
1@MichaelT Though you do need to be qualified enough to make a good guess about whether the compiler could have optimized it, and if so under what conditions it's able to do that optimization.Ben Aaronson– Ben Aaronson2015年09月09日 15:41:01 +00:00Commented Sep 9, 2015 at 15:41
-
@BenAaronson and that likely requires a non-trivial example to tickle that functionality.user40980– user409802015年09月09日 15:42:39 +00:00Commented Sep 9, 2015 at 15:42
3 Answers 3
TL;DR - they're equivalent examples at the IL layer.
DotNetFiddle makes this pretty to answer as it allows you to see the resulting IL.
I used a slightly different variation of your loop construct in order to make my testing quicker. I used:
Variation 1:
using System;
public class Program
{
public static void Main()
{
Console.WriteLine("Hello World");
int x;
int i;
for(x=0; x<=2; x++)
{
i = x;
Console.WriteLine(i);
}
}
}
Variation 2:
Console.WriteLine("Hello World");
int x;
for(x=0; x<=2; x++)
{
int i = x;
Console.WriteLine(i);
}
In both cases, the compiled IL output rendered the same.
.class public auto ansi beforefieldinit Program
extends [mscorlib]System.Object
{
.method public hidebysig static void Main() cil managed
{
//
.maxstack 2
.locals init (int32 V_0,
int32 V_1,
bool V_2)
IL_0000: nop
IL_0001: ldstr "Hello World"
IL_0006: call void [mscorlib]System.Console::WriteLine(string)
IL_000b: nop
IL_000c: ldc.i4.0
IL_000d: stloc.0
IL_000e: br.s IL_001f
IL_0010: nop
IL_0011: ldloc.0
IL_0012: stloc.1
IL_0013: ldloc.1
IL_0014: call void [mscorlib]System.Console::WriteLine(int32)
IL_0019: nop
IL_001a: nop
IL_001b: ldloc.0
IL_001c: ldc.i4.1
IL_001d: add
IL_001e: stloc.0
IL_001f: ldloc.0
IL_0020: ldc.i4.2
IL_0021: cgt
IL_0023: ldc.i4.0
IL_0024: ceq
IL_0026: stloc.2
IL_0027: ldloc.2
IL_0028: brtrue.s IL_0010
IL_002a: ret
} // end of method Program::Main
So to answer your question: the compiler optimizes out the declaration of the variable, and renders the two variations equivalent.
To my understanding, the .NET IL compiler moves all variable declarations to the beginning of the function but I couldn't find a good source that clearly stated that2. In this particular example, you see that it moved them up with this statement:
.locals init (int32 V_0,
int32 V_1,
bool V_2)
Wherein we get a bit too obsessive in making comparisons....
Case A, do all variables get moved up?
To dig into this a bit further, I tested the following function:
public static void Main()
{
Console.WriteLine("Hello World");
int x=5;
if (x % 2==0)
{
int i = x;
Console.WriteLine(i);
}
else
{
string j = x.ToString();
Console.WriteLine(j);
}
}
The difference here is that we declare either an int i
or a string j
based upon the comparison. Again, the compiler moves all the local variables to the top of the function2 with:
.locals init (int32 V_0,
int32 V_1,
string V_2,
bool V_3)
I found it interesting to note that even though int i
won't be declared in this example, the code to support it is still generated.
Case B: What about foreach
instead of for
?
It was pointed out that foreach
has different behavior than for
and that I wasn't checking the same thing that had been asked about. So I put in these two sections of code to compare the resulting IL.
int
declaration outside of the loop:
Console.WriteLine("Hello World");
List<int> things = new List<int>(){1, 2, 3, 4, 5};
int i;
foreach(var thing in things)
{
i = thing;
Console.WriteLine(i);
}
int
declaration inside of the loop:
Console.WriteLine("Hello World");
List<int> things = new List<int>(){1, 2, 3, 4, 5};
foreach(var thing in things)
{
int i = thing;
Console.WriteLine(i);
}
The resulting IL with the foreach
loop was indeed different from the IL generated using the for
loop. Specifically, the init block and the loop section changed.
.locals init (class [mscorlib]System.Collections.Generic.List`1<int32> V_0,
int32 V_1,
int32 V_2,
class [mscorlib]System.Collections.Generic.List`1<int32> V_3,
valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<int32> V_4,
bool V_5)
...
.try
{
IL_0045: br.s IL_005a
IL_0047: ldloca.s V_4
IL_0049: call instance !0 valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<int32>::get_Current()
IL_004e: stloc.1
IL_004f: nop
IL_0050: ldloc.1
IL_0051: stloc.2
IL_0052: ldloc.2
IL_0053: call void [mscorlib]System.Console::WriteLine(int32)
IL_0058: nop
IL_0059: nop
IL_005a: ldloca.s V_4
IL_005c: call instance bool valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<int32>::MoveNext()
IL_0061: stloc.s V_5
IL_0063: ldloc.s V_5
IL_0065: brtrue.s IL_0047
IL_0067: leave.s IL_0078
} // end .try
finally
{
IL_0069: ldloca.s V_4
IL_006b: constrained. valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<int32>
IL_0071: callvirt instance void [mscorlib]System.IDisposable::Dispose()
IL_0076: nop
IL_0077: endfinally
} // end handler
The foreach
approach generated more local variables and required some additional branching. Essentially, on the first time in it jumps to the end of the loop to get the first iteration of the enumeration and then jumps back to almost the top of the loop to execute the loop code. It then continues to loop through as you'd expect.
But beyond the branching differences caused by using the for
and foreach
constructs, there was no difference in the IL based upon where the int i
declaration was placed. So we're still at the two approaches being equivalent.
Case C: What about different compiler versions?
In a comment that was left 1, there was a link to an SO question regarding a warning about variable access with foreach and using closure. The part that really caught my eye in that question was that there may have been differences in how the .NET 4.5 compiler worked versus earlier versions of the compiler.
And that's where the DotNetFiddler site let me down - all they had available was .NET 4.5 and a version of the Roslyn compiler. So I brought up a local instance of Visual Studio and started testing out the code. To make sure I was comparing the same things, I compared locally built code at .NET 4.5 to the DotNetFiddler code.
The only difference that I noted was with the local init block and variable declaration. The local compiler was a bit more specific in naming the variables.
.locals init ([0] class [mscorlib]System.Collections.Generic.List`1<int32> things,
[1] int32 thing,
[2] int32 i,
[3] class [mscorlib]System.Collections.Generic.List`1<int32> '<>g__initLocal0',
[4] valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<int32> CS5ドル0000,ドル
[5] bool CS4ドル0001ドル)
But with that minor difference, it was so far, so good. I had equivalent IL output between the DotNetFiddler compiler and what my local VS instance was producing.
So I then rebuilt the project targeting .NET 4, .NET 3.5, and for good measure .NET 3.5 Release mode.
And in all three of those additional cases, the generated IL was equivalent. The targeted .NET version had no effect on the IL that was generated in these samples.
To summarize this adventure: I think we can confidently say that the compiler does not care where you declare the primitive type and that there is no effect upon memory or performance with either declaration method. And that holds true regardless of using a for
or foreach
loop.
I considered running yet another case that incorporated a closure inside of the foreach
loop. But you had asked about the effects of where a primitive type variable was declared, so I figured I was delving too far beyond what you were interested in asking about. The SO question I mentioned earlier has a great answer that provides a good overview about closure effects on foreach iteration variables.
1 Thank you to Andy for providing the original link to the SO question addressing closures within foreach
loops.
2 It's worth noting that the ECMA-335 spec addresses this with section I.12.3.2.2 'Local variables and arguments'. I had to see the resulting IL and then read the section for it to be clear regarding what was going on. Thanks to ratchet freak for pointing that out in chat.
-
1For and foreach do not behave the same, and the question includes code that is different which becomes important when there's a closure in the loop. stackoverflow.com/questions/14907987/…Andy– Andy2015年09月10日 00:59:34 +00:00Commented Sep 10, 2015 at 0:59
-
1@Andy - thanks for the link! I went ahead and checked the generated output using a
foreach
loop and also checked the targeted .NET version.user53019– user530192015年09月10日 15:31:03 +00:00Commented Sep 10, 2015 at 15:31
Depending on what compiler you use (I don't even know if C# has more than one), your code will be optimised before being turned into a program. A good compiler will see that you're re-initialising the same variable each time with a different value and manage the memory space for it efficiently.
If you were initialising the same variable to a constant each time, the compiler would likewise initialise it before the loop and reference it.
It all depends how well your compiler is written, but as far as coding standards are concerned variables should always have the least possible scope. So declaring inside the loop is what I've always been taught.
-
3Whether your last paragraph is true or not depends on two things: the importance of minimizing the scope of the variable within your own program's unique context, and inside knowledge of the compiler as to whether or not it actually optimizes out the multiple assignments.Robert Harvey– Robert Harvey2015年09月09日 14:53:24 +00:00Commented Sep 9, 2015 at 14:53
-
And then there's the runtime, which further translates the byte code into machine language, where many of these same optimizations (being discussed here as compiler optimizations) are also performed.Erik Eidt– Erik Eidt2015年09月09日 19:02:06 +00:00Commented Sep 9, 2015 at 19:02
in first you are just declaring and initializing inside loop so every time loop loops it will get reinitialized "i" inside loop. In second you are only declaring outside the loop.
-
1this doesn't seem to offer anything substantial over points made and explained in top answer that was posted over 2 years agognat– gnat2018年04月25日 08:52:38 +00:00Commented Apr 25, 2018 at 8:52
-
2Thank you for giving an answer, but it doesn't give any new aspects the accepted, top rated answer does not already cover (in detail).CharonX– CharonX2018年04月25日 09:32:54 +00:00Commented Apr 25, 2018 at 9:32