On Wednesday, 19 May 2021 at 14:24:57 UTC, Jesse Phillips wrote:
> for (int i = 0; i < 10; i++) {
int index = i;
dgs ~= () {
import std.stdio;
writeln(index);
};
}
When this loop concludes, the value of i
is 10 and the value of index is 9 (as shown from your output).
This is because within the for
logic i
was increased and it determined 10 < 10
is false. This means the for
body is not executed again leaving index
at 9.
I don't know why compiler magic you would expect is "correct" here. We can't say i
should be 9 as the loop would not have exited then. We certainly don't want index
to be 10 as that would mean the loop expected on more time than it was defined to.
A local variable's lifetime starts at its declaration and ends at the closing brace of the scope where it's declared:
void main() {
int x; // start of x's lifetime
{
int y; // start of y's lifetime
} // end of y's lifetime
int z; // start of z's lifetime
} // end of x's and z's lifetimes
This also applies to variables inside loops:
void main() {
foreach (i; 0 .. 10) {
int x; // start of x's lifetime
} // end of x's lifetime
}
We can see that this is the case by declaring a variable with a destructor inside a loop:
import std.stdio;
struct S {
~this() { writeln("destroyed"); }
}
void main() {
foreach (i; 0 .. 10) {
S s; // start of s's lifetime
} // end of s's lifetime
}
The above program prints "destroyed" 10 times. At the start of each loop iteration, a new instance of s
is initialized; at the end of each iteration, it is destroyed.
Normally, an instance of a variable declared inside a loop cannot outlive the loop iteration in which it was created, so the compiler is free to reuse the same memory for each instance. We can verify that it does so by printing out the address of each instance:
import std.stdio;
struct S
{
~this() { writeln("destroyed ", &this); }
}
void main()
{
foreach (i; 0 .. 10) {
S s;
}
}
On run.dlang.io
, this prints "destroyed 7FFE478D283C" 10 times.
However, when am instance of variable declared inside a loop is captured in a closure, it becomes possible to access that instance even after the loop iteration that created it has finished. In this case, the lifetimes of the instances may overlap, and it is no longer a valid optimization to re-use the same memory for each one.
We can see this most clearly by declaring the variable in the loop immutable
:
void main() {
int delegate()[10] dgs;
foreach (i; 0 .. 10) {
immutable index = i;
dgs[i] () => index;
assert(dgs[i]() == i);
}
foreach (i; 0 .. 10) {
// if this fails, something has mutated immutable data!
assert(dgs[i]() == i);
}
}
If you run the above program, you will see that the assert in the second loop does, in fact, fail. By using the same memory to store each instance of index
, the compiler has generated incorrect code that allows us to observe mutation of immutable
data--something that the language spec itself says is undefined behavior.
In order to compile this code correctly, the compiler must allocate a separate location in memory for each instance of index
. Those locations can be either on the stack (if the closure does not outlive the function) or on the heap; the important part is that they cannot overlap.