May 18, 2021
The general rule for determining "what should happen here" when there are abstractions around pointers (such as arrays, delegates, refs, outs, class references, etc.), is to rewrite it in explicit terms of those pointers. The (sometimes baffling) behavior is then exposed for what it actually is, and the behavior should match.

Granted, there is a special kludge in the compiler to sometimes put the variables referenced by the delegate into a closure allocated by the gc, but that doesn't account for variables that go out of scope before the function scope ends. There is no process to make a closure for them, and adding such a capability is likely much more complication than added value, and so should just be an error.
May 19, 2021
On Wednesday, 19 May 2021 at 03:09:03 UTC, Walter Bright wrote:
> The general rule for determining "what should happen here" when there are abstractions around pointers (such as arrays, delegates, refs, outs, class references, etc.), is to rewrite it in explicit terms of those pointers. The (sometimes baffling) behavior is then exposed for what it actually is, and the behavior should match.
>

No, this is definitively wrong.

Languages constructs provide invariant, that both the developers and the compiler can rely on. These invariant ensure that the codebase can scale to larger size while keeping bugs, complexity, extensibility and so on in check.

It is capital to look at language constructs through that lens, or, very quickly, one end up with sets of invariant that vanish to nothing because they are not provided consistently by various languages constructs.

This results in more complexity for the programmer, more bugs, and slower programs because the runtime and compiler cannot leverage the invariant either.

Thinking through the way it is all implemented with pointer is definitively useful too, but simply as a tool to know what can be implemented efficiently and what cannot, what can be optimized easily and what cannot, etc... This is not very useful as a design tool, as it lads to unsound language constructs.

In fact, not even C++ works this way, as they went through great length to define a virtual machine that does not exist that would execute the C++ in their spec.

> Granted, there is a special kludge in the compiler to sometimes put the variables referenced by the delegate into a closure allocated by the gc, but that doesn't account for variables that go out of scope before the function scope ends. There is no process to make a closure for them, and adding such a capability is likely much more complication than added value, and so should just be an error.

I find it surprising that you call this a kludge. This is how pretty much any language except C++ does it. It is proven. Without this and without the ability to capture by value like in C++, delegates are effectively useless.

This is not a kludge, this is the very thing that makes delegates useful at all.

That being said, the DIP1000 analysis you mention is a useful tool here. If nothing escape, then it is possible for the compiler to promote the closure on stack rather than on heap.

This is where attacking the problem from first principles helps. It is not about the pointers, it is about the invariants. If the compiler can find a better way to implement these invariants given a set of conditions, then great.

May 19, 2021
On 5/19/2021 12:36 AM, deadalnix wrote:
> This is where attacking the problem from first principles helps. It is not about the pointers, it is about the invariants. If the compiler can find a better way to implement these invariants given a set of conditions, then great.

The thing about metaprogramming is users can build things by aggregating simpler pieces (like pointers). If the compiler has special semantics for a higher level type that cannot be assembled from simpler pieces, then the language has composability problems.

(This problem has shown up with the special semantics given to associative arrays.)

May 19, 2021
On Wednesday, 19 May 2021 at 07:53:49 UTC, Walter Bright wrote:
> On 5/19/2021 12:36 AM, deadalnix wrote:
>> This is where attacking the problem from first principles helps. It is not about the pointers, it is about the invariants. If the compiler can find a better way to implement these invariants given a set of conditions, then great.
>
> The thing about metaprogramming is users can build things by aggregating simpler pieces (like pointers). If the compiler has special semantics for a higher level type that cannot be assembled from simpler pieces, then the language has composability problems.
>
> (This problem has shown up with the special semantics given to associative arrays.)

Composability comes from the invariants you can rely on, and more precisely, the fact that these invariant do not impose constraints on each others.
May 19, 2021
On Wednesday, 19 May 2021 at 11:06:29 UTC, deadalnix wrote:
> Composability comes from the invariants you can rely on, and more precisely, the fact that these invariant do not impose constraints on each others.

To go further, in this specific case, there is no composability.

The notion of loops, immutability, ad closure at colliding with each others. they do not compose because they don't propose a set of invariant which are independent from each other, but, on the other hand, step on each others.
May 19, 2021
On 5/18/21 12:47 PM, deadalnix wrote:
> Long story short: https://issues.dlang.org/show_bug.cgi?id=21929
> 
> Closure do not respect scope the way they should. Let's fix it.

Thinking about how this would have to be implemented:

1. If you want to access a variable in a scope from both the closure and the function itself, the variable has to be allocated on the heap
2. We need one allocation PER loop. If we do this the way normal closures are done (i.e. allocate before the scope is entered), this would be insanely costly for a loop.
3. It *could* allocate on demand. Basically, reserve stack space for the captured variables, have a pointer to that stack space. When a closure is used, copy that stack space to a heap allocation, and switch the pointer to that heap block (so the function itself also refers to the same data). This might be a reasonable tradeoff. But it has some limitations -- like if you ever take the address of one of these variables, that would also have to generate the allocated closure.

Of course, with Walter's chosen fix, only allowing capture of non-scoped variables, all of this is moot. I kind of feel like that's a much simpler (even if less convenient) solution.

And also, of course, can we please fix the issue where destroyed structs are accessible from a delegate?

-Steve
May 19, 2021

On Tuesday, 18 May 2021 at 16:47:03 UTC, deadalnix wrote:

>

Long story short: https://issues.dlang.org/show_bug.cgi?id=21929

Closure do not respect scope the way they should. Let's fix it.

After Walter's post I definitely see what is happening.

for (int i = 0; i < 10; i++) {
        int index = i;
        dgs ~= () {
            import std.stdio;
            writeln(index);
        };
    }

When this loop concludes, the value of i is 10 and the value of index is 9 (as shown from your output).

This is because within the for logic i was increased and it determined 10 < 10 is false. This means the forbody is not executed again leaving index at 9.

I don't know why compiler magic you would expect is "correct" here. We can't say i should be 9 as the loop would not have exited then. We certainly don't want index to be 10 as that would mean the loop expected on more time than it was defined to.

Untested

auto i;
auto index;
for (i = 0; i < 10; i++) {
        index = i;
        dgs ~= () {
            import std.stdio;
            writeln(i);
            writeln(index);
        };
   }
May 19, 2021

On Wednesday, 19 May 2021 at 14:24:57 UTC, Jesse Phillips wrote:

>
for (int i = 0; i < 10; i++) {
        int index = i;
        dgs ~= () {
            import std.stdio;
            writeln(index);
        };
    }

When this loop concludes, the value of i is 10 and the value of index is 9 (as shown from your output).

This is because within the for logic i was increased and it determined 10 < 10 is false. This means the forbody is not executed again leaving index at 9.

I don't know why compiler magic you would expect is "correct" here. We can't say i should be 9 as the loop would not have exited then. We certainly don't want index to be 10 as that would mean the loop expected on more time than it was defined to.

A local variable's lifetime starts at its declaration and ends at the closing brace of the scope where it's declared:

void main() {
    int x; // start of x's lifetime
    {
        int y; // start of y's lifetime
    } // end of y's lifetime
    int z; // start of z's lifetime
} // end of x's and z's lifetimes

This also applies to variables inside loops:

void main() {
    foreach (i; 0 .. 10) {
        int x; // start of x's lifetime
    } // end of x's lifetime
}

We can see that this is the case by declaring a variable with a destructor inside a loop:

import std.stdio;

struct S {
    ~this() { writeln("destroyed"); }
}

void main() {
    foreach (i; 0 .. 10) {
        S s; // start of s's lifetime
    } // end of s's lifetime
}

The above program prints "destroyed" 10 times. At the start of each loop iteration, a new instance of s is initialized; at the end of each iteration, it is destroyed.

Normally, an instance of a variable declared inside a loop cannot outlive the loop iteration in which it was created, so the compiler is free to reuse the same memory for each instance. We can verify that it does so by printing out the address of each instance:

import std.stdio;

struct S
{
    ~this() { writeln("destroyed ", &this); }
}

void main()
{
    foreach (i; 0 .. 10) {
        S s;
    }
}

On run.dlang.io, this prints "destroyed 7FFE478D283C" 10 times.

However, when am instance of variable declared inside a loop is captured in a closure, it becomes possible to access that instance even after the loop iteration that created it has finished. In this case, the lifetimes of the instances may overlap, and it is no longer a valid optimization to re-use the same memory for each one.

We can see this most clearly by declaring the variable in the loop immutable:

void main() {
    int delegate()[10] dgs;

    foreach (i; 0 .. 10) {
        immutable index = i;
        dgs[i] () => index;
        assert(dgs[i]() == i);
    }

    foreach (i; 0 .. 10) {
        // if this fails, something has mutated immutable data!
        assert(dgs[i]() == i);
    }
}

If you run the above program, you will see that the assert in the second loop does, in fact, fail. By using the same memory to store each instance of index, the compiler has generated incorrect code that allows us to observe mutation of immutable data--something that the language spec itself says is undefined behavior.

In order to compile this code correctly, the compiler must allocate a separate location in memory for each instance of index. Those locations can be either on the stack (if the closure does not outlive the function) or on the heap; the important part is that they cannot overlap.

May 19, 2021
On Wednesday, 19 May 2021 at 13:02:59 UTC, Steven Schveighoffer wrote:
> Thinking about how this would have to be implemented:
>
> 1. If you want to access a variable in a scope from both the closure and the function itself, the variable has to be allocated on the heap.

This is definitively what D guarantees.

> 2. We need one allocation PER loop. If we do this the way normal closures are done (i.e. allocate before the scope is entered), this would be insanely costly for a loop.

This is costly, but also the only way to ensure other invariants in the language are respected (immutability, no access after destruction, ...).

This is also consistent with what other languages do.

This is also consistent with the fact that D allow to iterate over loops using opDispatch, which already should exhibit this behavior, because it is a function under the hood.

> 3. It *could* allocate on demand. Basically, reserve stack space for the captured variables, have a pointer to that stack space. When a closure is used, copy that stack space to a heap allocation, and switch the pointer to that heap block (so the function itself also refers to the same data). This might be a reasonable tradeoff. But it has some limitations -- like if you ever take the address of one of these variables, that would also have to generate the allocated closure.
>

I suspect this will open a can of worm of edge cases.

May 19, 2021
On 5/19/21 1:26 PM, deadalnix wrote:
> On Wednesday, 19 May 2021 at 13:02:59 UTC, Steven Schveighoffer wrote:
>> Thinking about how this would have to be implemented:
>>
>> 1. If you want to access a variable in a scope from both the closure and the function itself, the variable has to be allocated on the heap.
> 
> This is definitively what D guarantees.

Yes, it is guaranteed... if it compiles. For sure the current behavior is junk and unsafe.

> 
>> 2. We need one allocation PER loop. If we do this the way normal closures are done (i.e. allocate before the scope is entered), this would be insanely costly for a loop.
> 
> This is costly, but also the only way to ensure other invariants in the language are respected (immutability, no access after destruction, ...).
> 
> This is also consistent with what other languages do.

Again, costly as long as it compiles. If a la Walter's suggestion it no longer compiles, then it's moot.

> 
> This is also consistent with the fact that D allow to iterate over loops using opDispatch, which already should exhibit this behavior, because it is a function under the hood.

You mean opApply? Not necessarily, if the delegate parameter is scope (and it should be).

> 
>> 3. It *could* allocate on demand. Basically, reserve stack space for the captured variables, have a pointer to that stack space. When a closure is used, copy that stack space to a heap allocation, and switch the pointer to that heap block (so the function itself also refers to the same data). This might be a reasonable tradeoff. But it has some limitations -- like if you ever take the address of one of these variables, that would also have to generate the allocated closure.
>>
> 
> I suspect this will open a can of worm of edge cases.

I don't think a can of worms is opened, but it's not easy to implement for sure. I'm not suggesting that we follow this path. I'm just thinking about "What's the most performant way we can implement closures used inside loops". If a loop *rarely* allocates a closure (i.e. only one element actually allocates a closure), then allocating defensively seems super-costly.

-Steve