September 20, 2021
On Monday, 20 September 2021 at 18:49:12 UTC, Steven Schveighoffer wrote:
> It's just class references that the compiler seems to not care about ensuring stack references stay alive.
>
> Should it be this way?

Hmm.  No opinion on 'should', but you ought to be able to insert a volatileRead late in the function in order to ensure the object stays alive.
September 20, 2021

On 9/20/21 3:20 PM, Steven Schveighoffer wrote:

>

Then why are pointers to structs, arrays, structs containing class references not treated the same?

I'm not sure why class references are singled out, but they are for some reason.

If I change the struct initialization to a function it has the same behavior.

e.g.:

struct S {
  ~this() { writeln("dtor"); }
}

auto makes() { return new S; }

void main()
{
   auto s = makes();
   GC.collect();
   GC.collect();
   writeln("end of main");
}

Also shows option 2.

So it has something to do with how the return value is stored.

-Steve

September 21, 2021

On Monday, 20 September 2021 at 18:26:59 UTC, Steven Schveighoffer wrote:

>

Without any context, what do you think should happen here?

import std.stdio;
import core.memory;
class C
{
   ~this() { writeln("dtor"); }
}

void main()
{
   auto c = new C;
   foreach(i; 0 .. 10000) GC.collect;
   writeln("end of main");
}

Option 1:

end of main
dtor

Option 2:

dtor
end of main

Option 3:

end of main

Option 4:
Option 1 or 2, depending on entropy.

Most likely option 1. May also be option 3 though, as I remember some warnings about trusting the GC destructor to run at all.

September 21, 2021

On Monday, 20 September 2021 at 18:49:12 UTC, Steven Schveighoffer wrote:

>

I feel like this might not necessarily be an issue, because technically, you aren't using c any more, so it can be deallocated immediately. But right in our documentation here it lists ways to alleviate this:

If pointers to D garbage collector allocated memory are passed to C functions, it's critical to ensure that the memory will not be collected by the garbage collector before the C function is done with it. This is accomplished by:

* Making a copy of the data using core.stdc.stdlib.malloc() and passing the copy instead.
* Leaving a pointer to it on the stack (as a parameter or automatic variable), as the garbage collector will scan the stack.
* Leaving a pointer to it in the static data segment, as the garbage collector will scan the static data segment.
* Registering the pointer with the garbage collector with the std.gc.addRoot() or std.gc.addRange() calls.

This to me seems like "leaving a pointer to it on the stack". I'm not sure how else I would do that specifically? Plus, this option is the only "free" one -- the others all require much more complication. Adding a pointer to the stack is free. It's just, I don't know how to tell the compiler to do that besides declaring it.

First: the use of "stack" here is wrong and confusing. It should be "local storage" (notorious error throughout the spec). Indeed, what is done in your example is putting the pointer in local storage. The scope of that local storage is until the end of function scope (in your example). I don't think (in LDC) that we track the lifetime of variables in that way, so what is done is that the optimizer just looks at last point of use. This is similar to how Java behaves:
https://stackoverflow.com/questions/39285108/can-java-garbage-collect-variables-before-end-of-scope

As per language spec, the D compilers are non-compliant on this point. So a decision is needed to either change the language spec, or to complain with the D compilers to fix it.

-Johan

September 21, 2021

On 9/21/21 6:58 AM, Johan wrote:

>

On Monday, 20 September 2021 at 18:49:12 UTC, Steven Schveighoffer wrote:

>

I feel like this might not necessarily be an issue, because technically, you aren't using c any more, so it can be deallocated immediately. But right in our documentation here it lists ways to alleviate this:

If pointers to D garbage collector allocated memory are passed to C functions, it's critical to ensure that the memory will not be collected by the garbage collector before the C function is done with it. This is accomplished by:

* Making a copy of the data using core.stdc.stdlib.malloc() and passing the copy instead.
* Leaving a pointer to it on the stack (as a parameter or automatic variable), as the garbage collector will scan the stack.
* Leaving a pointer to it in the static data segment, as the garbage collector will scan the static data segment.
* Registering the pointer with the garbage collector with the std.gc.addRoot() or std.gc.addRange() calls.

This to me seems like "leaving a pointer to it on the stack". I'm not sure how else I would do that specifically? Plus, this option is the only "free" one -- the others all require much more complication. Adding a pointer to the stack is free. It's just, I don't know how to tell the compiler to do that besides declaring it.

First: the use of "stack" here is wrong and confusing. It should be "local storage" (notorious error throughout the spec). Indeed, what is done in your example is putting the pointer in local storage. The scope of that local storage is until the end of function scope (in your example). I don't think (in LDC) that we track the lifetime of variables in that way, so what is done is that the optimizer just looks at last point of use. This is similar to how Java behaves:
https://stackoverflow.com/questions/39285108/can-java-garbage-collect-variables-before-end-of-scope

Yikes, that's quite aggressive. It says even a method can be in progress on the thing and it's collected early.

>

As per language spec, the D compilers are non-compliant on this point. So a decision is needed to either change the language spec, or to complain with the D compilers to fix it.

I would say if you can somehow find a way to trigger the optimizer not to avoid that stack push, in all compilers, we should do that. IMO, the cost of a stack pointer is minimal compared to the surprising result that we currently see. But I don't know enough about compilers implementation to know whether this is a reasonable ask.

Regardless of whether it's spec or implementation, something needs to change. This is why I asked the question without any context first, to have everyone think about what they expect to happen before finding out what actually happens. I'm surprised so many expected the current behavior, I did not.

I just thought of a possible easy and effective way to ensure the thing isn't collected early:

struct Pin(T)
{
   T t;
   @nogc nothrow pure @safe ~this() {}
   alias t this;
}

...
// usage
auto c = Pin!C(new C); // now it needs to be held until the scope ends

This seems to work on LDC with -O3 to prevent the early collection, so maybe it is sound? If this is a valid mechanism to ensure it's saved, maybe it can be added to Phobos and the spec updated to recommend that.

-Steve

September 21, 2021

On Tuesday, 21 September 2021 at 12:02:05 UTC, Steven Schveighoffer wrote:

>

On 9/21/21 6:58 AM, Johan wrote:

>

On Monday, 20 September 2021 at 18:49:12 UTC, Steven Schveighoffer wrote:

>

I feel like this might not necessarily be an issue, because technically, you aren't using c any more, so it can be deallocated immediately. But right in our documentation here it lists ways to alleviate this:

If pointers to D garbage collector allocated memory are passed to C functions, it's critical to ensure that the memory will not be collected by the garbage collector before the C function is done with it. This is accomplished by:

* Making a copy of the data using core.stdc.stdlib.malloc() and passing the copy instead.
* Leaving a pointer to it on the stack (as a parameter or automatic variable), as the garbage collector will scan the stack.
* Leaving a pointer to it in the static data segment, as the garbage collector will scan the static data segment.
* Registering the pointer with the garbage collector with the std.gc.addRoot() or std.gc.addRange() calls.

This to me seems like "leaving a pointer to it on the stack". I'm not sure how else I would do that specifically? Plus, this option is the only "free" one -- the others all require much more complication. Adding a pointer to the stack is free. It's just, I don't know how to tell the compiler to do that besides declaring it.

First: the use of "stack" here is wrong and confusing. It should be "local storage" (notorious error throughout the spec). Indeed, what is done in your example is putting the pointer in local storage. The scope of that local storage is until the end of function scope (in your example). I don't think (in LDC) that we track the lifetime of variables in that way, so what is done is that the optimizer just looks at last point of use. This is similar to how Java behaves:
https://stackoverflow.com/questions/39285108/can-java-garbage-collect-variables-before-end-of-scope

Yikes, that's quite aggressive. It says even a method can be in progress on the thing and it's collected early.

>

As per language spec, the D compilers are non-compliant on this point. So a decision is needed to either change the language spec, or to complain with the D compilers to fix it.

I would say if you can somehow find a way to trigger the optimizer not to avoid that stack push, in all compilers, we should do that. IMO, the cost of a stack pointer is minimal compared to the surprising result that we currently see. But I don't know enough about compilers implementation to know whether this is a reasonable ask.

Regardless of whether it's spec or implementation, something needs to change. This is why I asked the question without any context first, to have everyone think about what they expect to happen before finding out what actually happens. I'm surprised so many expected the current behavior, I did not.

I just thought of a possible easy and effective way to ensure the thing isn't collected early:

struct Pin(T)
{
   T t;
   @nogc nothrow pure @safe ~this() {}
   alias t this;
}

...
// usage
auto c = Pin!C(new C); // now it needs to be held until the scope ends

This seems to work on LDC with -O3 to prevent the early collection, so maybe it is sound? If this is a valid mechanism to ensure it's saved, maybe it can be added to Phobos and the spec updated to recommend that.

-Steve

What about a scope variable that holds an instance of a class? It could be made to mean the same thing as in (must not be collected until the end of the scope)

September 21, 2021

On Tuesday, 21 September 2021 at 12:02:05 UTC, Steven Schveighoffer wrote:

>

...

I just thought of a possible easy and effective way to ensure the thing isn't collected early:

struct Pin(T)
{
   T t;
   @nogc nothrow pure @safe ~this() {}
   alias t this;
}

...
// usage
auto c = Pin!C(new C); // now it needs to be held until the scope ends

This seems to work on LDC with -O3 to prevent the early collection, so maybe it is sound? If this is a valid mechanism to ensure it's saved, maybe it can be added to Phobos and the spec updated to recommend that.

-Steve

You just rediscovered Go's runtime.KeepAlive(), and C# GC.KeepAlive()

https://pkg.go.dev/runtime#KeepAlive

https://docs.microsoft.com/en-us/dotnet/api/system.gc.keepalive

September 21, 2021

On Tuesday, 21 September 2021 at 12:02:05 UTC, Steven Schveighoffer wrote:

>

On 9/21/21 6:58 AM, Johan wrote:

>

On Monday, 20 September 2021 at 18:49:12 UTC, Steven Schveighoffer wrote:

>

I feel like this might not necessarily be an issue, because technically, you aren't using c any more, so it can be deallocated immediately. But right in our documentation here it lists ways to alleviate this:

If pointers to D garbage collector allocated memory are passed to C functions, it's critical to ensure that the memory will not be collected by the garbage collector before the C function is done with it. This is accomplished by:

* Making a copy of the data using core.stdc.stdlib.malloc() and passing the copy instead.
* Leaving a pointer to it on the stack (as a parameter or automatic variable), as the garbage collector will scan the stack.
* Leaving a pointer to it in the static data segment, as the garbage collector will scan the static data segment.
* Registering the pointer with the garbage collector with the std.gc.addRoot() or std.gc.addRange() calls.

This to me seems like "leaving a pointer to it on the stack". I'm not sure how else I would do that specifically? Plus, this option is the only "free" one -- the others all require much more complication. Adding a pointer to the stack is free. It's just, I don't know how to tell the compiler to do that besides declaring it.

First: the use of "stack" here is wrong and confusing. It should be "local storage" (notorious error throughout the spec). Indeed, what is done in your example is putting the pointer in local storage. The scope of that local storage is until the end of function scope (in your example). I don't think (in LDC) that we track the lifetime of variables in that way, so what is done is that the optimizer just looks at last point of use. This is similar to how Java behaves:
https://stackoverflow.com/questions/39285108/can-java-garbage-collect-variables-before-end-of-scope

Yikes, that's quite aggressive. It says even a method can be in progress on the thing and it's collected early.

>

As per language spec, the D compilers are non-compliant on this point. So a decision is needed to either change the language spec, or to complain with the D compilers to fix it.

I would say if you can somehow find a way to trigger the optimizer not to avoid that stack push, in all compilers, we should do that. IMO, the cost of a stack pointer is minimal compared to the surprising result that we currently see. But I don't know enough about compilers implementation to know whether this is a reasonable ask.

I think this is not unreasonable to implement, it is similar to keeping track of what destructors to use: just doing a noop/keepalive on the variable at the end of scope.
I can think of hypothetical cases where this would impact performance. For example,
the function void foo(S* s) receives the pointer in a register, and would have to keep it alive in a register or push it to stack to preserve it for duration of the function; in a tight loop one may not expect that (and there would be no way to not do that). We also don't want this for just any kind of parameter (e.g. not for an int), so would need some smartness on which types to apply this to. I think this would cover it: (pointers to, arrays of) struct, class, slice, AA.

Test and see?

>

Regardless of whether it's spec or implementation, something needs to change. This is why I asked the question without any context first, to have everyone think about what they expect to happen before finding out what actually happens. I'm surprised so many expected the current behavior, I did not.

I just thought of a possible easy and effective way to ensure the thing isn't collected early:

struct Pin(T)
{
   T t;
   @nogc nothrow pure @safe ~this() {}
   alias t this;
}

...
// usage
auto c = Pin!C(new C); // now it needs to be held until the scope ends

This seems to work on LDC with -O3 to prevent the early collection, so maybe it is sound?

I don't think it is, and I am surprised it works.
You can trivially inline the destructor, see that it does nothing, and then the liveness of the variable is very short indeed...

-Johan

September 21, 2021
On 9/20/21 11:55 AM, Paul Backus wrote:

>>    auto c = new C;
>>    foreach(i; 0 .. 10000) GC.collect;
>>    writeln("end of main");

>> Option 2:
>> ```
>> dtor
>> end of main
>> ```

> Option 2 at first seems like it should be invalid, but since `c` is
> never accessed after initialization, the compiler is free to remove the
> initialization as a dead store,

Your explanation at first :) seems invalid because it seems to disregard side-effects in the constructor the destructor. However, because destructors for GC objects are not guaranteed to be executed anyway, only the constructor should be considered here.

I love the overwritten register story but I think this is a bug because "local storage" should be sufficient to keep the object alive. However, given the presence of KeepAlive from other languages, perhaps this is a concept that needs to be communicated better.

Ali

September 21, 2021

On 9/21/21 12:19 PM, Johan wrote:

>

On Tuesday, 21 September 2021 at 12:02:05 UTC, Steven Schveighoffer wrote:

>

I would say if you can somehow find a way to trigger the optimizer not to avoid that stack push, in all compilers, we should do that. IMO, the cost of a stack pointer is minimal compared to the surprising result that we currently see. But I don't know enough about compilers implementation to know whether this is a reasonable ask.

I think this is not unreasonable to implement, it is similar to keeping track of what destructors to use: just doing a noop/keepalive on the variable at the end of scope.
I can think of hypothetical cases where this would impact performance. For example,
the function void foo(S* s) receives the pointer in a register, and would have to keep it alive in a register or push it to stack to preserve it for duration of the function; in a tight loop one may not expect that (and there would be no way to not do that). We also don't want this for just any kind of parameter (e.g. not for an int), so would need some smartness on which types to apply this to. I think this would cover it: (pointers to, arrays of) struct, class, slice, AA.

Test and see?

Probably you don't need to push to the stack unless the last usage is sending the variable to a function, but even that could be more expensive than just keeping in a register.

The more I think about it (and finding out that other languages have a keepAlive feature), this really should just be changed to something that's opt-in. A way to signal to the compiler to ensure the thing gets onto the stack. And then we change the spec to say "use this feature to keep pointers alive during a scope".

If that's not the thing I posted below, then maybe even a special symbol name can be used to signal to the compiler.

> >

I just thought of a possible easy and effective way to ensure the thing isn't collected early:

struct Pin(T)
{
   T t;
   @nogc nothrow pure @safe ~this() {}
   alias t this;
}

...
// usage
auto c = Pin!C(new C); // now it needs to be held until the scope ends

This seems to work on LDC with -O3 to prevent the early collection, so maybe it is sound?

I don't think it is, and I am surprised it works.
You can trivially inline the destructor, see that it does nothing, and then the liveness of the variable is very short indeed...

Yeah, if the inliner elides the entire function, it could potentially be collected, maybe there's something about the fact that the struct has a destructor that forces the compiler to store on the stack?

-Steve