September 24, 2021
On Fri, Sep 24, 2021 at 11:31:46AM -0400, Steven Schveighoffer via Digitalmars-d wrote:
> On 9/24/21 11:25 AM, deadalnix wrote:
> > On Thursday, 23 September 2021 at 19:54:56 UTC, Steven Schveighoffer wrote:
> > > You can. But wouldn't you prefer just pushing something on the stack?
> > > 
> > 
> > Not really. If the optimizer can remove dead stack pushes, then program will become 2x slower instantly in addition of consuming more stack memory.
> > 
> 
> You think pushing on the stack is going be 2x slower than calling `GC.addRoot`?
[...]

I still prefer GC.addRoot.

For one thing, that's the "official" way to inform the GC that a certain object is still needed and therefore should not be collected.

Secondly, it self-documents the intent of the code, instead of some arcane workaround like struct Pin (that may or may not work in the future depending on how smart optimizers become).

Third, if the overhead of calling GC.addRoot becomes an actual problem, it can always be turned into an intrinsic that the compiler can, based on certain conditions, replace it with an equivalent internal flag that ensures the value stays on the stack until the end of the scope.


T

-- 
Three out of two people have difficulties with fractions. -- Dirk Eddelbuettel
September 24, 2021
Off topic, I heard that some people missed this very important thread because they didn't guess from the subject line that the content was important.

Ali

September 24, 2021

On 9/24/21 12:21 PM, H. S. Teoh wrote:

>

I still prefer GC.addRoot.

For one thing, that's the "official" way to inform the GC that a certain
object is still needed and therefore should not be collected.

The "official" docs also say, put a pointer on the stack if you want it to not be collected.

Note that GC.addRoot performs a different function. It keeps the memory alive until you use GC.removeRoot. Putting a pointer on the stack keeps the thing alive until the end of the stack frame. They are not the same thing.

>

Secondly, it self-documents the intent of the code, instead of some
arcane workaround like struct Pin (that may or may not work in the
future depending on how smart optimizers become).

This is pretty self documenting:

obj.keepAlive;

Or perhaps:

GC.keepAlive(obj);

which means, keep this object alive until this line at least. The name Pin was something I just quickly thought of. But I think keepAlive is much more descriptive (and has precedence).

>

Third, if the overhead of calling GC.addRoot becomes an actual problem,
it can always be turned into an intrinsic that the compiler can, based
on certain conditions, replace it with an equivalent internal flag that
ensures the value stays on the stack until the end of the scope.

  1. GC.addRoot cannot mean "put on the stack". Because it has to be paired with a GC.removeRoot at a later point in the same frame to have the same effect. Sure, an intrinsic is possible for this situation, but it's way more complex, and I feel not as easily deciphered.
  2. If keepAlive is poorly performing, it too can be replaced with an intrinsic (as it is in C# and Go).

-Steve

September 24, 2021
On 9/20/2021 11:26 AM, Steven Schveighoffer wrote:
> [...]

1, 2, or 3 are all valid outcomes.

The lifetime of a GC allocated object ends when there are no longer live references to it. Live references can end before the scope ends, this is called "non-lexical scoping". That does not imply that that is when the class destructor is run. The class destructor runs at some arbitrary time *after* there are no longer references to it.

The GC is not obligated to run a collection cycle upon program termination (a laxity intended to improve shutdown performance), and hence it is not obliged to run the class destructors.

The inevitable consequence of this is:

Do *not* use the GC to manage non-memory resources.

But if you must do it anyway, use the "destroy" and "free" GC special functions. Of course, if you decide to use these functions, it's up to you to ensure resources are free'd exactly once.

https://dlang.org/phobos/object.html#.destroy
https://dlang.org/phobos/core_memory.html#.GC.free

P.S. A live reference is one where there is a future use of it. A dead reference is one where there isn't a future use. The `c` variable is a dead reference immediately after it is initialized, which is why optimizers delete the assignment. The `new C` return value is dead as soon as it is created.

P.P.S. Attempting to deduce the GC's rules from observing its behavior is very likely a path to frustration and errors, because its observed behavior will not make sense (and will appear random) unless one understands the above explanation.

P.P.S. Do not conflate class destructors with struct destructors. The latter follow RAII rules, the former does not.
September 24, 2021
On 9/24/21 5:12 PM, Walter Bright wrote:

> "non-lexical scoping"

> live reference

Those are new concepts to me. :(

> dead reference immediately after it is initialized

I am sure I have dead references in my D library that exposes a C API. If it works, it must be because I don't hit a GC cycle in my thin extern(C) function. (Single-threaded too.)

> Do not conflate class destructors with struct destructors. The
> latter follow RAII rules, the former does not.

Some people stress that fact by using the term "finalizer" for classes.

Ali

September 24, 2021
On 9/21/2021 5:02 AM, Steven Schveighoffer wrote:
> I just thought of a possible easy and effective way to ensure the thing isn't collected early:
> 
> ```d
> struct Pin(T)
> {
>     T t;
>     @nogc nothrow pure @safe ~this() {}
>     alias t this;
> }
> 
> ...
> // usage
> auto c = Pin!C(new C); // now it needs to be held until the scope ends
> ```

Use addRoot()/removeRoot() to do this in a documented and supported fashion.

https://dlang.org/phobos/core_memory.html#addRoot
September 24, 2021
On 9/21/2021 3:58 AM, Johan wrote:
> First: the use of "stack" here is wrong and confusing. It should be "local storage" (notorious error throughout the spec). Indeed, what is done in your example is putting the pointer in local storage. The scope of that local storage is until the end of function scope (in your example). I don't think (in LDC) that we track the lifetime of variables in that way, so what is done is that the optimizer just looks at last point of use. This is similar to how Java behaves:
> https://stackoverflow.com/questions/39285108/can-java-garbage-collect-variables-before-end-of-scope 

Data flow analyzers are not scope-based, they are based on first use and last use. The @live pointer tracking does this, too. Destructors happen on going out of scope, but the compiler can move them as long as the code behaves "as if" the destructor happened at the end of scope.

But garbage collection, the class destructors are run at some arbitrary time after last use, not after going out of scope.
September 25, 2021

On 9/24/21 8:12 PM, Walter Bright wrote:

>

On 9/20/2021 11:26 AM, Steven Schveighoffer wrote:

>

[...]

1, 2, or 3 are all valid outcomes.

The lifetime of a GC allocated object ends when there are no longer live references to it. Live references can end before the scope ends, this is called "non-lexical scoping". That does not imply that that is when the class destructor is run. The class destructor runs at some arbitrary time after there are no longer references to it.

Right, but the optimizer is working against that.

For example:

auto c = new C;
.... // a whole bunch of other code

c.method; // not necessarily still live here.

Why would it not be live there? Because the method might be inlined, and the compiler might determine at that point that none of the data inside the method is needed, and therefore the object is no longer needed.

So technically, it's not live after the original allocation. But this is really hard for a person to determine. Imagine having the GC collect your object when the object is clearly "used" later?

This is why I made this thread. It's easy to explain why it's happening, but it's really hard to follow the instructions "leave a pointer on the stack" as noted in the spec.

>

The GC is not obligated to run a collection cycle upon program termination (a laxity intended to improve shutdown performance), and hence it is not obliged to run the class destructors.

Yeah, that's not really the focus here, but it's a good point to make.

>

The inevitable consequence of this is:

Do not use the GC to manage non-memory resources.

When D uses the GC for delegates, classes, etc, and you want to hook to those things via C callbacks, this advice falls flat.

Basically, you are saying, when using your OS primitives, don't use D.

>

But if you must do it anyway, use the "destroy" and "free" GC special functions. Of course, if you decide to use these functions, it's up to you to ensure resources are free'd exactly once.

https://dlang.org/phobos/object.html#.destroy
https://dlang.org/phobos/core_memory.html#.GC.free

Don't recommend free. Just use destroy.

>

P.S. A live reference is one where there is a future use of it. A dead reference is one where there isn't a future use. The c variable is a dead reference immediately after it is initialized, which is why optimizers delete the assignment. The new C return value is dead as soon as it is created.

It would be good for the spec to define what a "live" reference is. Currently, the GC talks about stacks, heap and registers. And I think it should be clear about whether using a reference later should be considered making it "live", as it is not currently when optimized.

>

P.P.S. Attempting to deduce the GC's rules from observing its behavior is very likely a path to frustration and errors, because its observed behavior will not make sense (and will appear random) unless one understands the above explanation.

Yes, it is difficult to ensure a collection doesn't occur or does occur. But clearly if it does occur and you don't expect it to, it's not doing what you thought it was. If it's not doing what you thought it was (e.g. keeping the object live), but it doesn't get collected due to some other reason, then it's hard to prove things conclusively.

I've resolved to reading the assembly instead of using the measured collection, at least to see if the compiler works around efforts to keep things live, as that is more reliable. However, it's harder to figure out.

>

P.P.S. Do not conflate class destructors with struct destructors. The latter follow RAII rules, the former does not.

Actually, this is not strictly true. Structs allocated on the heap do not follow RAII rules, and stack allocations of structs are a completely different issue.

-Steve

September 25, 2021
On Saturday, 25 September 2021 at 17:46:01 UTC, Steven Schveighoffer wrote:
>> P.P.S. Do not conflate class destructors with struct destructors. The latter follow RAII rules, the former does not.
>
> Actually, this is not strictly true. Structs allocated on the heap do not follow RAII rules, and stack allocations of structs are a completely different issue.

Indeed; additionally, 'scope c = new Class()' _does_ follow RAII.
September 25, 2021
On 9/25/2021 10:46 AM, Steven Schveighoffer wrote:
> Right, but the optimizer is working against that.
> 
> For example:
> 
> ```d
> auto c = new C;
> .... // a whole bunch of other code
> 
> c.method; // not necessarily still live here.
> ```
> 
> Why would it not be live there? Because the method might be inlined, and the compiler might determine at that point that none of the data inside the method is needed, and therefore the object is no longer needed.
> 
> So technically, it's not live after the original allocation. But this is really hard for a person to determine. Imagine having the GC collect your object when the object is clearly "used" later?

I understand your point. It's value is never used again, so there is no reason for the GC to hold on to it. After the point when the value is never used again, when the destructor is run is indeterminate. Maybe the real problem is the user is expecting the destructor to run at a specific point in the execution.

The point of putting the variable on the stack (or in a register, it works the same) is so the GC can find it. If D code is being called, D does not allow the hiding of pointers. But C does allow this, such as when doing the singly linked list XOR trick. The GC won't find those pointers, and will collect them. But if the pointer is still on the stack, the GC will find them.

If the function is inlined, it is not C code, so hiding the pointer won't be allowed, and it's not a problem.


>> Do *not* use the GC to manage non-memory resources.
> When D uses the GC for delegates, classes, etc, and you want to hook to those things via C callbacks, this advice falls flat.

If the closure for a delegate is located on the stack, then it will be found by the GC and works fine. If the closure is located on the GC heap, and the OS will keep it around past the return, then you'll need to use addRoot.


> Basically, you are saying, when using your OS primitives, don't use D.

addRoot()