September 22, 2021

On Wednesday, 22 September 2021 at 19:14:43 UTC, Steven Schveighoffer wrote:

>

On 9/22/21 3:10 PM, Adam D Ruppe wrote:

>

One thing we might do is if passing a pointer to an extern(C) function, the compiler assumes it shouldn't stomp the memory until end of scope.

What if you are calling D functions that call extern(C) functions?

I don't think this is the answer either.

Either (a) the compiler must assume, pessimistically, that a pointer passed to a function may be stored somewhere the GC can't see it, and therefore must be preserved in local storage until the end of its lexical lifetime, or (b) the user must manually signal to the compiler or the GC that the pointer should be kept alive before passing it to the function.

Given how error-prone option (b) is, I think (a) is the more sensible choice. Users who don't want the pointer kept alive can always opt-in to that behavior by enclosing it in a block scope to limit its lifetime, or setting it to null when they're done with it.

September 22, 2021

On Wednesday, 22 September 2021 at 19:30:37 UTC, Paul Backus wrote:

>

Either (a) the compiler must assume, pessimistically, that a pointer passed to a function may be stored somewhere the GC can't see it, and therefore must be preserved in local storage until the end of its lexical lifetime, or (b) the user must manually signal to the compiler or the GC that the pointer should be kept alive before passing it to the function.

Given how error-prone option (b) is, I think (a) is the more sensible choice. Users who don't want the pointer kept alive can always opt-in to that behavior by enclosing it in a block scope to limit its lifetime, or setting it to null when they're done with it.

The problem with a, is that eats up resources such as registers/stack, preventing better code generation. The compiler might choose to put it on a register and that way block that registers that could be used for better optimization. Also registers load/store on stack will increase.

Free after last recently used is a perfectly sane assumption. Also if you are using RC, the compiler should decrease the reference count after last use in order to free up resources like registers. Therefore I think to define that the resource must be kept alive inside the entire scope hurts code generation while the benefit is low.

Also option a, is in 99% of cases good because as long the resource is being used it must be somewhere (assuming you are using D all the way) which means the D GC will find it.

The problems described in this thread are really fringe problems and I think it could be resolved by other means. KeepAlive is one of them and it is also a GC agnostic, works on any GC. I rather say b is the better alternative, covering for those very special cases.

September 22, 2021

On 9/22/21 3:24 PM, Adam D Ruppe wrote:

>

On Wednesday, 22 September 2021 at 19:14:43 UTC, Steven Schveighoffer wrote:

>

What if you are calling D functions that call extern(C) functions?

Then it will be an argument to that other D function which makes it a local variable there and the same rule can apply.

If the C function stores something beyond the immediate function, you are already supposed to malloc it or addRoot etc, so I don't think the depth of the call stack really makes a difference.

While I don't necessarily disagree with you, this removes all possibility of abstraction. Even a local function cannot be used to factor out initialization of a C resource.

I think code which does not properly do cleanup of C resources it uses, or does so with the expectation that the cleanup must be running after main exits is very suspect. But the fact that you can't rely on the recommended remedy in the spec needs fixing, and I don't think this fixes it.

This also introduces unnecessary storage of pointers when not necessary (probably the vast majority of extern(C) calls).

-Steve

September 22, 2021

FYI, I filed an issue so it's not forgotten.

https://issues.dlang.org/show_bug.cgi?id=22331

-Steve

September 22, 2021

On Wednesday, 22 September 2021 at 19:30:37 UTC, Paul Backus wrote:

>

Either (a) the compiler must assume, pessimistically, that a pointer passed to a function may be stored somewhere the GC can't see it, and therefore must be preserved in local storage until the end of its lexical lifetime, or (b) the user must manually signal to the compiler or the GC that the pointer should be kept alive before passing it to the function.

Given how error-prone option (b) is, I think (a) is the more sensible choice. Users who don't want the pointer kept alive can always opt-in to that behavior by enclosing it in a block scope to limit its lifetime, or setting it to null when they're done with it.

Neither of these solutions would've helped with the original code that used the epoll API.

The pointer passed to C wasn't important, and it would've been fine for its lifetime to end at the end of the function (it was even a pointer to a stack-allocated struct in the function: epoll copies the struct out of the pointer given to it).

The pointer that mattered was a misaligned class reference in the structure passed to C. This pointer was to an object that will eventually get destructed before the end of its function scope. That objected wasn't passed anywhere in its scope, it was initialized and then had a method called on it.

At the site of the short-lived object, there's nothing apparently wrong. At the site of the C API call, what can you do? If you tell the GC that the short-lived object's reference is important, the GC still can't see any references to it afterward, and the GC isn't responsible for the object's short lifetime. If you tell the compiler that the short-lived object's reference is important, this could be a completely separate compilation and the decision to expire it early might've already been made.

I think the intuition that's violated here is "the GC is nondeterministic, but the stack is deterministic." And the correct intuition is "class destruction is nondeterministic, but struct destruction is (usually) deterministic."

This also can happen without using calling out to C:

import std.stdio : writeln;
import core.memory : GC;

struct Hidden {
    align(1):
    int spacer;
    C obj;
}

Hidden hidden;

class C {
    int id;

    this(int id) {
        this.id = id;
        hidden.obj = this;
    }

    ~this() {
        id = -id;
        writeln("dtor");
    }
}

void check() {
    writeln(hidden.obj.id);
}

void main() {
    auto c = new C(17);
    check; // 17, c is alive
    GC.collect;
    GC.collect;
    check; // -17, c was destroyed
    writeln("here");
}
September 22, 2021

On Wednesday, 22 September 2021 at 19:50:37 UTC, IGotD- wrote:

>

The problems described in this thread are really fringe problems and I think it could be resolved by other means. KeepAlive is one of them and it is also a GC agnostic, works on any GC. I rather say b is the better alternative, covering for those very special cases.

If GC.keepAlive is added, how do you know when to use it without first running into this fringe problem in a specific part of your code? In the code that prompted this discussion, the object lifetimes immediately around the C API calls were actually fine. GC.keepAlive in the functions with those API calls, even if applied to the exact object whose lifetime was too short, would not have saved the object.

September 22, 2021

On Wednesday, 22 September 2021 at 12:31:48 UTC, Steven Schveighoffer wrote:

>

And by the way I tried naive usage, and the compiler saw right through that:

auto c = new C;
scope(exit) auto fake = c; // still collected early

That just means you don't properly use it. It's so in this case, but it isn't always the case. For an example how D interoperates with the C heap see the Array container: https://github.com/dlang/phobos/blob/master/std/container/array.d#L604
Another option is to allocate objects in the C heap, then they won't have this problem.

September 22, 2021

On 9/21/21 8:02 AM, Steven Schveighoffer wrote:

>

I just thought of a possible easy and effective way to ensure the thing isn't collected early:

struct Pin(T)
{
    T t;
    @nogc nothrow pure @safe ~this() {}
    alias t this;
}

...
// usage
auto c = Pin!C(new C); // now it needs to be held until the scope ends

I made a package for something like this: https://code.dlang.org/packages/keepalive

Maybe it might find some use.

-Steve

September 22, 2021

On Wednesday, 22 September 2021 at 20:17:48 UTC, jfondren wrote:

>

Neither of these solutions would've helped with the original code that used the epoll API.

The pointer passed to C wasn't important, and it would've been fine for its lifetime to end at the end of the function (it was even a pointer to a stack-allocated struct in the function: epoll copies the struct out of the pointer given to it).

The pointer that mattered was a misaligned class reference in the structure passed to C. This pointer was to an object that will eventually get destructed before the end of its function scope. That objected wasn't passed anywhere in its scope, it was initialized and then had a method called on it.

Finally someone tries to solve the real problem. It's beyond me why people try to count angels dancing on the tip of a needle.

September 23, 2021

On Wednesday, 22 September 2021 at 21:06:11 UTC, Steven Schveighoffer wrote:

>

On 9/21/21 8:02 AM, Steven Schveighoffer wrote:

>

I just thought of a possible easy and effective way to ensure the thing isn't collected early:

struct Pin(T)
{
    T t;
    @nogc nothrow pure @safe ~this() {}
    alias t this;
}

...
// usage
auto c = Pin!C(new C); // now it needs to be held until the scope ends

I made a package for something like this: https://code.dlang.org/packages/keepalive

Maybe it might find some use.

For the simple Pin version above, LDC generates the same machine code with/without Pin (as expected):

https://d.godbolt.org/z/MW7d9Mefe

-Johan