November 13, 2021

On Saturday, 13 November 2021 at 03:03:21 UTC, Steven Schveighoffer wrote:

>

It's like saying an OS context switch that happens in the middle of a pure function must somehow be valid pure code.

No, by extending the lifetime of a gc object you guarantee that the finalizer is not executed. That means you prevent a possible side effect from occuring, which in itself is a side effect.

Assume that the finalizer calls exit() or assert(0) or does out of bounds indexing.

If you want truly strong purity you can only allow the function to extend lifetimes of objects with trivial destruction.

This might be more than you wish for, so just clarify what you want to achieve with pure.

November 13, 2021
On Friday, 12 November 2021 at 18:12:14 UTC, Andrei Alexandrescu wrote:
> We discussed this a couple of times. It's interesting. Sadly at this point implicit thread sharing of immutable is so baked into the language, it would take a lot of care to extricate. It would be very difficult even for Timon or Paul.

Just define a new keyword for unshared immutable and make "immutable" an alias.


November 13, 2021

On Friday, 12 November 2021 at 18:12:14 UTC, Andrei Alexandrescu wrote:

>

On 2021-11-12 8:03, Timon Gehr wrote:

>

On 12.11.21 13:31, Steven Schveighoffer wrote:

>

I've come to the conclusion, in order to fix this situation, so __mutable really means mutable and shared is an orthogonal piece, you need to remove the implicit sharing of immutable.

I agree, that would be much better.

We discussed this a couple of times. It's interesting. Sadly at this point implicit thread sharing of immutable is so baked into the language, it would take a lot of care to extricate. It would be very difficult even for Timon or Paul.

Why? Just introduce readonly as a qualifier for unshared immutable and define immutable to be a shortcut for shared + readonly.

November 13, 2021

On 11/12/21 11:36 PM, Stanislav Blinov wrote:

>

On Saturday, 13 November 2021 at 03:03:21 UTC, Steven Schveighoffer wrote:

>

This is an odd way to look at it. The finalizers are not run directly, and maybe not even run on the same thread as the pure function being run. They also should not be observable (for the most part), because you should not have access to that data any more.

It's like saying an OS context switch that happens in the middle of a pure function must somehow be valid pure code.

Not run directly??? As far as I know, GC, on allocation, may hijack the caller to do a collection. Net effect is that it executes arbitrary (not pure, not @safe, etc.) code within caller.

Whether it does this or pushes it off to another thread is incidental. The running of finalizers is a function of the GC, not the caller.

This is no different than monads in other languages doing impure things initiated by pure functions.

>

Caller is marked pure. GC may run impure finalizers that mutate some global state. Something as stupid as call to "close" which may set errno, or, I don't know, freeing half of program state, mutating shared globals... There can be no "for the most part" here. You've got pure function mutating global state. That's something pure functions aren't supposed to be doing.

The part you are not getting is that this is not something being called by the pure code, it's being run by the GC. Imagine it like a context switch to another thread that runs the GC code, and then switches back to the pure code. In fact, the GC could do this ALREADY, because it could use one of the other threads that it has paused do the collection. But it doesn't really make any difference conceptually which thread runs it. One of those other threads could be in the middle of a pure function.

It's similar to running some kernel code, or signal code -- it's initiated by a separate entity, in this case the GC.

>

Unless that changed and GC isn't doing that anymore, that's a bug that's been open for some years now.

It should be closed as invalid. Which bug is that?

-Steve

November 13, 2021

On Friday, 12 November 2021 at 12:31:03 UTC, Steven Schveighoffer wrote:

>

I've come to the conclusion, in order to fix this situation, so __mutable really means mutable and shared is an orthogonal piece, you need to remove the implicit sharing of immutable. While it can make sense, the conflation of the two concepts causes impossible-to-fix issues. Not just mutable, things like thread-local garbage collection might be easier if you have to explicitly share things.

-Steve

I'd rather keep immutable fully transitive. That's how it's designed to work, and does at least some things well, like allowing multithearded access. With some mutable gaps it's going to have many of the problems of C++ const, and also more complicated. Plus no changes needed.

I do not think having an immutable counted reference is necessary. With present immutable we can still have a mutable counted reference to immutable payload. It does mean some extra complications if the reference is itself stored in immutable data, but the cure would be worse than the disease I think.

If we drop the immutability requirement from the reference, we can have @safe pure @nogc reference counting, or can we?

November 13, 2021

On Saturday, 13 November 2021 at 14:34:43 UTC, Steven Schveighoffer wrote:

>

Whether it does this or pushes it off to another thread is incidental. The running of finalizers is a function of the GC, not the caller.

:)

>

The part you are not getting is that this is not something being called by the pure code, it's being run by the GC.

import someLibrary;
// someLibrary defines a module-global
// int threadLocal;

void main()
{
    someLibrary.threadLocal = () pure nothrow {
        return someLibrary.blah(42);
    } ();
    auto old = someLibrary.threadLocal;
    auto someInts = () pure { return new int[1000]; } ();
    assert(someLibrary.threadLocal == old);
}

That assert may fail. Or you may even crash before getting to it, and not with an OutOfMemoryError, but with a FinalizeError, depending to the value of threadLocal. Or it can be totally fine if the GC doesn't collect. What am I not getting?..

>

Imagine it like a context switch to another thread that runs the GC code, and then switches back to the pure code.

If I imagine that, the assert above should always hold. Because there should be no way that imaginary "another thread" would access main thread's threadLocal. Somehow, reality contradicts imagination.

>

In fact, the GC could do this ALREADY, because it could use one of the other threads that it has paused do the collection. But it doesn't really make any difference conceptually which thread runs it. One of those other threads could be in the middle of a pure function.

Could be != is.

>

It's similar to running some kernel code, or signal code -- it's initiated by a separate entity, in this case the GC.

>

Unless that changed and GC isn't doing that anymore, that's a bug that's been open for some years now.

It should be closed as invalid. Which bug is that?

https://issues.dlang.org/show_bug.cgi?id=19316

Feel free to close it as invalid, if the code above either:

  • dies with an OutOfMemoryError
  • passes the assert

regardless of value of threadLocal.

someLibrary can be this for testing:

module someLibrary;

int threadLocal;

class Good
{
    int calc(int input) pure nothrow { return input * 2; }
}

class Bad {

    int calc(int input) pure nothrow { return input + 14; }

    ~this() {
        // I agree with Walter, dtors should always be nothrow,
        // alas current language allows this
        if (threadLocal == 56) throw new Exception("ugh");
        threadLocal = 0;
    }
}

int blah(int input) pure nothrow {
    if (input <= 25)
       return (new Good).calc(input);
    else
       return (new Bad).calc(input);
}

Contrived? Maybe. Feel free to substitute threadLocal with errno, and make a syscall in Bad.~this.

November 13, 2021

On 11/13/21 3:01 PM, Stanislav Blinov wrote:

>

On Saturday, 13 November 2021 at 14:34:43 UTC, Steven Schveighoffer wrote:

>

Whether it does this or pushes it off to another thread is incidental. The running of finalizers is a function of the GC, not the caller.

:)

>

The part you are not getting is that this is not something being called by the pure code, it's being run by the GC.

import someLibrary;
// someLibrary defines a module-global
// int threadLocal;

void main()
{
     someLibrary.threadLocal = () pure nothrow {
         return someLibrary.blah(42);
     } ();
     auto old = someLibrary.threadLocal;
     auto someInts = () pure { return new int[1000]; } ();
     assert(someLibrary.threadLocal == old);
}

That assert may fail. Or you may even crash before getting to it, and not with an OutOfMemoryError, but with a FinalizeError, depending to the value of threadLocal. Or it can be totally fine if the GC doesn't collect. What am I not getting?..

You are not getting that the GC collecting has nothing to do with the pure function's executation. The GC hijacks the current thread to do its business, and then passes back control to the caller.

> >

Imagine it like a context switch to another thread that runs the GC code, and then switches back to the pure code.

If I imagine that, the assert above should always hold. Because there should be no way that imaginary "another thread" would access main thread's threadLocal. Somehow, reality contradicts imagination.

Actually, the GC can run finalizers from ANY thread. So accessing thread locals in a GC finalizer is risky behavior anyway.

> >

It's similar to running some kernel code, or signal code -- it's initiated by a separate entity, in this case the GC.

>

Unless that changed and GC isn't doing that anymore, that's a bug that's been open for some years now.

It should be closed as invalid. Which bug is that?

https://issues.dlang.org/show_bug.cgi?id=19316

Thanks! I closed it.

>

Feel free to close it as invalid, if the code above either:

  • dies with an OutOfMemoryError
  • passes the assert

The assert is incorrectly written, as the thread local can change at any time if the GC happens to run on the current thread, and happens to be finalizing a Bad object (even if it was allocated via a different thread). This is how you set it up.

Honestly, I think accessing thread locals in a GC destructor should be in the spec as implementation-defined behavior.

-Steve

November 13, 2021

On Saturday, 13 November 2021 at 21:55:21 UTC, Steven Schveighoffer wrote:

>

You are not getting that the GC collecting has nothing to do with the pure function's executation. The GC hijacks the current thread to do its business, and then passes back control to the caller.

...while introducing side effects, which the caller of the pure function was promised WOULD NOT HAPPEN, by the interface of the pure function.

> > >

Imagine it like a context switch to another thread that runs the GC code, and then switches back to the pure code.

If I imagine that, the assert above should always hold. Because there should be no way that imaginary "another thread" would access main thread's threadLocal. Somehow, reality contradicts imagination.

Actually, the GC can run finalizers from ANY thread. So accessing thread locals in a GC finalizer is risky behavior anyway.

The language allows this, and the runtime does this. There's no need for speculation. Risky or not. If a language guarantees something can't happen, it better not happen. Pure functions cannot mutate global state, except through arguments, in which case it's a weakly pure function. That's what the language guarantees. Pure functions can only call pure functions. That's what the language guarantees.

> > >

It's similar to running some kernel code, or signal code -- it's initiated by a separate entity, in this case the GC.

>

Unless that changed and GC isn't doing that anymore, that's a bug that's been open for some years now.

It should be closed as invalid. Which bug is that?

https://issues.dlang.org/show_bug.cgi?id=19316

Thanks! I closed it.

Cool. Then it's on you to reopen it back.

>

The assert is incorrectly written,

Read a, call pure function, read a. Where is it possible for a to mutate, after first read and before second read? Can't be another thread, a is a thread-local int. Can't be the pure function, it's pure and cant mutate a. So where is it possible for a to mutate?

>

as the thread local can change at any time if the GC happens to run on the current thread, and happens to be finalizing a Bad object (even if it was allocated via a different thread).

...Which means that anything that triggers collection (including allocation) cannot be pure. Which is exactly my point. Nor can it be @safe.

>

This is how you set it up.

I have no words...

>

Honestly, I think accessing thread locals in a GC destructor should be in the spec as implementation-defined behavior.

So, then, should be any other side effects. Good luck with that.

November 14, 2021
On Saturday, 13 November 2021 at 23:08:01 UTC, Stanislav Blinov wrote:
> Read `a`, call pure function, read `a`. Where is it possible for `a` to mutate, after first read and before second read? Can't be the pure function, it's pure and cant mutate `a`. So
> where is it possible for `a` to mutate?

auto x = new int;
assert(a < 5);
auto y = new int;
assert(a < 5);
*x += *y;

Would you begrudge the compiler the ability to remove the second assert?  Because there is the _exact_ same problem there.

Actually, forget allocation.  A signal can occur at _any_ time, and its handler can change globals.

> Can't be another thread, `a` is a thread-local int.

You can share pointers to thread-local objects.
November 14, 2021
On Sunday, 14 November 2021 at 00:17:29 UTC, Elronnd wrote:
> On Saturday, 13 November 2021 at 23:08:01 UTC, Stanislav Blinov wrote:
>> Read `a`, call pure function, read `a`. Where is it possible for `a` to mutate, after first read and before second read? Can't be the pure function, it's pure and cant mutate `a`. So
>> where is it possible for `a` to mutate?
>
> auto x = new int;
> assert(a < 5);
> auto y = new int;
> assert(a < 5);
> *x += *y;
>
> Would you begrudge the compiler the ability to remove the second assert?  Because there is the _exact_ same problem there.

So long as collection can call any destructors indiscriminately - yes, I would. Because with current runtime and GC spec `a` might mutate in either call to `new`. Even in both.

> Actually, forget allocation.  A signal can occur at _any_ time, and its handler can change globals.

Indeed, forget allocation, given that I'm talking about collection. But anyway, so what? What do signals have to do with `new` pretending to be pure when it calls destructors that aren't?

>> Can't be another thread, `a` is a thread-local int.
>
> You can share pointers to thread-local objects.

Indeed you can. How does that apply here?