April 01, 2018
On Monday, April 02, 2018 00:25:52 Nicholas Wilson via Digitalmars-d wrote:
> On Sunday, 1 April 2018 at 17:08:37 UTC, Andrei Alexandrescu wrote:
> > On 4/1/18 10:59 AM, Nicholas Wilson wrote:
> > [...]
> > int[] sneaky;
> > struct A
> > {
> >
> >     private int[] innocent;
> >     this(this)
> >     {
> >
> >         sneaky = innocent;
> >
> >     }
> >
> > }
> > void main()
> > {
> >
> >     immutable a = A([1, 2, 3]);
> >     auto b = a;
> >     sneaky[1] = 42; // oops
> >     import std.stdio;
> >     writeln(a.innocent); // ooooops
> >
> > }
> >
> > Sadly this (and many similar ones) compiles and runs warning-free on today's compiler. We really need to close this loop, like, five years ago.
>
> How much of this class of bug would be eliminated by requiring that `this(this)` be pure for assignment to const and immutable objects? Arguably this(this) should always be pure in any sane program. The only reason I can think of is if you're trying to perf the number of copies you're making, but there is compiler help for that.

All kinds of things could be done with a postlbit costructor that don't actually involve copying. These include logging or printing something, and they could include stuff like reference counting, which may or may not need access to something external. debug statements would solve some uses cases but not all. Requiring that this(this) be pure has some of the same issues that requiring opEquals to be pure or const or whatever has. It makes sense in _most_ cases, but occasionally, there are good reasons for it not to be - especially in a systems language.

We have to be _very_ careful about requiring any particular attribute much of anywhere. Doing so typically causes problems - e.g. those we have with Object's opEquals, opCmp, toHash, and toString. That decision really needs to be left up to a particular code base. It's also a big part of why templates infer attributes. Someone can write their code in such a way that their code base requires a particular attribute, but in general, language features shouldn't be requiring any specific attributes, or we'll just be backing ourselves into another corner.

- Jonathan M Davis

April 02, 2018
On 01/04/18 03:32, H. S. Teoh wrote:
> The one nagging question I've been having about pure is: how much are we
> actually taking advantage of the guarantees provided by pure?

My problem is that pure do not provide many guarantees.

>  We have
> developed very clever ways of extending the traditional definition of
> pure and invented creative ways of making more things pure, which is all
> great.

Can anyone explain to me what good are the guarantees provided by a function that is pure but not strictly pure? I couldn't find them.

>  But AFAIK the only place where it's actually taken advantage of
> is to elide some redundant function calls inside a single expression.

You cannot even do that unless the function is strictly pure. For all D's extension of the pure concept, it weakened, rather than enhanced, what it means.

> And perhaps infer uniqueness in some cases for implicit casting to
> immutable.

Can you expand on that one?

Shachar
April 02, 2018
On Monday, April 02, 2018 09:56:19 Shachar Shemesh via Digitalmars-d wrote:
> On 01/04/18 03:32, H. S. Teoh wrote:
> > The one nagging question I've been having about pure is: how much are we actually taking advantage of the guarantees provided by pure?
>
> My problem is that pure do not provide many guarantees.
>
> >  We have
> >
> > developed very clever ways of extending the traditional definition of pure and invented creative ways of making more things pure, which is all great.
>
> Can anyone explain to me what good are the guarantees provided by a function that is pure but not strictly pure? I couldn't find them.

Honestly, I think at this point pure is easier to understand if you think of it as @noglobal and don't think about functional purity at all. What pure does is make it so that the function cannot access global, mutable state except through its arguments. So, all that it's working with (except for constants such as enums or immutable, module-level variables) is what it's given via its arguments. As such, the primary benefit of pure is that you know that no global state is being mucked with unless it was passed to the function as an argument. _That_ is the guarantee that pure provides. Everything else about pure is just built on top of that guarantee and what the compiler can infer from it.

If the function is "strongly" pure (i.e. its parameters are immutable or implicitly convertible to immutable) then the compiler knows that if the function is called multiple times with the exact same arguments, then it knows that the result will be the same each time (though it does have to take into account the fact that each call could return a newly allocated object - they'd just be equivalent objects every time). So, when you're dealing with a strongly pure function, you're then dealing with actual, functional purity, and the compiler can choose to optimize calls such as foo(5) * foo(5) so that foo(5) is only called once.

Weakly pure functions (i.e. pure functions that aren't strongly pure) don't have those same optimization benefits, because calling them could result in the arguments being mutated, but because weakly pure functions don't access global, mutable state, the compiler can safely call them from a strongly pure function without violating the guarantee that multiple calls with the same arguments to a strongly pure function are supposed to give the same result.

Most pure functions are weakly pure (e.g. unless a pure member function is immutable, then it's weakly pure), so the optimization benefits are pretty minimal in most cases. And even if a function is strongly pure, the optimization is only done within an expression (or maybe statement - I can never remember which), because going beyond that would require data flow analysis, which the compiler rarely does. As such, pretty much the only time you end up with function calls being elided thanks to pure is when you have a strongly pure function where you do something like foo(5) * foo(5). So, the optimization benefits of pure are pretty minimal. It's that guarantee about not touching global, mutable state which is the main benefit. As such, if we were adding pure now, I would strongly argue for calling it something like @noglobal. I think that it would reduce the confusion considerably. As it stands, while it helps make functional purity possible in some cases, it ultimately doesn't have much to do with functional purity in spite of its name.

> > And perhaps infer uniqueness in some cases for implicit casting to immutable.
>
> Can you expand on that one?

int[] foo()
{
    return [1, 2, 3, 4, 5];
}

void main()
{
    immutable arr = foo();
}

does not compile, because the return value is mutable, and you can't implicitly convert int[] to immutable int[]. However,

int[] foo() pure
{
    return [1, 2, 3, 4, 5];
}

void main()
{
    immutable arr = foo();
}

compiles just fine. Because foo is pure, and the compiler knows that the return value could not possibly have come from the function's arguments, it knows that the return value is unique and that it won't violate the type system to implicitly cast it to immutable. As such, you can write a function as complicated as you want to create the return value, and so long as the function is pure, and the compiler can determine that the return value did not come via an argument, you can implicitly convert the return value to immutable rather than having to use something like std.exception.assumeUnique or an explicit cast, which relies on the programmer verifying that the object being cast is indeed unique rather than having the compiler guarantee it.

Whether the implicit cast is allowed ultimately depends on the types of the pure function's parameters and the actual arguments, so it's not always obvious whether it will work or not, but in general, it works very well for initializing complex, immutable objects without having to rely on getting casts right. e.g. this still compiles

int[] foo(int[] a) pure
{
    return [1, 2, 3, 4, 5];
}

void main()
{
    immutable arr = foo([1, 2]);
}

because the compiler can see that the argument being passed is unique and that therefore the return value is unique whether it came from the argument or not, whereas this doesn't compile

int[] foo(int[] a) pure
{
    return [1, 2, 3, 4, 5];
}

void main()
{
    int[] a = [1, 2];
    immutable arr = foo(a);
}

because the compiler can't guarantee that the return value isn't a slice of the argument (at least not without looking at the implementation, but all the compiler looks at is the signature).

- Jonathan M Davis

April 02, 2018
On 02/04/18 10:45, Jonathan M Davis wrote:
> Honestly, I think at this point pure is easier to understand if you think of
> it as @noglobal and don't think about functional purity at all.

That's fine. My point was that the only optimizations possible are possible on strictly pure functions (the immutable cast one included). Weakly pure functions add nothing.

But merely having them around means that when I annotate a function with "pure", I do not promise any guarantees that the compiler can actually use to perform optimizations.

Shachar
April 02, 2018
On Monday, 2 April 2018 at 00:44:14 UTC, Jonathan M Davis wrote:
> On Monday, April 02, 2018 00:25:52 Nicholas Wilson via Digitalmars-d wrote:
>> On Sunday, 1 April 2018 at 17:08:37 UTC, Andrei Alexandrescu wrote:
>> > On 4/1/18 10:59 AM, Nicholas Wilson wrote:
>> > [...]
>> > int[] sneaky;
>> > struct A
>> > {
>> >
>> >     private int[] innocent;
>> >     this(this)
>> >     {
>> >
>> >         sneaky = innocent;
>> >
>> >     }
>> >
>> > }
>> > void main()
>> > {
>> >
>> >     immutable a = A([1, 2, 3]);
>> >     auto b = a;
>> >     sneaky[1] = 42; // oops
>> >     import std.stdio;
>> >     writeln(a.innocent); // ooooops
>> >
>> > }
>> >
>> > Sadly this (and many similar ones) compiles and runs warning-free on today's compiler. We really need to close this loop, like, five years ago.
>>
>> How much of this class of bug would be eliminated by requiring that `this(this)` be pure for assignment to const and immutable objects? Arguably this(this) should always be pure in any sane program. The only reason I can think of is if you're trying to perf the number of copies you're making, but there is compiler help for that.
>
> All kinds of things could be done with a postlbit costructor that don't actually involve copying. These include logging or printing something, and they could include stuff like reference counting, which may or may not need access to something external. debug statements would solve some uses cases but not all. Requiring that this(this) be pure has some of the same issues that requiring opEquals to be pure or const or whatever has. It makes sense in _most_ cases, but occasionally, there are good reasons for it not to be - especially in a systems language.

I wasn't suggesting this a global requirement of all this(this)s,
only to the _postblit assignment to const and immutable objects_ ( where
being pure would to disallow the above bug). Note that this should
still be able to be worked around by cast()s (which are un@safe) and therefore require
@trusted to work in @safe code, in which case the programmer has presumably
thought about the situation and knows what he's doing.

April 02, 2018
On Monday, April 02, 2018 11:04:08 Shachar Shemesh via Digitalmars-d wrote:
> On 02/04/18 10:45, Jonathan M Davis wrote:
> > Honestly, I think at this point pure is easier to understand if you think of it as @noglobal and don't think about functional purity at all.
>
> That's fine. My point was that the only optimizations possible are possible on strictly pure functions (the immutable cast one included). Weakly pure functions add nothing.
>
> But merely having them around means that when I annotate a function with "pure", I do not promise any guarantees that the compiler can actually use to perform optimizations.

It means that a strongly pure function could call the function, which is the entire reason that the definition of pure was widened to simply mean that it guarantees that the function doesn't access global, mutable state instead of also including the requirements about parameters which are placed on strongly pure functions. So, a weakly pure function _can_ help with optimizations in that it helps to implement strongly pure functions, and without weakly pure functions, what you can do with strongly pure functions can be very limited, making weakly pure functions very important even if all you care about is optimizations, but no, a call to a weakly pure function cannot be elided based on the fact that it's weakly pure. Now, pure combined with const could provide some optimizations in rare cases, since the compiler can guarantee that a pure function doesn't mutate a const argument via another reference if no such mutable reference could be accessed through one of the arguments, but I doubt that such optimizations are done at this point, and it wouldn't involve eliding function calls. But in theory, the fact that the compiler knows that a function can't access anything except through its arguments could allow the compiler to optimize some code, even if it doesn't involve eliding function calls.

Ultimately, I think that it's a mistake to think about pure having much to do with optimizations. Much as such optimizations do exist, they're just too limited. The primary advantage is that when you see that a function is pure, you know that the function is just using what it's given via its arguments, just like when you see a variable is immutable, you know that it can't be mutated. Any optimizations that can be gotten via pure are therefore mostly just gravy.

So, if your only motivation in dealing with pure is optimizations, then there's a good chance that it really isn't worth your time, but personally, I think that it's quite valuably simply for guaranteeing the that function doesn't access global, mutable state.

- Jonathan M Davis

April 02, 2018
By sheer coincidence, I've just stumbled upon another limitation of this(this). I am not sure whether it is already documented.

Let's define struct S1 with no copying allowed, and put it as a member of struct S2. Under C++ as well as under D, this automatically makes S2 also non-copyable.
Under C++, however, I can do this:

struct S1 {
    S1() {}

    // Make S1 non-copyable
    S1(const S1 &that) = delete;
    S1 &operator=(const S1 &that) = delete;
};

struct S2 {
    int a;
    S1 s;

    S2(int _a) : a(_a) {
    }

    S2(const S2 &that) : a(that.a) {
    }
};

int main() {
    S2 s(17);
    S2 v(s); // This compiles, invoking S2's copy ctor
}


In other words, I can tell the compiler that I know how to copy S2 without having to copy the member S1.

Under D, this simply doesn't work. If S1 has @disable this(this), any struct that has S1 as a member will be uncopyable, and this is not overridable.

Shachar
April 02, 2018
On Saturday, 31 March 2018 at 23:38:06 UTC, Andrei Alexandrescu wrote:
> * should work with mutable, const, immutable, and shared

The problem is that postblit is not overloadable, so make it overloadable, and problems with overloading will be solved.

> * immutable and const are very difficult, but we have an attack (assuming copy construction gets taken care of)

Collections must be filled somehow, so they are inherently mutable, immutable collections need a whole different design approach, it doesn't look specific to postblit.

> * pure is difficult

Purity depends on written code. Running impure code in copy constructor won't make it pure.
April 02, 2018
On 4/2/18 4:04 AM, Shachar Shemesh wrote:
> On 02/04/18 10:45, Jonathan M Davis wrote:
>> Honestly, I think at this point pure is easier to understand if you think of
>> it as @noglobal and don't think about functional purity at all.
> 
> That's fine. My point was that the only optimizations possible are possible on strictly pure functions (the immutable cast one included). Weakly pure functions add nothing.
> 
> But merely having them around means that when I annotate a function with "pure", I do not promise any guarantees that the compiler can actually use to perform optimizations.
> 
> Shachar

This is a good article motivating the relaxed purity model we have: http://klickverbot.at/blog/2012/05/purity-in-d/
April 02, 2018
On 02.04.2018 08:56, Shachar Shemesh wrote:
> On 01/04/18 03:32, H. S. Teoh wrote:
>> The one nagging question I've been having about pure is: how much are we
>> actually taking advantage of the guarantees provided by pure?
> 
> My problem is that pure do not provide many guarantees.
> ...

It guarantees that no global state is accessed.

>>  We have
>> developed very clever ways of extending the traditional definition of
>> pure and invented creative ways of making more things pure, which is all
>> great.
> 
> Can anyone explain to me what good are the guarantees provided by a function that is pure but not strictly pure? I couldn't find them.
> ...

You can use weakly pure functions to compose strongly pure functions.

>>  But AFAIK the only place where it's actually taken advantage of
>> is to elide some redundant function calls inside a single expression.
> 
> You cannot even do that unless the function is strictly pure. For all D's extension of the pure concept, it weakened, rather than enhanced, what it means.
> ...

There is no such weakening.

>> And perhaps infer uniqueness in some cases for implicit casting to
>> immutable.
> 
> Can you expand on that one?
> 
> Shachar