April 09, 2015
On 2015-04-08 23:10:37 +0000, Walter Bright <newshound2@digitalmars.com> said:

> http://wiki.dlang.org/DIP77

In the definition of a Reference Counted Object:

"""
An object is assumed to be reference counted if it has a postblit and a destructor, and does not have an opAssign marked @system.
"""

Why should it not have an opAssign marked @system?

And what happens if the struct has a postblit but it is @disabled? Will the compiler forbid you from passing it by ref in cases where it'd need to make a copy, or will it just not be a RCO?

More generally, is it right to add implicit copying just because a struct has a postblit and a destructor? If someone implemented a by-value container in D (such as those found in C++), this behaviour of the compiler would trash the performance by silently doing useless unnecessary copies. You won't even get memory-safety as a benefit: if the container allocates from the GC it's safe anyway, otherwise you're referencing deallocated memory with your ref parameter (copying the struct would just make a copy elsewhere, not retain the memory of the original).

I think you're assuming too much from the presence of a postblit and a destructor. This implicit copy behaviour should not be trigged by seemingly unrelated clues. Instead of doing that:

	auto tmp = rc;

the compiler should insert this:

	auto tmp = rc.opPin();

RCArray can implement opPin by returning a copy of itself. A by-value container can implement opPin by returning a dummy struct that retains the container's memory until the dummy struct's destructor is called. Alternatively someone could make a dummy "void opPin() @system {}" to signal it isn't safe to pass internal references around (only in system code would the implicit call to opPin compile). If you were writing a layout-compatible D version of std::vector, you'd likely have to use a @system opPin because there's no way you can "pin" that memory and guaranty memory-safety when passing references around.

-- 
Michel Fortin
michel.fortin@michelf.ca
http://michelf.ca

April 09, 2015
On Thursday, 9 April 2015 at 09:05:10 UTC, Ola Fosheim Grøstad wrote:
> 2. How will this work with "yield"?
>

You yield both caller and callee, so you'll get caller's boxing in the yield.

> Why not just implement the more generic solution (shared pointers with move/borrow or WPO) ?

I don't think this is possible.
April 09, 2015
On 4/9/2015 5:05 AM, Michel Fortin wrote:
> Why should it not have an opAssign marked @system?

"Andrei's idea was to not do the copy for @system opAssign's, thus providing C++ equivalence for those folks that need it and don't care about guaranteed memory safety."


> And what happens if the struct has a postblit but it is @disabled? Will the
> compiler forbid you from passing it by ref in cases where it'd need to make a
> copy, or will it just not be a RCO?

It wouldn't be an RCO.


> More generally, is it right to add implicit copying just because a struct has a
> postblit and a destructor? If someone implemented a by-value container in D
> (such as those found in C++), this behaviour of the compiler would trash the
> performance by silently doing useless unnecessary copies. You won't even get
> memory-safety as a benefit: if the container allocates from the GC it's safe
> anyway, otherwise you're referencing deallocated memory with your ref parameter
> (copying the struct would just make a copy elsewhere, not retain the memory of
> the original).

The only real purpose to a postblit is to support ref counting. Why would a by-value container use a postblit and not ref count?


> I think you're assuming too much from the presence of a postblit and a
> destructor. This implicit copy behaviour should not be trigged by seemingly
> unrelated clues. Instead of doing that:
>
>      auto tmp = rc;
>
> the compiler should insert this:
>
>      auto tmp = rc.opPin();
>
> RCArray can implement opPin by returning a copy of itself. A by-value container
> can implement opPin by returning a dummy struct that retains the container's
> memory until the dummy struct's destructor is called. Alternatively someone
> could make a dummy "void opPin() @system {}" to signal it isn't safe to pass
> internal references around (only in system code would the implicit call to opPin
> compile). If you were writing a layout-compatible D version of std::vector,
> you'd likely have to use a @system opPin because there's no way you can "pin"
> that memory and guaranty memory-safety when passing references around.

My first impression is that's too complicated for the user to get right.

April 09, 2015
On Thursday, 9 April 2015 at 18:44:10 UTC, Walter Bright wrote:
> The only real purpose to a postblit is to support ref counting. Why would a by-value container use a postblit and not ref count?

A struct could have a postblit defined if you are implementing something like std::vector, where you you copy the memory when the struct is copied. I'm not sure why you would want to do such a thing in D, though. If allocating memory is your concern, you probably don't want any allocation, including malloc.
April 09, 2015
On 4/9/2015 11:53 AM, w0rp wrote:
> On Thursday, 9 April 2015 at 18:44:10 UTC, Walter Bright wrote:
>> The only real purpose to a postblit is to support ref counting. Why would a
>> by-value container use a postblit and not ref count?
>
> A struct could have a postblit defined if you are implementing something like
> std::vector, where you you copy the memory when the struct is copied. I'm not
> sure why you would want to do such a thing in D, though.

I'm not sure why you'd do that, either. Just make the memory part ref counted, then when modifying it, make the copy then if your ref count > 1.

If you want to interface with std::vector, make opAssign @system, after all, you're dealing with C++ :-)

April 09, 2015
On Thursday, 9 April 2015 at 18:31:24 UTC, deadalnix wrote:
> On Thursday, 9 April 2015 at 09:05:10 UTC, Ola Fosheim Grøstad wrote:
>> 2. How will this work with "yield"?
>>
>
> You yield both caller and callee, so you'll get caller's boxing in the yield.

But the coroutine stack and everything on it will be intact when it yields, including references to array elements...?

>> Why not just implement the more generic solution (shared pointers with move/borrow or WPO) ?
>
> I don't think this is possible.

It should be possible with pointer analysis, but the easier approach is just to ban non-const ref parameters (c++ style) for rc-pointer-objects, so maybe D should provide "head const" after all...

April 09, 2015
On Thursday, 9 April 2015 at 01:35:54 UTC, Walter Bright wrote:
> The same as that of a tmp being returned from a function - to the end of the expression.

But how will this work with anything that makes RCArrays reachable through indirections...? Is the compiler going to do a recursive scan and create temporaries of all RC-pointers that are reachable?

(e.g. RCArrays of RCArrays)


April 09, 2015
On Thursday, 9 April 2015 at 18:44:10 UTC, Walter Bright wrote:
> On 4/9/2015 5:05 AM, Michel Fortin wrote:
>> Why should it not have an opAssign marked @system?
>
> "Andrei's idea was to not do the copy for @system opAssign's, thus providing C++ equivalence for those folks that need it and don't care about guaranteed memory safety."
>

Why not bind this behavior to extern(C++) ?

> The only real purpose to a postblit is to support ref counting. Why would a by-value container use a postblit and not ref count?
>

In C++, the only real purpose of template were generic containers, and that didn't ended up well.

It is dangerous, at language level, to reason from usage instead of first principle (not that usage do not matter, usage should serve as a guideline for the principles) or thing like C++ templates happens.

> My first impression is that's too complicated for the user to get right.

Yeah, I don't think opPin is a the right way to go. It is always easy and tempting to add new stuff to make X or Y work, but at the end, it only create language complexity explosion.

There are some inefficiencies involved here, but I trust the compiler to be able to optimize it away for the most part, and we have a backdoor for thoses who want to bypass safety.

More generally, this is why I oppose the return attribute + adding op on reference type in favor of a principle scope proposal. The first one is 2 language features for a less general end result.
April 09, 2015
On 4/9/2015 12:58 PM, "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= <ola.fosheim.grostad+dlang@gmail.com>" wrote:
> On Thursday, 9 April 2015 at 01:35:54 UTC, Walter Bright wrote:
>> The same as that of a tmp being returned from a function - to the end of the
>> expression.
>
> But how will this work with anything that makes RCArrays reachable through
> indirections...? Is the compiler going to do a recursive scan and create
> temporaries of all RC-pointers that are reachable?
>
> (e.g. RCArrays of RCArrays)


It examines the types of all mutable values available to the function.

Note the discussion of the effect of purity in the DIP.
April 09, 2015
On 4/9/2015 1:21 PM, deadalnix wrote:
> On Thursday, 9 April 2015 at 18:44:10 UTC, Walter Bright wrote:
>> On 4/9/2015 5:05 AM, Michel Fortin wrote:
>>> Why should it not have an opAssign marked @system?
>>
>> "Andrei's idea was to not do the copy for @system opAssign's, thus providing
>> C++ equivalence for those folks that need it and don't care about guaranteed
>> memory safety."
>>
>
> Why not bind this behavior to extern(C++) ?

Because the charter of @system is "the user supplies the memory safety", which fits here perfectly. extern(C++) carries a lot of other stuff with it.


> It is dangerous, at language level, to reason from usage instead of first
> principle (not that usage do not matter, usage should serve as a guideline for
> the principles) or thing like C++ templates happens.

On the other hand, throwing things in the language just because you can, with no idea what they are good for or how they will be used, doesn't end well.


>> My first impression is that's too complicated for the user to get right.
> Yeah, I don't think opPin is a the right way to go. It is always easy and
> tempting to add new stuff to make X or Y work, but at the end, it only create
> language complexity explosion.

Glad you agree. I'm afraid of making something that is technically correct, but unusable.


> There are some inefficiencies involved here, but I trust the compiler to be able
> to optimize it away for the most part, and we have a backdoor for thoses who
> want to bypass safety.

Yes. Interestingly, this is a superset of what Rust does. Rust's has the idea of only one mutable reference at a time, for efficiency. As the DIP points out, if the function only has const references available other than the one mutable one, it doesn't do the copy, either.

I was curious how Rust handled the global variable issue. Turns out it doesn't - mutable global variables are marked as "unsafe" and the checker gives up on it.


> More generally, this is why I oppose the return attribute + adding op on
> reference type in favor of a principle scope proposal. The first one is 2
> language features for a less general end result.

I understand where you're coming from on this. The design was picked for being as simple as possible with the least disruption.