May 02, 2009
Robert Jacques wrote:
> On Sat, 02 May 2009 03:39:27 -0400, Rainer Deyke <rainerd@eldwood.com> wrote:
>> You can have non-copyable value types, but they can't be containers.
> 
> No they don't. Iteration can often be destructive. If I iterate over a stack or a queue, I'm left with an empty stack/queue. For value semantics to work non-destructive copying is required.

OK.  I grant that there are non-copyable types that can reasonably be
called containers.  Simple solution: make them non-copyable structs.
You still get most of the benefits of value types:
  - One less layer of indirection.
  - No long distance dependencies.
  - RAII.


-- 
Rainer Deyke - rainerd@eldwood.com
May 02, 2009
On Sat, 02 May 2009 15:17:04 -0400, Rainer Deyke <rainerd@eldwood.com> wrote:

> Robert Jacques wrote:
>> On Sat, 02 May 2009 03:39:27 -0400, Rainer Deyke <rainerd@eldwood.com>
>> wrote:
>>> You can have non-copyable value types, but they can't be containers.
>>
>> No they don't. Iteration can often be destructive. If I iterate over a
>> stack or a queue, I'm left with an empty stack/queue. For value
>> semantics to work non-destructive copying is required.
>
> OK.  I grant that there are non-copyable types that can reasonably be
> called containers.  Simple solution: make them non-copyable structs.
> You still get most of the benefits of value types:
>   - One less layer of indirection.

Again, D array's are structs with reference semantics. This isn't a pro/con either way.

>   - No long distance dependencies.

Well, if I can't copy it, then I have to use ref everywhere, which is functionally equivalent to reference semantics. I think you've just proved the counter-point.

>   - RAII.

Can be done with structs or classes. Also see point 1. So, this isn't a pro/con either way.


May 02, 2009
Robert Jacques wrote:
> On Sat, 02 May 2009 10:18:41 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
>> Bill Baxter wrote:
>>> On Fri, May 1, 2009 at 6:25 PM, Andrei Alexandrescu
>>>>  Matrix a, b, c; ... c = a; a += b;
>>>>  Does the last operation mutate c as well?
>>>  I said "assignments like a = b are rare" and you put one of those in your example.
>>
>> You said it and I don't buy it one femtosecond. Code does need to copy
>> values around. It's a fundamental operation!
> 
> Andrei, he said that explicit assignments/copies of arrays/containers are rare in numeric computing, which I agree with. Just because it's a fundamental operation doesn't mean it gets used much is his (or I guess Numply's actually) specific applications. Also, disregarding/disbelieving a use case is not helpful to this discussion.

He said something. That's about as much proof as was seen. I didn't buy it so I replied without assuming the same as him. Then he repeated "but I said that!" which upped the ante from supposition to presupposition. I think presuppositions are particularly pernicious so I felt the need to explicitly say that I don't believe it just because he said it. It's not a use case. It's just something that was said. If some sort of evidence is given, that would be great. Don't put the onus on me to disprove what was said.

>>> Yes, when you have an a=b anywhere you've got to pay attention and
>>> make sure you didn't mean a=b.dup.
>>
>> That is terrible. Anywhere == a portion of the code far, far away.
> 
> No, he was talking about a local decision. i.e. Did I or didn't I mean to make a logical copy here?

The local decision has effects that may go undetected until much later.


Andrei
May 02, 2009
Robert Jacques wrote:
> Also, in a value semantics world, refs are third class citizens, but in a reference semantic world, value semantics get their own assignment operator ( []= ), and by convention, their own property ( .dup )

The major problem is not assignment. That can be taken care of. The problem is:

1. Passing an object into a function

2. Making the object as a member of another object

3. Yes, assigning to the object (which ought to be congruent with 1 and 2).

I have perused some more searches and documentation, and things don't bode well for references. Consider the PyNum library. I have searched

pynum pass by reference

and found some interesting links. The first reveals differences between numpy and matlab (notably reference semantics). The second is a discussion entitled "beginner confused with numpy". Guess what the confusion was about? Congratulations, you have won a 40'' LCD TV. Reference semantics!

Third hit: https://www-old.cae.wisc.edu/pipermail/octave-maintainers/2009-March/011509.html

says:

>> Octave's pass-by-value mechanism (with lazy copy-on-write) is
>> something that is *far* more simple to grasp than NumPy's inherited
>> everything-is-a-reference semantics. I do regard myself as moderately
>> experienced Python programmer, yet every now and then I get shot in
>> the foot by the reference semantics in Python.

I swear I didn't pay that guy.

Also, getting back to Perl's Data language (http://tinyurl.com/derlrh) I see the mention to references is clearly a WARNING, not an INTERESTING AND DESIRABLE FEATURE.

"It is important to keep the ``reference nature'' of piddles in mind when passing piddles into subroutines. If you modify the input pdls you modify the original argument,  not a copy of it. This is different from some other array processing languages but makes for very efficient passing of piddles between subroutines."

So what I'm seeing is that reference semantics is not desirable and not natural but was chosen for efficiency reasons. But you know, I don't want to "keep in mind" extra stuff when coding, I have enough to worry about. If I can get away with ref and/or refcounting, then I have taken care of the efficiency issue and I don't need to keep in mind a weird quirk.

============

About undue copying of data: maybe that could be avoided by having functions manipulate ranges (which are cheap to copy), not the containers that use them. In C++ you need to pass the container more often than not mostly because passing two iterators is amazingly burdensome. In D we can pass ranges easily, so I think much D code that wants to do stuff will just take ranges.


Andrei
May 02, 2009
On Sat, 02 May 2009 17:59:53 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
> Robert Jacques wrote:
>> Also, in a value semantics world, refs are third class citizens, but in a reference semantic world, value semantics get their own assignment operator ( []= ), and by convention, their own property ( .dup )
>
> The major problem is not assignment. That can be taken care of. The problem is:
>
> 1. Passing an object into a function
>
> 2. Making the object as a member of another object
>
> 3. Yes, assigning to the object (which ought to be congruent with 1 and 2).
>
> I have perused some more searches and documentation, and things don't bode well for references. Consider the PyNum library. I have searched
>
> pynum pass by reference
>
> and found some interesting links. The first reveals differences between numpy and matlab (notably reference semantics). The second is a discussion entitled "beginner confused with numpy". Guess what the confusion was about? Congratulations, you have won a 40'' LCD TV. Reference semantics!
>
> Third hit: https://www-old.cae.wisc.edu/pipermail/octave-maintainers/2009-March/011509.html
>
> says:
>
>  >> Octave's pass-by-value mechanism (with lazy copy-on-write) is
>  >> something that is *far* more simple to grasp than NumPy's inherited
>  >> everything-is-a-reference semantics. I do regard myself as moderately
>  >> experienced Python programmer, yet every now and then I get shot in
>  >> the foot by the reference semantics in Python.
>
> I swear I didn't pay that guy.
>
> Also, getting back to Perl's Data language (http://tinyurl.com/derlrh) I see the mention to references is clearly a WARNING, not an INTERESTING AND DESIRABLE FEATURE.
>
> "It is important to keep the ``reference nature'' of piddles in mind when passing piddles into subroutines. If you modify the input pdls you modify the original argument,  not a copy of it. This is different from some other array processing languages but makes for very efficient passing of piddles between subroutines."
>
> So what I'm seeing is that reference semantics is not desirable and not natural but was chosen for efficiency reasons. But you know, I don't want to "keep in mind" extra stuff when coding, I have enough to worry about. If I can get away with ref and/or refcounting, then I have taken care of the efficiency issue and I don't need to keep in mind a weird quirk.
>

I've said it before and I'll say it again. In high-level interpreted languages, with interactive consoles etc, you're generally doing prototyping work and value semantics are better for that. But D is a systems language. That doesn't make the arguments any less valid, but you might want to down weight them a bit.

> ============
>
> About undue copying of data: maybe that could be avoided by having functions manipulate ranges (which are cheap to copy), not the containers that use them. In C++ you need to pass the container more often than not mostly because passing two iterators is amazingly burdensome. In D we can pass ranges easily, so I think much D code that wants to do stuff will just take ranges.

And ranges _are_ reference semantics. If as you say most of D's code would use ranges, doesn't that prove the counter point?

Although, I'm starting to see an interesting story: Here are containers. They have value semantics and are simple to use/prototype with. When you're done, you can move to ranges in the performance critical sections in order to boost performance. And some things, like high performance lock-free queues and stacks, might only exist as ranges.
May 02, 2009
Robert Jacques wrote:
> Again, D array's are structs with reference semantics. This isn't a pro/con either way.

The D1 dynamic array type does not have reference semantics, nor does it have value semantics.

void f(int[] a) {
  a.length = 1;
}

auto a = [];
f(a);
assert(a.length == 0);

>>   - No long distance dependencies.
> 
> Well, if I can't copy it, then I have to use ref everywhere, which is functionally equivalent to reference semantics. I think you've just proved the counter-point.

Given a value type 'T', you have the guarantee that no two variables of type 'T' can alias each other.  This guarantee is preserved when the type 'T' is non-copyable.

An argument of type 'ref T' can obviously alias a variable of type 'T'.

>>   - RAII.
> 
> Can be done with structs or classes. Also see point 1. So, this isn't a pro/con either way.

The D1 dynamic array type does not support RAII.


-- 
Rainer Deyke - rainerd@eldwood.com
May 02, 2009
On Sat, 02 May 2009 17:45:16 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:

> Robert Jacques wrote:
>> On Sat, 02 May 2009 10:18:41 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
>>> Bill Baxter wrote:
>>>> On Fri, May 1, 2009 at 6:25 PM, Andrei Alexandrescu
>>>>>  Matrix a, b, c; ... c = a; a += b;
>>>>>  Does the last operation mutate c as well?
>>>>  I said "assignments like a = b are rare" and you put one of those in your example.
>>>
>>> You said it and I don't buy it one femtosecond. Code does need to copy
>>> values around. It's a fundamental operation!
>>  Andrei, he said that explicit assignments/copies of arrays/containers are rare in numeric computing, which I agree with. Just because it's a fundamental operation doesn't mean it gets used much is his (or I guess Numply's actually) specific applications. Also, disregarding/disbelieving a use case is not helpful to this discussion.
>
> He said something. That's about as much proof as was seen. I didn't buy it so I replied without assuming the same as him. Then he repeated "but I said that!" which upped the ante from supposition to presupposition. I think presuppositions are particularly pernicious so I felt the need to explicitly say that I don't believe it just because he said it. It's not a use case. It's just something that was said. If some sort of evidence is given, that would be great. Don't put the onus on me to disprove what was said.
>

I don't expect you to disprove what was said. I would hope that when you see someone over-generalizing their own or someone else's experience, you take it for what it was: a use case. That being said, I think there was some miscommunication.
Bill seems to be talking about no-op assignments

a = c;

while you've (probably correctly) hit on is:

a = foo(b);

which results in an assignment and happens everywhere.
May 02, 2009
Robert Jacques wrote:
> Although, I'm starting to see an interesting story: Here are containers. They have value semantics and are simple to use/prototype with. When you're done, you can move to ranges in the performance critical sections in order to boost performance. And some things, like high performance lock-free queues and stacks, might only exist as ranges.

I'm off the phone with Walter. He made a golden point: matrices are not general containers! They are mathematical entities and probably are indeed best off with value semantics (if efficiency issues can be taken care of).

But matrix semantics are not necessarily generalizable to generic container semantics. I think that's a good insight. So probably worrying about matrices when discussing containers is a red herring.


Andrei
May 03, 2009
On Sat, 02 May 2009 19:11:11 -0400, Rainer Deyke <rainerd@eldwood.com> wrote:

> Robert Jacques wrote:
>> Again, D array's are structs with reference semantics. This isn't a
>> pro/con either way.
>
> The D1 dynamic array type does not have reference semantics, nor does it
> have value semantics.
>
> void f(int[] a) {
>   a.length = 1;
> }
>
> auto a = [];
> f(a);
> assert(a.length == 0);
>
>>>   - No long distance dependencies.
>>
>> Well, if I can't copy it, then I have to use ref everywhere, which is
>> functionally equivalent to reference semantics. I think you've just
>> proved the counter-point.
>
> Given a value type 'T', you have the guarantee that no two variables of
> type 'T' can alias each other.  This guarantee is preserved when the
> type 'T' is non-copyable.
>
> An argument of type 'ref T' can obviously alias a variable of type 'T'.

Okay, if T is not copyable, then I _must_ pass it as ref T, everywhere. Which is reference semantics.

>>>   - RAII.
>>
>> Can be done with structs or classes. Also see point 1. So, this isn't a
>> pro/con either way.
>
> The D1 dynamic array type does not support RAII.

There are two parts to D's arrays. One a struct 2 words long, the other is a chunk of ram. The first part is RAII, the second part is not possible, since D doesn't allow dynamically sized memory allocation on the stack. And basically all dynamic data structures have to do some heap allocation. I had thought you were talking about the difference between have the managing head be a struct of a class. (See scope classes)

May 03, 2009
Robert Jacques wrote:
> On Sat, 02 May 2009 19:11:11 -0400, Rainer Deyke <rainerd@eldwood.com> wrote:
>> Given a value type 'T', you have the guarantee that no two variables of type 'T' can alias each other.  This guarantee is preserved when the type 'T' is non-copyable.
>>
>> An argument of type 'ref T' can obviously alias a variable of type 'T'.
> 
> Okay, if T is not copyable, then I _must_ pass it as ref T, everywhere. Which is reference semantics.

When passing arguments, (possibly const) ref is a reasonable default.  I don't care about how arguments are passed.  I care about aliasing between variables, especially member variables.

With reference semantics, two variables of type T can reference each other.  With non-copyable types, they cannot.

>>>>   - RAII.
>>>
>>> Can be done with structs or classes. Also see point 1. So, this isn't a pro/con either way.
>>
>> The D1 dynamic array type does not support RAII.
> 
> There are two parts to D's arrays. One a struct 2 words long, the other is a chunk of ram. The first part is RAII, the second part is not possible, since D doesn't allow dynamically sized memory allocation on the stack.

It's meaningless to talk about RAII in the context of the "struct" part of a D1 dynamic array, since it doesn't manage any resources.  If I place a variable of a RAII type in a D1 dynamic array, it is not properly destroyed when the array goes out of scope.  Therefore D1 dynamic arrays do not support RAII.

Stack versus heap allocation is an orthogonal issue.


-- 
Rainer Deyke - rainerd@eldwood.com