December 09, 2012
On Sunday, 9 December 2012 at 20:14:18 UTC, Jonathan M Davis wrote:
> Algorithms end up copying all the time


Hmm... like which ones, and copying what kinds of objects?
December 09, 2012
On Sunday, December 09, 2012 21:21:10 Mehrdad wrote:
> On Sunday, 9 December 2012 at 20:14:18 UTC, Jonathan M Davis
> 
> wrote:
> > Algorithms end up copying all the time
> 
> Hmm... like which ones, and copying what kinds of objects?

Just calling front on a range is going to copy the element, and it's not uncommon for front to be called multiple times within a range-based function. Calls to front should probably minimized anyway, because front could be calculating front instead of simply returning it, but that's often not accounted for like it should be. But even if it _is_ accounted for, if front is being called in a loop (as it has to be while iterating), then it's going to be called over and over again. And if making a copy is O(n), then those calls to front end up being O(n) instead of O(1) like they're supposed to be, and calling in a loop becomes O(n * m) instead of O(n). And plenty of algorithms have to do further stuff on front, which may or may not result in further copies.

The fact that copying can be worse than O(1) is actually pretty devastating to performance in general. But I don't see how it can be avoided other than advising people that it's a bad idea to declare types where copying them is worse than O(1) (which would mean that types where copying would be worse should try being COW types or reference types). Trying to write algorithms in a way which minimizes copying helps to be sure, but sometimes copying is necessary, and it's very easy to introduce unnecessary copying accidentally - copying whose cost would only become evident with types where copying is expensive, and most unit testing is done with primitive types or simple user- defined types. And most unit testing does not involve any sort of timing or benchmarking (which is part of why Andrei wants to push for std.benchmark), so the cost wouldn't even necessarily be evident if unit tests _did_ use types which were expensive to copy.

So, I totally sympathize with Andrei and Walter wanting to say that copying must be O(1). I just don't think that it's realistic to expect that that's really going to be the case in the general case. At best, it's something that should be considered best practice.

- Jonathan M Davis
December 09, 2012
On Sunday, December 09, 2012 12:36:09 Jonathan M Davis wrote:
> On Sunday, December 09, 2012 21:21:10 Mehrdad wrote:
> > On Sunday, 9 December 2012 at 20:14:18 UTC, Jonathan M Davis
> > 
> > wrote:
> > > Algorithms end up copying all the time
> > 
> > Hmm... like which ones, and copying what kinds of objects?
> 
> Just calling front on a range is going to copy the element...

Oh, and ranges themselves get copied all over the place, so if someone is foolish enough to make copying _them_ expensive, they're screwed. Of course, if someone actually made a range deep copy with a postblit, that would also violate the logic of how ranges even work, but it's an example of a type that gets copied a lot, and if copying it were expensive, then performance would probably tank.

- Jonathan M Davis
December 09, 2012
On 12/9/2012 11:16 AM, Jonathan M Davis wrote:
> That's kind of the point of having posblits in the first place,

As I argued previously, the only justification I can think of for postblit is for reference counting. The counters posted here were variations on reference counting, or could be done in terms of reference counting.

In fact, I think we could solve the postblit problems with const, immutable, and shared by dispensing with postblit entirely and providing some sort of compiler magic for doing ref counting.
December 09, 2012
On Sunday, December 09, 2012 13:11:43 Walter Bright wrote:
> On 12/9/2012 11:16 AM, Jonathan M Davis wrote:
> > That's kind of the point of having posblits in the first place,
> 
> As I argued previously, the only justification I can think of for postblit is for reference counting. The counters posted here were variations on reference counting, or could be done in terms of reference counting.
> 
> In fact, I think we could solve the postblit problems with const, immutable, and shared by dispensing with postblit entirely and providing some sort of compiler magic for doing ref counting.

It is incredibly common in C++ to have objects which live on the stack and are deep-copied when they're copied. postblit allows us to do the same in D (aside from the issues with it not working with const and immutable), and TDPL even gives examples of doing precisely that (e.g. duping an array in a postblit constructor). I don't see why we should disallow structs which are deep copied when they're copied. You're then forcing everything which contains any kind of reference types to be a reference type. And if you're going to do that, why not just make them all classes? Why allow complex structs like we have if you're just going to hamstring them like that? We could have just gone with C#'s limited structs if that's what you're looking for.

Declaring a struct which contains reference types and does a deep copy with postblit obviously incurs a performance cost, but it's up to the programmer who declared it to decide whether that's worth it or not. I see no reason for the language to try and disallow that. That's overly restrictive.

- Jonathan M Davis
December 09, 2012
On Sunday, December 09, 2012 13:41:46 Jonathan M Davis wrote:
> Declaring a struct which contains reference types and does a deep copy with postblit obviously incurs a performance cost, but it's up to the programmer who declared it to decide whether that's worth it or not. I see no reason for the language to try and disallow that. That's overly restrictive.

Not to mention, we've already discussed fixing current features so that they stop disallowing certain idioms. For instance, in order to make it so that caching and lazy loading and whatnot are not disallowed in classes due to issues with const, we've talked about removing toString, opEquals, toHash, and opCmp from Object. Then the programmer can decide whether they want to use const or not and we'd no longer be disallowing the idioms that are prevented by forcing const on Object.

You're basically suggesting that we disallow any idiom which requires that structs be deep copied, and I think that that's bad policy. It's one thing to encourage programmers to not write such structs and to use other idioms like COW or reference counting. It's another thing entirely to disallow them. It's one of C++'s prime tenets to try and not force the programmer to program in a certain way or in a certain paradigm, and I think that D should do the same. If we want to encourage certain idioms and discourage others, fine. But outright disallowing them is a bad idea IMHO.

- Jonathan M Davis
December 09, 2012
On Sunday, 9 December 2012 at 22:10:50 UTC, Jonathan M Davis
wrote:
> On Sunday, December 09, 2012 13:41:46 Jonathan M Davis wrote:
>> Declaring a struct which contains reference types and does a deep copy with
>> postblit obviously incurs a performance cost, but it's up to the programmer
>> who declared it to decide whether that's worth it or not. I see no reason
>> for the language to try and disallow that. That's overly restrictive.
>
> Not to mention, we've already discussed fixing current features so that they
> stop disallowing certain idioms. For instance, in order to make it so that
> caching and lazy loading and whatnot are not disallowed in classes due to
> issues with const, we've talked about removing toString, opEquals, toHash, and
> opCmp from Object. Then the programmer can decide whether they want to use
> const or not and we'd no longer be disallowing the idioms that are prevented
> by forcing const on Object.
>
> You're basically suggesting that we disallow any idiom which requires that
> structs be deep copied, and I think that that's bad policy. It's one thing to
> encourage programmers to not write such structs and to use other idioms like
> COW or reference counting. It's another thing entirely to disallow them. It's
> one of C++'s prime tenets to try and not force the programmer to program in a
> certain way or in a certain paradigm, and I think that D should do the same.
> If we want to encourage certain idioms and discourage others, fine. But
> outright disallowing them is a bad idea IMHO.
>
> - Jonathan M Davis

I can't agree more
December 10, 2012
On 12/9/2012 2:10 PM, Jonathan M Davis wrote:
> You're basically suggesting that we disallow any idiom which requires that
> structs be deep copied, and I think that that's bad policy. It's one thing to
> encourage programmers to not write such structs and to use other idioms like
> COW or reference counting. It's another thing entirely to disallow them. It's
> one of C++'s prime tenets to try and not force the programmer to program in a
> certain way or in a certain paradigm, and I think that D should do the same.

We already disallow several C++ idioms - like multiple inheritance, using a type as both a value and a reference type, and head const. We believe that these are bad design patterns, despite them being used often in C++.

I do not dispute that deep copy is commonly used in C++. I challenge the idea that it is a good design pattern, i.e. better than using copy-on-write.

December 10, 2012
On Monday, 10 December 2012 at 03:53:04 UTC, Walter Bright wrote:
> I do not dispute that deep copy is commonly used in C++. I challenge the idea that it is a good design pattern, i.e. better than using copy-on-write.




Ignoring the design issue, it's plain faster -- for COW you have to perform checks on every call.
December 10, 2012
On Sunday, December 09, 2012 19:52:15 Walter Bright wrote:
> We already disallow several C++ idioms - like multiple inheritance, using a type as both a value and a reference type, and head const. We believe that these are bad design patterns, despite them being used often in C++.

Well, I certainly dispute that deep copying with copy constructors or postblits is as bad as you seem to think that it is. It's often sub-optimal to be sure, but it's also often overkill to do something like COW given its extra level of complexity. I think that there's a world of difference between disallowing multiple inheritance and disallowing objects which are deep copied when they're copied.

> I do not dispute that deep copy is commonly used in C++. I challenge the idea that it is a good design pattern, i.e. better than using copy-on-write.

It's a easier to implement normal, deep copying than COW and a less error- prone. It's also less code. Particularly if you don't need the extra speed of COW, it actually seems better to me to simply do a deep copy with postblit. Why go to the extra effort of making COW work correctly if a simple, deep copy does the trick just fine? If you _do_ need the extra efficiency of COW, then you'll do it anyway. But plenty of folks won't need it and won't want to bother. Why force it on them?

Not to mention, unless you can find ways to implement everything that you'd need a postblit or copy constructor to do without them and get rid of postblits, doing deep copies is going to be possible. And in the process of getting rid of postblits, you could easily end up disallowing other stuff which was useful and innocuous (e.g. something as simple as being able to print out when an object is copied when debugging).

- Jonathan M Davis