View mode: basic / threaded / horizontal-split · Log in · Help
April 30, 2009
Re: RFC: naming for FrontTransversal and Transversal ranges
On 2009-04-29 23:01:39 -0400, Rainer Deyke <rainerd@eldwood.com> said:

> Andrei Alexandrescu wrote:
>> 2. Value semantics with reference counting
> 
> I like this optimization and use it all the time in my own code, but I'm
> not convinced that it should be the default.  It's also problematic in
> multithreaded situations.
> 
> I think a generic CopyOnWrite wrapper over arbitrary value types would
> be more useful.  CopyOnWrite!(int[]).

But don't forget that D2 is going to have the shared keyword, possibly 
with thread-local heaps. So multithreading won't be much of a problem: 
if the variable is shared, the compiler should force locks as needed or 
use atomic operations; if the variable is local to a thread, locks 
should become no-ops and atomic operations regular ones, improving 
performance.

Or at least that's what I'm getting from the hints we currently have.


>> 3. Reference semantics
> 
> I'm strongly opposed to this option.  Either of the other options would
> be acceptable.

Andrei mentioned a couple of reasons against 3, which are yours?


-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/
April 30, 2009
Re: RFC: naming for FrontTransversal and Transversal ranges
Rainer Deyke wrote:
> Andrei Alexandrescu wrote:
>> 1. Value semantics (arrays are like int)
> 
> Don't forget:
>  + Supports timely destruction of contents (i.e. RAII).

These are collections. RAII doesn't matter when you have a garbage 
collector and the only resource you have is memory, unless you're quite 
strapped for memory.
April 30, 2009
Re: RFC: naming for FrontTransversal and Transversal ranges
On Wed, 29 Apr 2009 20:45:23 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail@erdani.org> wrote:
>
> It has become increasingly clear we'll need to support arrays in
> addition to slices.

No, Andrei it hasn't. A detailed paragraph (or more) explaining we you  
think so should be included in the full RFC.

> One big question mark for me is, should we define
> for arrays (and associative arrays and containers in general) value
> semantics, reference counted value semantics, or reference semantics? It
> is not clear to me yet what the best option is. Think of arrays as an
> archetypal example.

Here are some additional pros/cons:

1. Value semantics (arrays are like int)
 +  can return static arrays from functions
 -  can be emulated with slice assign and .dup, where needed
 -  requires arrays and slices to be separate types
 +- requires opImplicitCast from container to slice type
 +- requires template argument deduction to be opImplicitCast/alias this  
aware
 -  If you forget to pass by reference (when you meant to), logical bugs  
are produced (i.e. my mutation doesn't get published).
 -  Dis-allows inheritance from all containers (i.e. implemented as  
structs)

I think clarifying 2 as being copy-on-write is useful.
2. Copy-on-write (Value semantics with reference counting)
 +- The added trade-offs of value semantics
 -  arrays may no longer be passed in registers (performance reduction)

3. Reference semantics
 ++ Allows arrays and slices to be the same type
 +  How D's arrays/slices work today
 +  Familiar to C/C++ refugees (Similar to plain old arrays)
 +  Familiar to most OO language refugees: containers are objects
 +  Allows inheritance from most containers (i.e. implemented as objects,  
except for arrays)

> - Hard to reason about: squirrel an array into an object, and way later
> and farther it modifies the array and effects a change at long distance
> - Long-distance effects may as well be some of the toughest bugs after
> memory errors
> - Potentially inefficient as people might start sprinkling copies in the
> hope the eliminate bugs or chances of bugs
> + const system helps: if a function doesn't change something just mark
> it as const and you're home free
> + immutable also helps: immutable data can never change so it can be
> shared no problem. Not sure how interesting immutable containers are  
> though.

All these issues also exist with objects, which people are familiar with.

> - Implementing fast/unsafe semantics with reference semantics
> effectively forces reference counting to be part of containers, which
> complicates implementation.

I'm drawing a blank on this. My thinking is this;
1) garbage-collected:   reference counting isn't needed
2) reference counted:   reference counting is expected and built into  
objects in general.
3) "unsafe" management: safety is explicitly not expected, so reference  
counting isn't needed

My vote is for reference semantics. It's fast, familiar, easy to reason  
about (i.e. like objects) and doesn't require seperate array and slice  
types.
April 30, 2009
Re: RFC: naming for FrontTransversal and Transversal ranges
Michel Fortin wrote:
> On 2009-04-29 23:01:39 -0400, Rainer Deyke <rainerd@eldwood.com> said:
>> Andrei Alexandrescu wrote:
>>> 3. Reference semantics
>> I'm strongly opposed to this option.  Either of the other options would
>> be acceptable.
> 
> Andrei mentioned a couple of reasons against 3, which are yours?

I agree with all of Andrei's reasons.

In addition, there's this: suppose you have a struct containing a
(mutable) array.  When you make a copy of that struct, you will almost
always want to make a copy of the contained array.  Therefore value
semantics should be the default, because it simplifies the most common
use case.


-- 
Rainer Deyke - rainerd@eldwood.com
April 30, 2009
Re: RFC: naming for FrontTransversal and Transversal ranges
Christopher Wright wrote:
> Rainer Deyke wrote:
>> Don't forget:
>>  + Supports timely destruction of contents (i.e. RAII).
> 
> These are collections. RAII doesn't matter when you have a garbage
> collector and the only resource you have is memory, unless you're quite
> strapped for memory.

It also doesn't matter if the resource deallocation pixie deallocates
all my non-memory resources the moment they are no longer being used,
which is about as likely as only having memory resources in arrays.


-- 
Rainer Deyke - rainerd@eldwood.com
April 30, 2009
Re: RFC: naming for FrontTransversal and Transversal ranges
On Thu, 30 Apr 2009 14:40:42 -0400, Rainer Deyke <rainerd@eldwood.com>  
wrote:
> In addition, there's this: suppose you have a struct containing a
> (mutable) array.  When you make a copy of that struct, you will almost
> always want to make a copy of the contained array.  Therefore value
> semantics should be the default, because it simplifies the most common
> use case.

That's what struct ctors/dtors/opAssign are for. Personally, I think  
letting the struct designer decide the behaviour is better.
April 30, 2009
Re: RFC: naming for FrontTransversal and Transversal ranges
Robert Jacques wrote:
> On Thu, 30 Apr 2009 14:40:42 -0400, Rainer Deyke <rainerd@eldwood.com> 
> wrote:
>> In addition, there's this: suppose you have a struct containing a
>> (mutable) array.  When you make a copy of that struct, you will almost
>> always want to make a copy of the contained array.  Therefore value
>> semantics should be the default, because it simplifies the most common
>> use case.
> 
> That's what struct ctors/dtors/opAssign are for. Personally, I think 
> letting the struct designer decide the behaviour is better.

The problem is what to do for the containers in Phobos.

Andrei
April 30, 2009
Re: RFC: naming for FrontTransversal and Transversal ranges
Robert Jacques wrote:
> On Wed, 29 Apr 2009 20:45:23 -0400, Andrei Alexandrescu 
> <SeeWebsiteForEmail@erdani.org> wrote:
>>
>> It has become increasingly clear we'll need to support arrays in
>> addition to slices.
> 
> No, Andrei it hasn't. A detailed paragraph (or more) explaining we you 
> think so should be included in the full RFC.

There are several converging pieces of evidence. Each can be debated in 
separation, yet together they make for a rather strong case.

People have complained about the incredibly bad, unsafe, and slow 
behavior of appending. Even before the considerable recent slowdown of 
~=, there were speed concerns. Worse than that, the completely weird 
behavior of slices that grow to stomp on other slices cannot be defended.

The idiom:

array.length = n;
array.length = 0;
// append to array up to n...

is rather disingenuous and looks odd to the casual reader.

As predicted by the "range theory" which predicates that a range will 
never grow, only shrink, slices should always shrink. Growing is a 
foreign operation for a slice. This is a big problem that goes beyond a 
need for purity or cleanliness - it's just terrible to have a slice 
stomping around, and then uncontrollably being rebound elsewhere and so 
on. People pass slices into functions, functions modify them and also 
append to them, and depending on the phase of the moon, some updates are 
being seen in the caller.

So people have defined various types (such as ArrayCreator or 
ArrayAppender or whatnot) that take care of that aspect. But really what 
we're looking at is an Array type.

Another issue is that we can get (partly) away with disowned slices 
because of the garbage collector: whenever a slice is to be rebound, 
whatever chunk it was bound to is left behind and will be duly 
collected. If, however, we want to support other programming disciplines 
that exercise stricter control over memory, the "parent" arrays must 
become an explicit type that code can keep around and manipulate directly.

A design that has one container type and several slice types that crawl 
it in various ways is very compelling, and I think it will be a huge 
selling point for D. It would be odd, not natural, to have arrays as a 
singularly distinct container that is in fact its own range. We get away 
with container == range for arrays partly because arrays are a simple 
structure, but that only blurs thinking and understanding of more 
complex containers. In fact, ranges do predict that any attempt to grow 
a range will cause topological issues in the owning container; that's 
why SList wisely (IMHO :o)) includes a template parameter that tells 
whether its topology is fixed or flexible.

(Last but not least, .NET has arrays and slices. D's slices that 
actually have an array testis mesh real bad with .NET because in .NET 
you can't have a disowned slice.)


Andrei
April 30, 2009
Re: RFC: naming for FrontTransversal and Transversal ranges
On Thu, 30 Apr 2009 23:57:17 +0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:

> Robert Jacques wrote:
>> On Wed, 29 Apr 2009 20:45:23 -0400, Andrei Alexandrescu  
>> <SeeWebsiteForEmail@erdani.org> wrote:
>>>
>>> It has become increasingly clear we'll need to support arrays in
>>> addition to slices.
>>  No, Andrei it hasn't. A detailed paragraph (or more) explaining we you  
>> think so should be included in the full RFC.
>
> There are several converging pieces of evidence. Each can be debated in  
> separation, yet together they make for a rather strong case.
>
> People have complained about the incredibly bad, unsafe, and slow  
> behavior of appending. Even before the considerable recent slowdown of  
> ~=, there were speed concerns. Worse than that, the completely weird  
> behavior of slices that grow to stomp on other slices cannot be defended.
>
> The idiom:
>
> array.length = n;
> array.length = 0;
> // append to array up to n...
>
> is rather disingenuous and looks odd to the casual reader.
>
> As predicted by the "range theory" which predicates that a range will  
> never grow, only shrink, slices should always shrink. Growing is a  
> foreign operation for a slice. This is a big problem that goes beyond a  
> need for purity or cleanliness - it's just terrible to have a slice  
> stomping around, and then uncontrollably being rebound elsewhere and so  
> on. People pass slices into functions, functions modify them and also  
> append to them, and depending on the phase of the moon, some updates are  
> being seen in the caller.
>
> So people have defined various types (such as ArrayCreator or  
> ArrayAppender or whatnot) that take care of that aspect. But really what  
> we're looking at is an Array type.
>
> Another issue is that we can get (partly) away with disowned slices  
> because of the garbage collector: whenever a slice is to be rebound,  
> whatever chunk it was bound to is left behind and will be duly  
> collected. If, however, we want to support other programming disciplines  
> that exercise stricter control over memory, the "parent" arrays must  
> become an explicit type that code can keep around and manipulate  
> directly.
>
> A design that has one container type and several slice types that crawl  
> it in various ways is very compelling, and I think it will be a huge  
> selling point for D. It would be odd, not natural, to have arrays as a  
> singularly distinct container that is in fact its own range. We get away  
> with container == range for arrays partly because arrays are a simple  
> structure, but that only blurs thinking and understanding of more  
> complex containers. In fact, ranges do predict that any attempt to grow  
> a range will cause topological issues in the owning container; that's  
> why SList wisely (IMHO :o)) includes a template parameter that tells  
> whether its topology is fixed or flexible.
>
> (Last but not least, .NET has arrays and slices. D's slices that  
> actually have an array testis mesh real bad with .NET because in .NET  
> you can't have a disowned slice.)
>
>
> Andrei

I'm all for introducing Array type, but what would using it look like? Is it a plain Array!(T), or is there any shortcut planned. One of the selling points of slices is their ease of use. T[] is freaking awesome!

I also would like to see what would Phobos look like after introducing Arrays.

A good thing is that it's quite easy start implementing Array right now - all you need is to disallow ~ and ~= operations on slices. Then, go through Phobos code and see what got broken and start fixing it using newly written Array template (compiler doesn't have to know anything about it). I beleve it will become immediately clear whether D Arrays should be a reference or value type.
April 30, 2009
Re: RFC: naming for FrontTransversal and Transversal ranges
Robert Jacques wrote:
> On Thu, 30 Apr 2009 14:40:42 -0400, Rainer Deyke <rainerd@eldwood.com>
> wrote:
>> In addition, there's this: suppose you have a struct containing a
>> (mutable) array.  When you make a copy of that struct, you will almost
>> always want to make a copy of the contained array.  Therefore value
>> semantics should be the default, because it simplifies the most common
>> use case.
> 
> That's what struct ctors/dtors/opAssign are for. Personally, I think
> letting the struct designer decide the behaviour is better.

I hope you're not suggesting writing ctor/dtor/opAssign for every single
struct that uses arrays as value types.

It's possible to write a reference wrapper around a value type.  It's
also possible to write a value wrapper around a reference type.
However, the former is both easier and more efficient than the latter.


-- 
Rainer Deyke - rainerd@eldwood.com
1 2 3 4 5 6
Top | Discussion index | About this forum | D home