April 29, 2009
Hello,


I'm defining two ranges that go vertically over a range of ranges. Consider for example an int[][]. That's a range that has a range as its element type.

FrontTransversal goes transversally over the range of ranges and returns the front of each element in turn:

    int[][] x = new int[][2];
    x[0] = [1, 2];
    x[1] = [3, 4];
    auto ror = frontTransversal(x);
    auto witness = [ 1, 3 ];
    uint i;
    foreach (e; ror) assert(e == witness[i++]);
    assert(i == 2);

Then Transversal does a similar thing, just that instead of going through the front of each range, it goes over a column set during construction:

    int[][] x = new int[][2];
    x[0] = [1, 2];
    x[1] = [3, 4];
    auto ror = transversal(x, 1);
    auto witness = [ 2, 4 ];
    uint i;
    foreach (e; ror) assert(e == witness[i++]);
    assert(i == 2);

Question is, what would be two good names for these ranges?


Andrei
April 29, 2009
Andrei Alexandrescu:
> I'm defining two ranges that go vertically over a range of ranges. Consider for example an int[][]. That's a range that has a range as its element type.

Maybe I am not understanding the true purpose of this, but isn't a general 2D/ND slicer much better?

Bye,
bearophile
April 30, 2009
Andrei Alexandrescu wrote:
> Hello,
> 
> 
> I'm defining two ranges that go vertically over a range of ranges. Consider for example an int[][]. That's a range that has a range as its element type.
> 
> FrontTransversal goes transversally over the range of ranges and returns the front of each element in turn:
> 
>     int[][] x = new int[][2];
>     x[0] = [1, 2];
>     x[1] = [3, 4];
>     auto ror = frontTransversal(x);
>     auto witness = [ 1, 3 ];
>     uint i;
>     foreach (e; ror) assert(e == witness[i++]);
>     assert(i == 2);
> 
> Then Transversal does a similar thing, just that instead of going through the front of each range, it goes over a column set during construction:
> 
>     int[][] x = new int[][2];
>     x[0] = [1, 2];
>     x[1] = [3, 4];
>     auto ror = transversal(x, 1);
>     auto witness = [ 2, 4 ];
>     uint i;
>     foreach (e; ror) assert(e == witness[i++]);
>     assert(i == 2);
> 
> Question is, what would be two good names for these ranges?
> 
> 
> Andrei

Question is, how this fits into a collection/container package.

..So frankly, I don't worry about naming  conventions. I am more concerned about ranges within a let's say dynamic d e queue.
Lock free ADTs  are a final proof of product , no ?

Maybe I miss something.
But fact is : D2 (Phobos) still don't have (not even in it's brand new incarnation ) support for simple humpty dumpty collections. So Proof of product is where ?
April 30, 2009
BLS wrote:
> Question is, how this fits into a collection/container package.
> 
> ..So frankly, I don't worry about naming  conventions. I am more concerned about ranges within a let's say dynamic d e queue.
> Lock free ADTs  are a final proof of product , no ?
> 
> Maybe I miss something.
> But fact is : D2 (Phobos) still don't have (not even in it's brand new incarnation ) support for simple humpty dumpty collections. So Proof of product is where ?

You're not missing anything, except for perhaps a little patience :o).

There is vivid discussion about container support in D2. I was actually
thinking of posting a RFC sooner or later here.

It is very exciting to realize that with minimal (and actually
simplifying) changes to the language we can offer the option of
garbage-collected vs. reference counted vs. "unsafe" semi-automatically
managed memory. Moreover, I realized that the global allocation model
only affects the built-in literal types such as T[] and T[U], but still
allows using the other models as library types.

It has become increasingly clear we'll need to support arrays in
addition to slices. One big question mark for me is, should we define
for arrays (and associative arrays and containers in general) value
semantics, reference counted value semantics, or reference semantics? It
is not clear to me yet what the best option is. Think of arrays as an
archetypal example.

1. Value semantics (arrays are like int)

+ Simple to reason about
+ No long-distance sharing of state
+ Familiar to refugees coming from C++ with STL
- Unfamiliar to Java/C# refugees
+ Relatively simple to implement
+ Good behavior as members inside structs and classes
- Bad behavior as function parameters: if you forget to pass by
reference, performance will be harmed

2. Value semantics with reference counting

+- All tradeoffs of value semantics, except as noted below
- Difficult to implement
+ Cheap to copy around, will "divide" only upon mutation attempts
- Innocuous operations may throw/occasionally take a long time

3. Reference semantics

- Hard to reason about: squirrel an array into an object, and way later
and farther it modifies the array and effects a change at long distance
- Long-distance effects may as well be some of the toughest bugs after
memory errors
+ Cheap to copy
+ Familiar to Java/C# refugees
- Less familiar to C++ refugees
- Potentially inefficient as people might start sprinkling copies in the
hope the eliminate bugs or chances of bugs
+ const system helps: if a function doesn't change something just mark
it as const and you're home free
+ immutable also helps: immutable data can never change so it can be
shared no problem. Not sure how interesting immutable containers are though.
- Implementing fast/unsafe semantics with reference semantics
effectively forces reference counting to be part of containers, which
complicates implementation.

Any thoughts, just share! For the record, Walter has a bias in favor
of... I'll better don't bias you guys.

One thing: it would be relatively easy to switch among semantics by defining a wrapper such as Value!(T) or Ref!(T). So whatever choice we make, it won't paint us into a corner.


Andrei
April 30, 2009

Andrei Alexandrescu wrote:
> A lot of interesting stuff.

POLS (Principle of Least Surprise) says they should be reference types.
 That's what C, C++ (T[] doesn't look like it's using STL), C#, Java,
Lua, Python, etc., etc. programmers will expect.  Jumping out from
behind a bush a surprising them with value semantics will probably not
impress them.

And don't say you can just document it: we get enough people coming in who refuse to read the docs until we tell them to.  :P

That said, having a Value!T template [1] that implements value-semantics would be good.  I'll be honest with you: I'm not sure I'd use it for fear of performance issues.

Perhaps making it a copy-on-write type under the hood would help with that ("it only copies if you mutate and someone else has a reference; it's fast AND safe!").

  -- Daniel


[1] I'd be tempted to call it Block!T as in "here's a block of data".
April 30, 2009
Andrei ,
many thanks, ( not only because you are taking  the time to  explain ), I think I have to  say  “Thanks Andrei” because your answer was very gentle, ... ignoring all my grammar and (more important) logical mistakes... let me remain with a : You are listening and this is in my world : very generous.

Still : Container RFC ?
Björn



April 30, 2009
Andrei Alexandrescu Wrote:

> BLS wrote:
> > Question is, how this fits into a collection/container package.
> > 
> > ..So frankly, I don't worry about naming  conventions. I am more concerned about ranges within a let's say dynamic d e queue. Lock free ADTs  are a final proof of product , no ?
> > 
> > Maybe I miss something.
> > But fact is : D2 (Phobos) still don't have (not even in it's brand new
> > incarnation ) support for simple humpty dumpty collections. So Proof of
> > product is where ?
> 
> You're not missing anything, except for perhaps a little patience :o).
> 
> There is vivid discussion about container support in D2. I was actually thinking of posting a RFC sooner or later here.
> 
> It is very exciting to realize that with minimal (and actually simplifying) changes to the language we can offer the option of garbage-collected vs. reference counted vs. "unsafe" semi-automatically managed memory. Moreover, I realized that the global allocation model only affects the built-in literal types such as T[] and T[U], but still allows using the other models as library types.
> 
> It has become increasingly clear we'll need to support arrays in addition to slices. One big question mark for me is, should we define for arrays (and associative arrays and containers in general) value semantics, reference counted value semantics, or reference semantics? It is not clear to me yet what the best option is. Think of arrays as an archetypal example.
> 
> 1. Value semantics (arrays are like int)
> 
> + Simple to reason about
> + No long-distance sharing of state
> + Familiar to refugees coming from C++ with STL
> - Unfamiliar to Java/C# refugees
> + Relatively simple to implement
> + Good behavior as members inside structs and classes
> - Bad behavior as function parameters: if you forget to pass by
> reference, performance will be harmed
> 
> 2. Value semantics with reference counting
> 
> +- All tradeoffs of value semantics, except as noted below
> - Difficult to implement
> + Cheap to copy around, will "divide" only upon mutation attempts
> - Innocuous operations may throw/occasionally take a long time
> 
> 3. Reference semantics
> 
> - Hard to reason about: squirrel an array into an object, and way later
> and farther it modifies the array and effects a change at long distance
> - Long-distance effects may as well be some of the toughest bugs after
> memory errors
> + Cheap to copy
> + Familiar to Java/C# refugees
> - Less familiar to C++ refugees
> - Potentially inefficient as people might start sprinkling copies in the
> hope the eliminate bugs or chances of bugs
> + const system helps: if a function doesn't change something just mark
> it as const and you're home free
> + immutable also helps: immutable data can never change so it can be
> shared no problem. Not sure how interesting immutable containers are though.
> - Implementing fast/unsafe semantics with reference semantics
> effectively forces reference counting to be part of containers, which
> complicates implementation.
> 
> Any thoughts, just share! For the record, Walter has a bias in favor of... I'll better don't bias you guys.
> 
> One thing: it would be relatively easy to switch among semantics by defining a wrapper such as Value!(T) or Ref!(T). So whatever choice we make, it won't paint us into a corner.
> 
> 
> Andrei

I'm a C++ refugee and value semantics make no sense to me. I always declared classes (as part of function declaration) in C++ with a &... Either T& or const T&. There were plenty of bugs where I messed up and forgot the &. D is nice in that it tries to simplify that. You yourself had issues with remembering ref parameters when implementing ranges!
April 30, 2009
Jason House wrote:
> I'm a C++ refugee and value semantics make no sense to me. I always
> declared classes (as part of function declaration) in C++ with a &...
> Either T& or const T&. There were plenty of bugs where I messed up
> and forgot the &. D is nice in that it tries to simplify that. You
> yourself had issues with remembering ref parameters when implementing
> ranges!

Good point, but the issue boils down to this: do we define container members more often, or define container function parameters more often? Because inside a data structure value semantics are useful more often than not, and in a function parameter list reference semantics are useful more often than not.

I do agree that D programmers grown used to cheap-to-pass data and I don't want to cure anyone of hiccups :o).


Andrei
April 30, 2009
Andrei Alexandrescu wrote:
> 1. Value semantics (arrays are like int)

Don't forget:
 + Supports timely destruction of contents (i.e. RAII).

> 2. Value semantics with reference counting

I like this optimization and use it all the time in my own code, but I'm not convinced that it should be the default.  It's also problematic in multithreaded situations.

I think a generic CopyOnWrite wrapper over arbitrary value types would be more useful.  CopyOnWrite!(int[]).

> 3. Reference semantics

I'm strongly opposed to this option.  Either of the other options would be acceptable.


-- 
Rainer Deyke - rainerd@eldwood.com
April 30, 2009
Andrei Alexandrescu wrote:
> Jason House wrote:
>> I'm a C++ refugee and value semantics make no sense to me. I always
>> declared classes (as part of function declaration) in C++ with a &...
>> Either T& or const T&. There were plenty of bugs where I messed up
>> and forgot the &. D is nice in that it tries to simplify that. You
>> yourself had issues with remembering ref parameters when implementing
>> ranges!
> 
> Good point, but the issue boils down to this: do we define container members more often, or define container function parameters more often? Because inside a data structure value semantics are useful more often than not, and in a function parameter list reference semantics are useful more often than not.
> 
> I do agree that D programmers grown used to cheap-to-pass data and I don't want to cure anyone of hiccups :o).

Will it be an incurable mess if you let the programmer choose each time? (Then we'd of course have this same discussion about which is the default. :-) ) But still, I can see situations where it is _obvious_ which should be used.

Somebody into more functional style would have different needs than the more imperative guy. Not to mention more specific situations.
« First   ‹ Prev
1 2 3 4 5 6 7 8 9 10
Top | Discussion index | About this forum | D home