View mode: basic / threaded / horizontal-split · Log in · Help
January 02, 2010
Re: output ranges: by ref or by value?
Steven Schveighoffer wrote:
> On Fri, 01 Jan 2010 18:45:35 -0500, Andrei Alexandrescu 
> <SeeWebsiteForEmail@erdani.org> wrote:
> 
>> Rainer Deyke wrote:
>>> Andrei Alexandrescu wrote:
>>>> If the implementor of consume() forgets to call save(), the 
>>>> situation is
>>>> unpleasant albeit not catastrophic: for most struct ranges things will
>>>> continue to work, but for class ranges the function will fail to 
>>>> perform
>>>> to spec. I don't know how to improve on that.
>>>  Require that all ranges are structs.  If you want to implement a range
>>> as a class, use a wrapper struct that creates a new object in its
>>> postblit function.  The wrapper struct can be made generic and placed in
>>> the standard library.
>>>  Same performance as the current approach, slightly more effort on the
>>> part of the range implementor, much easier and less error-prone on the
>>> side of the range user.
>>
>> Oh, besides it doesn't work for struct ranges that iterate one-pass 
>> streams.
> 
> What does save do in those cases?
> 
> -Steve

It provides a syntactic differentiation between input ranges and forward 
ranges.

STL's input iterators are syntactically indistinguishable from forward 
iterators. Because of that, all STL algorithms that expect forward 
ranges will also compile and run with input ranges, although never with 
the expected result. This has been a lingering problem with C++98, and a 
key motivator for concepts. Since syntactic differences cannot be used 
to distinguish between input and forward iterators, the reasoning went, 
we need something else as a discriminator - and that would be a concept: 
someone who defines an input iterator declares it obeys the input 
iterator concept, whereas someone defining a forward iterator declares 
it obeys the forward iterator concept.

During the meltdown of concepts, a number of people including Bjarne 
Stroustrup and myself have suggested that a simple workable solution 
would be to define an extra function a la "save" that is used in 
algorithms and only supported by forward iterators, but not by input 
iterators. Then, algorithms use save() and will correctly refuse to 
compile calls with input iterators. The remaining risk is that someone 
writes an algorithm and forgets to use save().


Andrei
January 02, 2010
Re: output ranges: by ref or by value?
On Sat, 02 Jan 2010 13:06:25 -0500, Andrei Alexandrescu  
<SeeWebsiteForEmail@erdani.org> wrote:

> Steven Schveighoffer wrote:
>> On Fri, 01 Jan 2010 18:45:35 -0500, Andrei Alexandrescu  
>> <SeeWebsiteForEmail@erdani.org> wrote:
>>
>>> Rainer Deyke wrote:
>>>> Andrei Alexandrescu wrote:
>>>>> If the implementor of consume() forgets to call save(), the  
>>>>> situation is
>>>>> unpleasant albeit not catastrophic: for most struct ranges things  
>>>>> will
>>>>> continue to work, but for class ranges the function will fail to  
>>>>> perform
>>>>> to spec. I don't know how to improve on that.
>>>>  Require that all ranges are structs.  If you want to implement a  
>>>> range
>>>> as a class, use a wrapper struct that creates a new object in its
>>>> postblit function.  The wrapper struct can be made generic and placed  
>>>> in
>>>> the standard library.
>>>>  Same performance as the current approach, slightly more effort on the
>>>> part of the range implementor, much easier and less error-prone on the
>>>> side of the range user.
>>>
>>> Oh, besides it doesn't work for struct ranges that iterate one-pass  
>>> streams.
>>  What does save do in those cases?
>>  -Steve
>
> It provides a syntactic differentiation between input ranges and forward  
> ranges.
>
> STL's input iterators are syntactically indistinguishable from forward  
> iterators. Because of that, all STL algorithms that expect forward  
> ranges will also compile and run with input ranges, although never with  
> the expected result. This has been a lingering problem with C++98, and a  
> key motivator for concepts. Since syntactic differences cannot be used  
> to distinguish between input and forward iterators, the reasoning went,  
> we need something else as a discriminator - and that would be a concept:  
> someone who defines an input iterator declares it obeys the input  
> iterator concept, whereas someone defining a forward iterator declares  
> it obeys the forward iterator concept.
>
> During the meltdown of concepts, a number of people including Bjarne  
> Stroustrup and myself have suggested that a simple workable solution  
> would be to define an extra function a la "save" that is used in  
> algorithms and only supported by forward iterators, but not by input  
> iterators. Then, algorithms use save() and will correctly refuse to  
> compile calls with input iterators. The remaining risk is that someone  
> writes an algorithm and forgets to use save().

Would it not be as useful then to just define attributes on the type and  
save a bunch of useless/weird looking code?

that is, have an enum value inside the type that defines its state can be  
saved.  It seems to me like save is the equivalent of that anyways, since  
its possible to forget to use it, and still have your code compile.

Basically, it appears to me that save either a) doesn't compile or b) is  
the equivalent of assignment.  It doesn't seem to add much to the  
functionality.

This is all except for classes, which I think have no business being  
ranges in the first place.

-Steve
January 02, 2010
Re: output ranges: by ref or by value?
Steven Schveighoffer wrote:
> On Sat, 02 Jan 2010 13:06:25 -0500, Andrei Alexandrescu 
> <SeeWebsiteForEmail@erdani.org> wrote:
> 
>> Steven Schveighoffer wrote:
>>> On Fri, 01 Jan 2010 18:45:35 -0500, Andrei Alexandrescu 
>>> <SeeWebsiteForEmail@erdani.org> wrote:
>>>
>>>> Rainer Deyke wrote:
>>>>> Andrei Alexandrescu wrote:
>>>>>> If the implementor of consume() forgets to call save(), the 
>>>>>> situation is
>>>>>> unpleasant albeit not catastrophic: for most struct ranges things 
>>>>>> will
>>>>>> continue to work, but for class ranges the function will fail to 
>>>>>> perform
>>>>>> to spec. I don't know how to improve on that.
>>>>>  Require that all ranges are structs.  If you want to implement a 
>>>>> range
>>>>> as a class, use a wrapper struct that creates a new object in its
>>>>> postblit function.  The wrapper struct can be made generic and 
>>>>> placed in
>>>>> the standard library.
>>>>>  Same performance as the current approach, slightly more effort on the
>>>>> part of the range implementor, much easier and less error-prone on the
>>>>> side of the range user.
>>>>
>>>> Oh, besides it doesn't work for struct ranges that iterate one-pass 
>>>> streams.
>>>  What does save do in those cases?
>>>  -Steve
>>
>> It provides a syntactic differentiation between input ranges and 
>> forward ranges.
>>
>> STL's input iterators are syntactically indistinguishable from forward 
>> iterators. Because of that, all STL algorithms that expect forward 
>> ranges will also compile and run with input ranges, although never 
>> with the expected result. This has been a lingering problem with 
>> C++98, and a key motivator for concepts. Since syntactic differences 
>> cannot be used to distinguish between input and forward iterators, the 
>> reasoning went, we need something else as a discriminator - and that 
>> would be a concept: someone who defines an input iterator declares it 
>> obeys the input iterator concept, whereas someone defining a forward 
>> iterator declares it obeys the forward iterator concept.
>>
>> During the meltdown of concepts, a number of people including Bjarne 
>> Stroustrup and myself have suggested that a simple workable solution 
>> would be to define an extra function a la "save" that is used in 
>> algorithms and only supported by forward iterators, but not by input 
>> iterators. Then, algorithms use save() and will correctly refuse to 
>> compile calls with input iterators. The remaining risk is that someone 
>> writes an algorithm and forgets to use save().
> 
> Would it not be as useful then to just define attributes on the type and 
> save a bunch of useless/weird looking code?
> 
> that is, have an enum value inside the type that defines its state can 
> be saved.  It seems to me like save is the equivalent of that anyways, 
> since its possible to forget to use it, and still have your code compile.

If you have an enum that says a range's state can be saved, then you 
still need a function to effect that save :o). So you added, not 
removed, bloat.

> Basically, it appears to me that save either a) doesn't compile or b) is 
> the equivalent of assignment.  It doesn't seem to add much to the 
> functionality.

No, for class ranges save() is nontrivial (a sort of clone). I suspect 
even struct ranges for certain file-oriented stuff would do nontrivial 
work (e.g. open the file anew and fseek() on the current position etc.)

> This is all except for classes, which I think have no business being 
> ranges in the first place.

Well then how would the container hierarchy work? It does need range 
interfaces. How does dcollections deal with that?


Andrei
January 02, 2010
Re: output ranges: by ref or by value?
On Sat, 02 Jan 2010 13:51:48 -0500, Andrei Alexandrescu  
<SeeWebsiteForEmail@erdani.org> wrote:

> Steven Schveighoffer wrote:
>>  Would it not be as useful then to just define attributes on the type  
>> and save a bunch of useless/weird looking code?
>>  that is, have an enum value inside the type that defines its state can  
>> be saved.  It seems to me like save is the equivalent of that anyways,  
>> since its possible to forget to use it, and still have your code  
>> compile.
>
> If you have an enum that says a range's state can be saved, then you  
> still need a function to effect that save :o). So you added, not  
> removed, bloat.

The function already exists -- opAssign.  My point is that save would just  
be the same as opAssign in most cases where you'd want to implement save.   
Cases where you'd want to implement save differently than opAssign are  
cases where you wouldn't want to use it in a generic fashion.

>
>> Basically, it appears to me that save either a) doesn't compile or b)  
>> is the equivalent of assignment.  It doesn't seem to add much to the  
>> functionality.
>
> No, for class ranges save() is nontrivial (a sort of clone). I suspect  
> even struct ranges for certain file-oriented stuff would do nontrivial  
> work (e.g. open the file anew and fseek() on the current position etc.)

Classes should not be ranges.  And I hope that any algorithm that uses  
savable ranges would not be used on a file range.  For example, your  
consume function:

bool consume(R1, R2)(ref R1 r1, R2 r2)
        if (isForwardRange!R1 && isInputRange!R2)
{
    auto r = r1.save();  // open an extra file, and seek the file to the  
given point *just in case*?!
    while (!r2.empty && !r.empty && r.front == r2.front) {
        r.popFront();
        r2.popFront();
    }
    if (r2.empty) {
        r1 = r;  // note that this unaliases r1 from any other copies (not  
save()'d copies) that were made of it.
                 // also note that you now have opened more handles to the  
same file.  Calling this many times could
                 // consume quite a few handles.
        return true;
    }
    return false;
}

Here is a good exercise that would help clarify what we need: determine  
all the types of ranges you can think of that would need a special save  
function.

>> This is all except for classes, which I think have no business being  
>> ranges in the first place.
>
> Well then how would the container hierarchy work? It does need range  
> interfaces. How does dcollections deal with that?

A container is not a range.  A range of a container should not be a  
class.  For dcollections, I was planning on ranges being a struct with a  
begin and end cursor defining the part of the container referenced by the  
range.  Such a range can always be copied and modifying the copy through  
the range functions should not modify the original range.

I see a range as being useful for iteration or algorithms, but not for  
general use.  A great example is AAs.  Would you say that an AA *is* a  
range or should *provide* a range?  If it is a range, does that mean you  
remove elements as you call popFront?  Does that make any sense?  If it  
doesn't, then what happens if you add elements through another alias to  
that AA?

-Steve
January 03, 2010
Re: output ranges: by ref or by value?
Steven Schveighoffer wrote:
> On Sat, 02 Jan 2010 13:51:48 -0500, Andrei Alexandrescu 
> <SeeWebsiteForEmail@erdani.org> wrote:
> 
>> Steven Schveighoffer wrote:
>>>  Would it not be as useful then to just define attributes on the type 
>>> and save a bunch of useless/weird looking code?
>>>  that is, have an enum value inside the type that defines its state 
>>> can be saved.  It seems to me like save is the equivalent of that 
>>> anyways, since its possible to forget to use it, and still have your 
>>> code compile.
>>
>> If you have an enum that says a range's state can be saved, then you 
>> still need a function to effect that save :o). So you added, not 
>> removed, bloat.
> 
> The function already exists -- opAssign.  My point is that save would 
> just be the same as opAssign in most cases where you'd want to implement 
> save.  Cases where you'd want to implement save differently than 
> opAssign are cases where you wouldn't want to use it in a generic fashion.

You might mean this(this). Either doesn't help because you'd still be 
unable to differentiate between input ranges and forward ranges. Much of 
the point of save() is to mark a syntactic difference between input 
ranges and forward ranges. Input ranges still need this(this) and 
opAssign - those just have different semantics.

>>> Basically, it appears to me that save either a) doesn't compile or b) 
>>> is the equivalent of assignment.  It doesn't seem to add much to the 
>>> functionality.
>>
>> No, for class ranges save() is nontrivial (a sort of clone). I suspect 
>> even struct ranges for certain file-oriented stuff would do nontrivial 
>> work (e.g. open the file anew and fseek() on the current position etc.)
> 
> Classes should not be ranges.  And I hope that any algorithm that uses 
> savable ranges would not be used on a file range.  For example, your 
> consume function:
> 
> bool consume(R1, R2)(ref R1 r1, R2 r2)
>         if (isForwardRange!R1 && isInputRange!R2)
> {
>     auto r = r1.save();  // open an extra file, and seek the file to the 
> given point *just in case*?!
>     while (!r2.empty && !r.empty && r.front == r2.front) {
>         r.popFront();
>         r2.popFront();
>     }
>     if (r2.empty) {
>         r1 = r;  // note that this unaliases r1 from any other copies 
> (not save()'d copies) that were made of it.
>                  // also note that you now have opened more handles to 
> the same file.  Calling this many times could
>                  // consume quite a few handles.
>         return true;
>     }
>     return false;
> }
> 
> Here is a good exercise that would help clarify what we need: determine 
> all the types of ranges you can think of that would need a special save 
> function.

A range that defines save() is a forward range. save() creates an 
independent range from its source. The file etc. example was 
hypothetical but realistic.

>>> This is all except for classes, which I think have no business being 
>>> ranges in the first place.
>>
>> Well then how would the container hierarchy work? It does need range 
>> interfaces. How does dcollections deal with that?
> 
> A container is not a range.  A range of a container should not be a 
> class.  For dcollections, I was planning on ranges being a struct with a 
> begin and end cursor defining the part of the container referenced by 
> the range.  Such a range can always be copied and modifying the copy 
> through the range functions should not modify the original range.

Struct ranges won't work with a container hierarchy. If you define a 
container hierarchy (classes etc.) you'll also need a range hierarchy. 
Otherwise defining the inheritance relationship is impossible. Consider:

abstract class Container(E) { // most general container
    @property bool empty();
    bool add(E element);
    E removeAny();
    InputRange!E opSlice();
}

That's what I'd expect any container worth its salt to implement: (a) 
test for empty; (b) add an element to the container; (c) remove some 
element from the container; (d) get a range that spans the entire 
container. (Probably removeAll() and a range-positioned remove() would 
make sense, too.)

My point is, how do you inherit this guy? Well by taking advantage of 
covariant return types:

abstract class Array(E) : Container!E {
    @property bool empty();
    bool add(E element);
    E removeAny();
    RandomAccessRange!E opSlice();
    ... more stuff ...
}

Ergo, RandomAccessRange!E must inherit InputRange!E for covariance to 
kick in. The resulting setup is quite interesting: you either know you 
work with an Array and therefore you get a random-access range, or you 
just use the generic Container and get an input range.

> I see a range as being useful for iteration or algorithms, but not for 
> general use.  A great example is AAs.  Would you say that an AA *is* a 
> range or should *provide* a range?  If it is a range, does that mean you 
> remove elements as you call popFront?  Does that make any sense?  If it 
> doesn't, then what happens if you add elements through another alias to 
> that AA?

An AA provides several ranges - among which byKey and byValue.


Andrei
January 03, 2010
Re: output ranges: by ref or by value?
On Sat, 02 Jan 2010 19:51:41 -0500, Andrei Alexandrescu  
<SeeWebsiteForEmail@erdani.org> wrote:

> Steven Schveighoffer wrote:
>>  The function already exists -- opAssign.  My point is that save would  
>> just be the same as opAssign in most cases where you'd want to  
>> implement save.  Cases where you'd want to implement save differently  
>> than opAssign are cases where you wouldn't want to use it in a generic  
>> fashion.
>
> You might mean this(this).

Yes, that's what I meant :)  I haven't gotten used to struct constructors.

> Either doesn't help because you'd still be unable to differentiate  
> between input ranges and forward ranges. Much of the point of save() is  
> to mark a syntactic difference between input ranges and forward ranges.  
> Input ranges still need this(this) and opAssign - those just have  
> different semantics.

That was the point of the enum.  It would be a synthetic difference, but  
IMO so is save.  If it turns out that the only usable times you implement  
save all look like:

auto save() { return *this; }

then save gives you nothing.  It's kind of a proof by induction (I think).

You say that algorithm A requires the ability to save the state of a  
range.  So I define a function save on a range which cannot be saved by a  
simple assignment (i.e. a file).  I use A on that range, and the results  
are not what I expect or kill performance or consume unneeded resources,  
I'd rather not use algorithm A on that range, negating the need to define  
a save function in the first place.

I think that's what we're going to end up with.

> A range that defines save() is a forward range. save() creates an  
> independent range from its source. The file etc. example was  
> hypothetical but realistic.

I meant what actual types of ranges, not categories, I don't know how to  
say this better...  i.e. the file range is one type, what are others?   
What I'm looking for is ranges that would otherwise be input ranges unless  
you define save on them.  For example, a file range without save is an  
input range, cannot be a forward range.  But if you define save, it is a  
forward range.  An array is an example of a range that can be a forward  
range without the save function.

My theory is, given this list of ranges, if you pair them with an  
algorithm that requires save capability, you wouldn't want to use that  
algorithm on it anyways (kinda like the consume example).

> Struct ranges won't work with a container hierarchy. If you define a  
> container hierarchy (classes etc.) you'll also need a range hierarchy.  
> Otherwise defining the inheritance relationship is impossible. Consider:
>
> abstract class Container(E) { // most general container
>      @property bool empty();
>      bool add(E element);
>      E removeAny();
>      InputRange!E opSlice();
> }
>
> That's what I'd expect any container worth its salt to implement: (a)  
> test for empty; (b) add an element to the container; (c) remove some  
> element from the container; (d) get a range that spans the entire  
> container. (Probably removeAll() and a range-positioned remove() would  
> make sense, too.)

The range interface is compile-time only, there is no need to define it in  
the container interface.  opSlice does not need to be part of the  
interface, even if it is defined on all the container types.

opApply makes for a much better interface-friendly iteration anyways.

BTW, the primitives in dcollections are:

clear(); // clear all elements
remove(V v); // remove an element
contains(V v); // returns whether an element is contained in the colleciton
length(); // the length of the collection
dup(); // duplicate the collection
opApply(int delegate(ref V v) dg); // iterate the collection
opApply(int delegate(ref bool doPurge, ref V v) dg); // purge the  
collection

That means it covers only empty in your list of must-have functions (via  
length() == 0).  Add is not a primitive because the Map collections  
shouldn't assign some random key to the new element.  removeAny is defined  
only on sets and multisets, but I'm not sure that it couldn't be moved to  
Collection, in fact, I'll probably do that.

Note that it's missing begin and end which are defined on every single  
container type (i.e. the equivalent of the all-elements range).  This is  
because those primitives return a struct that is different for every  
container type.

It also surpasses opSlice via opApply, since all an input range can do  
anyways is iterate.  In fact, opApply is more powerful because you can  
change elements and remove elements (via purging).  Plus it's more  
efficient than a range-via-interface.

>> I see a range as being useful for iteration or algorithms, but not for  
>> general use.  A great example is AAs.  Would you say that an AA *is* a  
>> range or should *provide* a range?  If it is a range, does that mean  
>> you remove elements as you call popFront?  Does that make any sense?   
>> If it doesn't, then what happens if you add elements through another  
>> alias to that AA?
>
> An AA provides several ranges - among which byKey and byValue.

I misunderstood your statement "[a container hierarchy] does need range  
interfaces."  I thought you meant containers themselves need to implement  
the range interface, I see now that isn't the case, so my bad.

-Steve
January 03, 2010
Re: output ranges: by ref or by value?
On 2010-01-02 21:46:57 -0500, "Steven Schveighoffer" 
<schveiguy@yahoo.com> said:

> That was the point of the enum.  It would be a synthetic difference, 
> but  IMO so is save.  If it turns out that the only usable times you 
> implement  save all look like:
> 
> auto save() { return *this; }
> 
> then save gives you nothing.  It's kind of a proof by induction (I think).
> 
> You say that algorithm A requires the ability to save the state of a  
> range.  So I define a function save on a range which cannot be saved by 
> a  simple assignment (i.e. a file).  I use A on that range, and the 
> results  are not what I expect or kill performance or consume unneeded 
> resources,  I'd rather not use algorithm A on that range, negating the 
> need to define  a save function in the first place.
> 
> I think that's what we're going to end up with.

This is making me think...

First, opening files silently whenever an algorithm feels the need to 
save its state is just madness. What if the operating system decide to 
pop a security window when opening a restricted file? What if the file 
has been deleted or replaced in its directory while your program is 
still hanging on the old inode?

One might want to save the position in a file in order to be able to 
return there later, but when you return there the last thing you 
actually want to do is to open the file a second time: what you might 
realistically want is to set the position to what it was before, 
calling fseek. So perhaps it would be more useful to have both save() 
to save the current state (the position in a file), and restore() which 
would restore the saved state (returning to the saved position in a 
file).

For ranges with reference semantics -- please call them streams! -- 
save() and restore() would work just as fpos and fseek for files, but 
they might also not be available like in a TCP stream or in a 
communication channel between threads.

For ranges such as array slices, save and restore would just copy the 
range in one or the other direction. The only reason to have them is so 
they can be used as streams.

-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/
January 03, 2010
Re: output ranges: by ref or by value?
Steven Schveighoffer wrote:
> My theory is, given this list of ranges, if you pair them with an 
> algorithm that requires save capability, you wouldn't want to use that 
> algorithm on it anyways (kinda like the consume example).

Why gratuitously limit the design? You're asking to replace this:

R save() { return this; }

with:

enum thisIsAForwardRange = true;

Is there a reason? The former leaves in flexibility. The latter doesn't, 
for no good reason.

>> Struct ranges won't work with a container hierarchy. If you define a 
>> container hierarchy (classes etc.) you'll also need a range hierarchy. 
>> Otherwise defining the inheritance relationship is impossible. Consider:
>>
>> abstract class Container(E) { // most general container
>>      @property bool empty();
>>      bool add(E element);
>>      E removeAny();
>>      InputRange!E opSlice();
>> }
>>
>> That's what I'd expect any container worth its salt to implement: (a) 
>> test for empty; (b) add an element to the container; (c) remove some 
>> element from the container; (d) get a range that spans the entire 
>> container. (Probably removeAll() and a range-positioned remove() would 
>> make sense, too.)
> 
> The range interface is compile-time only, there is no need to define it 
> in the container interface.  opSlice does not need to be part of the 
> interface, even if it is defined on all the container types.
> 
> opApply makes for a much better interface-friendly iteration anyways.

I want container users to be able to save iteration state and also to 
open up containers to std.algorithm and other goodies, so I'm shying 
away from opApply.

> BTW, the primitives in dcollections are:
> 
> clear(); // clear all elements
> remove(V v); // remove an element

Search and remove? That's an odd primitive. Why wouldn't you offer an 
interface for iteration that allows an algorithm for search, and a 
primitive for positioned removal?

> contains(V v); // returns whether an element is contained in the colleciton

I don't think this belongs to primitives. It's O(n) for many containers 
and again it's a generic algorithm, not a member.

> length(); // the length of the collection

That's not a primitive either. std.algorithm.walkLength is. For me, all 
sorts of red flags and alarm buzzers go off when primitives are 
guaranteed that can't be implemented efficiently but by a subset of 
containers. You can't discount complexity as a implementation detail.

> dup(); // duplicate the collection
> opApply(int delegate(ref V v) dg); // iterate the collection
> opApply(int delegate(ref bool doPurge, ref V v) dg); // purge the 
> collection
> 
> That means it covers only empty in your list of must-have functions (via 
> length() == 0).

How do you implement length() for a singly-linked list? Is empty() going 
to take O(n)?

> Add is not a primitive because the Map collections 
> shouldn't assign some random key to the new element.  removeAny is 
> defined only on sets and multisets, but I'm not sure that it couldn't be 
> moved to Collection, in fact, I'll probably do that.

add is a primitive that takes Tuple!(K, V) as the element type.

> Note that it's missing begin and end which are defined on every single 
> container type (i.e. the equivalent of the all-elements range).  This is 
> because those primitives return a struct that is different for every 
> container type.

So you can't write container-independent iteration code unless you use 
opApply, in which case composition becomes tenuous.

> It also surpasses opSlice via opApply, since all an input range can do 
> anyways is iterate.  In fact, opApply is more powerful because you can 
> change elements and remove elements (via purging).  Plus it's more 
> efficient than a range-via-interface.

An input range is a subset of other (more powerful) ranges. It's also 
much more flexible. I agree that calling range primitives via interfaces 
can become an efficiency liability.

>>> I see a range as being useful for iteration or algorithms, but not 
>>> for general use.  A great example is AAs.  Would you say that an AA 
>>> *is* a range or should *provide* a range?  If it is a range, does 
>>> that mean you remove elements as you call popFront?  Does that make 
>>> any sense?  If it doesn't, then what happens if you add elements 
>>> through another alias to that AA?
>>
>> An AA provides several ranges - among which byKey and byValue.
> 
> I misunderstood your statement "[a container hierarchy] does need range 
> interfaces."  I thought you meant containers themselves need to 
> implement the range interface, I see now that isn't the case, so my bad.

Yah, they'd offer it as a result of opSlice(). Covariant return types 
will ensure there's no virtual call when you know what container you 
operate on.

Above all: the primitive set for a container must be a small set of 
functions that (a) can be implemented by all containers within 
reasonable efficiency bounds, and (b) are container-specific, not 
generic. IMHO any container design that defines a search(Element) as a 
primitive is misguided. Searching is not a container primitive - it's an 
algorithm. Only more specialized containers (e.g. trees, hashes etc.) 
can afford to define search(Element) as a primitive. Linear search is a 
generic algorithm that works the same for everyone. It does not belong 
as a method of any container.


Andrei
January 03, 2010
Re: output ranges: by ref or by value?
Michel Fortin wrote:
> On 2010-01-02 21:46:57 -0500, "Steven Schveighoffer" 
> <schveiguy@yahoo.com> said:
> 
>> That was the point of the enum.  It would be a synthetic difference, 
>> but  IMO so is save.  If it turns out that the only usable times you 
>> implement  save all look like:
>>
>> auto save() { return *this; }
>>
>> then save gives you nothing.  It's kind of a proof by induction (I 
>> think).
>>
>> You say that algorithm A requires the ability to save the state of a  
>> range.  So I define a function save on a range which cannot be saved 
>> by a  simple assignment (i.e. a file).  I use A on that range, and the 
>> results  are not what I expect or kill performance or consume unneeded 
>> resources,  I'd rather not use algorithm A on that range, negating the 
>> need to define  a save function in the first place.
>>
>> I think that's what we're going to end up with.
> 
> This is making me think...
> 
> First, opening files silently whenever an algorithm feels the need to 
> save its state is just madness. What if the operating system decide to 
> pop a security window when opening a restricted file? What if the file 
> has been deleted or replaced in its directory while your program is 
> still hanging on the old inode?
> 
> One might want to save the position in a file in order to be able to 
> return there later, but when you return there the last thing you 
> actually want to do is to open the file a second time: what you might 
> realistically want is to set the position to what it was before, calling 
> fseek. So perhaps it would be more useful to have both save() to save 
> the current state (the position in a file), and restore() which would 
> restore the saved state (returning to the saved position in a file).

These are implementation matters. save() could lazily save information, 
duplicate the file handle (cheap on many OSs), etc.

> For ranges with reference semantics -- please call them streams! -- 
> save() and restore() would work just as fpos and fseek for files, but 
> they might also not be available like in a TCP stream or in a 
> communication channel between threads.
> 
> For ranges such as array slices, save and restore would just copy the 
> range in one or the other direction. The only reason to have them is so 
> they can be used as streams.

This I don't understand. Array slices' save is:

T[] save(T[] array) { return array; }


Andrei
January 03, 2010
Re: output ranges: by ref or by value?
On 2010-01-03 00:51:46 -0500, Andrei Alexandrescu 
<SeeWebsiteForEmail@erdani.org> said:

>> First, opening files silently whenever an algorithm feels the need to 
>> save its state is just madness. What if the operating system decide to 
>> pop a security window when opening a restricted file? What if the file 
>> has been deleted or replaced in its directory while your program is 
>> still hanging on the old inode?
>> 
>> One might want to save the position in a file in order to be able to 
>> return there later, but when you return there the last thing you 
>> actually want to do is to open the file a second time: what you might 
>> realistically want is to set the position to what it was before, 
>> calling fseek. So perhaps it would be more useful to have both save() 
>> to save the current state (the position in a file), and restore() which 
>> would restore the saved state (returning to the saved position in a 
>> file).
> 
> These are implementation matters. save() could lazily save information, 
> duplicate the file handle (cheap on many OSs), etc.




>> For ranges with reference semantics -- please call them streams! -- 
>> save() and restore() would work just as fpos and fseek for files, but 
>> they might also not be available like in a TCP stream or in a 
>> communication channel between threads.
>> 
>> For ranges such as array slices, save and restore would just copy the 
>> range in one or the other direction. The only reason to have them is so 
>> they can be used as streams.
> 
> This I don't understand. Array slices' save is:
> 
> T[] save(T[] array) { return array; }

The idea is that you would use save and restore like this:

	auto state = range.save();
	// ...
	range.restore(state);

For array slices, save doesn't really have to change, just define restore:

	T[] save(T[] array) { return array; }
	void restore(ref T[] array, T[] state) { array = state; }

But with file handles:

	size_t save(FILE handle) { return fpos(handle); }
	void restore(FILE handle, size_t state) { fseek(handle, state); }

And for a range interface to be used in a class hierarchy:

	interface Range {
		Variant save();
		void restore(Variant state);
	}

This way you don't have to create a new range every time you need to 
start from a previously saved state, you can reuse the existing one 
which is much more efficient.

Also, the saved state can be anything, of any type, and doesn't need to 
be a range. So when implementing a non-trivial range you don't need to 
implement lazy initialization logic and all sort of complicated stuff 
just in case someone might save the state.

Note that with save defined like this you can't really duplicate the 
range in the general case. I'm not sure if this is good or bad. How 
many algorithms really require you to duplicate a range?

-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/
1 2 3 4
Top | Discussion index | About this forum | D home