output ranges: by ref or by value? (page 3) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » output ranges: by ref or by value? (page 3)

January 02, 2010

Re: output ranges: by ref or by value?

Posted by Andrei Alexandrescu
in reply to Steven Schveighoffer

Andrei Alexandrescu

Posted in reply to Steven Schveighoffer

Steven Schveighoffer wrote:
> On Fri, 01 Jan 2010 18:45:35 -0500, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
> 
>> Rainer Deyke wrote:
>>> Andrei Alexandrescu wrote:
>>>> If the implementor of consume() forgets to call save(), the situation is
>>>> unpleasant albeit not catastrophic: for most struct ranges things will
>>>> continue to work, but for class ranges the function will fail to perform
>>>> to spec. I don't know how to improve on that.
>>>  Require that all ranges are structs.  If you want to implement a range
>>> as a class, use a wrapper struct that creates a new object in its
>>> postblit function.  The wrapper struct can be made generic and placed in
>>> the standard library.
>>>  Same performance as the current approach, slightly more effort on the
>>> part of the range implementor, much easier and less error-prone on the
>>> side of the range user.
>>
>> Oh, besides it doesn't work for struct ranges that iterate one-pass streams.
> 
> What does save do in those cases?
> 
> -Steve

It provides a syntactic differentiation between input ranges and forward ranges.

STL's input iterators are syntactically indistinguishable from forward iterators. Because of that, all STL algorithms that expect forward ranges will also compile and run with input ranges, although never with the expected result. This has been a lingering problem with C++98, and a key motivator for concepts. Since syntactic differences cannot be used to distinguish between input and forward iterators, the reasoning went, we need something else as a discriminator - and that would be a concept: someone who defines an input iterator declares it obeys the input iterator concept, whereas someone defining a forward iterator declares it obeys the forward iterator concept.

During the meltdown of concepts, a number of people including Bjarne Stroustrup and myself have suggested that a simple workable solution would be to define an extra function a la "save" that is used in algorithms and only supported by forward iterators, but not by input iterators. Then, algorithms use save() and will correctly refuse to compile calls with input iterators. The remaining risk is that someone writes an algorithm and forgets to use save().

Andrei

January 02, 2010

Re: output ranges: by ref or by value?

Posted by Steven Schveighoffer
in reply to Andrei Alexandrescu

Steven Schveighoffer

Posted in reply to Andrei Alexandrescu

On Sat, 02 Jan 2010 13:06:25 -0500, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:

> Steven Schveighoffer wrote:
>> On Fri, 01 Jan 2010 18:45:35 -0500, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
>>
>>> Rainer Deyke wrote:
>>>> Andrei Alexandrescu wrote:
>>>>> If the implementor of consume() forgets to call save(), the situation is
>>>>> unpleasant albeit not catastrophic: for most struct ranges things will
>>>>> continue to work, but for class ranges the function will fail to perform
>>>>> to spec. I don't know how to improve on that.
>>>>  Require that all ranges are structs.  If you want to implement a range
>>>> as a class, use a wrapper struct that creates a new object in its
>>>> postblit function.  The wrapper struct can be made generic and placed in
>>>> the standard library.
>>>>  Same performance as the current approach, slightly more effort on the
>>>> part of the range implementor, much easier and less error-prone on the
>>>> side of the range user.
>>>
>>> Oh, besides it doesn't work for struct ranges that iterate one-pass streams.
>>  What does save do in those cases?
>>  -Steve
>
> It provides a syntactic differentiation between input ranges and forward ranges.
>
> STL's input iterators are syntactically indistinguishable from forward iterators. Because of that, all STL algorithms that expect forward ranges will also compile and run with input ranges, although never with the expected result. This has been a lingering problem with C++98, and a key motivator for concepts. Since syntactic differences cannot be used to distinguish between input and forward iterators, the reasoning went, we need something else as a discriminator - and that would be a concept: someone who defines an input iterator declares it obeys the input iterator concept, whereas someone defining a forward iterator declares it obeys the forward iterator concept.
>
> During the meltdown of concepts, a number of people including Bjarne Stroustrup and myself have suggested that a simple workable solution would be to define an extra function a la "save" that is used in algorithms and only supported by forward iterators, but not by input iterators. Then, algorithms use save() and will correctly refuse to compile calls with input iterators. The remaining risk is that someone writes an algorithm and forgets to use save().

Would it not be as useful then to just define attributes on the type and save a bunch of useless/weird looking code?

that is, have an enum value inside the type that defines its state can be saved.  It seems to me like save is the equivalent of that anyways, since its possible to forget to use it, and still have your code compile.

Basically, it appears to me that save either a) doesn't compile or b) is the equivalent of assignment.  It doesn't seem to add much to the functionality.

This is all except for classes, which I think have no business being ranges in the first place.

-Steve

January 02, 2010

Re: output ranges: by ref or by value?

Posted by Andrei Alexandrescu
in reply to Steven Schveighoffer

Andrei Alexandrescu

Posted in reply to Steven Schveighoffer

Steven Schveighoffer wrote:
> On Sat, 02 Jan 2010 13:06:25 -0500, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
> 
>> Steven Schveighoffer wrote:
>>> On Fri, 01 Jan 2010 18:45:35 -0500, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
>>>
>>>> Rainer Deyke wrote:
>>>>> Andrei Alexandrescu wrote:
>>>>>> If the implementor of consume() forgets to call save(), the situation is
>>>>>> unpleasant albeit not catastrophic: for most struct ranges things will
>>>>>> continue to work, but for class ranges the function will fail to perform
>>>>>> to spec. I don't know how to improve on that.
>>>>>  Require that all ranges are structs.  If you want to implement a range
>>>>> as a class, use a wrapper struct that creates a new object in its
>>>>> postblit function.  The wrapper struct can be made generic and placed in
>>>>> the standard library.
>>>>>  Same performance as the current approach, slightly more effort on the
>>>>> part of the range implementor, much easier and less error-prone on the
>>>>> side of the range user.
>>>>
>>>> Oh, besides it doesn't work for struct ranges that iterate one-pass streams.
>>>  What does save do in those cases?
>>>  -Steve
>>
>> It provides a syntactic differentiation between input ranges and forward ranges.
>>
>> STL's input iterators are syntactically indistinguishable from forward iterators. Because of that, all STL algorithms that expect forward ranges will also compile and run with input ranges, although never with the expected result. This has been a lingering problem with C++98, and a key motivator for concepts. Since syntactic differences cannot be used to distinguish between input and forward iterators, the reasoning went, we need something else as a discriminator - and that would be a concept: someone who defines an input iterator declares it obeys the input iterator concept, whereas someone defining a forward iterator declares it obeys the forward iterator concept.
>>
>> During the meltdown of concepts, a number of people including Bjarne Stroustrup and myself have suggested that a simple workable solution would be to define an extra function a la "save" that is used in algorithms and only supported by forward iterators, but not by input iterators. Then, algorithms use save() and will correctly refuse to compile calls with input iterators. The remaining risk is that someone writes an algorithm and forgets to use save().
> 
> Would it not be as useful then to just define attributes on the type and save a bunch of useless/weird looking code?
> 
> that is, have an enum value inside the type that defines its state can be saved.  It seems to me like save is the equivalent of that anyways, since its possible to forget to use it, and still have your code compile.

If you have an enum that says a range's state can be saved, then you still need a function to effect that save :o). So you added, not removed, bloat.

> Basically, it appears to me that save either a) doesn't compile or b) is the equivalent of assignment.  It doesn't seem to add much to the functionality.

No, for class ranges save() is nontrivial (a sort of clone). I suspect even struct ranges for certain file-oriented stuff would do nontrivial work (e.g. open the file anew and fseek() on the current position etc.)

> This is all except for classes, which I think have no business being ranges in the first place.

Well then how would the container hierarchy work? It does need range interfaces. How does dcollections deal with that?


Andrei

January 02, 2010

Re: output ranges: by ref or by value?

Posted by Steven Schveighoffer
in reply to Andrei Alexandrescu

Steven Schveighoffer

Posted in reply to Andrei Alexandrescu

On Sat, 02 Jan 2010 13:51:48 -0500, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:

> Steven Schveighoffer wrote:
>>  Would it not be as useful then to just define attributes on the type and save a bunch of useless/weird looking code?
>>  that is, have an enum value inside the type that defines its state can be saved.  It seems to me like save is the equivalent of that anyways, since its possible to forget to use it, and still have your code compile.
>
> If you have an enum that says a range's state can be saved, then you still need a function to effect that save :o). So you added, not removed, bloat.

The function already exists -- opAssign.  My point is that save would just be the same as opAssign in most cases where you'd want to implement save.  Cases where you'd want to implement save differently than opAssign are cases where you wouldn't want to use it in a generic fashion.

>
>> Basically, it appears to me that save either a) doesn't compile or b) is the equivalent of assignment.  It doesn't seem to add much to the functionality.
>
> No, for class ranges save() is nontrivial (a sort of clone). I suspect even struct ranges for certain file-oriented stuff would do nontrivial work (e.g. open the file anew and fseek() on the current position etc.)

Classes should not be ranges.  And I hope that any algorithm that uses savable ranges would not be used on a file range.  For example, your consume function:

bool consume(R1, R2)(ref R1 r1, R2 r2)
        if (isForwardRange!R1 && isInputRange!R2)
{
    auto r = r1.save();  // open an extra file, and seek the file to the given point *just in case*?!
    while (!r2.empty && !r.empty && r.front == r2.front) {
        r.popFront();
        r2.popFront();
    }
    if (r2.empty) {
        r1 = r;  // note that this unaliases r1 from any other copies (not save()'d copies) that were made of it.
                 // also note that you now have opened more handles to the same file.  Calling this many times could
                 // consume quite a few handles.
        return true;
    }
    return false;
}

Here is a good exercise that would help clarify what we need: determine all the types of ranges you can think of that would need a special save function.

>> This is all except for classes, which I think have no business being ranges in the first place.
>
> Well then how would the container hierarchy work? It does need range interfaces. How does dcollections deal with that?

A container is not a range.  A range of a container should not be a class.  For dcollections, I was planning on ranges being a struct with a begin and end cursor defining the part of the container referenced by the range.  Such a range can always be copied and modifying the copy through the range functions should not modify the original range.

I see a range as being useful for iteration or algorithms, but not for general use.  A great example is AAs.  Would you say that an AA *is* a range or should *provide* a range?  If it is a range, does that mean you remove elements as you call popFront?  Does that make any sense?  If it doesn't, then what happens if you add elements through another alias to that AA?

-Steve

January 03, 2010

Re: output ranges: by ref or by value?

Posted by Andrei Alexandrescu
in reply to Steven Schveighoffer

Andrei Alexandrescu

Posted in reply to Steven Schveighoffer

Steven Schveighoffer wrote:
> On Sat, 02 Jan 2010 13:51:48 -0500, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
> 
>> Steven Schveighoffer wrote:
>>>  Would it not be as useful then to just define attributes on the type and save a bunch of useless/weird looking code?
>>>  that is, have an enum value inside the type that defines its state can be saved.  It seems to me like save is the equivalent of that anyways, since its possible to forget to use it, and still have your code compile.
>>
>> If you have an enum that says a range's state can be saved, then you still need a function to effect that save :o). So you added, not removed, bloat.
> 
> The function already exists -- opAssign.  My point is that save would just be the same as opAssign in most cases where you'd want to implement save.  Cases where you'd want to implement save differently than opAssign are cases where you wouldn't want to use it in a generic fashion.

You might mean this(this). Either doesn't help because you'd still be unable to differentiate between input ranges and forward ranges. Much of the point of save() is to mark a syntactic difference between input ranges and forward ranges. Input ranges still need this(this) and opAssign - those just have different semantics.

>>> Basically, it appears to me that save either a) doesn't compile or b) is the equivalent of assignment.  It doesn't seem to add much to the functionality.
>>
>> No, for class ranges save() is nontrivial (a sort of clone). I suspect even struct ranges for certain file-oriented stuff would do nontrivial work (e.g. open the file anew and fseek() on the current position etc.)
> 
> Classes should not be ranges.  And I hope that any algorithm that uses savable ranges would not be used on a file range.  For example, your consume function:
> 
> bool consume(R1, R2)(ref R1 r1, R2 r2)
>         if (isForwardRange!R1 && isInputRange!R2)
> {
>     auto r = r1.save();  // open an extra file, and seek the file to the given point *just in case*?!
>     while (!r2.empty && !r.empty && r.front == r2.front) {
>         r.popFront();
>         r2.popFront();
>     }
>     if (r2.empty) {
>         r1 = r;  // note that this unaliases r1 from any other copies (not save()'d copies) that were made of it.
>                  // also note that you now have opened more handles to the same file.  Calling this many times could
>                  // consume quite a few handles.
>         return true;
>     }
>     return false;
> }
> 
> Here is a good exercise that would help clarify what we need: determine all the types of ranges you can think of that would need a special save function.

A range that defines save() is a forward range. save() creates an independent range from its source. The file etc. example was hypothetical but realistic.

>>> This is all except for classes, which I think have no business being ranges in the first place.
>>
>> Well then how would the container hierarchy work? It does need range interfaces. How does dcollections deal with that?
> 
> A container is not a range.  A range of a container should not be a class.  For dcollections, I was planning on ranges being a struct with a begin and end cursor defining the part of the container referenced by the range.  Such a range can always be copied and modifying the copy through the range functions should not modify the original range.

Struct ranges won't work with a container hierarchy. If you define a container hierarchy (classes etc.) you'll also need a range hierarchy. Otherwise defining the inheritance relationship is impossible. Consider:

abstract class Container(E) { // most general container
    @property bool empty();
    bool add(E element);
    E removeAny();
    InputRange!E opSlice();
}

That's what I'd expect any container worth its salt to implement: (a) test for empty; (b) add an element to the container; (c) remove some element from the container; (d) get a range that spans the entire container. (Probably removeAll() and a range-positioned remove() would make sense, too.)

My point is, how do you inherit this guy? Well by taking advantage of covariant return types:

abstract class Array(E) : Container!E {
    @property bool empty();
    bool add(E element);
    E removeAny();
    RandomAccessRange!E opSlice();
    ... more stuff ...
}

Ergo, RandomAccessRange!E must inherit InputRange!E for covariance to kick in. The resulting setup is quite interesting: you either know you work with an Array and therefore you get a random-access range, or you just use the generic Container and get an input range.

> I see a range as being useful for iteration or algorithms, but not for general use.  A great example is AAs.  Would you say that an AA *is* a range or should *provide* a range?  If it is a range, does that mean you remove elements as you call popFront?  Does that make any sense?  If it doesn't, then what happens if you add elements through another alias to that AA?

An AA provides several ranges - among which byKey and byValue.


Andrei

January 03, 2010

Re: output ranges: by ref or by value?

Posted by Steven Schveighoffer
in reply to Andrei Alexandrescu

Steven Schveighoffer

Posted in reply to Andrei Alexandrescu

On Sat, 02 Jan 2010 19:51:41 -0500, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:

> Steven Schveighoffer wrote:
>>  The function already exists -- opAssign.  My point is that save would just be the same as opAssign in most cases where you'd want to implement save.  Cases where you'd want to implement save differently than opAssign are cases where you wouldn't want to use it in a generic fashion.
>
> You might mean this(this).

Yes, that's what I meant :)  I haven't gotten used to struct constructors.

> Either doesn't help because you'd still be unable to differentiate between input ranges and forward ranges. Much of the point of save() is to mark a syntactic difference between input ranges and forward ranges. Input ranges still need this(this) and opAssign - those just have different semantics.

That was the point of the enum.  It would be a synthetic difference, but IMO so is save.  If it turns out that the only usable times you implement save all look like:

auto save() { return *this; }

then save gives you nothing.  It's kind of a proof by induction (I think).

You say that algorithm A requires the ability to save the state of a range.  So I define a function save on a range which cannot be saved by a simple assignment (i.e. a file).  I use A on that range, and the results are not what I expect or kill performance or consume unneeded resources, I'd rather not use algorithm A on that range, negating the need to define a save function in the first place.

I think that's what we're going to end up with.

> A range that defines save() is a forward range. save() creates an independent range from its source. The file etc. example was hypothetical but realistic.

I meant what actual types of ranges, not categories, I don't know how to say this better...  i.e. the file range is one type, what are others?  What I'm looking for is ranges that would otherwise be input ranges unless you define save on them.  For example, a file range without save is an input range, cannot be a forward range.  But if you define save, it is a forward range.  An array is an example of a range that can be a forward range without the save function.

My theory is, given this list of ranges, if you pair them with an algorithm that requires save capability, you wouldn't want to use that algorithm on it anyways (kinda like the consume example).

> Struct ranges won't work with a container hierarchy. If you define a container hierarchy (classes etc.) you'll also need a range hierarchy. Otherwise defining the inheritance relationship is impossible. Consider:
>
> abstract class Container(E) { // most general container
>      @property bool empty();
>      bool add(E element);
>      E removeAny();
>      InputRange!E opSlice();
> }
>
> That's what I'd expect any container worth its salt to implement: (a) test for empty; (b) add an element to the container; (c) remove some element from the container; (d) get a range that spans the entire container. (Probably removeAll() and a range-positioned remove() would make sense, too.)

The range interface is compile-time only, there is no need to define it in the container interface.  opSlice does not need to be part of the interface, even if it is defined on all the container types.

opApply makes for a much better interface-friendly iteration anyways.

BTW, the primitives in dcollections are:

clear(); // clear all elements
remove(V v); // remove an element
contains(V v); // returns whether an element is contained in the colleciton
length(); // the length of the collection
dup(); // duplicate the collection
opApply(int delegate(ref V v) dg); // iterate the collection
opApply(int delegate(ref bool doPurge, ref V v) dg); // purge the collection

That means it covers only empty in your list of must-have functions (via length() == 0).  Add is not a primitive because the Map collections shouldn't assign some random key to the new element.  removeAny is defined only on sets and multisets, but I'm not sure that it couldn't be moved to Collection, in fact, I'll probably do that.

Note that it's missing begin and end which are defined on every single container type (i.e. the equivalent of the all-elements range).  This is because those primitives return a struct that is different for every container type.

It also surpasses opSlice via opApply, since all an input range can do anyways is iterate.  In fact, opApply is more powerful because you can change elements and remove elements (via purging).  Plus it's more efficient than a range-via-interface.

>> I see a range as being useful for iteration or algorithms, but not for general use.  A great example is AAs.  Would you say that an AA *is* a range or should *provide* a range?  If it is a range, does that mean you remove elements as you call popFront?  Does that make any sense?  If it doesn't, then what happens if you add elements through another alias to that AA?
>
> An AA provides several ranges - among which byKey and byValue.

I misunderstood your statement "[a container hierarchy] does need range interfaces."  I thought you meant containers themselves need to implement the range interface, I see now that isn't the case, so my bad.

-Steve

January 03, 2010

Re: output ranges: by ref or by value?

Posted by Michel Fortin
in reply to Steven Schveighoffer

Michel Fortin

Posted in reply to Steven Schveighoffer

On 2010-01-02 21:46:57 -0500, "Steven Schveighoffer" <schveiguy@yahoo.com> said:

> That was the point of the enum.  It would be a synthetic difference, but  IMO so is save.  If it turns out that the only usable times you implement  save all look like:
> 
> auto save() { return *this; }
> 
> then save gives you nothing.  It's kind of a proof by induction (I think).
> 
> You say that algorithm A requires the ability to save the state of a  range.  So I define a function save on a range which cannot be saved by a  simple assignment (i.e. a file).  I use A on that range, and the results  are not what I expect or kill performance or consume unneeded resources,  I'd rather not use algorithm A on that range, negating the need to define  a save function in the first place.
> 
> I think that's what we're going to end up with.

This is making me think...

First, opening files silently whenever an algorithm feels the need to save its state is just madness. What if the operating system decide to pop a security window when opening a restricted file? What if the file has been deleted or replaced in its directory while your program is still hanging on the old inode?

One might want to save the position in a file in order to be able to return there later, but when you return there the last thing you actually want to do is to open the file a second time: what you might realistically want is to set the position to what it was before, calling fseek. So perhaps it would be more useful to have both save() to save the current state (the position in a file), and restore() which would restore the saved state (returning to the saved position in a file).

For ranges with reference semantics -- please call them streams! -- save() and restore() would work just as fpos and fseek for files, but they might also not be available like in a TCP stream or in a communication channel between threads.

For ranges such as array slices, save and restore would just copy the range in one or the other direction. The only reason to have them is so they can be used as streams.

-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/

January 03, 2010

Re: output ranges: by ref or by value?

Posted by Andrei Alexandrescu
in reply to Steven Schveighoffer

Andrei Alexandrescu

Posted in reply to Steven Schveighoffer

Steven Schveighoffer wrote:
> My theory is, given this list of ranges, if you pair them with an algorithm that requires save capability, you wouldn't want to use that algorithm on it anyways (kinda like the consume example).

Why gratuitously limit the design? You're asking to replace this:

R save() { return this; }

with:

enum thisIsAForwardRange = true;

Is there a reason? The former leaves in flexibility. The latter doesn't, for no good reason.

>> Struct ranges won't work with a container hierarchy. If you define a container hierarchy (classes etc.) you'll also need a range hierarchy. Otherwise defining the inheritance relationship is impossible. Consider:
>>
>> abstract class Container(E) { // most general container
>>      @property bool empty();
>>      bool add(E element);
>>      E removeAny();
>>      InputRange!E opSlice();
>> }
>>
>> That's what I'd expect any container worth its salt to implement: (a) test for empty; (b) add an element to the container; (c) remove some element from the container; (d) get a range that spans the entire container. (Probably removeAll() and a range-positioned remove() would make sense, too.)
> 
> The range interface is compile-time only, there is no need to define it in the container interface.  opSlice does not need to be part of the interface, even if it is defined on all the container types.
> 
> opApply makes for a much better interface-friendly iteration anyways.

I want container users to be able to save iteration state and also to open up containers to std.algorithm and other goodies, so I'm shying away from opApply.

> BTW, the primitives in dcollections are:
> 
> clear(); // clear all elements
> remove(V v); // remove an element

Search and remove? That's an odd primitive. Why wouldn't you offer an interface for iteration that allows an algorithm for search, and a primitive for positioned removal?

> contains(V v); // returns whether an element is contained in the colleciton

I don't think this belongs to primitives. It's O(n) for many containers and again it's a generic algorithm, not a member.

> length(); // the length of the collection

That's not a primitive either. std.algorithm.walkLength is. For me, all sorts of red flags and alarm buzzers go off when primitives are guaranteed that can't be implemented efficiently but by a subset of containers. You can't discount complexity as a implementation detail.

> dup(); // duplicate the collection
> opApply(int delegate(ref V v) dg); // iterate the collection
> opApply(int delegate(ref bool doPurge, ref V v) dg); // purge the collection
> 
> That means it covers only empty in your list of must-have functions (via length() == 0).

How do you implement length() for a singly-linked list? Is empty() going to take O(n)?

> Add is not a primitive because the Map collections shouldn't assign some random key to the new element.  removeAny is defined only on sets and multisets, but I'm not sure that it couldn't be moved to Collection, in fact, I'll probably do that.

add is a primitive that takes Tuple!(K, V) as the element type.

> Note that it's missing begin and end which are defined on every single container type (i.e. the equivalent of the all-elements range).  This is because those primitives return a struct that is different for every container type.

So you can't write container-independent iteration code unless you use opApply, in which case composition becomes tenuous.

> It also surpasses opSlice via opApply, since all an input range can do anyways is iterate.  In fact, opApply is more powerful because you can change elements and remove elements (via purging).  Plus it's more efficient than a range-via-interface.

An input range is a subset of other (more powerful) ranges. It's also much more flexible. I agree that calling range primitives via interfaces can become an efficiency liability.

>>> I see a range as being useful for iteration or algorithms, but not for general use.  A great example is AAs.  Would you say that an AA *is* a range or should *provide* a range?  If it is a range, does that mean you remove elements as you call popFront?  Does that make any sense?  If it doesn't, then what happens if you add elements through another alias to that AA?
>>
>> An AA provides several ranges - among which byKey and byValue.
> 
> I misunderstood your statement "[a container hierarchy] does need range interfaces."  I thought you meant containers themselves need to implement the range interface, I see now that isn't the case, so my bad.

Yah, they'd offer it as a result of opSlice(). Covariant return types will ensure there's no virtual call when you know what container you operate on.

Above all: the primitive set for a container must be a small set of functions that (a) can be implemented by all containers within reasonable efficiency bounds, and (b) are container-specific, not generic. IMHO any container design that defines a search(Element) as a primitive is misguided. Searching is not a container primitive - it's an algorithm. Only more specialized containers (e.g. trees, hashes etc.) can afford to define search(Element) as a primitive. Linear search is a generic algorithm that works the same for everyone. It does not belong as a method of any container.

Andrei

January 03, 2010

Re: output ranges: by ref or by value?

Posted by Andrei Alexandrescu
in reply to Michel Fortin

Andrei Alexandrescu

Posted in reply to Michel Fortin

Michel Fortin wrote:
> On 2010-01-02 21:46:57 -0500, "Steven Schveighoffer" <schveiguy@yahoo.com> said:
> 
>> That was the point of the enum.  It would be a synthetic difference, but  IMO so is save.  If it turns out that the only usable times you implement  save all look like:
>>
>> auto save() { return *this; }
>>
>> then save gives you nothing.  It's kind of a proof by induction (I think).
>>
>> You say that algorithm A requires the ability to save the state of a  range.  So I define a function save on a range which cannot be saved by a  simple assignment (i.e. a file).  I use A on that range, and the results  are not what I expect or kill performance or consume unneeded resources,  I'd rather not use algorithm A on that range, negating the need to define  a save function in the first place.
>>
>> I think that's what we're going to end up with.
> 
> This is making me think...
> 
> First, opening files silently whenever an algorithm feels the need to save its state is just madness. What if the operating system decide to pop a security window when opening a restricted file? What if the file has been deleted or replaced in its directory while your program is still hanging on the old inode?
> 
> One might want to save the position in a file in order to be able to return there later, but when you return there the last thing you actually want to do is to open the file a second time: what you might realistically want is to set the position to what it was before, calling fseek. So perhaps it would be more useful to have both save() to save the current state (the position in a file), and restore() which would restore the saved state (returning to the saved position in a file).

These are implementation matters. save() could lazily save information, duplicate the file handle (cheap on many OSs), etc.

> For ranges with reference semantics -- please call them streams! -- save() and restore() would work just as fpos and fseek for files, but they might also not be available like in a TCP stream or in a communication channel between threads.
> 
> For ranges such as array slices, save and restore would just copy the range in one or the other direction. The only reason to have them is so they can be used as streams.

This I don't understand. Array slices' save is:

T[] save(T[] array) { return array; }


Andrei

January 03, 2010

Re: output ranges: by ref or by value?

Posted by Michel Fortin
in reply to Andrei Alexandrescu

Michel Fortin

Posted in reply to Andrei Alexandrescu

On 2010-01-03 00:51:46 -0500, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> said:

>> First, opening files silently whenever an algorithm feels the need to save its state is just madness. What if the operating system decide to pop a security window when opening a restricted file? What if the file has been deleted or replaced in its directory while your program is still hanging on the old inode?
>> 
>> One might want to save the position in a file in order to be able to return there later, but when you return there the last thing you actually want to do is to open the file a second time: what you might realistically want is to set the position to what it was before, calling fseek. So perhaps it would be more useful to have both save() to save the current state (the position in a file), and restore() which would restore the saved state (returning to the saved position in a file).
> 
> These are implementation matters. save() could lazily save information, duplicate the file handle (cheap on many OSs), etc.




>> For ranges with reference semantics -- please call them streams! -- save() and restore() would work just as fpos and fseek for files, but they might also not be available like in a TCP stream or in a communication channel between threads.
>> 
>> For ranges such as array slices, save and restore would just copy the range in one or the other direction. The only reason to have them is so they can be used as streams.
> 
> This I don't understand. Array slices' save is:
> 
> T[] save(T[] array) { return array; }

The idea is that you would use save and restore like this:

	auto state = range.save();
	// ...
	range.restore(state);

For array slices, save doesn't really have to change, just define restore:

	T[] save(T[] array) { return array; }
	void restore(ref T[] array, T[] state) { array = state; }

But with file handles:

	size_t save(FILE handle) { return fpos(handle); }
	void restore(FILE handle, size_t state) { fseek(handle, state); }

And for a range interface to be used in a class hierarchy:

	interface Range {
		Variant save();
		void restore(Variant state);
	}

This way you don't have to create a new range every time you need to start from a previously saved state, you can reuse the existing one which is much more efficient.

Also, the saved state can be anything, of any type, and doesn't need to be a range. So when implementing a non-trivial range you don't need to implement lazy initialization logic and all sort of complicated stuff just in case someone might save the state.

Note that with save defined like this you can't really duplicate the range in the general case. I'm not sure if this is good or bad. How many algorithms really require you to duplicate a range?

-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation