July 09, 2012
On 7/9/12 12:04 PM, Mehrdad wrote:
> On Monday, 9 July 2012 at 15:48:35 UTC, Andrei Alexandrescu wrote:
>> On 7/9/12 11:35 AM, Mehrdad wrote:
>>> If something is both an input range and an output range, then sure, it
>>> can have that capability. But being able to write to something is
>>> _orthogonal_ to whether you can read from it.
>>
>> That is the case right now. The point is, with your design you need to
>> add something extra to allow writing to elements of a single-pass
>> range. So your design does not simplify things as much as it might seem.
>>
>> Andrei
>
>> With your design you need to add something extra to allow writing to
>> elements of a single-pass range.
>
> If that's the case, I'd hate to tell you this, but _unless_ you're
> planning on removing the notion of input/output ranges (and perhaps
> adding single-pass/multi-pass), you're doing it wrong. :-)

Given the smart-aleck nature of the comment I'd say s/hate/love/.

> That capability is simply /not needed/ when your /only/ contract is that
> something is an input range.

As I mentioned, "input range" is a misnomer. Think "one-pass range". The range can be written or not, and a multi-pass range (including random-access range) is also a one-pass range.


Andrei
July 09, 2012
On Monday, 9 July 2012 at 18:51:40 UTC, Andrei Alexandrescu wrote:
>> If that's the case, I'd hate to tell you this, but _unless_ you're planning on removing the notion of input/output ranges (and perhaps adding single-pass/multi-pass), you're doing it wrong. :-)
>
> Given the smart-aleck nature of the comment I'd say s/hate/love/.

lol, nice comment. +1

>> That capability is simply /not needed/ when your /only/ contract is that something is an input range.
>
> As I mentioned, "input range" is a misnomer. Think "one-pass range". The range can be written or not, and a multi-pass range (including random-access range) is also a one-pass range.

Sure, we all agree that a multi-pass range is also a one-pass range. I don't have a problem with that. But if you only use the one-pass range aspect, that doesn't mean it has to use the same syntax as the multi-pass range aspect.

The issue is, while you are /telling/ me that "input range" is a misnomer, that doesn't match what Phobos's is telling me. In Phobos, you've placed a very clear difference between "input ranges" and "output ranges", and /both/ of them are single-pass.
It looks like it was deliberately designed that way...  I don't think you made a mistake when separating input and output ranges like that (unless you do?).
So unless you're planning on trashing all that and redesigning the entire thing from scratch (are you?) I don't understand how you can have two orthogonal bases for working with ranges like this. (Pardon the math lingo, but that's really what they are -- input/output vs random-access/multi-pass-sequential-access/single-pass-sequential-access are two different bases for looking at the issue. You can't really mix them up in the same module...)
July 09, 2012
On Monday, 9 July 2012 at 18:33:26 UTC, jerro wrote:
> On Monday, 9 July 2012 at 16:22:05 UTC, Mehrdad wrote:
>> Perhaps if you could show an actual example of what you expect to be able to do, that would make things clearer?
>
> It is useful to be able to write an algorithm that both reads
> and writes range elements. There are plenty of use cases for
> that, but if you really need an example, here's a simple one:
>
> void transform(alias f, R)(R r)
>     if(isInputRange!R && hasAssignableElements!R)
> {
>     for(; !r.empty; r.popFront())
>         r.front = unaryFun!f(r.front);
> }

Wow, thanks a bunch. That makes it a LOT more clearer than explanations. :)

The trouble here is that the use case is valid, but the design of ranges doesn't really match it.

Why? Because the way it currently stands, isInputRange && hasAssignableElements is basically isOutputRange (with also input capability), which doesn't make any sense.

What we really need is to define a new kind of range -- perhaps an "input/output range", if I had to name it: it's forward-only, but it can be mutated and replaced. It'd be a superset of both input and output ranges, and also a subset of bidirectional ranges.

But without formally defining a new kind of range, we're just saying "I want an input range that's also kinda-sorta an output range. Ooh, I also want to be able to mutate it! But I can't seek backwards, so it's not bidi. Hmm, not sure what to call it, so I'll just put constraints for everything I need and hope it makes sense."

It (obviously) /works/, but it doesn't make sense semantically, the way it's defined right now.
July 09, 2012
On 7/9/12 3:03 PM, Mehrdad wrote:
> Sure, we all agree that a multi-pass range is also a one-pass range. I
> don't have a problem with that. But if you only use the one-pass range
> aspect, that doesn't mean it has to use the same syntax as the
> multi-pass range aspect.

But that works against your goal of simplifying things.

> The issue is, while you are /telling/ me that "input range" is a
> misnomer, that doesn't match what Phobos's is telling me. In Phobos,
> you've placed a very clear difference between "input ranges" and "output
> ranges", and /both/ of them are single-pass.

An output range only needs put(). In that sense it's an endless bag in which you get to put stuff without any control over e.g. navigation. A one-pass range may or may not accept assignment to .front. In particular, put() is written to detect and use that. And that's about it.

> It looks like it was deliberately designed that way... I don't think you
> made a mistake when separating input and output ranges like that (unless
> you do?).

I sense a Jedi trick is tried unto me.

> So unless you're planning on trashing all that and redesigning the
> entire thing from scratch (are you?) I don't understand how you can have
> two orthogonal bases for working with ranges like this.

I agree with your assessment that you don't have a good understanding of how D ranges work. Nevertheless, proposing an alternative design would be a great basis for discussion; isolated fragments of designs look attractive in isolation but often fail to offer the whole picture.


Andrei
July 09, 2012
On Monday, July 09, 2012 21:19:46 Mehrdad wrote:
> On Monday, 9 July 2012 at 18:33:26 UTC, jerro wrote:
> > On Monday, 9 July 2012 at 16:22:05 UTC, Mehrdad wrote:
> >> Perhaps if you could show an actual example of what you expect to be able to do, that would make things clearer?
> > 
> > It is useful to be able to write an algorithm that both reads and writes range elements. There are plenty of use cases for that, but if you really need an example, here's a simple one:
> > 
> > void transform(alias f, R)(R r)
> > 
> >     if(isInputRange!R && hasAssignableElements!R)
> > 
> > {
> > 
> >     for(; !r.empty; r.popFront())
> > 
> >         r.front = unaryFun!f(r.front);
> > 
> > }
> 
> Wow, thanks a bunch. That makes it a LOT more clearer than explanations. :)
> 
> The trouble here is that the use case is valid, but the design of ranges doesn't really match it.
> 
> Why? Because the way it currently stands, isInputRange && hasAssignableElements is basically isOutputRange (with also input capability), which doesn't make any sense.

Except that how output ranges are written to and an input range with assignable elements are fundamentally different. Output ranges use put, which _might_ write to each individual element, or it may just append to the output range (it all depends on the implementation of put). You're essentially generating a range when you use an output range. But with an input range with assignable elements, you're specifically setting each element in an existing range rather than generating a new one. You can think of an output range like an output stream, whereas an input range with assignable elements is like an array where you're assigning values to each of its elements.

- Jonathan M Davis
July 09, 2012
On Monday, 9 July 2012 at 19:22:39 UTC, Andrei Alexandrescu wrote:
> But that works against your goal of simplifying things.

If by "simplification" you mean "fewer methods to call", then yes.

If by "simplification" you mean "easier to understand", then no -- not here.
Having a different (but still simple) syntax for a more capable range is far better than the getNext() proposal here.

> I sense a Jedi trick is tried unto me.

lol wasn't intending to, but whatever.


> An output range only needs put(). In that sense it's an endless bag in which you get to put stuff without any control over e.g. navigation. A one-pass range may or may not accept assignment to .front. In particular, put() is written to detect and use that. And that's about it.
> I agree with your assessment that you don't have a good understanding of how D ranges work.

I certainly don't have a good understanding of how they were _intended_ to work, is what I found out in this thread. The docs (or even the names, as you mentioned yourself) don't at all convey what you guys explained here, so if they were designed one way but intended to work another way, then /of course/ I don't have a good understanding of how they were intended to work!


> Nevertheless, proposing an alternative design would be a great basis for discussion; isolated fragments of designs look attractive in isolation but often fail to offer the whole picture.

I agree. My alternative would be to abandon similar 'hasXYZ' stuff (which doesn't convey the picture and looks hacky), and instead formally define what those are, like I/O range. Sounds good/bad?
July 09, 2012
On Monday, 9 July 2012 at 19:30:25 UTC, Jonathan M Davis wrote:
> Except that how output ranges are written to and an input range with assignable elements are fundamentally different. Output ranges use put, which _might_ write to each individual element, or it may just append to the output range (it all depends on the implementation of put). You're essentially generating a range when you use an output range. But with an  input range with assignable elements, you're specifically setting each element in an existing range rather than generating a new one. You can think of an output range like an output stream, whereas an input range with assignable elements is like an array where you're assigning values to each of its elements.

Oh!! So they're _exactly_ emulating C++ here, with insert_iterators trying to mimic regular iterators and all. I _never_ got that impression from the docs.
The impression the docs give is that the only time when you "add" instead of "overwrite" is when it either doesn't make sense to overwrite (e.g. into hash set, at the end of an array, etc.).
They never implied that you you might also be inserting into somewhere where overwriting is possible, so clearly I misunderstood what was intended.


Btw, I just realized:

With that explanation, hasAssignableElements doesn't make sense for some things that it should make sense for.

How do you "assign" to e.g. a range of a BST? It _certainly_ makes sense to do a transform() on it, but assigning to any particular element doesn't make sense. So I think that confirms that what we really want IS an I/O range -- a range that supports delete-modify-add, but not (necessarily) read-modify-write (which is what hasAssignableElements implies).

Makes sense?
July 09, 2012
On Monday, 9 July 2012 at 07:53:41 UTC, David Piepgrass wrote:
> I don't know if this proposal went anywhere since 2010, but it occurs to me that there is a hidden danger here. alloca will allocate a sequence of separate temporaries. If the collection is large, the stack will overflow, and the client might not have a clue what happened.

Amazing. My post unleashed four pages of comments and not one of them responded to my post :O

I think Mehrdad is right that an in/out range should have its own name to distinguish it from an input range, but that doesn't necessarily mean that the same interface can't be used for both.

I imagine a couple of advantages of:

>     T tmp;
>     for(T* front = r.getNext(ref tmp))
>         // do something with front

instead of:

>     for(; !r.empty; r.popFront())
>         // do something with r.front

- If the range uses late-binding, getNext() is faster because you're only calling one function instead of 3. When I program in C#, I am quite irritated enough that IEnumerator requires 2 interface calls to get each item. Late binding, of course, is necessary across DLL boundaries and can help avoid code bloat.
- If an input-only range has to unpack its elements (e.g. bit array => bool, or anything compressed), the range doesn't need to unpack repeatedly every time 'front' is accessed, nor does it need to reserve memory inside itself for a scratch area (you don't want scratch areas in every range if your app keeps track of thousands of ranges; plus, ranges tend to get passed by value, right?).

That said, it may be unreasonable for the compiler to support the necessary escape analysis (impossible in case you're importing .di files)... and maybe the existing empty/popFront/front is too well established to reconsider? (I am not familiar with the status quo).
July 09, 2012
On 7/9/12 3:30 PM, Mehrdad wrote:
> I agree. My alternative would be to abandon similar 'hasXYZ' stuff
> (which doesn't convey the picture and looks hacky), and instead formally
> define what those are, like I/O range. Sounds good/bad?

You may want to just spell it clearly.

So right now we have the notions:

input range (well one-pass range)
forward range
bidirectional range
random-access range
output range

coupled with the capability queries

isInfinite
hasAssignableElements
hasLength
...

The small set of ranges coupled with the capability queries reflect the orthogonal or near-orthogonal nature of such.

As far as I understand the above, you propose the notions:

input range
input range with assignable elements
infinite input range
infinite input range with assignable elements
forward range
forward range with assignable elements
...

That seems pretty onerous, but then I can't derive other meaning from your post.


Andrei

July 09, 2012
On 07/09/2012 09:46 PM, Mehrdad wrote:
> On Monday, 9 July 2012 at 19:30:25 UTC, Jonathan M Davis wrote:
>> Except that how output ranges are written to and an input range with
>> assignable elements are fundamentally different. Output ranges use
>> put, which _might_ write to each individual element, or it may just
>> append to the output range (it all depends on the implementation of
>> put). You're essentially generating a range when you use an output
>> range. But with an input range with assignable elements, you're
>> specifically setting each element in an existing range rather than
>> generating a new one. You can think of an output range like an output
>> stream, whereas an input range with assignable elements is like an
>> array where you're assigning values to each of its elements.
>
> Oh!! So they're _exactly_ emulating C++ here, with insert_iterators
> trying to mimic regular iterators and all. I _never_ got that impression
> from the docs.
> The impression the docs give is that the only time when you "add"
> instead of "overwrite" is when it either doesn't make sense to overwrite
> (e.g. into hash set, at the end of an array, etc.).
> They never implied that you you might also be inserting into somewhere
> where overwriting is possible, so clearly I misunderstood what was
> intended.
>
>
> Btw, I just realized:
>
> With that explanation, hasAssignableElements doesn't make sense for some
> things that it should make sense for.
>
> How do you "assign" to e.g. a range of a BST? It _certainly_ makes sense
> to do a transform() on it, but assigning to any particular element
> doesn't make sense. So I think that confirms that what we really want IS
> an I/O range -- a range that supports delete-modify-add, but not
> (necessarily) read-modify-write (which is what hasAssignableElements
> implies).
>
> Makes sense?

struct BST(K){
    struct Node{ ... }
    Node root;
    auto opSlice(){
        struct Range{
            ...
            @property front(){ return ...; }
            @property front(K k){
                deleteFront();
                root.add(k);
            }
        }
    }
}