View mode: basic / threaded / horizontal-split · Log in · Help
May 16, 2012
Should range foreach be iterating over an implicit copy?
A small debate has broken out over on D.learn ( 
http://forum.dlang.org/thread/jovicg$jta$1@digitalmars.com#post-jovicg:24jta:241:40digitalmars.com ) 
that I thought I should move here.

Basically, the issue is this: Currently, when you have a struct-based range, 
foreach will iterate over a *copy* of it:

   Range r;
   auto save = r;
   foreach(e; r)
       assert(r == save);
   assert(r == save);

One side of the argument is that this behavior is correct and expected since 
structs are value types, and iterating shouldn't consume the range.

My argument is this:

First of all, foreach is conceptually a flow-control statement, not a 
function. So I'd intuitively expect:

   Range r;
   foreach(e; r) {...}

To work like this:

   Range r;
   for(; !r.empty; r.popFront)
   {
       auto e = r.front;
       ...
   }

Additionally, iterating a range normally *does* consume it. And that's 
expected, as it should be: Imagine, for example, an input range that read 
from stdin. Leaving that range unconsumed would make no sense at all. 
Actually, any input range can be expected to *not* leave an uniterated copy 
behind: if it *could* have an uniterated copy left behind, it would be a 
forward range, not an input range.

It actually seems wrong to use the current foreach over an input range: 
Sometimes it might work by pure luck (as in the original example), but you 
can expect that for some input ranges it would fail miserably, because an 
input range is *not* a forward range and by definition cannot reliably save 
a previous state. So iterating over an input range would leave the original 
input range in an undefined state (actually, that could even be true for 
certain forward ranges if foreach doesn't implicitly start out by calling 
"r.save" and the range doesn't expect to be saved by copying).

Suppose foreach *didn't* iterate over a copy: If you wanted to still have an 
un-iterated version (of an actual forward range, or an input range that you 
knew to be safe), then that's trivial:

   Foo a;
   b = a.save(); // Or "b = a;" if you really know what you're doing.
   foreach(elem; a) {}
   assert(a.empty);
   assert(!b.empty);

Note, however, that there is no such simple way to go the other way around: 
to have the current "foreach over a copy" behavior and have access to the 
actual iterated range. Maybe if we had "foreach(e; ref range)", but AFAIK we 
don't.

One counter-argument that was raised is that TDPL has an example on page 381 
that indicates foreach iterates over an implicit copy. I don't have a copy 
handy ATM, so I can't look at it myself, but I'd love to see Andrei weigh in 
on this: I'm curious if this example in TDPL made that copy deliberately, or 
if the full implications of that copy were just an oversight.
May 16, 2012
Re: Should range foreach be iterating over an implicit copy?
On 5/16/12 4:37 PM, Nick Sabalausky wrote:
> One counter-argument that was raised is that TDPL has an example on page 381
> that indicates foreach iterates over an implicit copy. I don't have a copy
> handy ATM, so I can't look at it myself, but I'd love to see Andrei weigh in
> on this: I'm curious if this example in TDPL made that copy deliberately, or
> if the full implications of that copy were just an oversight.

It is deliberate and the intent is that millions of programmers used to 
foreach from other languages don't scream "where is my range???"

Andrei
May 16, 2012
Re: Should range foreach be iterating over an implicit copy?
On Wednesday, 16 May 2012 at 21:37:54 UTC, Nick Sabalausky wrote:
> A small debate has broken out over on D.learn (
> http://forum.dlang.org/thread/jovicg$jta$1@digitalmars.com#post-jovicg:24jta:241:40digitalmars.com 
> )
> that I thought I should move here.
>
> Basically, the issue is this: Currently, when you have a 
> struct-based range, foreach will iterate over a *copy* of it:

http://forum.dlang.org/post/dskrijudairkopeoyqkc@forum.dlang.org

I've replied in the other one, but I'll repost it here, perhaps 
andrei can correct me.

---

On Wednesday, 16 May 2012 at 20:50:55 UTC, Nick Sabalausky wrote:

> I was initially open to the idea I may have just misunderstood 
> something, but I'm becoming more and more convinced that the 
> current "foreach over an implicit copy" behavior is indeed a 
> bad idea.
>
> I'd love to see Andrei weigh in on this. I'm curious if the 
> example you pointed out in TDPL made the copy deliberately, or 
> if the effects of that copy were just an oversight.

 I remember going over hard and long over that section. I think 
it's deliberate.

 Range can refer to many things, and I'm assuming array is one of 
them. So... If the array is consumed when using foreach, that 
seems dumb right? (An array in the end is just a small struct 
afterall...)

---
int[] x = [1,2,3];
foreach(i; x) {
 //stuff
}

 assert(x.length == 0, "Should have been consumed!");
---

 Structs being value types are by value and so are copied, not 
referenced. Although referencing a copy does seem a bit silly...

int[] x = [1,2,3];
foreach(ref i; x) {
 i = 10;
}

assert(x == [10,10,10], "But i changed them to 10!!");

--

 That means that foreach(ref; ref) if it's considered (And I 
think it makes sense to) would work for structs in these few 
cases. (Course you'd still consume dynamic arrays that way)

 In std.stream are classes, so they are consumed. I'm probably 
just rambling...
May 17, 2012
Re: Should range foreach be iterating over an implicit copy?
On Wednesday, 16 May 2012 at 21:37:54 UTC, Nick Sabalausky wrote:
> A small debate has broken out over on D.learn (
> http://forum.dlang.org/thread/jovicg$jta$1@digitalmars.com#post-jovicg:24jta:241:40digitalmars.com 
> )
> that I thought I should move here.

I am in agreement, a value type rage is confusing to use

http://d.puremagic.com/issues/show_bug.cgi?id=7067

I don't agree with Andrei that we want to stop the "hey, where'd 
my range go" because we spend so much effort to get that to be 
expected behavior (and it is). The hybrid array really doesn't 
help, but I wouldn't want it changed.

If .save is not called on the range then the expectation is that 
it will be consumed. It is clear that calling a function with a 
value type would likely mean it won't be consumed, but that is 
the odd case, the expectation is that the range would be consumed.

And this leads to what is suggested in the referenced bug, a 
wrapper to prevent an implicit forward range.
May 17, 2012
Re: Should range foreach be iterating over an implicit copy?
On Wednesday, 16 May 2012 at 21:40:39 UTC, Andrei Alexandrescu 
wrote:
> On 5/16/12 4:37 PM, Nick Sabalausky wrote:
>> One counter-argument that was raised is that TDPL has an 
>> example on page 381
>> that indicates foreach iterates over an implicit copy. I don't 
>> have a copy
>> handy ATM, so I can't look at it myself, but I'd love to see 
>> Andrei weigh in
>> on this: I'm curious if this example in TDPL made that copy 
>> deliberately, or
>> if the full implications of that copy were just an oversight.
>
> It is deliberate and the intent is that millions of programmers 
> used to foreach from other languages don't scream "where is my 
> range???"
>
> Andrei

I think this is the correct behavior as well. Otherwise, we'd 
have to take extra care when writing generic code to ensure it 
doesn't consume the range but doesn't attempt to call .save on 
non-range types.
May 17, 2012
Re: Should range foreach be iterating over an implicit copy?
I'm a little discouraged that my concern about "input ranges can't save 
their state, and yet that's exactly what happens implicitly" hasn't been 
addressed. I was hoping to at least get a "That's not really a problem and 
here's why..."

However, that said...

"Andrei Alexandrescu" wrote...
> It is deliberate and the intent is that millions of programmers used to 
> foreach from other languages don't scream "where is my range???"

"Era Scarecrow" wrote...
> Range can refer to many things, and I'm assuming array is one of them. 
> So... If the array is consumed when using foreach, that seems dumb right? 
> (An array in the end is just a small struct afterall...)

I admit, those are fair points.

With those in mind, I've given it some more thought, and here are my 
current...umm...thoughts:

- A range is not a collection, it's a *view* of a collection (or of 
something else). This is a necessary distinction because ranges and 
collections work in fundamentally different ways: A range is, *by necessity* 
consumed as it's iterated - that's what popFront *does*, that's 
fundamentally how ranges work. For many collections, OTOH, it makes *no* 
sense to consume the collection while iterating it. Thus, range != 
collection, it's a view of a collection.

- Ranges are D's answer to iterators. I don't think people used to iterators 
from other languages would expect their iterator to magically reset itself 
after being used. So I see no reason why they would expect a range (ie, an 
iterator-pair-with-benefits) to behave differently than that.

- D's arrays conflate the ideas of "collection" and "range", hence the odd 
edge case Era pointed out, and hence the "need" for foreach to automatically 
make a copy. But in my (not super-extensive) experience creating ranges, 
I've found that to be a problematic pattern (due to the fundamental 
distinction between a range and a collection), and learned to prefer making 
my iterable things *return* a range, rather than actually *be* ranges.

- Admittedly, it would be annoying if foreach had to be used like this on 
all collections: "foreach(e; myArray.rangeOf)", so I guess it would make 
sense for a range to automatically be obtained when foreach-ing over a 
collection. However, I'm still not 100% sold on the current method of doing 
that (making foreach automatically copy struct-based ranges), partly because 
of the questionable implications it has for input ranges, and partly because 
(for struct-ranges) it leaves no way to access the range that's actually 
being iterated.

- At the very least, perhaps input ranges just shouldn't be allowed to be 
structs? After all, structs are intended to be copied around, but input 
ranges, by definition, can't have their current state copied.
May 17, 2012
Re: Should range foreach be iterating over an implicit copy?
Nick Sabalausky:

> One side of the argument is that this behavior is correct and 
> expected since structs are value types, and iterating
> shouldn't consume the range.

In that D.learn thread I've shown that iterating on a fixed-size
array, that is a value, doesn't perform a copy of the array.


> Imagine, for example, an input range that read from stdin.
> Leaving that range unconsumed would make no sense at all.
> Actually, any input range can be expected to *not* leave an 
> uniterated copy behind: if it *could* have an uniterated
> copy left behind, it would be a forward range, not an input
> range.

Seems right.

Bye,
bearophile
May 17, 2012
Re: Should range foreach be iterating over an implicit copy?
On Thu, 17 May 2012 01:48:06 -0400, Nick Sabalausky  
<SeeWebsiteToContactMe@semitwist.com> wrote:

> I'm a little discouraged that my concern about "input ranges can't save
> their state, and yet that's exactly what happens implicitly" hasn't been
> addressed. I was hoping to at least get a "That's not really a problem  
> and
> here's why..."

input ranges can save their state if they are also forward ranges.

>
> However, that said...
>
> "Andrei Alexandrescu" wrote...
>> It is deliberate and the intent is that millions of programmers used to
>> foreach from other languages don't scream "where is my range???"
>
> "Era Scarecrow" wrote...
>> Range can refer to many things, and I'm assuming array is one of them.
>> So... If the array is consumed when using foreach, that seems dumb  
>> right?
>> (An array in the end is just a small struct afterall...)
>
> I admit, those are fair points.
>
> With those in mind, I've given it some more thought, and here are my
> current...umm...thoughts:

[snip]

This is not a problem with ranges, this is an issue with value types  
versus reference types.  I believe someone has created a byRef struct that  
wraps a range and iterates it byRef (maybe dsimcha?)

Almost all ranges except true input-only ranges (i.e. input ranges that  
are only input ranges) should be value types.

If we *didn't* do this, you would need heap data for each range.

-Steve
May 17, 2012
Re: Should range foreach be iterating over an implicit copy?
On 17.05.2012 18:18, Steven Schveighoffer wrote:
> On Thu, 17 May 2012 01:48:06 -0400, Nick Sabalausky
> <SeeWebsiteToContactMe@semitwist.com> wrote:
>
>> I'm a little discouraged that my concern about "input ranges can't save
>> their state, and yet that's exactly what happens implicitly" hasn't been
>> addressed. I was hoping to at least get a "That's not really a problem
>> and
>> here's why..."
>
> input ranges can save their state if they are also forward ranges.
>

s/ save state / copy themselves
Fixed :)
That's why I tried to come up with BacktrackRange or some such. An input 
range that you can "shake" to reset to some saved point, but can't have 
two separate states at _the same time_ (nor full copy, it's like ... file)



-- 
Dmitry Olshansky
May 17, 2012
Re: Should range foreach be iterating over an implicit copy?
On Wed, 16 May 2012 17:37:14 -0400, Nick Sabalausky  
<SeeWebsiteToContactMe@semitwist.com> wrote:

> A small debate has broken out over on D.learn (
> http://forum.dlang.org/thread/jovicg$jta$1@digitalmars.com#post-jovicg:24jta:241:40digitalmars.com  
> )
> that I thought I should move here.
>
> Basically, the issue is this: Currently, when you have a struct-based  
> range,
> foreach will iterate over a *copy* of it:
>
>     Range r;
>     auto save = r;
>     foreach(e; r)
>         assert(r == save);
>     assert(r == save);
>
> One side of the argument is that this behavior is correct and expected  
> since
> structs are value types, and iterating shouldn't consume the range.
>
> My argument is this:
>
> First of all, foreach is conceptually a flow-control statement, not a
> function. So I'd intuitively expect:
>
>     Range r;
>     foreach(e; r) {...}
>
> To work like this:
>
>     Range r;
>     for(; !r.empty; r.popFront)
>     {
>         auto e = r.front;
>         ...
>     }

Hm... proposal:

foreach(e; ref r)
{
}

equates to your desired code.  Would this help?

-Steve
« First   ‹ Prev
1 2
Top | Discussion index | About this forum | D home