Jump to page: 1 2
Thread overview
foreach range with index
Jun 13, 2017
Luís Marques
Jun 14, 2017
Luís Marques
Jun 14, 2017
Luís Marques
Jun 14, 2017
Luís Marques
Jun 14, 2017
9il
Jun 14, 2017
Luís Marques
Jun 14, 2017
9il
Jun 14, 2017
Luís Marques
Jun 14, 2017
Luís Marques
Jun 14, 2017
Ali Çehreli
Jun 14, 2017
bachmeier
Jun 14, 2017
Ali Çehreli
Jun 14, 2017
Luís Marques
June 13, 2017
In the documentation for foreach with range concepts (not x..y ranges) (<http://dlang.org/spec/statement.html#foreach-with-ranges>) I did not find anything about foreach supporting iteration indices/tuple unpacking for ranges, such as `foreach(i, elm; range)`. It is "documented" in an example for std.range.enumerate though [1].

I found a bug report asking for that to be documented [2] and one saying that it should be removed [3].

One of the things that I like in D is the plasticity of the code. In the situation that brought me to this issue, I started with a plain slice which I iterated with foreach(i, foo; foos). Then `foos` became a range, and the foreach became foreach(i, foo; foos.enumerate); Thus, I didn't have to restructure my foreach iteration, moving the index variable outside the foreach statement, and manually incrementing the index, making this change less disruptive.

Some thoughts about this:

- It doesn't strike me as very realistic to remove support for foreach(i, e; range.enumerate) now, as probably a lot of people are relying on this. I find it very useful, especially as currently there doesn't seem to be a good alternative.

- The plasticity was not perfect in my example; ideally I wouldn't have to change the foreach statement at all (i.e., add .enumerate). Consider: my range had random access, why couldn't foreach itself generate the index, as it does for slices, instead of relying on .enumerate and tuple unpacking? Maybe it would even make sense for input ranges, I'm not sure.

- Whatever changes you decide for tuple unpacking and foreach, please try to keep in mind this plasticity concern. I'm trying some techniques to make data-oriented designs more flexible, and this feature seems like an important tool for the approach I'm following/devising.

- Since foreach with ranges and unpacking (i.e., indices when using .enumerate) seems here to stay for now, please document it. The code I was writing was supporting material for an article. But I'm afraid to advocate an approach which relies on behavior which is undocumented and which people are arguing should be removed. Not exactly confidence inspiring...

Thank you for your hard work.

Cheers,
Luís

[1] <http://dlang.org/phobos/std_range.html#enumerate>
[2] <https://issues.dlang.org/show_bug.cgi?id=7361>
[3] <https://issues.dlang.org/show_bug.cgi?id=9817>
June 13, 2017
On 6/13/17 1:15 PM, Luís Marques wrote:
> In the documentation for foreach with range concepts (not x..y ranges)
> (<http://dlang.org/spec/statement.html#foreach-with-ranges>) I did not
> find anything about foreach supporting iteration indices/tuple unpacking
> for ranges, such as `foreach(i, elm; range)`. It is "documented" in an
> example for std.range.enumerate though [1].
>
> I found a bug report asking for that to be documented [2] and one saying
> that it should be removed [3].
>
> One of the things that I like in D is the plasticity of the code. In the
> situation that brought me to this issue, I started with a plain slice
> which I iterated with foreach(i, foo; foos). Then `foos` became a range,
> and the foreach became foreach(i, foo; foos.enumerate); Thus, I didn't
> have to restructure my foreach iteration, moving the index variable
> outside the foreach statement, and manually incrementing the index,
> making this change less disruptive.
>
> Some thoughts about this:
>
> - It doesn't strike me as very realistic to remove support for
> foreach(i, e; range.enumerate) now, as probably a lot of people are
> relying on this. I find it very useful, especially as currently there
> doesn't seem to be a good alternative.

We aren't going to as far as I can tell. I closed that bug report.

> - The plasticity was not perfect in my example; ideally I wouldn't have
> to change the foreach statement at all (i.e., add ..enumerate).
> Consider: my range had random access, why couldn't foreach itself
> generate the index, as it does for slices, instead of relying on
> .enumerate and tuple unpacking? Maybe it would even make sense for input
> ranges, I'm not sure.

The issue here is the difference between foreach on arrays (which has nothing to do with ranges), and foreach on a range. In the former, the array index is wholly a concept of the foreach loop itself. The compiler inserts and maintains the index in the loop, as the array structure has no storage or maintenance of an index.

In the latter, the index is maintained by the range, and so stored there. AND it's copied and stored as a temporary for your loop via the unpacking of the tuple.

I feel a change in the way foreach and ranges interact may be needed to make things more efficient and less awkward. opApply has some really nice features, one of them being that you can declare loop-specific data such as an index, and another being that foreach'ing an opApply structure with different types/numbers of parameters works just fine.

But I think leaving the definition of the index up to the range itself is paramount -- I don't want every range to be able to have a size_t index, as that's not always what you want, and it conflicts with other items. What we may need is a smarter way to get from the type parameters at the front of the foreach to an iterable type.

> - Since foreach with ranges and unpacking (i.e., indices when using
> .enumerate) seems here to stay for now, please document it. The code I
> was writing was supporting material for an article. But I'm afraid to
> advocate an approach which relies on behavior which is undocumented and
> which people are arguing should be removed. Not exactly confidence
> inspiring...

It should be documented, 100%.

-Steve
June 14, 2017
On Tuesday, 13 June 2017 at 21:44:43 UTC, Steven Schveighoffer wrote:
> But I think leaving the definition of the index up to the range itself is paramount -- I don't want every range to be able to have a size_t index, as that's not always what you want, and it conflicts with other items. What we may need is a smarter way to get from the type parameters at the front of the foreach to an iterable type.

That's why I mentioned random access ranges, as in that case we can sidestep this discussion; since foreach understands input ranges, why can't foreach understand random access ones, and iterate them as if they were slices, including maintaining a foreach-generated index? That is, it would have nothing to do with tuple unpacking. Plus, it would work transparently in the cases you replace `slice` with `slice.algorithm`, where algorithm maintains random-access. Instead of having to add .enumerate in each place the slice (now a range) is iterated, it would just work.
June 14, 2017
On Wednesday, 14 June 2017 at 02:42:46 UTC, Luís Marques wrote:
> On Tuesday, 13 June 2017 at 21:44:43 UTC, Steven Schveighoffer wrote:
>> But I think leaving the definition of the index up to the range itself is paramount -- I don't want every range to be able to have a size_t index, as that's not always what you want, and it conflicts with other items. What we may need is a smarter way to get from the type parameters at the front of the foreach to an iterable type.
>
> That's why I mentioned random access ranges, as in that case we can sidestep this discussion; since foreach understands input ranges, why can't foreach understand random access ones, and iterate them as if they were slices, including maintaining a foreach-generated index? That is, it would have nothing to do with tuple unpacking. Plus, it would work transparently in the cases you replace `slice` with `slice.algorithm`, where algorithm maintains random-access. Instead of having to add .enumerate in each place the slice (now a range) is iterated, it would just work.

Hmm, I suppose that even if we have all of the good stuff of length, opIndex, opSlice, etc., we can't assume the indices are [0..length], right? But there should be some way for, say, map to preserve that information from the slice, the same way random access is preserved.
June 14, 2017
On Wednesday, 14 June 2017 at 02:42:46 UTC, Luís Marques wrote:
> On Tuesday, 13 June 2017 at 21:44:43 UTC, Steven Schveighoffer wrote:
>> But I think leaving the definition of the index up to the range itself is paramount -- I don't want every range to be able to have a size_t index, as that's not always what you want, and it conflicts with other items. What we may need is a smarter way to get from the type parameters at the front of the foreach to an iterable type.
>
> That's why I mentioned random access ranges, as in that case we can sidestep this discussion; since foreach understands input ranges, why can't foreach understand random access ones, and iterate them as if they were slices, including maintaining a foreach-generated index? That is, it would have nothing to do with tuple unpacking. Plus, it would work transparently in the cases you replace `slice` with `slice.algorithm`, where algorithm maintains random-access. Instead of having to add .enumerate in each place the slice (now a range) is iterated, it would just work.

Random access iteration is more expensive then front/popFront in general case.
For arrays it is not true. But for many other kinds of ranges it is. For example ranges that do not have contiguous memory representation (strided vectors, flattened matrixes and etc)
June 14, 2017
On Wednesday, 14 June 2017 at 02:48:34 UTC, Luís Marques wrote:
> Hmm, I suppose that even if we have all of the good stuff of length, opIndex, opSlice, etc., we can't assume the indices are [0..length], right? But there should be some way for, say, map to preserve that information from the slice, the same way random access is preserved.

A quick look into std.algorithm.iteration suggests that this is not actually true. For instance:

    static if (hasLength!R)
    {
        foreach (k; 0 .. data.length / 16)

So I guess we are assuming that if we have length then indexing is always [0..length]. These are the kinds of details that I never know because there is no spec for ranges :(
June 14, 2017
On Wednesday, 14 June 2017 at 02:54:30 UTC, 9il wrote:
> Random access iteration is more expensive then front/popFront in general case.
> For arrays it is not true. But for many other kinds of ranges it is. For example ranges that do not have contiguous memory representation (strided vectors, flattened matrixes and etc)

I'm not sure I understand what you are saying. I was suggesting that foreach(range) would iterate range[0...length] in that order. Wouldn't that be an equivalent access pattern to front/popFront?
June 14, 2017
On Wednesday, 14 June 2017 at 03:06:26 UTC, Luís Marques wrote:
> On Wednesday, 14 June 2017 at 02:54:30 UTC, 9il wrote:
>> Random access iteration is more expensive then front/popFront in general case.
>> For arrays it is not true. But for many other kinds of ranges it is. For example ranges that do not have contiguous memory representation (strided vectors, flattened matrixes and etc)
>
> I'm not sure I understand what you are saying. I was suggesting that foreach(range) would iterate range[0...length] in that order. Wouldn't that be an equivalent access pattern to front/popFront?

-------A
foreach(__i; 0 .. r.length)
{
    auto e = r[__i];
}


-------B
for (auto __r = range; !__r.empty; __r.popFront())
{
    auto e = __r.front;
    ...
}

They are equivalent, but "A" is slower then "B" in general case.
So, B is preferred.

June 14, 2017
On 6/13/17 10:42 PM, Luís Marques wrote:
> On Tuesday, 13 June 2017 at 21:44:43 UTC, Steven Schveighoffer wrote:
>> But I think leaving the definition of the index up to the range itself
>> is paramount -- I don't want every range to be able to have a size_t
>> index, as that's not always what you want, and it conflicts with other
>> items. What we may need is a smarter way to get from the type
>> parameters at the front of the foreach to an iterable type.
>
> That's why I mentioned random access ranges, as in that case we can
> sidestep this discussion; since foreach understands input ranges, why
> can't foreach understand random access ones, and iterate them as if they
> were slices, including maintaining a foreach-generated index? That is,
> it would have nothing to do with tuple unpacking. Plus, it would work
> transparently in the cases you replace `slice` with `slice.algorithm`,
> where algorithm maintains random-access. Instead of having to add
> .enumerate in each place the slice (now a range) is iterated, it would
> just work.

Yes, foreach could be augmented to do that. However, this is essentially a way to implement enumerate without the .enumerate, and nothing more. It's not a lot of added benefit IMO.

What I'd rather see is a better mechanism for ranges (or things that provide ranges) to hook the foreach syntax to allow better control and power from ranges without using traditional opApply.

For example, if we had:

foreach(size_t a, b; x)

translates to:

for(auto _context = x.opApply!(size_t, void); !context.empty; _context.popFront)
{
   auto (a, b) = _context.front; // the way it works now with tuples, don't think there's a real syntax for this.
   ...
}

-Steve
June 14, 2017
On Wednesday, 14 June 2017 at 14:35:32 UTC, Steven Schveighoffer wrote:
> What I'd rather see is a better mechanism for ranges (or things that provide ranges) to hook the foreach syntax to allow better control and power from ranges without using traditional opApply.
>
> For example, if we had:
>
> foreach(size_t a, b; x)
>
> translates to:
>
> for(auto _context = x.opApply!(size_t, void); !context.empty; _context.popFront)
> {
>    auto (a, b) = _context.front; // the way it works now with tuples, don't think there's a real syntax for this.
>    ...
> }

An approach like that seems fine to me. In any case, why do it in exclusion of a foreach-generated index? Imagine that I define an input range. Can we assume that range.front is *conceptually* equivalent to range[0], even if the range doesn't define random access?  Because if we can, then, if the range doesn't define the new opApply or whatever, why not allow foreach to generate the index?

> Yes, foreach could be augmented to do that. However, this is essentially a way to implement enumerate without the .enumerate, and nothing more. It's not a lot of added benefit IMO.

Consider this case. You create a prototype where foo is a slice. Then you start improving the code and foo becomes a range. In your code you have several places that use foo with foreach, and several places where it is used directly. If you define the new foo as newRangeFoo.enumerate, then the several foreach work automatically, but the other uses don't because they have to be changed to foo.value; if you do it the other way around, the direct uses work automatically but the foreach all have to be changed from foo to foo.enumerate. Thus you have lost a lot of the code plasticity. If the foreach statement generated the indices then it would work automatically.

This could also be solved by the range defining your new version of opApply and foreach understanding that. Assuming that we can wrap old ranges (e.g. from a library you don't want to change) with that opApply, I would be fine with that solution as long as this is something that actually goes forward. I'm just afraid this is something that will linger because it's a more difficult design decision, while having foreach support index generation seems like a more self contained solution that could be agreed upon and implemented quickly.
« First   ‹ Prev
1 2