Jump to page: 1 24  
Page
Thread overview
Repost: make foreach(i, a; range) "just work"
Feb 20, 2014
Regan Heath
Feb 20, 2014
Marc Schütz
Feb 20, 2014
Regan Heath
Feb 20, 2014
w0rp
Feb 20, 2014
w0rp
Feb 20, 2014
Regan Heath
Feb 21, 2014
Regan Heath
Feb 21, 2014
Regan Heath
Feb 21, 2014
Daniel Murphy
Feb 20, 2014
Justin Whear
Feb 20, 2014
w0rp
Feb 20, 2014
Justin Whear
Feb 22, 2014
Jesse Phillips
Feb 25, 2014
w0rp
Feb 21, 2014
Regan Heath
Feb 21, 2014
Regan Heath
Feb 21, 2014
Justin Whear
Feb 24, 2014
Regan Heath
Feb 21, 2014
Jesse Phillips
Feb 21, 2014
w0rp
Feb 21, 2014
Regan Heath
Feb 21, 2014
Jesse Phillips
Feb 21, 2014
Regan Heath
Feb 21, 2014
Jesse Phillips
Feb 24, 2014
Regan Heath
Feb 24, 2014
Jesse Phillips
Feb 25, 2014
Regan Heath
Feb 26, 2014
Jesse Phillips
February 20, 2014
I am posting this again because I didn't get any feedback on my idea, which may be TL;DR or because people think it's a dumb idea and they were politely ignoring it :p

My original thought was that things like this should "just work"..

auto range = input.byLine();
while(!range.empty)
{
  range.popFront();
  foreach (i, line; range.take(4))  //Error: cannot infer argument types
  {
    ..etc..
  }
  range.popFront();
}

The reason it fails was best expressed by Steven:

> This is only available using opApply style iteration. Using range iteration does not give you this ability.
> It's not a permanent limitation per se, but there is no plan at the moment to add multiple parameters torange iteration.
>
> One thing that IS a limitation though: we cannot overload on return values. So the obvious idea ofoverloading front to return tuples of various types, would not be feasible. opApply can do that becausethe delegate is a parameter.

And Jakob pointed me to this proposed solution:
[1] https://github.com/D-Programming-Language/phobos/pull/1866

Which is a great idea, but, I still feel that this should "just work" as I have written it.  I think this is what people will intuitively expect to work, and having it fail and them scrabble around looking for enumerate is sub-optimal.  I think we can solve it without negatively impacting future plans like what bearophile wants, which is built-in tuples (allowing foreach over AA's etc).

So, the solution I propose for my original problem above is:

Currently the 'i' value in a foreach on an array is understood to be an index into the array.  But, ranges are not always indexable.  So, for us to make this work for all ranges we would have to agree to change the meaning of 'i' from being an "index" to being a "counter, which may also be an index".  This counter would be an index if the source object was indexable.  Another way to look at it is to realise that the counter is always an index into the result set itself, and could be used as such if you were to store the result set in an indexable object.

To implement this, foreach simply needs to keep a counter and increment it after each call to the foreach body - the same way (I assume) it does for arrays and objects with opApply.

Interestingly, if this had been in place earlier, then the byKey() and byValue() members of AA's would not have been necessary.  Instead keys/values could simply have changed into indexable ranges, and no code breakage would have occurred (AFAICS).

So, to address bearophile's desire for built-in tuples, and iteration over AA's and how this change might affect those plans.  It seems to me we could do foreach over AAs/tuples in one of 2 ways or even a combination of both:

Scheme 1) for AA's/tuples the value given to the foreach body is a voldemort (unnamed) type with a public property member for each component of the AA/tuple.  In the case of AA's this would then be "key" and "value", for tuples it might be a, b, .., z, aa, bb, .. and so on.

foreach(x; AA) {}        // Use x.key and x.value
foreach(i, x; AA) {}     // Use i, x.key and x.value
foreach(int i, x; AA) {} // Use i, x.key and x.value

Extra/better: For non-AA tuples we could allow the members to be named using some sort of syntax, i.e.

foreach(i, (x.bob, x.fred); AA) {} // Use i, x.bob and x.fred
or
foreach(i, x { int bob; string fred }; AA) {} // Use i, x.bob and x.fred
or
foreach(i, new x { int bob; string fred }; AA) {} // Use i, x.bob and x.fred


Lets look at bearophile's examples re-written for scheme #1

foreach (v; AA) {}
foreach (x; AA) { .. use x.value .. } // better? worse?

foreach (k, v; AA) {}
foreach (x; AA) { .. use x.key, x.value .. } // better? worse?

foreach (k; AA.byKeys) {}
same // no voldemort reqd

foreach (i, k; AA.byKeys.enumerate) {}
foreach (i, k; AA.byKeys) {}   // better. note, no voldemort reqd

foreach (i, v; AA.byValues.enumerate) {}
foreach (i, v; AA.byValues) {} // better. note, no voldemort reqd

foreach (k, v; AA.byPairs) {}
foreach (x; AA) { .. use x.key, x.value .. } // better

foreach (i, k, v; AA.byPairs.enumerate) {}
foreach (i, x; AA) { .. use i and x.key, x.value .. } // better

This is my preferred approach TBH, you might call it foreach on "packed" tuples.


Scheme 2) the tuple is unpacked into separate variables given in the foreach.

When no types are given, components are assigned to variables such that the rightmost is the last AA/tuple component and subsequent untyped variables get previous components up and until the N+1 th which gets index/count.

foreach (v; AA) {}        // v is "value" (last tuple component)
foreach (k, v; AA) {}     // k is "key"   (2nd to last tuple component), ...
foreach (i, k, v; AA) {}  // i is "index/count" because AA only has 2 tuple components.

So, if you have N tuple components and you supply N+1 variables you get the index/count.  Supplying any more would be an error.

However, if a type is given and the type can be unambiguously matched to a single tuple component then do so.

double[string] AA;
foreach (string k; AA) {} // k is "key"

.. in which case, any additional unmatched untyped or 'int' variable is assigned the index/count. e.g.

foreach (i, double v; AA) {} // i is index, v is "value"
foreach (i, string k; AA) {} // i is index, k is "key"

If more than one typed variable is given, match each unambiguously.

foreach (string k, double v; AA) {} // i is index, k is "key", v is "value"

.. and likewise any unmatched untyped or 'int' variable is assigned index/count. e.g.

foreach (i, string k, double v; AA) {}     // i is index, k is "key", v is "value"
foreach (int i, string k, double v; AA) {} // i is index, k is "key", v is "value"

Any ambiguous situation would result in an error requiring the use of one of .keys/values (in the case of an AA), or to specify types (where possible), or to specify them all in the tuple order, e.g.

Using a worst case of..
int[int] AA;

// Error: cannot infer binding of k; could be 'key' or 'value'
foreach (int k; AA) {}

// Solve using .keys/byKey()/values/byValue()
foreach (k; AA.byKeys) {}            // k is "key"
foreach (i, k; AA.byKeys) {}         // i is index/count, k is "key"

// Solve using tuple order
foreach (k, v; AA) {}    // k is "key", v is "value"
foreach (i, k, v; AA) {} // i is index/count, k is "key", v is "value"

So, to bearophile's examples re-written for scheme #2

foreach (v; AA) {}
same

foreach (k, v; AA) {}
same

foreach (k; AA.byKeys) {}
same

foreach (i, k; AA.byKeys.enumerate) {}
foreach (i, k; AA.byKeys) {} // better

foreach (i, v; AA.byValues.enumerate) {}
foreach (i, v; AA.byValues) {} // better

foreach (k, v; AA.byPairs) {}
foreach (k, v; AA) {} // better

foreach (i, k, v; AA.byPairs.enumerate) {}
foreach (i, k, v; AA) {} // better

This scheme is more complicated than #1 so it's not my preferred solution.  But, it does name the components better than #1.


Scheme 3) Combination.  If we were to combine these ideas we would need to prefer one scheme by default, if we select scheme #1, for example, then any time a foreach is specified we default to assuming #1 where possible, and #2 otherwise.

In which case there are clear cases where scheme #2 is required/used:
 - when more than 2 variables are given, or
 - a specific type is given for the final variable.

Note that #1 only has 3 possible forms (for AA or tuples):

double[string] AA;
foreach (v; AA) {}        // #1 v is voldemort(key, value)/tuple
foreach (i, v; AA) {}     // #1 i is index/count, v is voldemort(key, value)/tuple
foreach (int i, v; AA) {} // #1 i is index/count, v is voldemort(key, value)/tuple

#2 would take effect in these forms..

foreach (i, double v; AA) {} // #2 (type given)  i is index/count, v is value
foreach (i, string k; AA) {} // #2 (type given)  i is index/count, k is key
foreach (i, k, v; AA) {}     // #2 (3 variables) i is index/count, k is key, v is value

Bearophile's examples re-written for scheme #3

With..
(*) any                 // voldemort scheme #1 works with any AA/tuple types even worst case "all one type"
(A) int[int] AA;        // worst case
(B) double[string] AA;

foreach (v; AA) {}
(*) foreach (x; AA)       { .. use x.value .. } // better?
(A) foreach (i, k, v; AA) { }  // worse?
(B) foreach (double v; AA) { } // worse?

foreach (k, v; AA) {}
(*) foreach (x; AA)       { .. use x.key, x.value .. } // better?
(A) foreach (k, double v; AA) { }  // force scheme #2, worse?
(B) foreach (double v; AA) { }     // force scheme #2, worse?

foreach (k; AA.byKeys) {}
same // note, no voldemort reqd

foreach (i, k; AA.byKeys.enumerate) {}
(*) foreach (i, k; AA.byKeys) {}   // better, note; no voldemort reqd

foreach (i, v; AA.byValues.enumerate) {}
(*) foreach (i, v; AA.byValues) {} // netter, note; no voldemort reqd

foreach (k, v; AA.byPairs) {}
(*) foreach (x; AA) { .. use x.key, x.value .. } // better

foreach (i, k, v; AA.byPairs.enumerate) {}
(*) foreach (i, x; AA) { .. use i and x.key, x.value .. } // better
(A) foreach (i, k, v; AA) { } // better
(B) foreach (i, k, v; AA) { } // better

This is a trade off between #1 and #2 but on balance I feel it is worse than #1 so is not my preferred solution.

***********

Thoughts?

Regan

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/
February 20, 2014
IMO, any change needs to be both backwards-compatible (i.e., it should not only "just work", as you phrased, but existing code should "just keep working"), and forward-compatible, so as not to obstruct any potential improvements of tuple handling.

Scheme #1 fails backwards-compatibility.

Scheme #2 doesn't, but I feel the matching rules if a type is specified are too complicated. Instead, I would suggest just to always assign the variables from the right, i.e. you cannot skip variables, and if you specify a type, it must match the type of the value in this position.

If you really want to skip a tuple member (in order to avoid an expensive copy), a special token "_" or "$" could be introduced, as has also been suggested in one the tuple unpacking/pattern matching DIPs, IIRC.

As for unpacking a tuple value (or key), an additional pair of parentheses can be used, so such a feature would still be possible in the future:

foreach(i, k, (a,b,c); ...)

(Scheme #3 seems just too complicated for my taste. It's important to be intuitively understandable and predictable.)
February 20, 2014
I don't think this is a good idea. Say you have a class with range methods and add opApply later. Only the opApply delegate receives a type other than size_t for the first argument. Now the foreach line infers a differnt type for i and code in the outside world will break.

More importantly, this gets in the way of behaviour which may be desirable later, foreach being able to unpack tuples from ranges. I would like if it was possible to return Tuple!(A, B) from front() and write foreach(a, b; range) to interate through those thing, unpacking the values with an alias, so this...

foreach(a, b; range) {
}

... could rewrite to roughly this. (There may be a better way.)

foreach(_someInternalName; range) {
    alias a = _someInternalName[0];
    alias b = _someInternalName[1];
}

Then to get a counter with a range, we could follow Python's example and use an enumerate function, which would take an existing range and wrap it with a counter, so T maps to Tuple!(size_t, T).

foreach(index, value; enumerate(range)) {
}

Which is rewritten to roughly this.

foreach(_bla; enumerate(range)) {
    alias index = _bla[0];
    alias value = _bla[1];
}

If we follow Python's example again, we could also support this nested unpacking.

// Now written with UFCS instead.
foreach(index, (index_again, value); range.enumerate.enumerate) {
}

Which can rewrite to roughly this.

foreach(_bla; range.enumerate.enumerate) {
    alias index = _bla[0];
    alias index_again = _bla[1][0];
    alias value = _bla[1][1];
}

I got off on kind of a tangent there, but there you go.
February 20, 2014
I probably didn't do enough job of reading your post because it looks like you shared some similar ideas. I'm sorry if my post reads a little like that.
February 20, 2014
On Thu, 20 Feb 2014 12:56:27 -0000, Marc Schütz <schuetzm@gmx.net> wrote:

> IMO, any change needs to be both backwards-compatible (i.e., it should not only "just work", as you phrased, but existing code should "just keep working"), and forward-compatible, so as not to obstruct any potential improvements of tuple handling.
>
> Scheme #1 fails backwards-compatibility.

Fair enough.  We can always pack things manually using something like the enumerate() method mentioned in the link.

> Scheme #2 doesn't, but I feel the matching rules if a type is specified are too complicated. Instead, I would suggest just to always assign the variables from the right, i.e. you cannot skip variables, and if you specify a type, it must match the type of the value in this position.
>
> If you really want to skip a tuple member (in order to avoid an expensive copy), a special token "_" or "$" could be introduced, as has also been suggested in one the tuple unpacking/pattern matching DIPs, IIRC.
>
> As for unpacking a tuple value (or key), an additional pair of parentheses can be used, so such a feature would still be possible in the future:
>
> foreach(i, k, (a,b,c); ...)

Cool.

> (Scheme #3 seems just too complicated for my taste. It's important to be intuitively understandable and predictable.)

Fair enough.

Any comments on the initial solution to my original problem?

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/
February 20, 2014
On Thu, 20 Feb 2014 13:04:55 -0000, w0rp <devw0rp@gmail.com> wrote:

> I don't think this is a good idea.

Which part?  The initial solution to my initial problem, or one of the 3 schemes mentioned?

> Say you have a class with range methods and add opApply later. Only the opApply delegate receives a type other than size_t for the first argument. Now the foreach line infers a differnt type for i and code in the outside world will break.

Only if the compiler prefers opApply to range methods, does it?

And, if it prefers range methods then any existing class with opApply (with more than 1 variable) that gets range methods will break also, because foreach(<more than 1 variable>; range) does not (currently) work.

> More importantly, this gets in the way of behaviour which may be desirable later, foreach being able to unpack tuples from ranges.

<snip> :)

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/
February 20, 2014
On Thu, 20 Feb 2014 13:04:55 +0000, w0rp wrote:
> 
> More importantly, this gets in the way of behaviour which may be
> desirable later, foreach being able to unpack tuples from ranges.
> I would like if it was possible to return Tuple!(A, B) from front() and
> write foreach(a, b; range) to interate through those thing, unpacking
> the values with an alias, so this...
> 
> foreach(a, b; range) {
> }
> 
> ... could rewrite to roughly this. (There may be a better way.)
> 
> foreach(_someInternalName; range) {
>      alias a = _someInternalName[0];
>      alias b = _someInternalName[1];
> }

Tuple unpacking already works in foreach.  This code has compiled since at least 2.063.2:

import std.stdio;
import std.range;
void main(string[] args)
{
    auto tuples = ["a", "b", "c"].zip(iota(0, 3));

    // unpack the string into `s`, the integer into `i`
    foreach (s, i; tuples)
        writeln(s, ", ", i);
}
February 20, 2014
On Thu, 20 Feb 2014 11:07:32 -0500, Regan Heath <regan@netmail.co.nz> wrote:

> Only if the compiler prefers opApply to range methods, does it?

It should. If it doesn't, that is a bug.

The sole purpose of opApply is to interact with foreach. If it is masked out, then there is no point for having opApply.

-Steve
February 20, 2014
On Thursday, 20 February 2014 at 16:30:42 UTC, Justin Whear wrote:

> Tuple unpacking already works in foreach.  This code has compiled since
> at least 2.063.2:
>
> import std.stdio;
> import std.range;
> void main(string[] args)
> {
>     auto tuples = ["a", "b", "c"].zip(iota(0, 3));
>
>     // unpack the string into `s`, the integer into `i`
>     foreach (s, i; tuples)
>         writeln(s, ", ", i);
> }

I did not know that. When did that happen? It didn't appear in
any changelogs and it works when I tried it in 2.064 on my
machine too. I suppose the next step after that would be to
support nested unpacking, but that would require a change in
syntax so it would be much more complicated.
February 20, 2014
On Thu, 20 Feb 2014 19:34:17 +0000, w0rp wrote:

> On Thursday, 20 February 2014 at 16:30:42 UTC, Justin Whear wrote:
> 
>> Tuple unpacking already works in foreach.  This code has compiled since at least 2.063.2:
>>
>> import std.stdio;
>> import std.range;
>> void main(string[] args)
>> {
>>     auto tuples = ["a", "b", "c"].zip(iota(0, 3));
>>
>>     // unpack the string into `s`, the integer into `i`
>>     foreach (s, i; tuples)
>>         writeln(s, ", ", i);
>> }
> 
> I did not know that. When did that happen? It didn't appear in any changelogs and it works when I tried it in 2.064 on my machine too. I suppose the next step after that would be to support nested unpacking, but that would require a change in syntax so it would be much more complicated.

January 24th, 2012: http://forum.dlang.org/thread/ mailman.756.1327362275.16222.digitalmars-d@puremagic.com#post- mailman.757.1327365651.16222.digitalmars-d:40puremagic.com

That said, it is not documented, see this bug: http://d.puremagic.com/ issues/show_bug.cgi?id=7361
« First   ‹ Prev
1 2 3 4