Jump to page: 1 2 3
Thread overview
opNext: Simplified ranges
Nov 05, 2022
Tomer@Weka
Nov 05, 2022
rikki cattermole
Nov 05, 2022
rikki cattermole
Nov 05, 2022
Paul Backus
Nov 05, 2022
Dukc
Nov 06, 2022
Walter Bright
Nov 06, 2022
Tomer@Weka
Nov 06, 2022
Imperatorn
Nov 06, 2022
Per Nordlöw
Nov 06, 2022
Per Nordlöw
Nov 06, 2022
Sebastiaan Koppe
Nov 06, 2022
Per Nordlöw
Nov 06, 2022
Per Nordlöw
Nov 07, 2022
Tejas
Nov 07, 2022
Sebastiaan Koppe
Nov 07, 2022
Timon Gehr
Nov 10, 2022
victoroak
November 05, 2022

I find the empty/front/popNext protocol very cumbersome to implement for most cases (at least in the Weka codebase), which causes people to just prefer opApply (even though it has its own set of issues).

Today I even ran into a problem that cannot be solved using the existing protocol: a consuming range (think a file stream) that should not consume the element if the foreach has been broken. Using opApply, this is easy (the delegate returns nonzero), but using popNext it's next to impossible (or at least really really cumbersome). I don't want to go into all the details, so let's just jump to the suggested protocol.

bool opNext(out T elem) {...}

opNext returns true if it "produced" an element, false if it reached the end. That way

foreach (elem; range) {
    ...
}

Simply gets lowered to

T elem;
while (range.opNext(elem)) {
   ...
}

Of course this protocol can live side-by-side with the existing protocol, just take precedence in the lowering rules. A concrete example:

struct LinkedList {
    int value;
    LinkedList* next;

    struct Range {
        LinkedList* cursor;

        bool opNext(out int val) nothrow @nogc {
            if (cursor is null) {
                return false;
            }
            val = cursor.value;
            return true;
        }
    }

    auto members() {return Range(&this);}
}

foreach (val; myList.members) {
    ...
}

Notes:

  • save() can work for the new protocol as well, it just needs to duplicate the iterator's state
  • Maybe it would be wise to complement this with opPrev for bidirectional ranges, but I have to say I've rarely seen any use of them
  • I think chaining with map/filter/etc. is even easier to implement in the new protocol
November 05, 2022

On Saturday, 5 November 2022 at 06:01:54 UTC, Tomer@Weka wrote:

>
bool opNext(out T elem) {...}

opNext returns true if it "produced" an element, false if it reached the end.

Out of curiosity: what do you think are the pros and cons of this versus, say, something like:

Option!(T) opNext() { ... }

Where Option is an implementation of the option type, and hence encapsulates the question of whether an element was produced or not? (IIRC this has been another suggestion for a simplified/redesigned range concept.)

An Option-based design has the advantage of not requiring a persistent front, which may fit better with true input ranges (e.g. reading from a stream). OTOH it may make things more clunky for the case where there is persistent underlying data that a front method could refer to.

November 06, 2022
```d
mixin template NextToRange(ElementType) {
	ElementType _value;
	bool _haveValue;

	@property {
		bool empty() {
			return _haveValue;
		}

		ElementType front() {
			assert(_haveValue);
			return _value;
		}
	}

	void popFront() {
		_haveValue = nextValue(_value);
	}
}
```

You don't need to break a large percentage of the standard library to use this ;)
November 06, 2022
On 06/11/2022 12:34 AM, rikki cattermole wrote:
>              return _haveValue;

return !_haveValue;

Oops, that should be reversed.

Thanks Bruce Carneal!
November 05, 2022

On 11/5/22 2:01 AM, Tomer@Weka wrote:

>

Today I even ran into a problem that cannot be solved using the existing protocol: a consuming range (think a file stream) that should not consume the element if the foreach has been broken.

Not sure what this means. breaking a foreach does not consume the next element.

Now, foreach does make a copy of the range. Which means that the copy is going to consume the current, while the original will not get it (and the copy is gone once the loop is over).

This can be solved by a wrapper:

struct RefOf(T)
{
  private T *_val;
  ref T _get() { return *_val; }
  alias _get this;
}

auto refOf(T)(return ref T val) { return RefOf!T(&val); }

foreach(elem; range.refOf) // now doesn't make a copy.

Could also be solved with syntax changes to foreach. Maybe foreach(elem; ref range)?

-Steve

November 05, 2022

On Saturday, 5 November 2022 at 15:17:42 UTC, Steven Schveighoffer wrote:

>

This can be solved by a wrapper:

struct RefOf(T)
{
  private T *_val;
  ref T _get() { return *_val; }
  alias _get this;
}

auto refOf(T)(return ref T val) { return RefOf!T(&val); }

foreach(elem; range.refOf) // now doesn't make a copy.

You'd think, but if you actually try this, it turns out it doesn't work: the compiler is "smart" enough to see through the alias this, and makes a copy of range.refOf._get (the actual range object) instead of range.refOf (the wrapper).

So, for example, this usage code prints [1, 2, 3], demonstrating that the range is not consumed by the foreach loop:

void main()
{
    import std.range, std.stdio;

    auto r = only(1, 2, 3);
    foreach (e; refOf(r)) {}
    writeln(r);
}

In the -vcg-ast output, you can see the call to _get inserted by the compiler:

void main()
{
	import std.range;
	import std.stdio;
	OnlyResult!(int, int, int) r = only(1, 2, 3);
	{
		OnlyResult!(int, int, int) __r47 = refOf(r)._get().opSlice();
		for (; !__r47.empty(); __r47.popFront())
		{
			int e = __r47.front();
		}
	}
	writeln(r);
	return 0;
}
November 05, 2022

On Saturday, 5 November 2022 at 18:08:39 UTC, Paul Backus wrote:

>

On Saturday, 5 November 2022 at 15:17:42 UTC, Steven Schveighoffer wrote:

>

This can be solved by a wrapper:

You'd think, but if you actually try this, it turns out it doesn't work: the compiler is "smart" enough to see through the alias this, and makes a copy of range.refOf._get (the actual range object) instead of range.refOf (the wrapper).

This works:

void main()
{   import std.range : inputRangeObject, refRange;
    import std.stdio;

    auto testPiece = [1,2,3,4,5];
    foreach(el; (&testPiece).refRange.inputRangeObject)
    {   if(el == 3) break;
    }
    writeln(testPiece); // [3, 4, 5]
}

It's far from satisfying though. Ugly to begin with, and worse, incompatible with any function attributes.

November 05, 2022

On 11/5/22 2:08 PM, Paul Backus wrote:

>

On Saturday, 5 November 2022 at 15:17:42 UTC, Steven Schveighoffer wrote:

>

This can be solved by a wrapper:

struct RefOf(T)
{
  private T *_val;
  ref T _get() { return *_val; }
  alias _get this;
}

auto refOf(T)(return ref T val) { return RefOf!T(&val); }

foreach(elem; range.refOf) // now doesn't make a copy.

You'd think, but if you actually try this, it turns out it doesn't work: the compiler is "smart" enough to see through the alias this, and makes a copy of range.refOf._get (the actual range object) instead of range.refOf (the wrapper).

It's not "smart", just picky.

>

So, for example, this usage code prints [1, 2, 3], demonstrating that the range is not consumed by the foreach loop:

void main()
{
     import std.range, std.stdio;

     auto r = only(1, 2, 3);
     foreach (e; refOf(r)) {}
     writeln(r);
}

In the -vcg-ast output, you can see the call to _get inserted by the compiler:

void main()
{
     import std.range;
     import std.stdio;
     OnlyResult!(int, int, int) r = only(1, 2, 3);
     {
         OnlyResult!(int, int, int) __r47 = refOf(r)._get().opSlice();
         for (; !__r47.empty(); __r47.popFront())
         {
             int e = __r47.front();
         }
     }
     writeln(r);
     return 0;
}

What is happening is the compiler is not using duck typing exactly, and it also has some pretty puzzling logic.

  1. If it has the ability to slice, it will always do this at least once. See the bug reported long ago: https://issues.dlang.org/show_bug.cgi?id=14619
  2. The definition of a range is that it contains a front member. This cannot be via alias this.

So this works:

struct RefOf(T)
{
  private T *_val;
  auto front() { return _get.front; }
  RefOf opSlice() { return this; }
  ref T _get() { return *_val; }
  alias _get this;
}

Take away from that what you will.

(also I forgot there actually is something called std.range.refRange which is supposed to do this, but apparently still broken).

-Steve

November 05, 2022
We didn't use the "next" protocol for ranges because one cannot test for existence of the next element without consuming it.

November 06, 2022
On Sunday, 6 November 2022 at 01:48:04 UTC, Walter Bright wrote:
> We didn't use the "next" protocol for ranges because one cannot test for existence of the next element without consuming it.

But that's exactly the issue - I have to produce the element in the range's ctor, which essentially requires a "non-popping popNext" to determine if it's empty and assign the front

-tomer
« First   ‹ Prev
1 2 3