Thread overview
Treating a slice as an InputRange
Nov 15, 2017
unleashy
Nov 15, 2017
Jonathan M Davis
Nov 15, 2017
Ali Çehreli
Nov 15, 2017
unleashy
Nov 16, 2017
Jonathan M Davis
November 15, 2017
Hello,

I'm writing a small parser for a specific binary format. The format is parsed by way of a sequence of functions, each deserializing a small portion of the format into a D type such as int, double, string, etc., operating on a single InputRange. The problem is, if I pass a slice to each of these functions, the slice doesn't get mutated by the popFront function used in the functions, so something like this:

ubyte[] slice;
...
int a = readInt(slice);
double b = readDouble(slice);

Ends up failing, because readInt properly reads an int, but the slice is not mutated "outside" of itself, which means readDouble will not read from where readInt stopped, and instead read from the start of the slice, and failing because there is an int encoded there and not a double. Both functions' signatures are like this:

T readXXX(Range)(auto ref Range range) if (isInputRange!Range)

So my question is, is there a way to treat a slice strictly as an InputRange, so that it is mutated no matter what? Or is there another way to do what I'm trying to do?

I've worked around it using a "wrapper" InputRange struct, but I feel like there must be another way.

Thanks!
November 15, 2017
On Wednesday, November 15, 2017 20:53:39 unleashy via Digitalmars-d-learn wrote:
> Hello,
>
> I'm writing a small parser for a specific binary format. The format is parsed by way of a sequence of functions, each deserializing a small portion of the format into a D type such as int, double, string, etc., operating on a single InputRange. The problem is, if I pass a slice to each of these functions, the slice doesn't get mutated by the popFront function used in the functions, so something like this:
>
> ubyte[] slice;
> ...
> int a = readInt(slice);
> double b = readDouble(slice);
>
> Ends up failing, because readInt properly reads an int, but the slice is not mutated "outside" of itself, which means readDouble will not read from where readInt stopped, and instead read from the start of the slice, and failing because there is an int encoded there and not a double. Both functions' signatures are like this:
>
> T readXXX(Range)(auto ref Range range) if (isInputRange!Range)
>
> So my question is, is there a way to treat a slice strictly as an InputRange, so that it is mutated no matter what? Or is there another way to do what I'm trying to do?
>
> I've worked around it using a "wrapper" InputRange struct, but I feel like there must be another way.

Typically, functions that operate on ranges either consume them, return a new range that wrapped the one passed in, or return the original range with some number of elements popped off. If you specifically want a function to accept a range and mutate it without returning it, then it should take its argument by ref. Having it take auto ref is actually quite odd, since that means that the behavior can depend on whether an lvalue or rvalue is passed in.

If you can't alter the function's signature, then you could always use std.range.RefRange to wrap it, but it's better to change the function signature if you can.

- Jonathan M Davis

November 15, 2017
On 11/15/17 3:53 PM, unleashy wrote:

> So my question is, is there a way to treat a slice strictly as an InputRange, so that it is mutated no matter what? Or is there another way to do what I'm trying to do?

I'd model your functions after the std.conv.parse functions:

https://dlang.org/phobos/std_conv.html#parse

This always takes the range by reference, and takes the type to parse via a template parameter (more idiomatic D to have parse!int than parseInt).

-Steve
November 15, 2017
On 11/15/2017 01:02 PM, Jonathan M Davis wrote:
> On Wednesday, November 15, 2017 20:53:39 unleashy via Digitalmars-d-learn

>> ubyte[] slice;
>> ...
>> int a = readInt(slice);
>> double b = readDouble(slice);
>>
>> Ends up failing, because readInt properly reads an int, but the
>> slice is not mutated "outside" of itself

That should work:

import std.range;

T read(T, Range)(auto ref Range range) if (isInputRange!Range) {
    range.popFront();
    return 42;
}

unittest {
    ubyte[] slice = [ 1, 2 ];
    read!int(slice);
    assert(slice == [2]);
}

void main() {
}

> If you specifically want a function to
> accept a range and mutate it without returning it, then it should take its
> argument by ref.

Agreed.

Ali

November 15, 2017
Thanks for the insights everyone, it really helped!

I actually discovered that it wasn't working because one of the parsing functions used `std.range.take` and, since I was giving it a slice, `take` decided to save the fwdrange instead of mutating it. I realised the `take` call was 100% useless, so I removed it and it works perfectly now, and refactored to be more idiomatic :)

On Wednesday, 15 November 2017 at 21:02:35 UTC, Jonathan M Davis wrote:
> If you specifically want a function to accept a range and mutate it without returning it, then it should take its argument by ref. Having it take auto ref is actually quite odd, since that means that the behavior can depend on whether an lvalue or rvalue is passed in.

I was under the impression that templated parameters needed `auto ref` to work as `ref` properly. Good to know that's not true.

On Wednesday, 15 November 2017 at 21:11:16 UTC, Steven Schveighoffer wrote:
>I'd model your functions after the std.conv.parse functions:
>
>https://dlang.org/phobos/std_conv.html#parse
>
>This always takes the range by reference, and takes the type to parse via a template parameter (more idiomatic D to have parse!int than parseInt).

Yes, that is much more idiomatic than what I was going for. I've been writing some Java, so I guess it got to my head :)

On Wednesday, 15 November 2017 at 21:14:04 UTC, Ali Çehreli wrote:
>That should work:
>
>import std.range;
>
>T read(T, Range)(auto ref Range range) if (isInputRange!Range) {
>    range.popFront();
>    return 42;
>}
>
>unittest {
>    ubyte[] slice = [ 1, 2 ];
>    read!int(slice);
>    assert(slice == [2]);
>}
>
>void main() {
>}

That is exactly what I was looking for, thanks!
November 16, 2017
On Wednesday, November 15, 2017 22:48:12 unleashy via Digitalmars-d-learn wrote:
> On Wednesday, 15 November 2017 at 21:02:35 UTC, Jonathan M Davis wrote:
> > If you specifically want a function to accept a range and mutate it without returning it, then it should take its argument by ref. Having it take auto ref is actually quite odd, since that means that the behavior can depend on whether an lvalue or rvalue is passed in.
>
> I was under the impression that templated parameters needed `auto ref` to work as `ref` properly. Good to know that's not true.

What auto ref does is make it so that the parameter is infered as ref if the argument is an lvalue and infered as non-ref if it's an rvalue. That way, lvalues get passed by references, and rvalues get moved. It's really not the sort of thing you use when you intend to mutate the parameter. It's either used with the intent of avoiding copying (similar to when you use const T& in C++), or it's used to forward the refness of the argument (e.g. that's important with something like emplace, which forwards the arguments to the constructor of the type being constructed).

Because auto ref generates different template instantiations based on the refness of the type, it only works with templated functions, but ref in general doesn't function any differently with templates than it does with non-templates.

- Jonathan M Davis