Thread overview
range simple toy problem
Jun 01, 2018
Xiaoxi
Jun 01, 2018
Alex
Jun 01, 2018
ag0aep6g
Jun 01, 2018
Xiaoxi
June 01, 2018
import std.range;
import std.algorithm;
import std.string;
import std.stdio;

void main()
{
  auto s = "1 2 3 4 5 6 7 8 9";
  auto iter = s.split(" ").drop(2);
	
  // How to find the unconsumed/not-split part of s here?
  // i.e. "3 4 5 6 7 8 9" NOT ["3", "4", "5", "6", "7", "8", "9"]	
  // s[??? .. $] <- what goes here?
}

split is just an example, it's a generic question if you chain multiple lazy functions and then consume a part of the data... how do you know how to slice the original buffer to point to the unconsumed data? Imagine the chain could be quite long

s.one.two.three.four.five.six.seven()

You don't really want to lazily add the inverse of all the functions and they might even be destructive so it might not be possible in all cases.

Thanks,
The Range n00b

June 01, 2018
On 6/1/18 1:00 PM, Xiaoxi wrote:
> import std.range;
> import std.algorithm;
> import std.string;
> import std.stdio;
> 
> void main()
> {
>    auto s = "1 2 3 4 5 6 7 8 9";
>    auto iter = s.split(" ").drop(2);
> 
>    // How to find the unconsumed/not-split part of s here?
>    // i.e. "3 4 5 6 7 8 9" NOT ["3", "4", "5", "6", "7", "8", "9"]
>    // s[??? .. $] <- what goes here?
> }
> 
> split is just an example, it's a generic question if you chain multiple lazy functions and then consume a part of the data... how do you know how to slice the original buffer to point to the unconsumed data? Imagine the chain could be quite long
> 
> s.one.two.three.four.five.six.seven()
> 
> You don't really want to lazily add the inverse of all the functions and they might even be destructive so it might not be possible in all cases.

Yes, this is a problem in range-land that is difficult to solve.

I don't know of a good answer to it. Probably you want to use algorithms instead of range wrappers to do this, but I don't know how to do this one-liner style.

I found ranges/algorithms substandard when trying to get the *other* data instead (i.e. what if you wanted "1 2" instead).

-Steve
June 01, 2018
On Friday, 1 June 2018 at 17:00:45 UTC, Xiaoxi wrote:
> import std.range;
> import std.algorithm;
> import std.string;
> import std.stdio;
>
> void main()
> {
>   auto s = "1 2 3 4 5 6 7 8 9";
>   auto iter = s.split(" ").drop(2);
> 	
>   // How to find the unconsumed/not-split part of s here?
>   // i.e. "3 4 5 6 7 8 9" NOT ["3", "4", "5", "6", "7", "8", "9"]	
>   // s[??? .. $] <- what goes here?
> }
>
> split is just an example, it's a generic question if you chain multiple lazy functions and then consume a part of the data... how do you know how to slice the original buffer to point to the unconsumed data? Imagine the chain could be quite long
>
> s.one.two.three.four.five.six.seven()
>
> You don't really want to lazily add the inverse of all the functions and they might even be destructive so it might not be possible in all cases.
>

Split is already destructive, as you loose the whitespaces. In this special case, maybe by

iter.join(" ");

https://dlang.org/library/std/array/join.html

In case the the separator can't be known in advance, I would choose to store it somehow... Maybe by
https://dlang.org/library/std/algorithm/searching/find_split.html
?

June 01, 2018
On 06/01/2018 07:00 PM, Xiaoxi wrote:
> import std.range;
> import std.algorithm;
> import std.string;
> import std.stdio;
> 
> void main()
> {
>    auto s = "1 2 3 4 5 6 7 8 9";
>    auto iter = s.split(" ").drop(2);
> 
>    // How to find the unconsumed/not-split part of s here?
>    // i.e. "3 4 5 6 7 8 9" NOT ["3", "4", "5", "6", "7", "8", "9"]
>    // s[??? .. $] <- what goes here?
> }

This prints "3 4 5 6 7 8 9":

----
import std.range;
import std.algorithm;
import std.stdio;

void main()
{
   auto s = "1 2 3 4 5 6 7 8 9";
   auto iter = refRange(&s).splitter!(c => c == ' ').drop(2);
   writeln(s); /* "3 4 5 6 7 8 9" */
}
----

Arguably, `.splitter(' ')` should work as well, but it doesn't.

Warning: Large parts of Phobos, including `splitter`, have problems handling a `RefRange`. So this might break down when you take it beyond the toy stage.
https://issues.dlang.org/show_bug.cgi?id=18657

> split is just an example, it's a generic question if you chain multiple lazy functions and then consume a part of the data...

Nitpick: `split` is not lazy. `splitter` is.

> how do you know how to slice the original buffer to point to the unconsumed data? 

In my opinion, `refRange` should fit the bill here. But:

1) It's not always clear from the documentation how much a lazy function actually pops. It might pop more than you expect, or less.
2) As mentioned, `refRange` has compatibility issues with other parts of Phobos.
June 01, 2018
On Friday, 1 June 2018 at 18:40:45 UTC, ag0aep6g wrote:
> On 06/01/2018 07:00 PM, Xiaoxi wrote:
>
> This prints "3 4 5 6 7 8 9":
>
> ----
> import std.range;
> import std.algorithm;
> import std.stdio;
>
> void main()
> {
>    auto s = "1 2 3 4 5 6 7 8 9";
>    auto iter = refRange(&s).splitter!(c => c == ' ').drop(2);
>    writeln(s); /* "3 4 5 6 7 8 9" */
> }
> ----
>
> Arguably, `.splitter(' ')` should work as well, but it doesn't.
>
> Warning: Large parts of Phobos, including `splitter`, have problems handling a `RefRange`. So this might break down when you take it beyond the toy stage.
> https://issues.dlang.org/show_bug.cgi?id=18657
>
>> split is just an example, it's a generic question if you chain multiple lazy functions and then consume a part of the data...
>
> Nitpick: `split` is not lazy. `splitter` is.
>
>> how do you know how to slice the original buffer to point to the unconsumed data?
>
> In my opinion, `refRange` should fit the bill here. But:
>
> 1) It's not always clear from the documentation how much a lazy function actually pops. It might pop more than you expect, or less.
> 2) As mentioned, `refRange` has compatibility issues with other parts of Phobos.

Many thanks ye all for your helpful comments! Seems like std.range is a small minefield for beginners. std.algorithm is much easier and intuitive to use(my current code uses findSplit). However when std.range works, it really shines.

refRange is exactly what I needed, thanks! It is really worrisome that a function could pop too much or too little though, especially if it would change between D versions. Guess I need to add a good unit-test to handle that.