Thread overview
Variable chunk sizes operations
Aug 29, 2020
Mihaela Chirea
Aug 30, 2020
Paul Backus
Aug 30, 2020
9il
Aug 30, 2020
Mihaela Chirea
August 29, 2020
I started working on this issue and I reached a point where some more opinions would be needed:

https://github.com/dlang/phobos/pull/7600

Since the range is split into uneven parts, some operations can only be done in O(n).
Slicing requires taking into account all the first elements of the range specifying the chunk lengths to find the starting point in the source range.
Back and popBack also need to find where that last chunk is since the source length and the provided lengths wouldn't necessarily match.

As mentioned in the pull request comments, the complexity of the program using these might be higher than expected.

So should they be provided in spite of their complexity? Or would a forward range(at most) be enough?
August 30, 2020
On Saturday, 29 August 2020 at 23:33:49 UTC, Mihaela Chirea wrote:
> I started working on this issue and I reached a point where some more opinions would be needed:
>
> https://github.com/dlang/phobos/pull/7600
>
> Since the range is split into uneven parts, some operations can only be done in O(n).

I think care should be taken to distinguish which `n` we're talking about when we use terms like O(n) in this context.

For example, let's say N = the length of the source range, and S = the length of the chunk-sizes range. Your length method is O(S), but not O(N)--in fact, it runs in constant time w.r.t. the length of the source range.

I'm not sure what Phobos policy is regarding time complexity for algorithms that take multiple ranges (does it have to be O(1) w.r.t. *all* arguments?), but given that S is likely to be smaller than N, there's a case to be made that the linear-time algorithm is reasonable here.

(A similar case is binary search of a range that contains strings runs in logarithmic time w.r.t. the length of the range, but linear time w.r.t. the length of the target string, because checking two strings for equality is a linear-time operation.)
August 30, 2020
On Saturday, 29 August 2020 at 23:33:49 UTC, Mihaela Chirea wrote:
> I started working on this issue and I reached a point where some more opinions would be needed:
>
> https://github.com/dlang/phobos/pull/7600

Mir solution for random-access ranges [1]. It has a slightly different API, instead of the chunk sizes, it uses the index sequence. Everything is O(1).

[1] http://mir-algorithm.libmir.org/mir_ndslice_topology.html#.chopped
August 30, 2020
On Sunday, 30 August 2020 at 04:32:17 UTC, 9il wrote:
> On Saturday, 29 August 2020 at 23:33:49 UTC, Mihaela Chirea wrote:
>> I started working on this issue and I reached a point where some more opinions would be needed:
>>
>> https://github.com/dlang/phobos/pull/7600
>
> Mir solution for random-access ranges [1]. It has a slightly different API, instead of the chunk sizes, it uses the index sequence. Everything is O(1).
>
> [1] http://mir-algorithm.libmir.org/mir_ndslice_topology.html#.chopped

Thanks! This would indeed solve the indexing and slicing problem.

But back and popBack would still have to search for the last valid index. At the moment, when the source range is too short, the last elements of the lengths range are ignored, so the length of the range of chunks isn't always equal to the length of the lengths range.
I was thinking of solving this by just giving empty chunks for those extra elements.

Does this sound like a good approach? Or maybe throwing an error would be a better option?