Requirements for Slicing Ranges

Index » General » Requirements for Slicing Ranges

Requirements for Slicing Ranges
Sep 02 Jonathan M Davis
Sep 02 Richard (Rikki) Andrew Cattermole
Sep 02 Jonathan M Davis
Sep 02 Richard (Rikki) Andrew Cattermole
Sep 02 Jonathan M Davis
Sep 02 Sebastiaan Koppe
Sep 02 Paul Backus
Sep 03 Sebastiaan Koppe
Sep 03 Jonathan M Davis
Sep 03 Jonathan M Davis
Sep 02 Paul Backus
Sep 02 Jonathan M Davis
Sep 02 monkyyy
Sep 03 Jonathan M Davis
Sep 03 monkyyy
Sep 05 Quirin Schroll

September 01

Posted by Jonathan M Davis

Permalink

Jonathan M Davis

Permalink

One thing that I've been thinking about with regards to improving the range API which I forgot to discuss previously is the requirements for hasSlicing. Right now, for whatever reason, slicing is divorced from random-access ranges. In practice, random-access ranges have slicing (unless the programmer just didn't bother to implement it), and other ranges don't, but currently, hasSlicing is a separate trait that can theoretically pass with a forward range, and it's not guaranteed that a random-access range has slicing. So, my question is:

1. Are there any realistic situations where a range could be a forward or bidirectional range _and_ implement slicing in O(1), but it _can't_ actually be implemented as a random-access range? So, it would need to be possible to take slices of elements of such a range by index in O(1) and yet somehow not be possible to access individual elements by index in O(1).

2. Are there any realistic situations where a range could be a random-access
range but _can't_ implement slicing in O(1)? So, it would need to be
possible to access individual elements by index in O(1) and yet somehow not
be possible to take slices of elements by index in O(1).

As far as I can tell, the ability to index a range and the ability to slice
it are inextricably linked. If you can do range[5 .. 12] or range[7 .. $] in
O(1), then you should be able to do range[5] or range[12] in O(1).
Similarly, if you can do range[5] in O(1), it should be possible to do
range[5 .. 12] in O(1).

Does anyone know of a situation where that would not be the case? I would love to just ditch the test for slicing in the updated range API and tie the ability to random-access ranges, since it would be simplify things. If I had to guess without digging through all of the history, I suspect that random-access ranges were added before slicing was and that that's why they weren't tied together, but in practice, they always seem to be, and I can't think of a reason at the moment why they can't be.

Can anyone see something that I'm missing here?

- Jonathan M Davis

September 02

Re: Requirements for Slicing Ranges

Posted by Richard (Rikki) Andrew Cattermole
in reply to Jonathan M Davis

Permalink

Richard (Rikki) Andrew Cattermole

Posted in reply to Jonathan M Davis

Permalink

On 02/09/2024 2:47 PM, Jonathan M Davis wrote:
> Can anyone see something that I'm missing here?

Indexing and range intervals are indeed distinct things.

An interval does not have to evaluate to a specific bit of data, but an index does.

They can also use non-integral keys.

Typically both ``size_t`` and ``ptrdiff_t`` will have defined sequentiality, but they may not in the case of a map. Whereby an index works but not an interval.

So the question is, how many assumptions surrounding sequentiality is being made, along with the key type?

September 01

Re: Requirements for Slicing Ranges

Posted by Jonathan M Davis
in reply to Richard (Rikki) Andrew Cattermole

Permalink

Jonathan M Davis

Posted in reply to Richard (Rikki) Andrew Cattermole

Permalink

On Sunday, September 1, 2024 9:31:23 PM MDT Richard (Rikki) Andrew Cattermole via Digitalmars-d wrote:
> On 02/09/2024 2:47 PM, Jonathan M Davis wrote:
> > Can anyone see something that I'm missing here?
>
> Indexing and range intervals are indeed distinct things.
>
> An interval does not have to evaluate to a specific bit of data, but an index does.
>
> They can also use non-integral keys.
>
> Typically both ``size_t`` and ``ptrdiff_t`` will have defined sequentiality, but they may not in the case of a map. Whereby an index works but not an interval.
>
> So the question is, how many assumptions surrounding sequentiality is being made, along with the key type?

The range API uses size_t for indices, and it is required that the indices be sequential. range[3] gives you the 4th element and range[3 .. 9] gives you a 6 element range starting at the 4th element.

Containers may do other things with keys (particularly containers like hash tables and sorted maps), but the range API treats indices the same way that arrays do. Doing anything else would complicate things considerably, and if anything, the range API is arguably already too complex.

In essence, the range API provides an abstraction based on arrays, with random-access ranges providing the full capabilities of a dynamic array / slice aside from the capabilities related to increasing their size or doing any kind allocations. Each lower category of range then loses capabilities as it moves further away from being an array, with basic input ranges just providing a way to iterate through the elements once and nothing else. Or, if you want to view the situation in reverse, we start with a basic input range being only able to iterate through a sequential list of elements once, and each category of range builds up towards the capabilities of an array.

As such, the situation with keys is simple in that there are no keys in the general sense. We're just dealing with indices, and an index needs to mean the same thing that it would with an array. It's simply a question of where in the hierarchy we put each capability.

Indexing requires random-access ranges. Slicing at present does not, but as far as I can tell, from a technical persective, with regards to a sequential data structure such as a range, it is not possible to implement indexing without also being able to implement slicing, and I don't see how it would be possible to implement slicing but not be possible to implement indexing.

Non-sequential data structures really don't have anything to do with ranges aside from cases where such a data structure provides a sequential view into its data via a range.

- Jonathan M Davis

September 02

Re: Requirements for Slicing Ranges

Posted by Richard (Rikki) Andrew Cattermole
in reply to Jonathan M Davis

Permalink

Richard (Rikki) Andrew Cattermole

Posted in reply to Jonathan M Davis

Permalink

On 02/09/2024 4:35 PM, Jonathan M Davis wrote:
> On Sunday, September 1, 2024 9:31:23 PM MDT Richard (Rikki) Andrew Cattermole
> via Digitalmars-d wrote:
>> On 02/09/2024 2:47 PM, Jonathan M Davis wrote:
>>> Can anyone see something that I'm missing here?
>>
>> Indexing and range intervals are indeed distinct things.
>>
>> An interval does not have to evaluate to a specific bit of data, but an
>> index does.
>>
>> They can also use non-integral keys.
>>
>> Typically both ``size_t`` and ``ptrdiff_t`` will have defined
>> sequentiality, but they may not in the case of a map. Whereby an index
>> works but not an interval.
>>
>> So the question is, how many assumptions surrounding sequentiality is
>> being made, along with the key type?
> 
> The range API uses size_t for indices, and it is required that the indices
> be sequential. range[3] gives you the 4th element and range[3 .. 9] gives
> you a 6 element range starting at the 4th element.
> 
> Containers may do other things with keys (particularly containers like hash
> tables and sorted maps), but the range API treats indices the same way that
> arrays do. Doing anything else would complicate things considerably, and if
> anything, the range API is arguably already too complex.
> 
> In essence, the range API provides an abstraction based on arrays, with
> random-access ranges providing the full capabilities of a dynamic array /
> slice aside from the capabilities related to increasing their size or doing
> any kind allocations. Each lower category of range then loses capabilities
> as it moves further away from being an array, with basic input ranges just
> providing a way to iterate through the elements once and nothing else. Or,
> if you want to view the situation in reverse, we start with a basic input
> range being only able to iterate through a sequential list of elements once,
> and each category of range builds up towards the capabilities of an array.
> 
> As such, the situation with keys is simple in that there are no keys in the
> general sense. We're just dealing with indices, and an index needs to mean
> the same thing that it would with an array. It's simply a question of where
> in the hierarchy we put each capability.
> 
> Indexing requires random-access ranges. Slicing at present does not, but as
> far as I can tell, from a technical persective, with regards to a sequential
> data structure such as a range, it is not possible to implement indexing
> without also being able to implement slicing, and I don't see how it would
> be possible to implement slicing but not be possible to implement indexing.
> 
> Non-sequential data structures really don't have anything to do with ranges
> aside from cases where such a data structure provides a sequential view into
> its data via a range.
> 
> - Jonathan M Davis

Your question appears to be answered :)

September 01

Re: Requirements for Slicing Ranges

Posted by Jonathan M Davis
in reply to Richard (Rikki) Andrew Cattermole

Permalink

Jonathan M Davis

Posted in reply to Richard (Rikki) Andrew Cattermole

Permalink

On Sunday, September 1, 2024 10:54:02 PM MDT Richard (Rikki) Andrew Cattermole via Digitalmars-d wrote:
> Your question appears to be answered :)

Probably. I'm just trying to make sure that there isn't some corner case that I'm missing.

As far as I can tell, if you can index a range in O(1), it should be
possible to implement slicing in O(1), and if you can slice a range in O(1),
it should be possible to index it in O(1) (at least as long as the
requirement to have slicing give you the same type remains in place), but
the current range API does not make that assumption, which is kind of weird.

So, I'm asking if anyone is able to see a hole in that logic or a corner case where it somehow really would make sense to be able to slice a range but not be able to also index that range. I don't _think_ that such a thing is possible, but it's possible that I'm missing something here, and someone else may see something that I don't. I'm 99% certain that I have a proper grasp of the situation, but I'd just like to be doubly sure.

- Jonathan M Davis

September 02

Re: Requirements for Slicing Ranges

Posted by Sebastiaan Koppe
in reply to Jonathan M Davis

Permalink

Sebastiaan Koppe

Posted in reply to Jonathan M Davis

Permalink

On Monday, 2 September 2024 at 02:47:16 UTC, Jonathan M Davis wrote:
> Does anyone know of a situation where that would not be the case? I would love to just ditch the test for slicing in the updated range API and tie the ability to random-access ranges, since it would be simplify things. If I had to guess without digging through all of the history, I suspect that random-access ranges were added before slicing was and that that's why they weren't tied together, but in practice, they always seem to be, and I can't think of a reason at the moment why they can't be.
>
> Can anyone see something that I'm missing here?
>
> - Jonathan M Davis

I briefly wondered how it would work if the range is non-copyable, since taking a slice is effectively taking a copy of the underlying data the range captures (and its lifetime). But I remember you saying in another thread that forward ranges have to be copyable, correct?

So I guess the question for me boils down to whether there exists a need for a non-copyable range that provides random access and doesn't allow copies. (Although I guess with ref and dip1000 you could make that work, but lets not go there :smile).

September 02

Re: Requirements for Slicing Ranges

Posted by Paul Backus
in reply to Sebastiaan Koppe

Permalink

Paul Backus

Posted in reply to Sebastiaan Koppe

Permalink

On Monday, 2 September 2024 at 09:59:04 UTC, Sebastiaan Koppe wrote:
>
> I briefly wondered how it would work if the range is non-copyable, since taking a slice is effectively taking a copy of the underlying data the range captures (and its lifetime).

Taking a slice does not necessarily copy the range, nor does it necessarily copy the range's elements.

For example:

    struct DontCopyMe
    {
        this(this) { assert(0); }
    }

    void main()
    {
        auto r1 = [DontCopyMe(), DontCopyMe(), DontCopyMe()];
        auto r2 = r1[1 .. $];
        // No AssertError => elements not copied
        assert(r1 != r2); // Not equal => r2 is not a copy of r1
    }

September 02

Re: Requirements for Slicing Ranges

Posted by Paul Backus
in reply to Jonathan M Davis

Permalink

Paul Backus

Posted in reply to Jonathan M Davis

Permalink

On Monday, 2 September 2024 at 02:47:16 UTC, Jonathan M Davis wrote:

Are there any realistic situations where a range could be a forward or bidirectional range and implement slicing in O(1), but it can't actually be implemented as a random-access range? So, it would need to be possible to take slices of elements of such a range by index in O(1) and yet somehow not be possible to access individual elements by index in O(1).

Infinite forward ranges can implement slicing, but can't be random-access ranges because they don't support r.back.

September 02

Re: Requirements for Slicing Ranges

Posted by monkyyy
in reply to Jonathan M Davis

Permalink

monkyyy

Posted in reply to Jonathan M Davis

Permalink

On Monday, 2 September 2024 at 02:47:16 UTC, Jonathan M Davis wrote:
> One thing that I've been thinking about with regards to improving the range API which I forgot to discuss previously is the requirements for hasSlicing. Right now, for whatever reason, slicing is divorced from random-access ranges. In practice, random-access ranges have slicing (unless the programmer just didn't bother to implement it), and other ranges don't, but currently, hasSlicing is a separate trait that can theoretically pass with a forward range, and it's not guaranteed that a random-access range has slicing. So, my question is:
>
> 1. Are there any realistic situations where a range could be a forward or bidirectional range _and_ implement slicing in O(1), but it _can't_ actually be implemented as a random-access range? So, it would need to be possible to take slices of elements of such a range by index in O(1) and yet somehow not be possible to access individual elements by index in O(1).
>
> 2. Are there any realistic situations where a range could be a random-access
> range but _can't_ implement slicing in O(1)? So, it would need to be
> possible to access individual elements by index in O(1) and yet somehow not
> be possible to take slices of elements by index in O(1).
>
> As far as I can tell, the ability to index a range and the ability to slice
> it are inextricably linked. If you can do range[5 .. 12] or range[7 .. $] in
> O(1), then you should be able to do range[5] or range[12] in O(1).
> Similarly, if you can do range[5] in O(1), it should be possible to do
> range[5 .. 12] in O(1).
>
> Does anyone know of a situation where that would not be the case? I would love to just ditch the test for slicing in the updated range API and tie the ability to random-access ranges, since it would be simplify things. If I had to guess without digging through all of the history, I suspect that random-access ranges were added before slicing was and that that's why they weren't tied together, but in practice, they always seem to be, and I can't think of a reason at the moment why they can't be.
>
> Can anyone see something that I'm missing here?
>
> - Jonathan M Davis

I think slicing should probably be cut from that main api, Id argue its the hardest to get correct of any of the functions while its the most fragile in meta programming in my experiments

Instead there should be a collection of implict range functions given special treatment(std.range.interface, object.d, somewhere new for ranges)

---

```d
auto tail(R)(R r){
  r.pop;
  return r;
}
```
It would suck to implement tail everywhere, so tail should not be in the main api; but often you need to make a poped dup. If tail doesnt exist, several range algorithms will need 3 extra lines of code repeated several time, if its in the main api, every algorthim will need to impliment a `auto tail()=>r.tail;` busy work as a specail treatment floating function it is 3 lines, users dont need to implement anything and it will reduce complexity and clarity of several algorithms(consider zip.pop being implimented with static map)

likewise `sliceable` should be a floating function and it wraps ranges and adds opSlice, implements the meta programming of detecting if its implementable *once*, instead of the each and every algorthim needing to juggle the complexity of opDollar existing or not, infinite or finite, error behavoir etc.

September 02

Re: Requirements for Slicing Ranges

Posted by Jonathan M Davis
in reply to Paul Backus

Permalink

Jonathan M Davis

Posted in reply to Paul Backus

Permalink

On Monday, September 2, 2024 6:04:08 AM MDT Paul Backus via Digitalmars-d wrote:
> On Monday, 2 September 2024 at 02:47:16 UTC, Jonathan M Davis
>
> wrote:
> > 1. Are there any realistic situations where a range could be a forward or bidirectional range _and_ implement slicing in O(1), but it _can't_ actually be implemented as a random-access range? So, it would need to be possible to take slices of elements of such a range by index in O(1) and yet somehow not be possible to access individual elements by index in O(1).
>
> Infinite forward ranges can implement slicing, but can't be random-access ranges because they don't support `r.back`.

Well, with the current rules, infinite ranges don't require back in order to be random-access ranges. You can have an infinite range which implements indexing - and therefore is a random-access range - and it is of course also a forward range, but it's not a bidirectional range, because it doesn't implement back. That corner case probably does cause occasional problems (since finite random-access ranges do have to be bidirectional ranges), because there's probably code that checks for random-access ranges without checking whether the range is infinite and then decides to use back, but you wouldn't usually try to use an infinite range with something that needed back anyway, because it wouldn't even make sense to try. Either way, I suspect that there are a variety of bugs out there with range-based code that doesn't actually take infinite ranges into account but which hasn't been a problem simply because infinite ranges aren't all that common.

As for slicing an infinite range, if you slice it with two indices, you get a finite range (which probably causes problems for some code, because the type isn't the same, but it's obviously not possible to return the same type in that case), whereas if you slice it with an index and $, you get the same range type, just like you would when slicing a finite range.

So, with regards to infinite ranegs, the question is whether it's possible
to implement range[1 .. 5] (which gives a finite range) or range[5 .. $]
(which gives the same type of range) in O(1) without being able to implement
range[5] in O(1) - or if it's possible to implement range[5] in O(1), but
it's not possible to implement range[1 .. 5] or range[5 .. $] in O(1). I'm
pretty sure that they're inextricably linked and that if you can implement
slicing, you can implement random-access (and vice-versa), but again, I
could be missing something.

Regardless, random-access and slicing infinite ranges gets a bit weird, because the "end" has to be treated differently, and you can't get a finite slice of the same type as the infinite range when slicing, since the lack of end is part of the type itself.

- Jonathan M Davis

Top | Forum index | About this forum

Forums