Questions about the slice operator (page 2) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » Learn » Questions about the slice operator (page 2)

April 04, 2012

Re: Questions about the slice operator

Posted by Simen Kjærås
in reply to Jacob Carlborg

Simen Kjærås

Posted in reply to Jacob Carlborg

On Wed, 04 Apr 2012 14:21:01 +0200, Jacob Carlborg <doob@me.com> wrote:

> On 2012-04-04 14:16, Simen Kjærås wrote:
>
>> And what do we do with 3..$?
>
> Hmm, that's a good point. The best I can think of for now is to translate that to:
>
> range(3, size_t.max)
>
> Or something like:
>
> struct range
> {
>      size_t start;
>      size_t end;
>      bool dollar; // better name is needed
> }
>
> range(3, 0, true)
>

Not enough:

$-3..$-2

This is a hard and unpleasant one, unless we go with $ being
defined as the length of the array we're slicing, and only valid
inside a slice operation. (and of course some opDollar or the
like for other containers)

April 04, 2012

Re: Questions about the slice operator

Posted by Simen Kjærås
in reply to Simen Kjærås

Simen Kjærås

Posted in reply to Simen Kjærås

On Wed, 04 Apr 2012 14:16:54 +0200, Simen Kjærås <simen.kjaras@gmail.com> wrote:

> On Wed, 04 Apr 2012 12:06:33 +0200, Jacob Carlborg <doob@me.com> wrote:
>
>> On 2012-04-04 04:11, Jonathan M Davis wrote:
>>
>>> foreach(i; 0 .. 5)
>>>
>>> is more efficient only because it has _nothing_ to do with arrays. Generalizing
>>> the syntax wouldn't help at all, and if it were generalized, it would arguably
>>> have to be consistent in all of its uses, in which case
>>>
>>> foreach(i; 0 .. 5)
>>>
>>> would become identical to
>>>
>>> foreach(i; [0, 1, 2, 3, 4])
>>>
>>> and therefore less efficient. Generalizing .. just doesn't make sense.
>>
>> Why couldn't the .. syntax be syntax sugar for some kind of library implement range type, just as what is done with associative arrays.
>>
>> We could implement a new library type, named "range". Looking something like this:
>>
>> struct range
>> {
>>      size_t start;
>>      size_t end;
>>      // implement the range interface or opApply
>> }
>>
>> range r = 1 .. 5;
>>
>> The above line would be syntax sugar for:
>>
>> range r = range(1, 5);
>>
>> void foo (range r)
>> {
>>      foreach (e ; r) {}
>> }
>>
>> foo(r);
>>
>> This could then be taken advantage of in other parts of the language:
>>
>> class A
>> {
>>      int opSlice (range r); // new syntax
>>      int opSlice (size_t start, size_t end); // old syntax
>> }
>>
>> I think this would be completely backwards compatible as well.
>>
>
> And what do we do with 3..$?

Actually, I've thought a little about this. And apart from the tiny
idiosyncrasy of $, a..b as a more regular type can bring some
interesting enhancements to the language.

Consider a..b as simply a set of indices, defined by a start point and
an end point. A different index set may be [1,2,4,5], or Strided!(3,4).

An index set then works as a filter on a range, returning only those
elements whose indices are in the set.

We can now redefine opIndex to take either a single index or an index
set, as follows:

auto opIndex(S)(S set) if (isIndexSet!S) {
    return set.transform(this);
}

For an AA, there would be another constraint that the type of elements
of the index set match those of the AA keys, of course. Other containers
may have other constraints.

An index set may or may not be iterable, but it should always supply
functionality to check if an index is contained in it.

With this framework laid out, we can define these operations on arrays,
and have any array be sliceable by an array of integral elements:

assert(['a','b','c'][[0,2]] == ['a', 'c']);

The problem of $ is a separate one, and quite complex to handle. No
doubt it is useful for arrays and their ilk, but for the generic array
and index set, it's complex and unpleasant.

Barring the use of expression templates, I see few other solutions than
to introduce the function opDollar(size_t level), where level is 0 for
the first index ([$]), 1 for the second ([_, $]), etc. This means there
is no way to express the concept of next-to-last element outside of the
opSlice call.

A different solution would be to use a specific type for $. Basically,
this would be:

struct Dollar(T) {
    T offset;
    alias offset this;
    // operator overloads here to assure typeof($+n) == typeof($)
}

This complicates things a lot, and still does not really work.
[1,2,3][0..foo($)] works in D today, but would not with the proposed
type. Hence, the use of $ outside slice operations likely should not
(indeed, can not) be possible.

April 04, 2012

Re: Questions about the slice operator

Posted by Jacob Carlborg
in reply to Simen Kjærås

Jacob Carlborg

Posted in reply to Simen Kjærås

On 2012-04-04 15:01, Simen Kjærås wrote:

> Actually, I've thought a little about this. And apart from the tiny
> idiosyncrasy of $, a..b as a more regular type can bring some
> interesting enhancements to the language.
>
> Consider a..b as simply a set of indices, defined by a start point and
> an end point. A different index set may be [1,2,4,5], or Strided!(3,4).
>
> An index set then works as a filter on a range, returning only those
> elements whose indices are in the set.
>
> We can now redefine opIndex to take either a single index or an index
> set, as follows:
>
> auto opIndex(S)(S set) if (isIndexSet!S) {
> return set.transform(this);
> }
>
> For an AA, there would be another constraint that the type of elements
> of the index set match those of the AA keys, of course. Other containers
> may have other constraints.
>
> An index set may or may not be iterable, but it should always supply
> functionality to check if an index is contained in it.
>
> With this framework laid out, we can define these operations on arrays,
> and have any array be sliceable by an array of integral elements:
>
> assert(['a','b','c'][[0,2]] == ['a', 'c']);

I don't think I really understand this idea of an index set.

-- 
/Jacob Carlborg

April 04, 2012

Re: Questions about the slice operator

Posted by Simen Kjærås
in reply to Jacob Carlborg

Simen Kjærås

Posted in reply to Jacob Carlborg

On Wed, 04 Apr 2012 15:29:58 +0200, Jacob Carlborg <doob@me.com> wrote:

> On 2012-04-04 15:01, Simen Kjærås wrote:
>
>> Actually, I've thought a little about this. And apart from the tiny
>> idiosyncrasy of $, a..b as a more regular type can bring some
>> interesting enhancements to the language.
>>
>> Consider a..b as simply a set of indices, defined by a start point and
>> an end point. A different index set may be [1,2,4,5], or Strided!(3,4).
>>
>> An index set then works as a filter on a range, returning only those
>> elements whose indices are in the set.
>>
>> We can now redefine opIndex to take either a single index or an index
>> set, as follows:
>>
>> auto opIndex(S)(S set) if (isIndexSet!S) {
>> return set.transform(this);
>> }
>>
>> For an AA, there would be another constraint that the type of elements
>> of the index set match those of the AA keys, of course. Other containers
>> may have other constraints.
>>
>> An index set may or may not be iterable, but it should always supply
>> functionality to check if an index is contained in it.
>>
>> With this framework laid out, we can define these operations on arrays,
>> and have any array be sliceable by an array of integral elements:
>>
>> assert(['a','b','c'][[0,2]] == ['a', 'c']);
>
> I don't think I really understand this idea of an index set.
>

It's quite simple, really - an index set holds indices. For a regular
array of N elements, the index set it [0..N-1]. For an AA, the index set
is all the keys in the AA. Basically, an index set is the set of all
values that will give meaningful results from container[index].

arr[2..4] thus means 'restrict the indices to those between 2 and 4'.
For arrays though, it also translates the array so that what was 2
before, now is 0.

For a T[string] aa, one could imagine the operation aa["a".."c"] to
produce a new AA with only those elements whose keys satisfy
"a" <= key < "c".

As for the example given:

assert(['a','b','c'][[0,2]] == ['a', 'c']);

This means 'grab the elements at position 0 and 2, and put them in
a new array'. Hence, element 0 ('a') and element 2 ('c') are in
the result.

April 04, 2012

Re: Questions about the slice operator

Posted by Jacob Carlborg
in reply to Simen Kjærås

Jacob Carlborg

Posted in reply to Simen Kjærås

On 2012-04-04 16:40, Simen Kjærås wrote:
> It's quite simple, really - an index set holds indices. For a regular
> array of N elements, the index set it [0..N-1]. For an AA, the index set
> is all the keys in the AA. Basically, an index set is the set of all
> values that will give meaningful results from container[index].
>
> arr[2..4] thus means 'restrict the indices to those between 2 and 4'.
> For arrays though, it also translates the array so that what was 2
> before, now is 0.
>
> For a T[string] aa, one could imagine the operation aa["a".."c"] to
> produce a new AA with only those elements whose keys satisfy
> "a" <= key < "c".
>
> As for the example given:
>
> assert(['a','b','c'][[0,2]] == ['a', 'c']);
>
> This means 'grab the elements at position 0 and 2, and put them in
> a new array'. Hence, element 0 ('a') and element 2 ('c') are in
> the result.

Ok, now I think I get it.

-- 
/Jacob Carlborg

April 04, 2012

Re: Questions about the slice operator

Posted by Jonathan M Davis
in reply to Jacob Carlborg

Jonathan M Davis

Posted in reply to Jacob Carlborg

On Wednesday, April 04, 2012 12:06:33 Jacob Carlborg wrote:
> On 2012-04-04 04:11, Jonathan M Davis wrote:
> > foreach(i; 0 .. 5)
> > 
> > is more efficient only because it has _nothing_ to do with arrays. Generalizing the syntax wouldn't help at all, and if it were generalized, it would arguably have to be consistent in all of its uses, in which case
> > 
> > foreach(i; 0 .. 5)
> > 
> > would become identical to
> > 
> > foreach(i; [0, 1, 2, 3, 4])
> > 
> > and therefore less efficient. Generalizing .. just doesn't make sense.
> 
> Why couldn't the .. syntax be syntax sugar for some kind of library implement range type, just as what is done with associative arrays.
> 
> We could implement a new library type, named "range". Looking something like this:
> 
> struct range
> {
> size_t start;
> size_t end;
> // implement the range interface or opApply
> }
> 
> range r = 1 .. 5;
> 
> The above line would be syntax sugar for:
> 
> range r = range(1, 5);
> 
> void foo (range r)
> {
> foreach (e ; r) {}
> }
> 
> foo(r);

That might work, but it does make it so that ".." has very different meanings in different contexts, and I don't know that it really buys us much. iota already does them same thing (and with more functionality), just without the syntactic sugar. Also, we've had enough issues with moving AA's into druntime, that I don't know how great an idea this sort of thing would be (though it should be much simpler). It would certainly make some folks (e.g. Bearophile) happy though.

> This could then be taken advantage of in other parts of the language:
> 
> class A
> {
> int opSlice (range r); // new syntax
> int opSlice (size_t start, size_t end); // old syntax
> }
> 
> I think this would be completely backwards compatible as well.

Except that opSlice already works with "..". What would this buy you? It doesn't make sense to pass opSlice a range normally. Why treat this proposed "range" type any differently from any other range? This functionality already exists with the second declaration there. If we added a range type like this, I'd be inclined to make it __range or somesuch and not ever have its name used explicitly anywhere. It would basically just be syntactic sugar for iota (though it wouldnt' use iota specifically). I don't know what else you would be looking to get out of using its type specifically anywhere. That's not general done with other range types.

- Jonathan M Davis

April 04, 2012

Re: Questions about the slice operator

Posted by Jonathan M Davis
in reply to Simen Kjærås

Jonathan M Davis

Posted in reply to Simen Kjærås

On Wednesday, April 04, 2012 14:37:54 Simen Kjærås wrote:
> On Wed, 04 Apr 2012 14:21:01 +0200, Jacob Carlborg <doob@me.com> wrote:
> > On 2012-04-04 14:16, Simen Kjærås wrote:
> >> And what do we do with 3..$?
> > 
> > Hmm, that's a good point. The best I can think of for now is to translate that to:
> > 
> > range(3, size_t.max)
> > 
> > Or something like:
> > 
> > struct range
> > {
> > 
> > size_t start;
> > size_t end;
> > bool dollar; // better name is needed
> > 
> > }
> > 
> > range(3, 0, true)
> 
> Not enough:
> 
> $-3..$-2
> 
> This is a hard and unpleasant one, unless we go with $ being defined as the length of the array we're slicing, and only valid inside a slice operation. (and of course some opDollar or the like for other containers)

I believe that we have opDollar already but that it's buggy.

http://d.puremagic.com/issues/show_bug.cgi?id=7097 http://d.puremagic.com/issues/show_bug.cgi?id=7520

Several types in Phobos already have opDollar (generally an alias for length, it seems).

- Jonathan M Davis

April 04, 2012

Re: Questions about the slice operator

Posted by Jacob Carlborg
in reply to Jonathan M Davis

Jacob Carlborg

Posted in reply to Jonathan M Davis

On 2012-04-04 19:09, Jonathan M Davis wrote:

> That might work, but it does make it so that ".." has very different meanings
> in different contexts, and I don't know that it really buys us much. iota
> already does them same thing (and with more functionality), just without the
> syntactic sugar. Also, we've had enough issues with moving AA's into druntime,
> that I don't know how great an idea this sort of thing would be (though it
> should be much simpler). It would certainly make some folks (e.g. Bearophile)
> happy though.

Yeah, we don't have to stop any releases for this. It's one of those features in the language that is not very consistent and sometimes that just a bit annoying.

>> This could then be taken advantage of in other parts of the language:
>>
>> class A
>> {
>> int opSlice (range r); // new syntax
>> int opSlice (size_t start, size_t end); // old syntax
>> }
>>
>> I think this would be completely backwards compatible as well.
>
> Except that opSlice already works with "..". What would this buy you?

Nothing, but that's how the language could have looked like if a first class range type had been added to the language a long time ago.

> It doesn't make sense to pass opSlice a range normally. Why treat this proposed
> "range" type any differently from any other range? This functionality already
> exists with the second declaration there. If we added a range type like this,
> I'd be inclined to make it __range or somesuch and not ever have its name used
> explicitly anywhere. It would basically just be syntactic sugar for iota
> (though it wouldnt' use iota specifically). I don't know what else you would be
> looking to get out of using its type specifically anywhere. That's not general
> done with other range types.

In this case "range" is just a start and end of a list of numbers, maybe "range" is not a good name IT conflicts with the concept of ranges.

No, instead we use templates like mad.

-- 
/Jacob Carlborg

April 05, 2012

Re: Questions about the slice operator

Posted by Christophe
in reply to Jonathan M Davis

Christophe

Posted in reply to Jonathan M Davis

"Jonathan M Davis" , dans le message (digitalmars.D.learn:34243), a
> Except that opSlice already works with "..". What would this buy you?

Having a specific range for a .. operator allows you to have them as parameters of any function.

For example, this could be nice for multidimensional slicing:
Matrix!(double, 6, 6) A;
auto partOfA = A[1..3, 4..6];

Operations on several items of a container:
Container B;
B.remove(4..9); // remove 5 contiguous elements.

etc.

April 05, 2012

Re: Questions about the slice operator

Posted by Timon Gehr
in reply to Jacob Carlborg

Timon Gehr

Posted in reply to Jacob Carlborg

On 04/04/2012 12:06 PM, Jacob Carlborg wrote:
> On 2012-04-04 04:11, Jonathan M Davis wrote:
>
>> foreach(i; 0 .. 5)
>>
>> is more efficient only because it has _nothing_ to do with arrays.
>> Generalizing
>> the syntax wouldn't help at all, and if it were generalized, it would
>> arguably
>> have to be consistent in all of its uses, in which case
>>
>> foreach(i; 0 .. 5)
>>
>> would become identical to
>>
>> foreach(i; [0, 1, 2, 3, 4])
>>
>> and therefore less efficient. Generalizing .. just doesn't make sense.
>
> Why couldn't the .. syntax be syntax sugar for some kind of library
> implement range type, just as what is done with associative arrays.
>
> ...
>
> I think this would be completely backwards compatible as well.
>

It would be awkward to introduce it in a backwards compatible way, because currently '..' binds weaker than any operator.

auto x = 0..10; // ok
auto y = 0..10, z = 2; // error, z not defined
x = 0..11; // error: expression '11' has no effect

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation