Smart slicing - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Smart slicing

Thread overview

Smart slicing
Mar 20, 2008 bearophile
Mar 21, 2008 Trevor Parscal
Mar 21, 2008 bearophile
Mar 21, 2008 BCS
Mar 21, 2008 renoX
Mar 21, 2008 Bill Baxter
Mar 22, 2008 Neal Becker
Mar 22, 2008 bearophile
Mar 21, 2008 Janice Caron

March 20, 2008

Posted by bearophile

bearophile

Hello, I have discussed this topic a bit on the IRC #D channel too.

Array slicing is a very useful feature of D. In Python the array (list) slicing is even more powerful, because array indexes can be negative (they wrap around, so -1 equals to $-1 in D, but I like the D solution too, it's faster too), and it has a third optional parameter (stride) that can be a negative number too:

>>> "abcdefghilmn"[2]
'c'
>>> "abcdefghilmn"[2:11]
'cdefghilm'
>>> "abcdefghilmn"[2:]
'cdefghilmn'
>>> "abcdefghilmn"[:2]
'ab'
>>> "abcdefghilmn"[-1]
'n'
>>> "abcdefghilmn"[-2]
'm'
>>> "abcdefghilmn"[2:-2]
'cdefghil'
>>> "abcdefghilmn"[11:2]
''
>>> "abcdefghilmn"[11:2:-1]
'nmlihgfed'
>>> "abcdefghilmn"[11:2:-2]
'nlhfd'
>>> "abcdefghilmn"[2:30]
'cdefghilmn'
>>> "abcdefghilmn"[30:50]
''

(I presume D has used .. instead of : because : is used by AAs, and {} can't be used to denote AAs because it's used already in the C syntax).
I don't miss all those features in D (D arrays have .reverse that replaces the common case of stride=-1), but I miss the feature you can see in the last two examples, that is the overflowing index is put back to the actual max-min index bound of the array.

A bare-bone semantic like that can be implemented in D like this (no negative indexes, no optional stride. But it's easy to add a third overloaded function for the stride too):

T[] slice(T)(T[] array, int start) {
    int end = array.length;
    if (end <= start)
        return new T[0];
    if (start < 0)
        start = 0;
    return array[start .. end];
}

T[] slice(T)(T[] array, int start, int end) {
    if (end <= start)
        return new T[0];
    if (start < 0)
        start = 0;
    if (end > array.length)
        end = array.length;
    return array[start .. end];
}

later you can write:
"abcdefghilmn".slice(2,30)


Some questions:
1) What's the purpose of this smart slicing in the core of the language?
It avoids errors, and speeds up programming a little because you need less care in slicing bounds.
In Python and Ruby you use such feature often.
You can use it to take "50 characters or less" from a string (less if the string is shorter than 50), or to safely produce empty slices in some situations. I know you can do the same with D, for example:

somestring[0 .. $ < 50 ? $ : 50]
Or:
somestring[0 .. min(50, $)]

But it's less immediate, expecially when the 50 is a variable, so you need:

somestring[0 .. min(max(x, 0), $))]

Another small example, replace/create heads of sub-arrays with a given value (not in-place):


import d.func: list, putr, map;

T[][] replace_heads1(T)(T[][] seq, T x) { // Error: ArrayBoundsError
    T[] sub;
    return list([x]~sub[1..$], sub, seq);
}

T[][] replace_heads2(T)(T[][] seq, T x) {
    T[] sub;
    return list(sub.length ? [x]~sub[1..$] : [x], sub, seq);
}

T[][] replace_heads3(T)(T[][] seq, T x) { // smart slicing
    T[] sub;
    return list([x]~sub.slice(1), sub, seq);
}

void main() {
    int[][] l = [[1, 2, 3], [4, 5], [6], []];
    //putr(replace_heads1(l, 0));
    putr(replace_heads2(l, 0)); // Prints: [[0, 2, 3], [0, 5], [0], [0]]
    putr(replace_heads2(l, 0)); // Prints: [[0, 2, 3], [0, 5], [0], [0]]
}

(list() is like a Python list comphrension)

2) 'Smart slicing' can't be used on poiter-based arrays (C ones), because their length isn't known at runtime (but it can be done at compile time on static arrays). This just means you can't rely on smart slicing on such lower-level arrays.

3) Such smart slicing is slower, and I think at compile-time the compiler can't optimize it much. D is designed to be quite faster than Python/Ruby, so such slowdown may be too much.
C++ gives at() and [] to index a a vector, in a safe and unsafe way, so for D it may be invented a different syntax for safe slicing. I think those slice() functions may be enough (and you have to import them from the std lib), when the language will allow to use $ in user functions too. In the meantime the $ hask may be enough.

Bye,
bearophile

March 21, 2008

Re: Smart slicing

Posted by Trevor Parscal
in reply to bearophile

Trevor Parscal

Posted in reply to bearophile

bearophile Wrote:

> Hello, I have discussed this topic a bit on the IRC #D channel too.
> 
> Array slicing is a very useful feature of D. In Python the array (list) slicing is even more powerful, because array indexes can be negative (they wrap around, so -1 equals to $-1 in D, but I like the D solution too, it's faster too), and it has a third optional parameter (stride) that can be a negative number too:
> 
> >>> "abcdefghilmn"[2]
> 'c'
> >>> "abcdefghilmn"[2:11]
> 'cdefghilm'
> >>> "abcdefghilmn"[2:]
> 'cdefghilmn'
> >>> "abcdefghilmn"[:2]
> 'ab'
> >>> "abcdefghilmn"[-1]
> 'n'
> >>> "abcdefghilmn"[-2]
> 'm'
> >>> "abcdefghilmn"[2:-2]
> 'cdefghil'
> >>> "abcdefghilmn"[11:2]
> ''
> >>> "abcdefghilmn"[11:2:-1]
> 'nmlihgfed'
> >>> "abcdefghilmn"[11:2:-2]
> 'nlhfd'
> >>> "abcdefghilmn"[2:30]
> 'cdefghilmn'
> >>> "abcdefghilmn"[30:50]
> ''
> 
> (I presume D has used .. instead of : because : is used by AAs, and {} can't be used to denote AAs because it's used already in the C syntax).
> I don't miss all those features in D (D arrays have .reverse that replaces the common case of stride=-1), but I miss the feature you can see in the last two examples, that is the overflowing index is put back to the actual max-min index bound of the array.
> 
> A bare-bone semantic like that can be implemented in D like this (no negative indexes, no optional stride. But it's easy to add a third overloaded function for the stride too):
> 
> T[] slice(T)(T[] array, int start) {
>     int end = array.length;
>     if (end <= start)
>         return new T[0];
>     if (start < 0)
>         start = 0;
>     return array[start .. end];
> }
> 
> T[] slice(T)(T[] array, int start, int end) {
>     if (end <= start)
>         return new T[0];
>     if (start < 0)
>         start = 0;
>     if (end > array.length)
>         end = array.length;
>     return array[start .. end];
> }
> 
> later you can write:
> "abcdefghilmn".slice(2,30)
> 
> 
> Some questions:
> 1) What's the purpose of this smart slicing in the core of the language?
> It avoids errors, and speeds up programming a little because you need less care in slicing bounds.
> In Python and Ruby you use such feature often.
> You can use it to take "50 characters or less" from a string (less if the string is shorter than 50), or to safely produce empty slices in some situations. I know you can do the same with D, for example:
> 
> somestring[0 .. $ < 50 ? $ : 50]
> Or:
> somestring[0 .. min(50, $)]
> 
> But it's less immediate, expecially when the 50 is a variable, so you need:
> 
> somestring[0 .. min(max(x, 0), $))]
> 
> Another small example, replace/create heads of sub-arrays with a given value (not in-place):
> 
> 
> import d.func: list, putr, map;
> 
> T[][] replace_heads1(T)(T[][] seq, T x) { // Error: ArrayBoundsError
>     T[] sub;
>     return list([x]~sub[1..$], sub, seq);
> }
> 
> T[][] replace_heads2(T)(T[][] seq, T x) {
>     T[] sub;
>     return list(sub.length ? [x]~sub[1..$] : [x], sub, seq);
> }
> 
> T[][] replace_heads3(T)(T[][] seq, T x) { // smart slicing
>     T[] sub;
>     return list([x]~sub.slice(1), sub, seq);
> }
> 
> void main() {
>     int[][] l = [[1, 2, 3], [4, 5], [6], []];
>     //putr(replace_heads1(l, 0));
>     putr(replace_heads2(l, 0)); // Prints: [[0, 2, 3], [0, 5], [0], [0]]
>     putr(replace_heads2(l, 0)); // Prints: [[0, 2, 3], [0, 5], [0], [0]]
> }
> 
> (list() is like a Python list comphrension)
> 
> 2) 'Smart slicing' can't be used on poiter-based arrays (C ones), because their length isn't known at runtime (but it can be done at compile time on static arrays). This just means you can't rely on smart slicing on such lower-level arrays.
> 
> 3) Such smart slicing is slower, and I think at compile-time the compiler can't optimize it much. D is designed to be quite faster than Python/Ruby, so such slowdown may be too much.
> C++ gives at() and [] to index a a vector, in a safe and unsafe way, so for D it may be invented a different syntax for safe slicing. I think those slice() functions may be enough (and you have to import them from the std lib), when the language will allow to use $ in user functions too. In the meantime the $ hask may be enough.
> 
> Bye,
> bearophile

Allowing negetive indexes seems like a nice feature - but in fact acts to hide bugs in software, which is something that D makes efforts not to do.

In other words - a negative index should be wrapped by you, not the compiler - and honestly, it's not hard to say (i % arr.length) in those very seldom cases in which wrapping the index of an array is needed.

- Trevor

March 21, 2008

Re: Smart slicing

Posted by bearophile
in reply to Trevor Parscal

bearophile

Posted in reply to Trevor Parscal

Trevor Parscal:
> Allowing negetive indexes seems like a nice feature - but in fact acts to hide bugs in software, which is something that D makes efforts not to do.

In practice it rarely produces bugs, but note that my whole post was not about negative indexes, it was about out-of-bound arrays (positive) indexes.

Bye,
bearophile

March 21, 2008

Re: Smart slicing

Posted by Bill Baxter
in reply to bearophile

Bill Baxter

Posted in reply to bearophile

bearophile wrote:
> Hello, I have discussed this topic a bit on the IRC #D channel too.
> 
> Array slicing is a very useful feature of D. In Python the array (list) slicing is even more powerful, because array indexes can be negative (they wrap around, so -1 equals to $-1 in D, but I like the D solution too, it's faster too), and it has a third optional parameter (stride) that can be a negative number too:
> 

>>>> "abcdefghilmn"[2:30]
> 'cdefghilmn'
>>>> "abcdefghilmn"[30:50]
> ''
> 
> (I presume D has used .. instead of : because : is used by AAs, and {} can't be used to denote AAs because it's used already in the C syntax).
> I don't miss all those features in D (D arrays have .reverse that replaces the common case of stride=-1), but I miss the feature you can see in the last two examples, that is the overflowing index is put back to the actual max-min index bound of the array.

I had no idea you could do that in Python.  I'd much rather get an error there.  My guess is that most cases where you aren't sure how much to slice out are covered by end-relative indexes, like stuff[30:] or stuff[30:-1].  In D syntax stuff[30..$] and stuff[30..$-1].

For safe-to-go-out-of-bounds slicing like you're talking about, I'd rather implement that as a function call, something like your
  stuff.slice(2,30)
or you can probably get this syntax with current D
  stuff.slice[2..30]

It's not something that I'd want to use most of the time, and when I do, I'd want to make it clear that I'm relying on automatic bounds trimming.

--bb

March 21, 2008

Re: Smart slicing

Posted by BCS
in reply to bearophile

BCS

Posted in reply to bearophile

Reply to bearophile,

> Trevor Parscal:
> 
>> Allowing negetive indexes seems like a nice feature - but in fact
>> acts to hide bugs in software, which is something that D makes
>> efforts not to do.
>> 
> In practice it rarely produces bugs, but note that my whole post was
> not about negative indexes, it was about out-of-bound arrays
> (positive) indexes.
> 

BTW: this "array[i%$];" does both.

> Bye,
> bearophile

March 21, 2008

Re: Smart slicing

Posted by renoX
in reply to BCS

renoX

Posted in reply to BCS

BCS a écrit :
> Reply to bearophile,
> 
>> Trevor Parscal:
>>
>>> Allowing negetive indexes seems like a nice feature - but in fact
>>> acts to hide bugs in software, which is something that D makes
>>> efforts not to do.
>>>
>> In practice it rarely produces bugs, but note that my whole post was
>> not about negative indexes, it was about out-of-bound arrays
>> (positive) indexes.
>>
> 
> BTW: this "array[i%$];" does both.

Sure but it isn't the same semantic for the positive overflow as was exposed by bearophile: 'saturation' vs 'modulo' arithmetic.

Bye,
renoX

> 
>> Bye,
>> bearophile
> 
>

March 21, 2008

Re: Smart slicing

Posted by Janice Caron
in reply to bearophile

Janice Caron

Posted in reply to bearophile

On 20/03/2008, bearophile <bearophileHUGS@lycos.com> wrote:
>  A bare-bone semantic like that can be implemented in D like this

or like this:

    struct Slice(T)
    {
        T[] array;
        T opIndex(int i) {...}
        Slice opSlice(int i, int j) {...}
        Slice stride(int i, int j, int k) {...}
        /*etc*/
    }

That way, you can preserve the D syntax and at the same time allow smart slicing.

March 22, 2008

Re: Smart slicing

Posted by Neal Becker
in reply to Bill Baxter

Neal Becker

Posted in reply to Bill Baxter

Bill Baxter wrote:

> bearophile wrote:
>> Hello, I have discussed this topic a bit on the IRC #D channel too.
>> 
>> Array slicing is a very useful feature of D. In Python the array (list) slicing is even more powerful, because array indexes can be negative (they wrap around, so -1 equals to $-1 in D, but I like the D solution too, it's faster too), and it has a third optional parameter (stride) that can be a negative number too:
>> 
> 
>>>>> "abcdefghilmn"[2:30]
>> 'cdefghilmn'
>>>>> "abcdefghilmn"[30:50]
>> ''
>> 
>> (I presume D has used .. instead of : because : is used by AAs, and {} can't be used to denote AAs because it's used already in the C syntax). I don't miss all those features in D (D arrays have .reverse that replaces the common case of stride=-1), but I miss the feature you can see in the last two examples, that is the overflowing index is put back to the actual max-min index bound of the array.
> 
> I had no idea you could do that in Python.  I'd much rather get an error there.  My guess is that most cases where you aren't sure how much to slice out are covered by end-relative indexes, like stuff[30:] or stuff[30:-1].  In D syntax stuff[30..$] and stuff[30..$-1].
> 
> For safe-to-go-out-of-bounds slicing like you're talking about, I'd
> rather implement that as a function call, something like your
>    stuff.slice(2,30)
> or you can probably get this syntax with current D
>    stuff.slice[2..30]
> 
> It's not something that I'd want to use most of the time, and when I do, I'd want to make it clear that I'm relying on automatic bounds trimming.
> 
> --bb
Please no!

Silently ignoring the error is bad and unpythonic.  When I posted this opinion to python-devel BDFL Guido replied, basically agreeing with me, but saying it was too late to change.  But for D, Please no!

March 22, 2008

Re: Smart slicing

Posted by bearophile
in reply to Neal Becker

bearophile

Posted in reply to Neal Becker

Neal Becker:
> Silently ignoring the error is bad and unpythonic.  When I posted this opinion to python-devel BDFL Guido replied, basically agreeing with me, but saying it was too late to change.  But for D, Please no!

I see. But I don't see them changing this feature, even if they now can do it in Python 3.x.
(I think that D disables such error controls when you add the -release compiler flag. While you can't do that in Python. So the situation is quite different.)

Bye,
bearophile

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation