Jump to page: 1 2
Thread overview
Status of multidimensional slicing
Mar 07, 2014
Jared Miller
Mar 07, 2014
bearophile
Mar 07, 2014
H. S. Teoh
Mar 10, 2014
Jared Miller
Mar 12, 2014
Kenji Hara
Mar 14, 2014
Mason McGill
Mar 14, 2014
bearophile
Mar 14, 2014
H. S. Teoh
Mar 14, 2014
Mason McGill
Mar 08, 2014
Brad Roberts
March 07, 2014
I would like to revisit the topic of operator overloads for multidimensional slicing.

Bottom line: opSlice is currently limited to 1 dimension/axis only. The cleanest workaround right now is to pass your own "slice" structs to opIndex. It works but it's not too pretty.

----
// Suppose we have a user-defined type...
auto mat = Matrix(
    [ [0,1,2],
      [3,4,5],
      [6,7,8] ] );

// This type of indexing can be implemented:
auto cell = mat[1, $-1];

// But multidimensional slicing cannot:
// auto submatrix = mat[0..2, 1..$];

// "Cleanest" workaround with a slice struct S taken by opIndex
//  (no $ capability):
auto submatrix = mat[ S(0,2), S(1,3) ];

// With a bit more hacking, something like this could be done:
auto submatrix = mat[ S[0..2], S[1..$] ];
----

Problem with current state of affairs and rationale for a fix:

* A stated design goal of D is to "Cater to the needs of numerical analysis programmers", and presumably HPC / scientific computing that's heavy on linear algebra and n-dimensional arrays. Well, it seems like the multidimensional slice/stride syntax in Matlab, NumPy, and even Fortran has been pretty popular with these folks. Syntactic sugar here is a clear win. I don't think it's a niche feature.
* The limitation on slicing is inconsistent with the capabilities of opIndex and opDollar, and workarounds are ugly.
* The approved DIP#7 briefly mentions multidimensional slicing but it was never implemented (despite opDollar getting done).

Recap of discussions so far:

* 2009-10-10: DIP7 (http://www.prowiki.org/wiki4d/wiki.cgi?LanguageDevel/DIPs/DIP7)
* 2010-03-08: "Proposal: Multidimensional opSlice solution" (http://forum.dlang.org/thread/hn2q9q$263e$1@digitalmars.com)
* 2011-10-09: "Issue #6798: Integrate overloadings for multidimensional indexing and slicing"
* 2012-06-01: "[Proposal] Additional operator overloadings for multidimentional indexing and slicing" (http://forum.dlang.org/thread/mailman.1202.1338515967.24740.digitalmars-d@puremagic.com)
* 2012-11-19: "Multidimensional array operator overloading" (http://forum.dlang.org/thread/mailman.2065.1353348152.5162.digitalmars-d@puremagic.com)
* 2012-12-19: "Multidimensional slice" (http://forum.dlang.org/thread/lglljlnzoathjxijomrn@forum.dlang.org)
* 2013-04-06: "rationale for opSlice, opSliceAssign, vs a..b being syntax suger for a Slice struct?" (http://forum.dlang.org/thread/mailman.551.1365290408.4724.digitalmars-d-learn@puremagic.com)
* 2013-05-12: Andrei asks for feedback on Kenji's 2011 pull request for #6798
* 2013-10-11: "std.linalg" (http://forum.dlang.org/thread/rmyaglfeimzuggoluxvd@forum.dlang.org)

Steps forward:

So I basically want resurrect the topic and gauge support for fixing slice overloads. Then, core committers could revisit Kenji's 2011 proposal and pull request for issue #6798 as a very solid near-term solution. Finally, perhaps a DIP for stride syntax/overloads?

Looking forward to discussion.
March 07, 2014
Jared Miller:

> Looking forward to discussion.

D needs to offer a nice syntax for user defined multidimensional slicing.

Bye,
bearophile
March 07, 2014
On Fri, Mar 07, 2014 at 09:23:30PM +0000, bearophile wrote:
> Jared Miller:
> 
> >Looking forward to discussion.
> 
> D needs to offer a nice syntax for user defined multidimensional slicing.
[...]

+1.  I fully support Kenji's pull to extend the language in that direction.

I'm a bit sad that Walter is pushing for a large breaking change to D string handling, while Kenji's pull, which is a non-breaking enhancement that would lead to much better D support for many numerical computation applications, has been stagnating for at least a year (probably more).


T

-- 
Unix is my IDE. -- Justin Whear
March 08, 2014
On 3/7/14, 2:30 PM, H. S. Teoh wrote:
> On Fri, Mar 07, 2014 at 09:23:30PM +0000, bearophile wrote:
>> Jared Miller:
>>
>>> Looking forward to discussion.
>>
>> D needs to offer a nice syntax for user defined multidimensional
>> slicing.
> [...]
>
> +1.  I fully support Kenji's pull to extend the language in that
> direction.
>
> I'm a bit sad that Walter is pushing for a large breaking change to D
> string handling, while Kenji's pull, which is a non-breaking enhancement
> that would lead to much better D support for many numerical computation
> applications, has been stagnating for at least a year (probably more).

I agree and sympathize.

Andrei


March 08, 2014
On 3/7/2014 2:30 PM, H. S. Teoh wrote:
> I'm a bit sad that Walter is pushing for a large breaking change to D
> string handling, while Kenji's pull, which is a non-breaking enhancement
> that would lead to much better D support for many numerical computation
> applications, has been stagnating for at least a year (probably more).

You expressed this as if there's actual correlation or causation between the two when it's highly unlikely any exists.  He's doing exactly what many many others do: express concern about a problem encountered during recent use of some aspect of the D ecosystem.

It's an unfortunate but true aspect of the rate of D development combined with the relative small community: old pull's get lost in the noise.  For pulls to get attention, the author or proponents of a pull need to keep it alive.  The rate of application of pulls (regardless of age) isn't bad, but when combined with the influx rate of new pull requests it's just not high enough to get the backlog gone.
March 10, 2014
So are there any significant objections to Kenji's PR?

I think it's got a lot of things going for it, particularly in finishing the job begun by DIP#7 and opDollar. I realize it's not likely be a top priority for most people, but it's got a lot of bang for your buck: a great benefit to an important subset of users for relatively little effort.

I'd love to see it on the official agenda for release this year. Is this the right location: http://wiki.dlang.org/Agenda, and is anybody welcome to offer edits?

Jared

On Saturday, 8 March 2014 at 01:24:50 UTC, Andrei Alexandrescu wrote:

>
> I agree and sympathize.
>
> Andrei

March 12, 2014
2014-03-08 10:24 GMT+09:00 Andrei Alexandrescu < SeeWebsiteForEmail@erdani.org>:

>
> I agree and sympathize.


I finished to update my pull request #443. Now it is active.

Kenji Hara


March 14, 2014
Hi all,

I think D has a lot to offer technical computing:
  - the speed and modeling power of C++
  - GC for clean API design
  - reflection for automatic bindings
And technical computing has a lot to offer D:
  - users
  - API writers
  - time in the minds of people who teach

Multidimensional array support is important for this exchange to happen, so as a D user and a computer vision researcher I'm glad to see it's being addressed! However, I'm interested in hearing more about the rationale for the design decisions made concerning pull request #443.  My concern is that this design may be ignoring some of the lessons the SciPy community has learned over the past 10+ years.  A bit of elaboration:

In Python, slicing and indexing were originally separate operations.  Custom containers would have to define both `__getitem__(self, key)` and `__getslice__(self, start, end)`.  This is where D is now.  Python then deprecated `__getslice__` and decided `container[start:end]` should translate to `container[slice(start, end)]`: the slicing syntax just became sugar for creating a lightweight slice object (i.e. a "range literal"), but it only worked inside an index expression.  If I understand correctly, this is similar in spirit to the solution the D community seems to be converging upon.  This solution enables multidimensional slicing, but needlessly prohibits the construction of range literals outside of an index expression.

So, why is this important?  One point of view is that multidimensional slicing is just one of many use cases for a concise representation of a range of numbers.  In more "specialized" scientific languages, like MATLAB/Octave and Julia, range literals are a critical component to readable, idiomatic code.  In order to partially make up for this, SciPy is forced to subvert Python's indexing syntax for calling functions that may operate on numeric ranges, obfuscating code (e.g. http://docs.scipy.org/doc/numpy/reference/generated/numpy.r_.html).

I point this out because it (fortunately) seems like D is in a position to have range literals while maintaining backwards compatibility and reducing language complexity (details are below).  I'd like to hear your thoughts about range literals as a solution for multidimensional indexing: whether it's been proposed, if so, why is was decided against, what its disadvantages might be, whether they're compatible with the work already done on this front, etc.

===================
Range Literals in D
===================

// Right now this works:
foreach (i; 0..10) doScience(i);

// And this works:
auto range = iota(0, 10);
foreach (i; range) doScience(i);

// So why shouldn't this work?
auto range = 0..10;
foreach (i; range) doScience(i);

// Or this?
auto range = 0..10;
myFavoriteArray[range] = fascinatingFindings();

// Or this?
auto range = 0..10;
myFavoriteMatrix[0..$, range] = fascinating2DFindings();

// `opSlice` would no longer be necessary...
myMap["key"];       // calls `opIndex(string);`
myVector[5];        // calls `opIndex(int);`
myMatrix[5, 0..10]; // calls `opIndex(int, NumericRange);`

// But old code that defines `opSlice` could still work (like in Python).
myVector[0..10]; // If `opIndex(NumericRange)` isn't defined,
                 // fall back to`opSlice`.

// `ForeachRangeStatement` would no longer need to exist as an odd special case.
// The following two statements are semantically equivalent, and with range
// literals, they'd be instances of the same looping syntax.
foreach (i; 0..10) doScience();
foreach (i; iota(0, 10)) doScience();

// Compilers would, of course, be free to special-case `foreach` loops
// over range literals, if it's helpful for performance.

On Wednesday, 12 March 2014 at 13:55:05 UTC, Kenji Hara wrote:
> 2014-03-08 10:24 GMT+09:00 Andrei Alexandrescu <
> SeeWebsiteForEmail@erdani.org>:
>
>>
>> I agree and sympathize.
>
>
> I finished to update my pull request #443. Now it is active.
>
> Kenji Hara

March 14, 2014
Mason McGill:

> My concern is that this design may be ignoring some of the lessons the SciPy community has learned over the past 10+ years.

Thank you for your help. An injection of experience is quite important here. Julia is far newer than D, and yet it has already a better design and more refined implementation in several things related to numerical computing.


> // So why shouldn't this work?
> auto range = 0..10;
> foreach (i; range) doScience(i);

People have suggested this lot of time ago, again and again. So I ask that question for Walter.

Bye,
bearophile
March 14, 2014
On Fri, Mar 14, 2014 at 12:29:34PM +0000, bearophile wrote:
> Mason McGill:
> 
> >My concern is that this design may be ignoring some of the lessons the SciPy community has learned over the past 10+ years.
> 
> Thank you for your help. An injection of experience is quite important here. Julia is far newer than D, and yet it has already a better design and more refined implementation in several things related to numerical computing.
> 
> 
> >// So why shouldn't this work?
> >auto range = 0..10;
> >foreach (i; range) doScience(i);
> 
> People have suggested this lot of time ago, again and again. So I ask that question for Walter.
[...]

Replace the first line with:

	auto range = iota(0, 10);

and it will work. It's not *that* hard to learn, is it?


T

-- 
Klein bottle for rent ... inquire within. -- Stephen Mulraney
« First   ‹ Prev
1 2