November 22, 2012
On Wednesday, 21 November 2012 at 19:40:25 UTC, Mike Wey wrote:
> If you want to use this syntax with images, DMagick's ImageView might be interesting:
> http://dmagick.mikewey.eu/docs/ImageView.html

I like it :)
From what I can see it provides exactly what i'm talking about for 2D. I haven't looked at the implementation in detail, but do you think that such an approach could be scaled up to arbitrary N-dimensional arrays?
November 22, 2012
On Thursday, 22 November 2012 at 11:25:31 UTC, John Colvin wrote:
>
> Anyway, this is a pretty trivial matter, I'd be more interested in seeing a definitive answer for what the correct behaviour for the statement a[] = b[] + c[] is when the arrays have different lengths.

I'd say the same as for "a[] += b[];": an assertion error.

You have to compile druntime in non-release to see it though.
November 22, 2012
On Thu, 22 Nov 2012 06:10:04 -0600, John Colvin <john.loughran.colvin@gmail.com> wrote:

> On Wednesday, 21 November 2012 at 19:40:25 UTC, Mike Wey wrote:
>> If you want to use this syntax with images, DMagick's ImageView might be interesting:
>> http://dmagick.mikewey.eu/docs/ImageView.html
>
> I like it :)
>  From what I can see it provides exactly what i'm talking about for 2D. I haven't looked at the implementation in detail, but do you think that such an approach could be scaled up to arbitrary N-dimensional arrays?

Yes and no. Basically, like an array, an ImageView is a thick pointer and as the dimensions increase the pointer gets thicker by 1-2 words a dimension. And each indexing or slicing operation has to create a temporary with this framework, which leads to stack churn as the dimensions get large. An another syntax that can be used until we get true, multi-dimensional slicing is to use opIndex with int[2] arguments, i.e: view[[4,40],[5,50]] = new Color("red");
November 22, 2012
On 11/22/2012 3:25 AM, John Colvin wrote:
> Anyway, this is a pretty trivial matter, I'd be more interested in seeing a
> definitive answer for what the correct behaviour for the statement a[] = b[] +
> c[] is when the arrays have different lengths.

An error.
November 22, 2012
On 11/22/2012 3:25 AM, John Colvin wrote:
> c[] = a[] + b[];
> fast, in place array operation, the cost of allocation happens earlier in the code.
>
> but also
> c = a[] + b[];
> a much slower, memory assigning array operation, pretty much just shorthand for
> c = new T[a.length];
> c[] = a[] + b[];
>
> You could argue that hiding an allocation is bad, but I would think it's quite
> obvious to any programmer that if you add 2 arrays together, you're going to
> have to create somewhere to put them... Having the shorthand prevents any
> possible mistakes with the length of the new array and saves a line of code.

I'll be bold and predict what will happen if this proposal is implemented:

    "Array operations in D are cool but are incredibly slow. D sux."

Few will notice that the hidden memory allocation can be easily removed, certainly not people casually looking at D to see if they should use it, and the damage will be done.
November 22, 2012
On 11/22/2012 01:10 PM, John Colvin wrote:
> On Wednesday, 21 November 2012 at 19:40:25 UTC, Mike Wey wrote:
>> If you want to use this syntax with images, DMagick's ImageView might
>> be interesting:
>> http://dmagick.mikewey.eu/docs/ImageView.html
>
> I like it :)
>  From what I can see it provides exactly what i'm talking about for 2D.
> I haven't looked at the implementation in detail, but do you think that
> such an approach could be scaled up to arbitrary N-dimensional arrays?

Every dimension has it's own type, so it won't scale well to a lot of dimensions. When sliceing every dimension would create an temporary.

-- 
Mike Wey
November 22, 2012
11/23/2012 1:02 AM, Walter Bright пишет:

>
> I'll be bold and predict what will happen if this proposal is implemented:
>
>      "Array operations in D are cool but are incredibly slow. D sux."
>
> Few will notice that the hidden memory allocation can be easily removed,
> certainly not people casually looking at D to see if they should use it,
> and the damage will be done.

Expending on it and adding more serious reasoning.

Array ops supposed to be overhead-free loops transparently leveraging SIMD parallelism of modern CPUs. No more and no less. It's like auto-vectorization but it's guaranteed and obvious in the form.

Now if array ops did the checking for matching lengths it would slow them down. And that's something you can't turn off when you know the lengths match as it's a built-in. Ditto for checking if the left side is already allocated and allocating if not (but it's even worse).

Basically you can't make the fastest primitive on something wrapped in safeguards. Doing the other way around is easy, for example via defining special wrapper type with custom opSlice, opSliceAssign etc..
that will do the checks.


-- 
Dmitry Olshansky
November 23, 2012
On Thursday, 22 November 2012 at 21:37:19 UTC, Dmitry Olshansky wrote:
> Array ops supposed to be overhead-free loops transparently leveraging SIMD parallelism of modern CPUs. No more and no less. It's like auto-vectorization but it's guaranteed and obvious in the form.

I disagree that array ops are only for speed.
I would argue that their primary significance lies in their ability to make code significantly more readable, and more importantly, writeable. For example, the vector distance between 2 position vectors can be written as:
dv[] = v2[] - v1[]
or
dv = v2[] - v1[]
anyone with an understanding of mathematical vectors instantly understands the general intent of the code.
With documentation something vaguely like this:
"An array is a reference to a chunk of memory that contains a list of data, all of the same type. v[] means the set of elements in the array, while v on it's own refers to just the reference. Operations on sets of elements e.g. dv[] = v2[] - v1[] work element-wise along the arrays {insert mathematical notation and picture of 3 arrays as columns next to each other etc.}.
Array operations can be very fast, as they are sometimes lowered directly to cpu vector instructions. However, be aware of situations where a new array has to be created implicitly, e.g. dv = v2[] - v1[]; Let's look at what this really means: we are asking for dv to be set to refer to the vector difference between v2 and v1. Note we said nothing about the current elements of dv, it might not even have any! This means we need to put the result of v2[] - v1] in a new chunk of memory, which we then set dv to refer to. Allocating new memory takes time, potentially taking a lot longer than the array operation itself, so if you can, avoid it!",
anyone with the most basic programming and mathematical knowledge can write concise code operating on arrays, taking advantage of the potential speedups while being aware of the pitfalls.

In short:
Vector syntax/array ops is/are great. Concise code that's easy to read and write. They fulfill one of the guiding principles of D: the most obvious code is fast and safe (or if not 100% safe, at least not too error-prone).
More vector syntax capabilities please!
November 23, 2012
On Thursday, 22 November 2012 at 20:58:25 UTC, Walter Bright wrote:
> On 11/22/2012 3:25 AM, John Colvin wrote:
>> Anyway, this is a pretty trivial matter, I'd be more interested in seeing a
>> definitive answer for what the correct behaviour for the statement a[] = b[] +
>> c[] is when the arrays have different lengths.
>
> An error.

Is monarch_dodra correct in saying that one would have to compile druntime in non-release to see this error? That would be a pity, couldn't this be implemented somehow so that it would depend on the user code being compiled non-release, not druntime?
November 23, 2012
On Thu, 22 Nov 2012 20:06:44 -0600, John Colvin <john.loughran.colvin@gmail.com> wrote:
> On Thursday, 22 November 2012 at 21:37:19 UTC, Dmitry Olshansky wrote:
>> Array ops supposed to be overhead-free loops transparently leveraging SIMD parallelism of modern CPUs. No more and no less. It's like auto-vectorization but it's guaranteed and obvious in the form.
>
> I disagree that array ops are only for speed.
> I would argue that their primary significance lies in their ability to make code significantly more readable, and more importantly, writeable. For example, the vector distance between 2 position vectors can be written as:
> dv[] = v2[] - v1[]
> or
> dv = v2[] - v1[]
> anyone with an understanding of mathematical vectors instantly understands the general intent of the code.
> With documentation something vaguely like this:
> "An array is a reference to a chunk of memory that contains a list of data, all of the same type. v[] means the set of elements in the array, while v on it's own refers to just the reference. Operations on sets of elements e.g. dv[] = v2[] - v1[] work element-wise along the arrays {insert mathematical notation and picture of 3 arrays as columns next to each other etc.}.
> Array operations can be very fast, as they are sometimes lowered directly to cpu vector instructions. However, be aware of situations where a new array has to be created implicitly, e.g. dv = v2[] - v1[]; Let's look at what this really means: we are asking for dv to be set to refer to the vector difference between v2 and v1. Note we said nothing about the current elements of dv, it might not even have any! This means we need to put the result of v2[] - v1] in a new chunk of memory, which we then set dv to refer to. Allocating new memory takes time, potentially taking a lot longer than the array operation itself, so if you can, avoid it!",
> anyone with the most basic programming and mathematical knowledge can write concise code operating on arrays, taking advantage of the potential speedups while being aware of the pitfalls.
>
> In short:
> Vector syntax/array ops is/are great. Concise code that's easy to read and write. They fulfill one of the guiding principles of D: the most obvious code is fast and safe (or if not 100% safe, at least not too error-prone).
> More vector syntax capabilities please!

While I think implicit allocation is a good idea in the case of variable initialization, i.e.:

auto dv = v2[] - v1[];

however, as a general statement, i.e. dv = v2[] - v1[];, it could just as easily be a typo and result in a silent and hard to find performance bug.

// An alternative syntax for variable initialization by an array operation expression:
auto dv[] = v2[] - v1[];