View mode: basic / threaded / horizontal-split · Log in · Help
November 22, 2012
Re: Array Operations: a[] + b[] etc.
On Wednesday, 21 November 2012 at 19:40:25 UTC, Mike Wey wrote:
> If you want to use this syntax with images, DMagick's ImageView 
> might be interesting:
> http://dmagick.mikewey.eu/docs/ImageView.html

I like it :)
From what I can see it provides exactly what i'm talking about 
for 2D. I haven't looked at the implementation in detail, but do 
you think that such an approach could be scaled up to arbitrary 
N-dimensional arrays?
November 22, 2012
Re: Array Operations: a[] + b[] etc.
On Thursday, 22 November 2012 at 11:25:31 UTC, John Colvin wrote:
>
> Anyway, this is a pretty trivial matter, I'd be more interested 
> in seeing a definitive answer for what the correct behaviour 
> for the statement a[] = b[] + c[] is when the arrays have 
> different lengths.

I'd say the same as for "a[] += b[];": an assertion error.

You have to compile druntime in non-release to see it though.
November 22, 2012
Re: Array Operations: a[] + b[] etc.
On Thu, 22 Nov 2012 06:10:04 -0600, John Colvin  
<john.loughran.colvin@gmail.com> wrote:

> On Wednesday, 21 November 2012 at 19:40:25 UTC, Mike Wey wrote:
>> If you want to use this syntax with images, DMagick's ImageView might  
>> be interesting:
>> http://dmagick.mikewey.eu/docs/ImageView.html
>
> I like it :)
>  From what I can see it provides exactly what i'm talking about for 2D.  
> I haven't looked at the implementation in detail, but do you think that  
> such an approach could be scaled up to arbitrary N-dimensional arrays?

Yes and no. Basically, like an array, an ImageView is a thick pointer and  
as the dimensions increase the pointer gets thicker by 1-2 words a  
dimension. And each indexing or slicing operation has to create a  
temporary with this framework, which leads to stack churn as the  
dimensions get large. An another syntax that can be used until we get  
true, multi-dimensional slicing is to use opIndex with int[2] arguments,  
i.e: view[[4,40],[5,50]] = new Color("red");
November 22, 2012
Re: Array Operations: a[] + b[] etc.
On 11/22/2012 3:25 AM, John Colvin wrote:
> Anyway, this is a pretty trivial matter, I'd be more interested in seeing a
> definitive answer for what the correct behaviour for the statement a[] = b[] +
> c[] is when the arrays have different lengths.

An error.
November 22, 2012
Re: Array Operations: a[] + b[] etc.
On 11/22/2012 3:25 AM, John Colvin wrote:
> c[] = a[] + b[];
> fast, in place array operation, the cost of allocation happens earlier in the code.
>
> but also
> c = a[] + b[];
> a much slower, memory assigning array operation, pretty much just shorthand for
> c = new T[a.length];
> c[] = a[] + b[];
>
> You could argue that hiding an allocation is bad, but I would think it's quite
> obvious to any programmer that if you add 2 arrays together, you're going to
> have to create somewhere to put them... Having the shorthand prevents any
> possible mistakes with the length of the new array and saves a line of code.

I'll be bold and predict what will happen if this proposal is implemented:

    "Array operations in D are cool but are incredibly slow. D sux."

Few will notice that the hidden memory allocation can be easily removed, 
certainly not people casually looking at D to see if they should use it, and the 
damage will be done.
November 22, 2012
Re: Array Operations: a[] + b[] etc.
On 11/22/2012 01:10 PM, John Colvin wrote:
> On Wednesday, 21 November 2012 at 19:40:25 UTC, Mike Wey wrote:
>> If you want to use this syntax with images, DMagick's ImageView might
>> be interesting:
>> http://dmagick.mikewey.eu/docs/ImageView.html
>
> I like it :)
>  From what I can see it provides exactly what i'm talking about for 2D.
> I haven't looked at the implementation in detail, but do you think that
> such an approach could be scaled up to arbitrary N-dimensional arrays?

Every dimension has it's own type, so it won't scale well to a lot of 
dimensions. When sliceing every dimension would create an temporary.

-- 
Mike Wey
November 22, 2012
Re: Array Operations: a[] + b[] etc.
11/23/2012 1:02 AM, Walter Bright пишет:

>
> I'll be bold and predict what will happen if this proposal is implemented:
>
>      "Array operations in D are cool but are incredibly slow. D sux."
>
> Few will notice that the hidden memory allocation can be easily removed,
> certainly not people casually looking at D to see if they should use it,
> and the damage will be done.

Expending on it and adding more serious reasoning.

Array ops supposed to be overhead-free loops transparently leveraging 
SIMD parallelism of modern CPUs. No more and no less. It's like 
auto-vectorization but it's guaranteed and obvious in the form.

Now if array ops did the checking for matching lengths it would slow 
them down. And that's something you can't turn off when you know the 
lengths match as it's a built-in. Ditto for checking if the left side is 
already allocated and allocating if not (but it's even worse).

Basically you can't make the fastest primitive on something wrapped in 
safeguards. Doing the other way around is easy, for example via defining 
special wrapper type with custom opSlice, opSliceAssign etc..
that will do the checks.


-- 
Dmitry Olshansky
November 23, 2012
Re: Array Operations: a[] + b[] etc.
On Thursday, 22 November 2012 at 21:37:19 UTC, Dmitry Olshansky 
wrote:
> Array ops supposed to be overhead-free loops transparently 
> leveraging SIMD parallelism of modern CPUs. No more and no 
> less. It's like auto-vectorization but it's guaranteed and 
> obvious in the form.

I disagree that array ops are only for speed.
I would argue that their primary significance lies in their 
ability to make code significantly more readable, and more 
importantly, writeable. For example, the vector distance between 
2 position vectors can be written as:
dv[] = v2[] - v1[]
or
dv = v2[] - v1[]
anyone with an understanding of mathematical vectors instantly 
understands the general intent of the code.
With documentation something vaguely like this:
"An array is a reference to a chunk of memory that contains a 
list of data, all of the same type. v[] means the set of elements 
in the array, while v on it's own refers to just the reference. 
Operations on sets of elements e.g. dv[] = v2[] - v1[] work 
element-wise along the arrays {insert mathematical notation and 
picture of 3 arrays as columns next to each other etc.}.
Array operations can be very fast, as they are sometimes lowered 
directly to cpu vector instructions. However, be aware of 
situations where a new array has to be created implicitly, e.g. 
dv = v2[] - v1[]; Let's look at what this really means: we are 
asking for dv to be set to refer to the vector difference between 
v2 and v1. Note we said nothing about the current elements of dv, 
it might not even have any! This means we need to put the result 
of v2[] - v1] in a new chunk of memory, which we then set dv to 
refer to. Allocating new memory takes time, potentially taking a 
lot longer than the array operation itself, so if you can, avoid 
it!",
anyone with the most basic programming and mathematical knowledge 
can write concise code operating on arrays, taking advantage of 
the potential speedups while being aware of the pitfalls.

In short:
Vector syntax/array ops is/are great. Concise code that's easy to 
read and write. They fulfill one of the guiding principles of D: 
the most obvious code is fast and safe (or if not 100% safe, at 
least not too error-prone).
More vector syntax capabilities please!
November 23, 2012
Re: Array Operations: a[] + b[] etc.
On Thursday, 22 November 2012 at 20:58:25 UTC, Walter Bright 
wrote:
> On 11/22/2012 3:25 AM, John Colvin wrote:
>> Anyway, this is a pretty trivial matter, I'd be more 
>> interested in seeing a
>> definitive answer for what the correct behaviour for the 
>> statement a[] = b[] +
>> c[] is when the arrays have different lengths.
>
> An error.

Is monarch_dodra correct in saying that one would have to compile 
druntime in non-release to see this error? That would be a pity, 
couldn't this be implemented somehow so that it would depend on 
the user code being compiled non-release, not druntime?
November 23, 2012
Re: Array Operations: a[] + b[] etc.
On Thu, 22 Nov 2012 20:06:44 -0600, John Colvin  
<john.loughran.colvin@gmail.com> wrote:
> On Thursday, 22 November 2012 at 21:37:19 UTC, Dmitry Olshansky wrote:
>> Array ops supposed to be overhead-free loops transparently leveraging  
>> SIMD parallelism of modern CPUs. No more and no less. It's like  
>> auto-vectorization but it's guaranteed and obvious in the form.
>
> I disagree that array ops are only for speed.
> I would argue that their primary significance lies in their ability to  
> make code significantly more readable, and more importantly, writeable.  
> For example, the vector distance between 2 position vectors can be  
> written as:
> dv[] = v2[] - v1[]
> or
> dv = v2[] - v1[]
> anyone with an understanding of mathematical vectors instantly  
> understands the general intent of the code.
> With documentation something vaguely like this:
> "An array is a reference to a chunk of memory that contains a list of  
> data, all of the same type. v[] means the set of elements in the array,  
> while v on it's own refers to just the reference. Operations on sets of  
> elements e.g. dv[] = v2[] - v1[] work element-wise along the arrays  
> {insert mathematical notation and picture of 3 arrays as columns next to  
> each other etc.}.
> Array operations can be very fast, as they are sometimes lowered  
> directly to cpu vector instructions. However, be aware of situations  
> where a new array has to be created implicitly, e.g. dv = v2[] - v1[];  
> Let's look at what this really means: we are asking for dv to be set to  
> refer to the vector difference between v2 and v1. Note we said nothing  
> about the current elements of dv, it might not even have any! This means  
> we need to put the result of v2[] - v1] in a new chunk of memory, which  
> we then set dv to refer to. Allocating new memory takes time,  
> potentially taking a lot longer than the array operation itself, so if  
> you can, avoid it!",
> anyone with the most basic programming and mathematical knowledge can  
> write concise code operating on arrays, taking advantage of the  
> potential speedups while being aware of the pitfalls.
>
> In short:
> Vector syntax/array ops is/are great. Concise code that's easy to read  
> and write. They fulfill one of the guiding principles of D: the most  
> obvious code is fast and safe (or if not 100% safe, at least not too  
> error-prone).
> More vector syntax capabilities please!

While I think implicit allocation is a good idea in the case of variable  
initialization, i.e.:

auto dv = v2[] - v1[];

however, as a general statement, i.e. dv = v2[] - v1[];, it could just as  
easily be a typo and result in a silent and hard to find performance bug.

// An alternative syntax for variable initialization by an array operation  
expression:
auto dv[] = v2[] - v1[];
1 2 3
Top | Discussion index | About this forum | D home