Array Operations: a[] + b[] etc. (page 3)

On 11/22/2012 6:11 PM, John Colvin wrote: >> An error. > > Is monarch_dodra correct in saying that one would have to compile druntime in > non-release to see this error? That would be a pity, couldn't this be > implemented somehow so that it would depend on the user code being compiled > non-release, not druntime? I'd have to look at the specific code to see. In any case, it is an error. It takes a runtime check to do it, which can be turned on or off with the -noboundscheck switch.

On Friday, 23 November 2012 at 06:41:06 UTC, Walter Bright wrote: > On 11/22/2012 6:11 PM, John Colvin wrote: >>> An error. >> >> Is monarch_dodra correct in saying that one would have to compile druntime in >> non-release to see this error? That would be a pity, couldn't this be >> implemented somehow so that it would depend on the user code being compiled >> non-release, not druntime? > > I'd have to look at the specific code to see. In any case, it is an error. It takes a runtime check to do it, which can be turned on or off with the -noboundscheck switch. I originally opened this some time back, related to opSlice operations not error-ing: http://d.puremagic.com/issues/show_bug.cgi?id=8650 I've since learned to build druntime as non-release, which "fixes" the problem. I don't know if you plan to change anything about this, but just wanted to point out there's an Bugzilla entry for it.

On 11/22/2012 10:49 PM, monarch_dodra wrote: > I originally opened this some time back, related to opSlice operations not > error-ing: > http://d.puremagic.com/issues/show_bug.cgi?id=8650 > > I've since learned to build druntime as non-release, which "fixes" the problem. > > I don't know if you plan to change anything about this, but just wanted to point > out there's an Bugzilla entry for it. Thank you.

On Wednesday, 21 November 2012 at 18:15:51 UTC, Walter Bright wrote: > On 11/21/2012 10:02 AM, John Colvin wrote: >> My vision of how things could work: >> c = a[] opBinary b[]; >> should be legal. It should create a new array that is then reference assigned to c. > > This is not done because it puts excessive pressure on the garbage collector. Array ops do not allocate memory by design. But if they wanted it anyways, could implement it as a struct... Here's a rough build... Should be fairly obvious what's happening. struct AllocatingVectorArray(T) { T[] data; alias data this; alias AllocatingVectorArray AVA; //forces slice operations for vector format only static struct AVASlice { T[] data; alias data this; this(T[] rhs) { data = rhs; } AVA opBinary(string op)(const AVASlice rhs) { assert(rhs.length == data.length, "Lengths don't match, cannot use vector operations"); AVA var; var.data = data.dup; mixin("var[] " ~ op ~ "= rhs[];"); return var; } } this(T[] rhs) { data = rhs; } ref AVA opAssign(T[] rhs) { data = rhs; return this; } AVASlice opSlice() { return AVASlice(this); } } unittest { alias AllocatingVectorArray!int AVAint; AVAint a = [1,2,3,4]; AVAint b = [5,6,7,8]; AVAint c; // c = a + b; //not allowed, 'not implemented error' // assert(c == [6,8,10,12]); c = a[] + b[]; //known vector syntax assert(c == [6,8,10,12]); c[] = a[] + b[]; //more obvious what's happening assert(c == [6,8,10,12]); }

November 23, 2012

Re: Array Operations: a[] + b[] etc.

Posted by Dmitry Olshansky
in reply to John Colvin

Permalink

Dmitry Olshansky

Posted in reply to John Colvin

Permalink

11/23/2012 6:06 AM, John Colvin пишет:
> On Thursday, 22 November 2012 at 21:37:19 UTC, Dmitry Olshansky wrote:
>> Array ops supposed to be overhead-free loops transparently leveraging
>> SIMD parallelism of modern CPUs. No more and no less. It's like
>> auto-vectorization but it's guaranteed and obvious in the form.
>

> I disagree that array ops are only for speed.

Well that and intuitive syntax.

> I would argue that their primary significance lies in their ability to
> make code significantly more readable, and more importantly, writeable.
> For example, the vector distance between 2 position vectors can be
> written as:
> dv[] = v2[] - v1[]
> or
> dv = v2[] - v1[]
> anyone with an understanding of mathematical vectors instantly
> understands the general intent of the code.

Mathematical sense doesn't take into account that arrays occupy memory and generally the cost of operations.
Also :
dv = v2 - v1
Is plenty as obvious, thus structs + operator overloading covers the usability department of this problem. Operating on raw arrays directly as N-dimensional vectors is fine but hardly helps maintainability/readability as the program grows over time.

> With documentation something vaguely like this:
> "An array is a reference to a chunk of memory that contains a list of
> data, all of the same type. v[] means the set of elements in the array,
> while v on it's own refers to just the reference. Operations on sets of
> elements e.g. dv[] = v2[] - v1[] work element-wise along the arrays
> {insert mathematical notation and picture of 3 arrays as columns next to
> each other etc.}.
....
So far so good, but I'd rather not use 'list' to define array nor the 'set' of elements. Semantically v[] means the slice of the whole array - nothing more and nothing less.

> Array operations can be very fast, as they are sometimes lowered
> directly to cpu vector instructions. However, be aware of situations
> where a new array has to be created implicitly, e.g. dv = v2[] - v1[];
> Let's look at what this really means: we are asking for dv to be set to
> refer to the vector difference between v2 and v1. Note we said nothing
> about the current elements of dv, it might not even have any! This means
> we need to put the result of v2[] - v1] in a new chunk of memory, which
> we then set dv to refer to. Allocating new memory takes time,
> potentially taking a lot longer than the array operation itself, so if
> you can, avoid it!",

IMHO I'd shot this kind of documentation on sight. "There is a fast tool but here is our peculiar set of rules that makes certain constructs slow as a pig. So, watch out! Isn't that convenient?"

> anyone with the most basic programming and mathematical knowledge can
> write concise code operating on arrays, taking advantage of the
> potential speedups while being aware of the pitfalls.
>
People typically are not aware as long as it seems to work.

> In short:
> Vector syntax/array ops is/are great. Concise code that's easy to read
> and write. They fulfill one of the guiding principles of D: the most
> obvious code is fast and safe (or if not 100% safe, at least not too
> error-prone).

This change fits scripting language more then system.
For me
a[] = b[] + c[];
implies:
a[0..$] = b[0..$] + c[0..$]
so it's obvious that lengths better match and 'a' must be preallocated.


> More vector syntax capabilities please!

It would have been nice to write things like:
a[] = min(b[], c[]);
where min is a regular function.

But again I don't see the pressing need:
- if speed is of concern then 'arbitrary function' can't be sped up much by hardware
- if flexibility then range-style operation is far more flexible
-- 
Dmitry Olshansky

On 11/23/2012 7:58 AM, Dmitry Olshansky wrote: >> anyone with the most basic programming and mathematical knowledge can >> write concise code operating on arrays, taking advantage of the >> potential speedups while being aware of the pitfalls. >> > People typically are not aware as long as it seems to work. As an example, bearophile is an experienced programmer. He just posted two loops, one using pointers and another using arrays, and was mystified why the array version was slower. He even posted the assembler output, where it was pretty obvious (to me, anyway) that he had array bounds checking turned on in the array version, which will slow it down. So yes, it's a problem when subtle changes in code can result in significant slowdowns, and yes, even experienced programmers get caught by that.

Forums