[Issue 9582] std.algorithm.map!(T) cause CT error for fixed size arrays (page 2)

February 24, 2013

[Issue 9582] std.algorithm.map!(T) cause CT error for fixed size arrays

Posted by monarchdodra@gmail.com
in reply to Marcin Mstowski

Permalink

monarchdodra@gmail.com

Posted in reply to Marcin Mstowski

Permalink

http://d.puremagic.com/issues/show_bug.cgi?id=9582



--- Comment #10 from monarchdodra@gmail.com 2013-02-24 15:41:59 PST ---
(In reply to comment #9)
> In the end I am probably able to add the missing [] to my D2 code in a matter of one or two hours (or less) so for me this change doesn't require me a lot of work to fix. So I leave the decision to you.

I'm not that passionate on the issue. I think treating static arrays as ranges in the first place wasn't the best idea. I mean, if your code didn't compile in the first place, you would have probably called it with arr[] and not have though about it much more.

I also agree that working on things with opApply adds useability, such as for std.array.array. How ever, I don't think it warrants passing a static array by value, and creating a template instance per array size.

> > Static arrays, while being iterable, shouldn't be passed around by
> value to functions. I think there'd be gains in the long run to such a scheme.<
> 
> Let's say I have this fixed-sized array:
> 
> int[3] items = [1, 2, 3];
> 
> and I want to compute its sum:
> 
> reduce!q{a + b}(items)
> 
> In this case ideally I'd like the D compiler to replace that call to reduce with the same ASM as
> 
> items[0] + items[1] + items[2]
> 
> This means the compiler has some compile-time information (the array length) that in theory (and in practice too if the code is well written and you are using a GCC back-end that is able to unroll small loops) is usable to produce full optimized code.
> 
> If I have to write:
> 
> reduce!q{a + b}(items[])
> 
> Then we have lost something, the reduce() can't know at compile-time the length of the array, so performing loop unrolling becomes harder (the JavaVM is able to partially unroll the loop in this case too, but LLVM was not able to do it since the latest version, and even now it's not perfect).
> 
> Throwing away some compile-time information seems a bad idea to me.

But the way I see it, your argument is that loop unrolling justifies copying an entire array. I'm not really sure it is. Can you honestly tell me that when you wrote "reduce(myArray)", you realized you were passing it by value? Furthermore, is the performance gain *also* worth the template bloat, since you are instantiating 1 reduce algorithm per array *size*.

C++ could do this, but it doesn't. Ever. And C++ is dead serious about squeezing every last ounce of performance it can, algorithm wise.

The ideal solution would be one akin to cycle: A specialized overload that takes static arrays by reference. You get your cheap pass by ref, but also keep your compile time info.

But still, that is a *very* specific use case, with a code deployment cost, and shouldn't be used many other things.

So my conclusion is that: Yes, for useability reason, taking an iterable is good. However, for the sake of arguments accepted, I don't think a static array should be considered an iterable, and the caller should be expected to slice it.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------

http://d.puremagic.com/issues/show_bug.cgi?id=9582



--- Comment #11 from bearophile_hugs@eml.cc 2013-02-24 16:11:10 PST ---
(In reply to comment #10)

> But the way I see it, your argument is that loop unrolling justifies copying an entire array.

If your array is a ubyte[6] then I think copying it is OK, otherwise it's better to take the fixed size array by reference.


> Furthermore, is the performance gain *also* worth the template bloat, since you are instantiating 1 reduce algorithm per array *size*.

The compile-time knowledge of the array length gives a performance advantage for small arrays only, that's why I have used a items int[3] in my example. A way to keep the template bloat low is to introduce a limit of the length with a template constraint:


import std.stdio;

void foo(T, size_t N)(ref T[N] items) if (N < 10) {
    writeln("foo 1");
}

void foo(T)(T[] items) {
    writeln("foo 2");
}

void main() {
    int[3] a1;
    int[20] a2;
    int[] a3;
    foo(a1);
    foo(a2);
    foo(a3);
}

Prints:

foo 1
foo 2
foo 2


(There is another way to solve this problem, but I think it requires a small improvement of D language (it's an idea to help reduce the template bloat in some situations, like with fixed size arrays), but while you design reduce() you can't assume such change, that currently is not even written in a bugzilla enhancement request entry.)


> The ideal solution would be one akin to cycle: A specialized overload that takes static arrays by reference. You get your cheap pass by ref, but also keep your compile time info.
> 
> But still, that is a *very* specific use case, with a code deployment cost, and shouldn't be used many other things.

OK.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------

Forums