Thread overview
static array litteral syntax using a library solution (no GC and 40x faster!)
Feb 03, 2013
timotheecour
Feb 03, 2013
timotheecour
Feb 03, 2013
Danny Arends
Feb 03, 2013
John Colvin
Feb 03, 2013
Max Klyga
Feb 03, 2013
John Colvin
Feb 03, 2013
Benjamin Thaut
February 03, 2013
Static arrays suffer from:

1) bad implementation that allocates on the heap when we do: "int[3]=[1,2,3];"

2) lack of syntactic sugar to declare them on on the fly, eg when we want to pass a static array to a function without declaring an intermediate variable. See my proposal for "auto x=[1,2,3]s" here:
http://permalink.gmane.org/gmane.comp.lang.d.general/90035, which would allow one to pass a static array to a function eg: fun([1,2,3]s) without having to do int[3] temp; fun(temp), when it's not passed by ref.

If 90035 won't get implemented, what about a library solution?

Below, we can construct a static array on the fly as:
"auto x=S(1,2,3);"
as opposed to:
"int[3] x=[1,2,3]"

advantages:
a) can directly pass to a function without creating temp variable
b) less verbose
c) no heap allocations
d) 40 times faster in the example below (even 2.3x faster than C, for some reason which eludes me)


----
import std.stdio,std.conv;
import std.traits:CommonType;

auto S(T...)(T a) if(!is(CommonType!T == void )){ //check to prevent illegal stuff like S([],2)
	alias CommonType!T T0;
	T0[T.length]ret;
	foreach(i,ai;a)
		ret[i]=ai;
	return ret;
}

void main(){
	size_t n=1000000,z=0,i=0,j=0;
	for(i=0;i<n;i++){
//		auto a=S(cast(size_t)i,i+1,i+2,i+3,i+4,i+5,i+6,i+7,i+8,i+9); //time: 0.351 with LDC

		size_t[10] a=[i,i+1,i+2,i+3,i+4,i+5,i+6,i+7,i+8,i+9]; //time: 14.049 with LDC, 16s with dmd (-inline -O -release)
		for(j=0;j<9;j++){z+=a[j];}
	}
	TOC;	
	writeln(z); //to prevent optimizing away result (?)
}
----



interestingly, this seems faster than the C version below. Why is that so?

----
//test.c:
#include <stdio.h>
int main(){
	size_t n=100000000,z=0,i=0,j=0;
	for(i=0;i<n;i++){
		size_t a[10]={i,i+1,i+2,i+3,i+4,i+5,i+6,i+7,i+8,i+9};
		for(j=0;j<9;j++){z+=a[j];}
	}
	printf("%lu\n",z);
	return 0;
}
----
gcc -O2 test.c -o test && time ./test 	
real	0m0.803s
February 03, 2013
actually, we can use
"T0[T.length]ret=void;"
instead of
"T0[T.length]ret;"
inside the definition of S. It doesn't make a difference for ldc which optimizes it away, but it does for dmd (which is also about 10x slower than ldc).
February 03, 2013
interesting implementation, and very fast indeed...
(My Netbook MSI Wind u100 - Intel Atom)

S() (DMD)    -> 0.189 s
C gcc -03    -> 0.445 s
SArray (DMD) -> 0.738 s
C gcc -02    -> 3.205 s

Gr,
Danny Arends
http://www.dannyarends.nl

On Sunday, 3 February 2013 at 09:28:05 UTC, timotheecour wrote:
> actually, we can use
> "T0[T.length]ret=void;"
> instead of
> "T0[T.length]ret;"
> inside the definition of S. It doesn't make a difference for ldc which optimizes it away, but it does for dmd (which is also about 10x slower than ldc).

February 03, 2013
On Sunday, 3 February 2013 at 09:23:01 UTC, timotheecour wrote:
> Static arrays suffer from:
>
> 1) bad implementation that allocates on the heap when we do: "int[3]=[1,2,3];"
>
> 2) lack of syntactic sugar to declare them on on the fly, eg when we want to pass a static array to a function without declaring an intermediate variable. See my proposal for "auto x=[1,2,3]s" here:
> http://permalink.gmane.org/gmane.comp.lang.d.general/90035, which would allow one to pass a static array to a function eg: fun([1,2,3]s) without having to do int[3] temp; fun(temp), when it's not passed by ref.
>
> If 90035 won't get implemented, what about a library solution?
>
> Below, we can construct a static array on the fly as:
> "auto x=S(1,2,3);"
> as opposed to:
> "int[3] x=[1,2,3]"
>
> advantages:
> a) can directly pass to a function without creating temp variable
> b) less verbose
> c) no heap allocations
> d) 40 times faster in the example below (even 2.3x faster than C, for some reason which eludes me)
>
>
> ----
> import std.stdio,std.conv;
> import std.traits:CommonType;
>
> auto S(T...)(T a) if(!is(CommonType!T == void )){ //check to prevent illegal stuff like S([],2)
> 	alias CommonType!T T0;
> 	T0[T.length]ret;
> 	foreach(i,ai;a)
> 		ret[i]=ai;
> 	return ret;
> }
>
> void main(){
> 	size_t n=1000000,z=0,i=0,j=0;
> 	for(i=0;i<n;i++){
> //		auto a=S(cast(size_t)i,i+1,i+2,i+3,i+4,i+5,i+6,i+7,i+8,i+9); //time: 0.351 with LDC
>
> 		size_t[10] a=[i,i+1,i+2,i+3,i+4,i+5,i+6,i+7,i+8,i+9]; //time: 14.049 with LDC, 16s with dmd (-inline -O -release)
> 		for(j=0;j<9;j++){z+=a[j];}
> 	}
> 	TOC;	
> 	writeln(z); //to prevent optimizing away result (?)
> }
> ----
>
>
>
> interestingly, this seems faster than the C version below. Why is that so?
>
> ----
> //test.c:
> #include <stdio.h>
> int main(){
> 	size_t n=100000000,z=0,i=0,j=0;
> 	for(i=0;i<n;i++){
> 		size_t a[10]={i,i+1,i+2,i+3,i+4,i+5,i+6,i+7,i+8,i+9};
> 		for(j=0;j<9;j++){z+=a[j];}
> 	}
> 	printf("%lu\n",z);
> 	return 0;
> }
> ----
> gcc -O2 test.c -o test && time ./test 	
> real	0m0.803s

Very interesting! Anything that beats c performance is a very big plus for D.

Btw,  you can replace the loop in S with
ret[] = a[];
Which should be even faster.

Also, to check that the assignment is being optimised away,  try using different data in each pass.
February 03, 2013
Am 03.02.2013 10:23, schrieb timotheecour:

That is a nice idea, but I would really like a in language solution so that:

int[10] a = [1,2,3,4,5,6,7,8,9]; //does not allocate, just initializes a

void foo1(scope int[]) { ... }

foo1([1,2,3,4,5,6,7,8,9]); //allocates the literal on the stack

void foo2(int[]){ ... }

foo2([1,2,3,4,5,6,7,8,9]); //allocates the literal on the heap

Kind Regards
Benjamin Thaut
February 03, 2013
On 2013-02-03 13:18:03 +0000, John Colvin said:

> On Sunday, 3 February 2013 at 09:23:01 UTC, timotheecour wrote:
>> snip
> 
> Very interesting! Anything that beats c performance is a very big plus for D.
> 
> Btw,  you can replace the loop in S with
> ret[] = a[];
> Which should be even faster.
> 
> Also, to check that the assignment is being optimised away,  try using different data in each pass.

It will not get faster, not only that. It will not even compile.

If look carefully, you will notice that it is in fact a static foreach and a is not an array, but a tuple, so the whole loop is unrolled as a series of assignments (ret[0] = a, ret[1] = b …)

February 03, 2013
On Sunday, 3 February 2013 at 15:16:44 UTC, Max Klyga wrote:
> On 2013-02-03 13:18:03 +0000, John Colvin said:
>
>> On Sunday, 3 February 2013 at 09:23:01 UTC, timotheecour wrote:
>>> snip
>> 
>> Very interesting! Anything that beats c performance is a very big plus for D.
>> 
>> Btw,  you can replace the loop in S with
>> ret[] = a[];
>> Which should be even faster.
>> 
>> Also, to check that the assignment is being optimised away,  try using different data in each pass.
>
> It will not get faster, not only that. It will not even compile.
>
> If look carefully, you will notice that it is in fact a static foreach and a is not an array, but a tuple, so the whole loop is unrolled as a series of assignments (ret[0] = a, ret[1] = b …)

Woops, sorry my bad