November 04, 2008
erm, so what would be the best way to get a large array? which is also fast.

>>> You can have a dynamic array that _points_ to the stack, but there is no general mechanism for allocating a dynamic array directly on the stack.  You might be able to do something with alloca, but stack space is usually limited and you wouldn't be able to have very large arrays.
>>>
>>> Performance of dynamic arrays is the same no matter where their data is.  Fixed-size 2D arrays are not faster _because_ they are on the stack, they just happen to be allocated on the stack.  They are faster (usually) because they don't need two pointer dereferences.  You can allocated a fixed-size 2D array on the heap (well.. inside a struct or class anyway) and it will be just as fast.
>>
>> Ok, so when I want a multimillion array with speed I should use a static array on the heap. That would be using new, right :)
>
> You.. can't actually allocate a static array on the heap using new. At least not directly.  It's kind of an embarrassing hole in the syntax.  In fact, I'm not sure if you can even get a static array to point onto the heap, since a static array "reference" is treated like a value type when assigned to, so if you do something like:
>
> int[3][4] b = (new int[3][4][](1))[0];
>
> The weirdness on the right-hand-side is needed to get around the compiler "helpfully" rewriting "new int[3][4]" as "new int[][](3, 4)". But this code will simply allocate a static array on the heap, and then copy its contents onto the stack.  Dumb.
>
> Other than using a struct/class that contains a static array and allocating _that_ on the heap, I don't know of any way of getting it on the heap.  But then you'd be double-dereferencing anyway since you'd have to go through the struct pointer or class reference!  Sigh.


November 04, 2008
On Tue, Nov 4, 2008 at 1:36 PM, Saaa <empty@needmail.com> wrote:
> erm, so what would be the best way to get a large array? which is also fast.

A large 2D array that is fast?  Write it yourself.  It's not terribly hard; it'd just be a templated struct with overloaded opIndex and opIndexAssign.
November 04, 2008
On Tue, 04 Nov 2008 01:34:26 -0500, Jarrett Billingsley wrote:

> On Mon, Nov 3, 2008 at 11:30 PM, Jesse Phillips <jessekphillips@gmail.com> wrote:
>> In C allocating a static 2D array gives a continues chunk of memory. Java creates an array that points to more arrays. Just wondering how D handles this.
> 
> If it's a 2D fixed-size array, like:
> 
> int[5][6] x;
> 
> Then it is allocated as a single chunk of memory, and when you index it, it calculates the offset into that block of memory.
> 
> But 2D dynamic arrays:
> 
> int[][] y;
> 
> Are handled like in Java - an array of arrays.  When you index it, it indexes the first array, then indexes the second array.

Thanks, I figured it would work like this be it for C compatibility or not.
November 05, 2008
Robert Fraser:
> The only reason static arrays seem to exist at all (as well as good explanation for their incongruity with other types) is C compatibility.

They also can give you more performance, if you know how to use them well.


Jarrett Billingsley>Fixed-size 2D arrays are not faster _because_ they are on the
stack, they just happen to be allocated on the stack.  They are faster (usually) because they don't need two pointer dereferences.<

Right. You can allocate a single block of dynamic memory, and then find items using something like: col+row*ncols. But fixed-size 2D arrays can be faster also because the compiler knows the sides of the matrix at compile time. So to find the position of the cells it can often (if you have chosen the right sizes) use shifts and similar tricks, that are faster than that general multiplication row*ncols.

For example here I have written a small C program (exactly the same code can be written in D, I have used C because GCC optimizes more than DMD):
http://www.fantascienza.net/leonardo/js/wirun2.c

As main data structure it uses a large matrix:
#define SIZEX 1024
#define SIZEY 1024
int lists[4][SIZEX * SIZEY];

That program wastes some space on the right of that tensor, to keep the sizes as powers of two. Those numbers are powers of two, so to compute the index positions it uses just shifts, and it's fast. If you use the same code in D you have a higher performance than using dynamic arrays, even if you carve it from a single chunk of heap memory.
Note that modern caches have quite weird performance problems with arrays with sizes as powers of two, so you have to benchmark your code every time, or you have to know a LOT of details about your CPU caches.

Bye,
bearophile
November 05, 2008
Jarrett Billingsley wrote:
> Performance of dynamic arrays is the same no matter where their data
> is.  Fixed-size 2D arrays are not faster _because_ they are on the
> stack, they just happen to be allocated on the stack.  They are faster
> (usually) because they don't need two pointer dereferences.  You can
> allocated a fixed-size 2D array on the heap (well.. inside a struct or
> class anyway) and it will be just as fast.

Your "usually" interests me. I was under the impression, and it seems quite logical, that static arrays are faster than dynamic ones. However, recently I did some simple experiments with matrix operations (multiplication, etc.), in which performance was actually a little bit better for dynamic arrays.

Is there a general "rule" for when dynamic arrays are faster than static ones, and vice versa?

Also, I tried multiplying large matrices both using arrays in D and using a C BLAS implementation. The BLAS algorithm is the same as the one I've written in D, and additionally includes a lot of runtime checks. Still, multiplication using BLAS was about four times faster. Is this simply because C compilers are a lot better at optimisation than the D compilers?

-Lars
November 05, 2008
Lars Kyllingstad wrote:
> Jarrett Billingsley wrote:
>> Performance of dynamic arrays is the same no matter where their data
>> is.  Fixed-size 2D arrays are not faster _because_ they are on the
>> stack, they just happen to be allocated on the stack.  They are faster
>> (usually) because they don't need two pointer dereferences.  You can
>> allocated a fixed-size 2D array on the heap (well.. inside a struct or
>> class anyway) and it will be just as fast.
> 
> Your "usually" interests me. I was under the impression, and it seems quite logical, that static arrays are faster than dynamic ones. However, recently I did some simple experiments with matrix operations (multiplication, etc.), in which performance was actually a little bit better for dynamic arrays.
> 
> Is there a general "rule" for when dynamic arrays are faster than static ones, and vice versa?
> 
> Also, I tried multiplying large matrices both using arrays in D and using a C BLAS implementation. The BLAS algorithm is the same as the one I've written in D, and additionally includes a lot of runtime checks. Still, multiplication using BLAS was about four times faster. Is this simply because C compilers are a lot better at optimisation than the D compilers?

I just remembered that D also does bounds checking on arrays. To work around this, I tried accessing the array elements through pointers. To my surprise, this actually didn't affect the results very much.

Therefore my question still stands.

-Lars
November 05, 2008
Lars Kyllingstad wrote:
> Jarrett Billingsley wrote:
>> Performance of dynamic arrays is the same no matter where their data
>> is.  Fixed-size 2D arrays are not faster _because_ they are on the
>> stack, they just happen to be allocated on the stack.  They are faster
>> (usually) because they don't need two pointer dereferences.  You can
>> allocated a fixed-size 2D array on the heap (well.. inside a struct or
>> class anyway) and it will be just as fast.
> 
> Your "usually" interests me. I was under the impression, and it seems quite logical, that static arrays are faster than dynamic ones. However, recently I did some simple experiments with matrix operations (multiplication, etc.), in which performance was actually a little bit better for dynamic arrays.

Indexing on a static 2d array is a multiplication and and addition. Indexing on a dynamic one is a dereference and an addition. If the compiler optimizes the dereference out but not the multiplication, you can get cases where dynamic arrays are faster.

There are probably other cases where the efficiency differs from expectations, but I'm not familiar with them.
November 06, 2008
Lars Kyllingstad wrote:
> Lars Kyllingstad wrote:
>> Jarrett Billingsley wrote:
>>> Performance of dynamic arrays is the same no matter where their data
>>> is.  Fixed-size 2D arrays are not faster _because_ they are on the
>>> stack, they just happen to be allocated on the stack.  They are faster
>>> (usually) because they don't need two pointer dereferences.  You can
>>> allocated a fixed-size 2D array on the heap (well.. inside a struct or
>>> class anyway) and it will be just as fast.
>>
>> Your "usually" interests me. I was under the impression, and it seems quite logical, that static arrays are faster than dynamic ones. However, recently I did some simple experiments with matrix operations (multiplication, etc.), in which performance was actually a little bit better for dynamic arrays.
>>
>> Is there a general "rule" for when dynamic arrays are faster than static ones, and vice versa?
>>
>> Also, I tried multiplying large matrices both using arrays in D and using a C BLAS implementation. The BLAS algorithm is the same as the one I've written in D, and additionally includes a lot of runtime checks. Still, multiplication using BLAS was about four times faster. Is this simply because C compilers are a lot better at optimisation than the D compilers?
> 
> I just remembered that D also does bounds checking on arrays. To work around this, I tried accessing the array elements through pointers. To my surprise, this actually didn't affect the results very much.

My experience is that using indexing is faster in DMD. I don't know why.
To compare D and C, compile your C code with DMC. Then you'll get the same code generation capability. DMD and DMC are extremely weak on floating-point optimisation. You shouldn't see any difference between them.
1 2
Next ›   Last »