October 02, 2007
On 10/2/07, Vladimir Panteleev <thecybershadow@gmail.com> wrote:
> Janice Caron Wrote:
> > Anyway, hopefully you get the idea. I'm going for more coffee before I mistype anything else!
> Thanks - this indeed makes the code more readable, however now accessing an element involves two dereferences (the class, and the array pointer)

It doesn't have to be a class. It could just as easily be a struct. Then you've only got one dereference.

> - and, possibly, a function call if the compiler won't inline it.

I would hope that the compiler would inline that. Maybe not in debug mode, but certainly in release mode.
October 02, 2007
Vladimir Panteleev wrote:
> On Tue, 02 Oct 2007 09:20:29 +0300, Vladimir Panteleev <thecybershadow@gmail.com> wrote:
> 
>> Why?
>>
>> I posted a bug report yesterday, thinking it was just some accidentally left over code.
>> But I noticed this statement in the documentation today. And, frankly, I don't understand why that limitation is there.
>>
>> The problem with this restriction comes when using multi-dimensional arrays. Branched dynamic arrays are MUCH slower than rectangular/cubic/etc. static arrays - and this is a severe limitation for anyone wanting to work with large amounts of data (e.g. game maps). Of course I could work around this by calculating the positions manually using x*width+y, but I'd like to know the reason why this limitation was put there in the first place, and what's the reasoning behind it. No other compiled language that I used (C/C++/Delphi) has such a limitation.
>>
>> The check is in mtype.c, line 1835 (DMD 1.021).
> 
> I think I found the cause of this limitation - OPTLINK hangs, crashes and/or generates corrupt data if I try to force it such input. It works if I use a pointer to a huge static array, but DMD wouldn't accept that as input anyway. I thought that putting the array in a struct or class would prevent the linker from generating a >16MB data segment, but I forgot that D needs(!) to have an initial value for every type, even if it's completely null. Once again, I'll mention that Borland and Microsoft's linkers don't have such a limitation, and they can generate data segments over 16 MB with no problems.
> 
> So, suggestions for fixing this:
> 
> 1) fix or rewrite OPTLINK
>    ("rewrite" because I keep hearing about how OPTLINK holds D back for one reason or another; I know it's no small task, however D doesn't need a multi-purpose linker, just one that can handle what the DMD compiler generates)
> 2) if a type doesn't contain any non-0 initial values, instead of generating a big block of zeroes for the initializer, initialize it with a memset (rep stosd/stosb).
> 

T[][] SquareArray (T) (int x, int y) {
   T[] block = new T[x * y];
   T[][] ret = [];
   ret.length = x;
   for (int i = 0; i < x; i++) {
      ret[i] = block[i * y .. (i + 1) * y];
   }

   return ret;
}

void main () {
   int[][] array = SquareArray!(int)(9000, 9000);
}

Slightly tedious, and it gives you reference semantics and places the array on the heap (but do you really *want* 16MB stack frames? I imagine the OS might have something to say about that).
October 02, 2007
On Tue, 02 Oct 2007 16:18:26 +0300, Christopher Wright <dhasenan@gmail.com> wrote:

> T[][] SquareArray (T) (int x, int y) {
>     T[] block = new T[x * y];
>     T[][] ret = [];
>     ret.length = x;
>     for (int i = 0; i < x; i++) {
>        ret[i] = block[i * y .. (i + 1) * y];
>     }
>
>     return ret;
> }
>
> void main () {
>     int[][] array = SquareArray!(int)(9000, 9000);
> }
>
> Slightly tedious, and it gives you reference semantics and places the array on the heap (but do you really *want* 16MB stack frames? I imagine the OS might have something to say about that).

Thanks, however dynamic arrays are unfortunately slow. To access a random element, you need a dereference, a multiplication+addition, another dereference, another multiplication+addition. Janice Caron's workaround, when modified to use a struct placed in the data segment, requires only one dereferencing/multiply+adding - while a static array placed in the data segment requires only arithmetics to access.

Also, I was planning to place the array in the uninitialized data segment (I believe it is called BSS in ELF and object files), not the stack - I believe the default stack size for Windows executables to be 256 kB anyway.

P.S. I believe your function can be reduced to this without any significant changes in behavior:

     T[][] ret = new T[][x];
     foreach(ref row;ret)
         row.length = y;

     return ret;

-- 
Best regards,
 Vladimir                          mailto:thecybershadow@gmail.com
October 02, 2007
On 10/2/07, Vladimir Panteleev <thecybershadow@gmail.com> wrote:
> P.S. I believe your function can be reduced to this without any significant changes in behavior:
>
>      T[][] ret = new T[][x];
>      foreach(ref row;ret)
>          row.length = y;
>
>      return ret;

For that matter, I think you could also do:

    template MultiDimArray(T,int N)
    {
        static if (N == 1)
            typedef T[] MultiDimArray;
        else
            typedef MultiDimArray!(T,N-1)[] MultiDimArray;
    }

    MultiDimArray!(T,N) MakeMultiDimArray(T,int N)(int dim, int[] others...)
    {
        static if (N == 1)
            return new T[dim];
        else
        {
            MultiDimArray!(T,N) t;
            t.length = dim;
            foreach(ref a;t) a =
MakeMultiDimArray!(T,N-1)(others[0],others[1..$]);
            return t;
        }
    }
October 02, 2007
Reply to Vladimir,

> On Tue, 02 Oct 2007 10:16:25 +0300, BCS <ao@pathlink.com> wrote:
> 
>> Reply to Vladimir,
>> 
>>> Why?
>>> 
>>> I posted a bug report yesterday, thinking it was just some
>>> accidentally left over code.
>>> 
>>> But I noticed this statement in the documentation today. And,
>>> frankly, I don't understand why that limitation is there.
>>> 
>>> The problem with this restriction comes when using multi-dimensional
>>> arrays. Branched dynamic arrays are MUCH slower than
>>> rectangular/cubic/etc. static arrays - and this is a severe
>>> limitation for anyone wanting to work with large amounts of data
>>> (e.g. game maps). Of course I could work around this by calculating
>>> the positions manually using x*width+y, but I'd like to know the
>>> reason why this limitation was put there in the first place, and
>>> what's the reasoning behind it.
>>> 
>> just new the array.
>> 
> If you mean that I do something like:
> int[4096][4096]* arr = (new int[4096][4096][1]).ptr;
> 1) the compiler still (normally) doesn't let me do it
> 2) it's slower for random access due to having to dereference a
> pointer.
> 
> But, yes, this is the sanest workaround so far.
> 

I was thinking of this:

alias int[4096] block;   // alias to be shure I get dynamic array of static arrays
block[] arr = new block[4096];


October 02, 2007
On Tue, 02 Oct 2007 18:12:41 +0300, BCS <ao@pathlink.com> wrote:

> I was thinking of this:
>
> alias int[4096] block;   // alias to be shure I get dynamic array of static arrays block[] arr = new block[4096];

Why haven't I thought of that... :O

-- 
Best regards,
 Vladimir                          mailto:thecybershadow@gmail.com
October 02, 2007
"Janice Caron" <caron800@googlemail.com> wrote in message news:mailman.353.1191333299.16939.digitalmars-d@puremagic.com...
> On 10/2/07, Vladimir Panteleev <thecybershadow@gmail.com> wrote:
>> P.S. I believe your function can be reduced to this without any significant changes in behavior:
>>
>>      T[][] ret = new T[][x];
>>      foreach(ref row;ret)
>>          row.length = y;
>>
>>      return ret;
>
> For that matter, I think you could also do:
>
>    template MultiDimArray(T,int N)
>    {
>        static if (N == 1)
>            typedef T[] MultiDimArray;
>        else
>            typedef MultiDimArray!(T,N-1)[] MultiDimArray;
>    }
>
>    MultiDimArray!(T,N) MakeMultiDimArray(T,int N)(int dim, int[]
> others...)
>    {
>        static if (N == 1)
>            return new T[dim];
>        else
>        {
>            MultiDimArray!(T,N) t;
>            t.length = dim;
>            foreach(ref a;t) a =
> MakeMultiDimArray!(T,N-1)(others[0],others[1..$]);
>            return t;
>        }
>    }

You've basically implemented, in templates, what you can already do with the language syntax.  MultiDimArray!(int, 3) will turn into int[][][], and the functionality of MakeMultiDimArray is already covered by array new-ing, i.e. new int[][][](10, 20, 30).

And it's still an array-of-arrays.


October 02, 2007
On 10/2/07, Jarrett Billingsley <kb3ctd2@yahoo.com> wrote:
> functionality of MakeMultiDimArray is already covered by array new-ing, i.e. new int[][][](10, 20, 30).

Ooh! I didn't know you could do that!
Nice!
October 02, 2007
"Janice Caron" <caron800@googlemail.com> wrote in message news:mailman.362.1191349640.16939.digitalmars-d@puremagic.com...
> On 10/2/07, Jarrett Billingsley <kb3ctd2@yahoo.com> wrote:
>> functionality of MakeMultiDimArray is already covered by array new-ing,
>> i.e.
>> new int[][][](10, 20, 30).
>
> Ooh! I didn't know you could do that!
> Nice!

:)

Actually 'new int[5]' is exactly equivalent to 'new int[](5)' -- the former is sugar for the latter.  It's just most people are used to writing it like the latter.


October 02, 2007
Vladimir Panteleev wrote:
> Thanks, however dynamic arrays are unfortunately slow. To access a
> random element, you need a dereference, a multiplication+addition,
> another dereference, another multiplication+addition. Janice Caron's
> workaround, when modified to use a struct placed in the data segment,
> requires only one dereferencing/multiply+adding - while a static
> array placed in the data segment requires only arithmetics to access.

If you define an array as:

int[10000][] a = new int[10000][100];

only one dereference is required to access the data.