Jump to page: 1 2
Thread overview
Keeping references to dynamic arrays
Jul 23, 2004
brad
Jul 23, 2004
Russ Lewis
Jul 23, 2004
Brad Beveridge
Jul 23, 2004
Sean Kelly
Jul 24, 2004
Brad Beveridge
Jul 24, 2004
J Anderson
Jul 24, 2004
Brad Beveridge
Jul 24, 2004
J Anderson
Jul 24, 2004
Brad Beveridge
Jul 24, 2004
J Anderson
Jul 24, 2004
Arcane Jill
Jul 24, 2004
J Anderson
Jul 24, 2004
Sean Kelly
Jul 23, 2004
Derek
Jul 23, 2004
Brad Beveridge
July 23, 2004
Hi guys, I've just started playing with D again.  I'm having a little
trouble getting my head around Dynamic arrays, especially when the GC
gets involved.
My basic usage scenario is this, I have multiple classes that want to
reference the same large dynamic array.  From the small test cases I've
written, if the array is resized and needs to be realloc'd to a
different place then they don't keep their reference.
So...
int [] a1;
a1.length = 1;
int [] a2 = a1;
a1.length = 300; // or someother value that moves the array
a2[0] = 5;
a1[0] = 1;
// this is the point where I really want a1[0] == a2[0]
Is there a way to do this?  It seems to me that this is quite a subtle
area, and may possibly introduce bugs.
From what I can tell it has the following effects (maybe)
- It is never safe to pass C a dynamic array if the library is going
to keep that array around.  (Note, are there any restrictions on what
the GC can do with arrays, ie is it free to move arrays at will for heap
compation?)
- Even if the C library isn't going to keep the array around,
multi-threaded D apps may have bugs due to.
1. int [] a2 = a1[0...4]
2. -- another thread resizes a1 & causes a move
3. a2[0] = 5 // ie, expecting a2 to be an alias of a1

- Does this mean that to have arrays that do allow resizing and move correctly, we need to wrap the array in a class?

It just feels to me like you don't really know where you stand with D dynamic arrays.  You can't trust to alias them with a pointer, or with an array slice.  The only real way to know you are writing to element x in the array is to dereference the original array at position x.  My intuitive way that I thought this would work was that all slices or dynamic array assignments would essentially act like a reference to the actual array, in much the same way as assigning objects is by reference. To my point of view this would be much more intuitive and consistant than the current way I think it works.

Could I please be enlightened on how all this actually works - rather than how I think it may work?  :)

Cheers
Brad


July 23, 2004
You cat keep pointers to dynamic arrays just like anything else.  I always get mixed up with the syntax, so I looked it up on http://digitalmars.com/d/arrays.html.  It says that a pointer to a dynamic array looks like this:
  int[]* e;

You have two choices how to allocate the array.  You can declare it once, and then assign pointers to it:
  int[] a;
  int[]* b = &a;
  int[]* c = &a;
  int[]* d = &a;

Or you should be able to allocate it with new.  I'm not 100% sure of the syntax, though.  I would try this and see if it works:
  int[]* e = new int[];
  e.length = 1;
  e.length = 300;

brad wrote:
> Hi guys, I've just started playing with D again.  I'm having a little trouble getting my head around Dynamic arrays, especially when the GC gets involved.
> My basic usage scenario is this, I have multiple classes that want to reference the same large dynamic array.  From the small test cases I've written, if the array is resized and needs to be realloc'd to a different place then they don't keep their reference.
> So...
> int [] a1;
> a1.length = 1;
> int [] a2 = a1;
> a1.length = 300; // or someother value that moves the array
> a2[0] = 5;
> a1[0] = 1;
> // this is the point where I really want a1[0] == a2[0]
> Is there a way to do this?  It seems to me that this is quite a subtle area, and may possibly introduce bugs.
> From what I can tell it has the following effects (maybe)
> - It is never safe to pass C a dynamic array if the library is going to keep that array around.  (Note, are there any restrictions on what the GC can do with arrays, ie is it free to move arrays at will for heap compation?)
> - Even if the C library isn't going to keep the array around, multi-threaded D apps may have bugs due to.
> 1. int [] a2 = a1[0...4]
> 2. -- another thread resizes a1 & causes a move
> 3. a2[0] = 5 // ie, expecting a2 to be an alias of a1
> 
> - Does this mean that to have arrays that do allow resizing and move correctly, we need to wrap the array in a class?
> 
> It just feels to me like you don't really know where you stand with D dynamic arrays.  You can't trust to alias them with a pointer, or with an array slice.  The only real way to know you are writing to element x in the array is to dereference the original array at position x.  My intuitive way that I thought this would work was that all slices or dynamic array assignments would essentially act like a reference to the actual array, in much the same way as assigning objects is by reference. To my point of view this would be much more intuitive and consistant than the current way I think it works.
> 
> Could I please be enlightened on how all this actually works - rather than how I think it may work?  :)
> 
> Cheers
> Brad
> 
> 

July 23, 2004
On Fri, 23 Jul 2004 21:40:29 +0000 (UTC), brad wrote:

> Hi guys, I've just started playing with D again.  I'm having a little
> trouble getting my head around Dynamic arrays, especially when the GC
> gets involved.
> My basic usage scenario is this, I have multiple classes that want to
> reference the same large dynamic array.  From the small test cases I've
> written, if the array is resized and needs to be realloc'd to a
> different place then they don't keep their reference.
> So...
> int [] a1;
> a1.length = 1;
> int [] a2 = a1;
> a1.length = 300; // or someother value that moves the array
> a2[0] = 5;
> a1[0] = 1;
> // this is the point where I really want a1[0] == a2[0]
> Is there a way to do this?  It seems to me that this is quite a subtle
> area, and may possibly introduce bugs.
> From what I can tell it has the following effects (maybe)
> - It is never safe to pass C a dynamic array if the library is going
> to keep that array around.  (Note, are there any restrictions on what
> the GC can do with arrays, ie is it free to move arrays at will for heap
> compation?)
> - Even if the C library isn't going to keep the array around,
> multi-threaded D apps may have bugs due to.
> 1. int [] a2 = a1[0...4]
> 2. -- another thread resizes a1 & causes a move
> 3. a2[0] = 5 // ie, expecting a2 to be an alias of a1
> 
> - Does this mean that to have arrays that do allow resizing and move correctly, we need to wrap the array in a class?
> 
> It just feels to me like you don't really know where you stand with D dynamic arrays.  You can't trust to alias them with a pointer, or with an array slice.  The only real way to know you are writing to element x in the array is to dereference the original array at position x.  My intuitive way that I thought this would work was that all slices or dynamic array assignments would essentially act like a reference to the actual array, in much the same way as assigning objects is by reference. To my point of view this would be much more intuitive and consistant than the current way I think it works.
> 
> Could I please be enlightened on how all this actually works - rather than how I think it may work?  :)
> 

You are thinking correctly. Unfortunately D does subtly change the
semantics of slicing. Here is some code to prove it.
<code>

void pc(char[] x)
{
    foreach(char c;x)
    {
        if (c != '\0')
        printf("%c", c);
    }
}

void main()
{
    char[] a;
    char[] b;

    // Give it something to work with
    a = "1234567890";
    // Set 'b' to point into a subset of 'a'
    b = a[2..7];
    pc("a='" ~a~"' b='"~b~"'\n");

    // Now prove it by chaging 'a' to see if 'b' also changes.
    a[5] = 'a';
    pc("a='" ~a~"' b='"~b~"'\n");

    // resize 'a' to force it to move.
    a.length=10000;
    pc("a='" ~a~"' b='"~b~"'\n");

    // Change 'a' again to see if 'b' still changes.
    a[4] = 'a';
    pc("a='" ~a~"' b='"~b~"'\n");
    // Ahhh! But 'b' is no longer pointing into a subset of 'a'.
}

</code>

-- 
Derek
Melbourne, Australia
July 23, 2004
I'd seen that, but got tripped up on the nasty syntax.
    int [] a1;
    a1.length = 1;
    a1[0] = 5;
    int []* a2;
    a2 = &a1;
    *a2[0] = 0;
    printf("%i %i\n", a1[0], *a2[0]);

This works as expected, but I think the *a2[0] syntax is a bit of a dog. Also, what about the whole "sometimes a=b is an alias, sometimes (when b is resized), it is a copy?"

Cheers
Brad

Russ Lewis wrote:

> You cat keep pointers to dynamic arrays just like anything else.  I
> always get mixed up with the syntax, so I looked it up on
> http://digitalmars.com/d/arrays.html.  It says that a pointer to a
> dynamic array looks like this:
>    int[]* e;
> 
> You have two choices how to allocate the array.  You can declare it
> once, and then assign pointers to it:
>    int[] a;
>    int[]* b = &a;
>    int[]* c = &a;
>    int[]* d = &a;
> 
> Or you should be able to allocate it with new.  I'm not 100% sure of the
> syntax, though.  I would try this and see if it works:
>    int[]* e = new int[];
>    e.length = 1;
>    e.length = 300;
> 
> brad wrote:
>> Hi guys, I've just started playing with D again.  I'm having a little
>> trouble getting my head around Dynamic arrays, especially when the GC
>> gets involved.
>> My basic usage scenario is this, I have multiple classes that want to
>> reference the same large dynamic array.  From the small test cases I've
>> written, if the array is resized and needs to be realloc'd to a
>> different place then they don't keep their reference.
>> So...
>> int [] a1;
>> a1.length = 1;
>> int [] a2 = a1;
>> a1.length = 300; // or someother value that moves the array
>> a2[0] = 5;
>> a1[0] = 1;
>> // this is the point where I really want a1[0] == a2[0]
>> Is there a way to do this?  It seems to me that this is quite a subtle
>> area, and may possibly introduce bugs.
>> From what I can tell it has the following effects (maybe)
>> - It is never safe to pass C a dynamic array if the library is going
>> to keep that array around.  (Note, are there any restrictions on what
>> the GC can do with arrays, ie is it free to move arrays at will for heap
>> compation?)
>> - Even if the C library isn't going to keep the array around,
>> multi-threaded D apps may have bugs due to.
>> 1. int [] a2 = a1[0...4]
>> 2. -- another thread resizes a1 & causes a move
>> 3. a2[0] = 5 // ie, expecting a2 to be an alias of a1
>> 
>> - Does this mean that to have arrays that do allow resizing and move correctly, we need to wrap the array in a class?
>> 
>> It just feels to me like you don't really know where you stand with D dynamic arrays.  You can't trust to alias them with a pointer, or with an array slice.  The only real way to know you are writing to element x in the array is to dereference the original array at position x.  My intuitive way that I thought this would work was that all slices or dynamic array assignments would essentially act like a reference to the actual array, in much the same way as assigning objects is by reference. To my point of view this would be much more intuitive and consistant than the current way I think it works.
>> 
>> Could I please be enlightened on how all this actually works - rather than how I think it may work?  :)
>> 
>> Cheers
>> Brad
>> 
>>

July 23, 2004
Derek wrote:
<snip>
> 
> You are thinking correctly. Unfortunately D does subtly change the
> semantics of slicing. Here is some code to prove it.
> <code>
> 
<snip>

OK, well at least I "get" the way it works.  However, I'm still not overly
pleased that it works this way :)
I am thinking that for the best consistancy I should probably roll my own
template array (or does DTL have one?) that is always by reference,
unless .dup is used - an preserves aliasing when resized.  Does this sound
possible?

Cheers
Brad
July 23, 2004
In article <cds4tg$2qrp$1@digitaldaemon.com>, Brad Beveridge says...
>
>This works as expected, but I think the *a2[0] syntax is a bit of a dog. Also, what about the whole "sometimes a=b is an alias, sometimes (when b is resized), it is a copy?"

Pretty standard Copy On Write behavior.  I haven't decided if I really like it yet, but it is a good solution for most cases.

Sean


July 24, 2004
But it isn't actually copy on write behaviour, because you can slice into an
array and use that slice as a window into the main array.
So what it really is, is copy on resize.  The rest of the time it is by
reference.
I've got no problems with copy on write, the thing that really irks me about
this is the inconsistancy of it all - sometimes slices and copies can be
used as methods to manipulate the main array, sometimes they can't.  And
the problem is that you can't nessecarily be sure what you are getting.

What I think would make everything consistant is
 1 - Slicing and assignment are always windows into the same array, ie
   int [] dyn;
   int [] dyn2;
   dyn2 = dyn;
   dyn.length = 45;
   int [] slice = dyn[0..2];
  All of the above point to the same dynamic array (dyn), and do so even if
dyn is resized and needs to be moved and reallocated.  If dyn is resized to
be smaller than one of its slices then accessing that slice causes an out
of bounds exception.
 2 - To get a copy of an array use the explicit .dup property.

I guess the thing that bugs me the most is that class objects in D behave
with reference behaviour all the time, and the GC is free to move them as
it likes - which is essentially what is going on here.  But dynamic array
assignment of the type above behaves in a completely different manner.  And
the manner is a bit random.
Imagine what would happen if you had a thread resizing a dynamic array, and
another thread trying to keep track of it?

Cheers
Brad

<code>

void printa(char[] name, int [] a)
{
    printf("%.*s : ", name);
    foreach (int i; a)
    {
        printf ("%i ", i);
    }
    printf("\n");
}

int main(char [][] a)
{
    int [] dyn;
    dyn.length = 4;
    for (int i = 0; i < dyn.length; i++)
        dyn[i] = i;

    printa("dyn", dyn);

    int [] slice;
    slice = dyn[1..3];
    printa("slice", slice);
    slice[0] = 50;

    printa("dyn", dyn);
    printa("slice", slice);

    return 0;
}
</code>
Sean Kelly wrote:

> In article <cds4tg$2qrp$1@digitaldaemon.com>, Brad Beveridge says...
>>
>>This works as expected, but I think the *a2[0] syntax is a bit of a dog. Also, what about the whole "sometimes a=b is an alias, sometimes (when b is resized), it is a copy?"
> 
> Pretty standard Copy On Write behavior.  I haven't decided if I really like it yet, but it is a good solution for most cases.
> 
> Sean

July 24, 2004
I think D arrays should be as efficient as possible.  Copy on write would make it not so.   If you need such high-level functionality it should be part of the standard lib, not D itself.

Brad Beveridge wrote:

>But it isn't actually copy on write behaviour, because you can slice into an
>array and use that slice as a window into the main array.
>So what it really is, is copy on resize.  The rest of the time it is by
>reference.
>I've got no problems with copy on write, the thing that really irks me about
>this is the inconsistancy of it all - sometimes slices and copies can be
>used as methods to manipulate the main array, sometimes they can't.  And
>the problem is that you can't nessecarily be sure what you are getting.
>
>What I think would make everything consistant is
> 1 - Slicing and assignment are always windows into the same array, ie   int [] dyn;
>   int [] dyn2;
>   dyn2 = dyn;
>   dyn.length = 45;
>   int [] slice = dyn[0..2];
>  All of the above point to the same dynamic array (dyn), and do so even if
>dyn is resized and needs to be moved and reallocated.  If dyn is resized to
>be smaller than one of its slices then accessing that slice causes an out
>of bounds exception.
> 2 - To get a copy of an array use the explicit .dup property.
>
>I guess the thing that bugs me the most is that class objects in D behave
>with reference behaviour all the time, and the GC is free to move them as
>it likes - which is essentially what is going on here.  But dynamic array
>assignment of the type above behaves in a completely different manner.  And
>the manner is a bit random.
>Imagine what would happen if you had a thread resizing a dynamic array, and
>another thread trying to keep track of it?
>
>Cheers
>Brad
>
><code>
>
>void printa(char[] name, int [] a)
>{
>    printf("%.*s : ", name);
>    foreach (int i; a)
>    {
>        printf ("%i ", i);
>    }
>    printf("\n");
>}
>
>int main(char [][] a)
>{
>    int [] dyn;
>    dyn.length = 4;
>    for (int i = 0; i < dyn.length; i++)
>        dyn[i] = i;
>
>    printa("dyn", dyn);
>
>    int [] slice;
>    slice = dyn[1..3];
>    printa("slice", slice);
>    slice[0] = 50;
>
>    printa("dyn", dyn);
>    printa("slice", slice);
>
>    return 0;
>}
></code>
>Sean Kelly wrote:
>
>  
>
>>In article <cds4tg$2qrp$1@digitaldaemon.com>, Brad Beveridge says...
>>    
>>
>>>This works as expected, but I think the *a2[0] syntax is a bit of a dog.
>>>Also, what about the whole "sometimes a=b is an alias, sometimes (when b
>>>is resized), it is a copy?"
>>>      
>>>
>>Pretty standard Copy On Write behavior.  I haven't decided if I really
>>like it yet, but it is a good solution for most cases.
>>
>>Sean
>>    
>>
>
>  
>


-- 
-Anderson: http://badmama.com.au/~anderson/
July 24, 2004
I'm not advocating copy on write, I agree that behaviour like that is
perhaps too inefficient to be part of the language.
All I am saying is that if I do
int [] orig;
int [] other;
other = orig;
orig.length = 20;

Then other and orig will always point to the same array, no matter if the
array needs to be moved in order to satisfy the realloc.
This already (I think) happens with D object references, the GC is free to
move the location of the object memory, but all references are updated.
Why not the same with arrays?  Who cards where the physical memory is, as
long as the handles to the array that I hold are correct.

Brad

> 
> I think D arrays should be as efficient as possible.  Copy on write would make it not so.   If you need such high-level functionality it should be part of the standard lib, not D itself.
> 
> Brad Beveridge wrote:
> 
>>But it isn't actually copy on write behaviour, because you can slice into
>>an array and use that slice as a window into the main array.
>>So what it really is, is copy on resize.  The rest of the time it is by
>>reference.
>>I've got no problems with copy on write, the thing that really irks me
>>about this is the inconsistancy of it all - sometimes slices and copies
>>can be
>>used as methods to manipulate the main array, sometimes they can't.  And
>>the problem is that you can't nessecarily be sure what you are getting.
>>
>>What I think would make everything consistant is
>> 1 - Slicing and assignment are always windows into the same array, ie
>>   int [] dyn;
>>   int [] dyn2;
>>   dyn2 = dyn;
>>   dyn.length = 45;
>>   int [] slice = dyn[0..2];
>>  All of the above point to the same dynamic array (dyn), and do so even
>>  if
>>dyn is resized and needs to be moved and reallocated.  If dyn is resized to be smaller than one of its slices then accessing that slice causes an out of bounds exception.
>> 2 - To get a copy of an array use the explicit .dup property.
>>
>>I guess the thing that bugs me the most is that class objects in D behave
>>with reference behaviour all the time, and the GC is free to move them as
>>it likes - which is essentially what is going on here.  But dynamic array
>>assignment of the type above behaves in a completely different manner.
>>And the manner is a bit random.
>>Imagine what would happen if you had a thread resizing a dynamic array,
>>and another thread trying to keep track of it?
>>
>>Cheers
>>Brad
>>
>><code>
>>
>>void printa(char[] name, int [] a)
>>{
>>    printf("%.*s : ", name);
>>    foreach (int i; a)
>>    {
>>        printf ("%i ", i);
>>    }
>>    printf("\n");
>>}
>>
>>int main(char [][] a)
>>{
>>    int [] dyn;
>>    dyn.length = 4;
>>    for (int i = 0; i < dyn.length; i++)
>>        dyn[i] = i;
>>
>>    printa("dyn", dyn);
>>
>>    int [] slice;
>>    slice = dyn[1..3];
>>    printa("slice", slice);
>>    slice[0] = 50;
>>
>>    printa("dyn", dyn);
>>    printa("slice", slice);
>>
>>    return 0;
>>}
>></code>
>>Sean Kelly wrote:
>>
>> 
>>
>>>In article <cds4tg$2qrp$1@digitaldaemon.com>, Brad Beveridge says...
>>> 
>>>
>>>>This works as expected, but I think the *a2[0] syntax is a bit of a dog. Also, what about the whole "sometimes a=b is an alias, sometimes (when b is resized), it is a copy?"
>>>> 
>>>>
>>>Pretty standard Copy On Write behavior.  I haven't decided if I really like it yet, but it is a good solution for most cases.
>>>
>>>Sean
>>> 
>>>
>>
>> 
>>
> 
> 

July 24, 2004
Brad Beveridge wrote:

>I'm not advocating copy on write, I agree that behaviour like that is
>perhaps too inefficient to be part of the language.
>All I am saying is that if I do
>int [] orig;
>int [] other;
>other = orig;
>orig.length = 20;
>
>Then other and orig will always point to the same array, no matter if the
>array needs to be moved in order to satisfy the realloc.
>This already (I think) happens with D object references, the GC is free to
>move the location of the object memory, but all references are updated. Why not the same with arrays?  Who cards where the physical memory is, as
>long as the handles to the array that I hold are correct.
>
>Brad
>  
>
Right-t-o then.  I guess even that could be inefficient because then you need to keep track of which arrays are which.  If you wrap the array in a class then you'd have no problems.


-- 
-Anderson: http://badmama.com.au/~anderson/
« First   ‹ Prev
1 2