Jump to page: 1 2
Thread overview
Dynamic arrays, Associative arrays, etc...
Apr 18, 2005
Paul Bonser
Apr 18, 2005
Regan Heath
Apr 19, 2005
Ilya Minkov
Apr 19, 2005
Regan Heath
Apr 19, 2005
Georg Wrede
Apr 19, 2005
Regan Heath
Apr 20, 2005
Paul Bonser
May 16, 2005
Brian White
Apr 18, 2005
Regan Heath
Apr 18, 2005
Manfred Nowak
Apr 18, 2005
Ben Hinkle
April 18, 2005
I know how to use them and whatnot, but I was just wondering, what exactly happens behind the scenes when you change the size of an array in D? Does it just make a new one and copy over the old stuff, then leave the old array for the GC to pick up?

I'm working on a game (well, the design for a game at the moment, as I don't want to just haphazardly start coding before I know what I want to do), and I am thinking of speed as a big factor of my design.

One of the things that I'm planning is to avoid using the GC as much as possible, so I just want to be completely sure when it is used.

When you use delete, I'm assuming that whatever it is you are deleting will simply be freed right then and there? Is this correct?

I was also wondering about how Associative arrays are stored internally, again, since I have speed in mind. I'd like to know so I will know if it would be faster to use a Map class of my own creation (which I might do anyway for better integration into my class design).

I'm sure I'll have lots more questions as I go along here, so expect more in the future...

-- 
-PIB

--
"C++ also supports the notion of *friends*: cooperative classes that
are permitted to see each other's private parts." - Grady Booch
April 18, 2005
On Mon, 18 Apr 2005 12:19:29 -0700, Paul Bonser <misterpib@gmail.com> wrote:
> I know how to use them and whatnot, but I was just wondering, what exactly happens behind the scenes when you change the size of an array in D? Does it just make a new one and copy over the old stuff, then leave the old array for the GC to pick up?

I'd imagine it 'reallocates' the internal data pointer, this does not copy unless the reallocate moves the memory block. So, there is no 'old array' for the GC to collect. Any items now outside the length (i.e. when you make it smaller) will be collected by the GC eventualy.

> I'm working on a game (well, the design for a game at the moment, as I don't want to just haphazardly start coding before I know what I want to do), and I am thinking of speed as a big factor of my design.

Then you'll want to use array slicing as much as possible. Array slicing allows you to wrap existing data in an array without copying. This is very efficient. You can also pre-allocate objects and use object pooling to increase speed and decrease the likelyhood of the GC running during time critical phases.

> One of the things that I'm planning is to avoid using the GC as much as possible, so I just want to be completely sure when it is used.

When in time critical pieces of code avoid allocating memory, the reason for this is that the GC will do it's thing if you ask it to, or if you ask for more memory and the amount of memory available is getting "tight".

> When you use delete, I'm assuming that whatever it is you are deleting will simply be freed right then and there? Is this correct?

I believe so:
"The program can explicitly inform the garbage collector that an object is no longer referred to (with the delete expression), and then the garbage collector calls the destructor immediately, and adds the object's memory to the free storage. The destructor is guaranteed to never be called twice."

> I was also wondering about how Associative arrays are stored internally, again, since I have speed in mind. I'd like to know so I will know if it would be faster to use a Map class of my own creation (which I might do anyway for better integration into my class design).

An AA is essentially a "hash table". I believe a Map is the same sort of thing. I would use an AA and if you're noticing slow parts profile the application to see where. If it turns out that the AA is slow, write your own.

> I'm sure I'll have lots more questions as I go along here, so expect more in the future...

I hope my replies are helpful.

Regan
April 18, 2005
"Paul Bonser" <misterpib@gmail.com> wrote in message news:d41151$1nc3$1@digitaldaemon.com...
>I know how to use them and whatnot, but I was just wondering, what exactly happens behind the scenes when you change the size of an array in D? Does it just make a new one and copy over the old stuff, then leave the old array for the GC to pick up?

Only if the new size doesn't fit.  If you shrink an array, it will never be moved.  If you increase the size of an array, it will attempt to do so in place, but if there's not enough room, it will be copied.  If you're familiar with the behavior of realloc(), it's pretty much the same.

> I'm working on a game (well, the design for a game at the moment, as I
> don't want to just haphazardly start coding before I know what I want to
> do), and I am thinking of speed as a big factor of my design.
> One of the things that I'm planning is to avoid using the GC as much as
> possible, so I just want to be completely sure when it is used.
> When you use delete, I'm assuming that whatever it is you are deleting
> will simply be freed right then and there? Is this correct?

I don't believe so.  The destructor is called immediately, but I don't think the memory is freed.  After that, I'm not sure what happens to the memory. I noticed, when I was looking through the GC source, that it seems to have a free list.  Perhaps the destructed memory chunk is placed on a free list? This would be a question to ask someone more knowledgeable than myself :)

> I was also wondering about how Associative arrays are stored internally, again, since I have speed in mind. I'd like to know so I will know if it would be faster to use a Map class of my own creation (which I might do anyway for better integration into my class design).

It's stored as a binary tree, if my memory serves me.


April 18, 2005
On Mon, 18 Apr 2005 17:32:21 -0400, Jarrett Billingsley <kb3ctd2@yahoo.com> wrote:
> "Paul Bonser" <misterpib@gmail.com> wrote in message
> news:d41151$1nc3$1@digitaldaemon.com...
>> When you use delete, I'm assuming that whatever it is you are deleting
>> will simply be freed right then and there? Is this correct?
>
> I don't believe so.  The destructor is called immediately, but I don't think
> the memory is freed.  After that, I'm not sure what happens to the memory.
> I noticed, when I was looking through the GC source, that it seems to have a
> free list.  Perhaps the destructed memory chunk is placed on a free list?
> This would be a question to ask someone more knowledgeable than myself :)

I think you're right here.. but you're also right that someone more knowledgable than ourselves is required to get an authoritive answer.

>> I was also wondering about how Associative arrays are stored internally,
>> again, since I have speed in mind. I'd like to know so I will know if it
>> would be faster to use a Map class of my own creation (which I might do
>> anyway for better integration into my class design).
>
> It's stored as a binary tree, if my memory serves me.

Really? and here I was thinking it was a hash table...

Regan
April 18, 2005
"Regan Heath" <regan@netwin.co.nz> wrote:

[...]
> Really? and here I was thinking it was a hash table...

For the OP the time requirements are of concern, not a description of the actual implementation. However, because DA's and AA's are not naturally supported by current Computers there is a gap in the docs: the docs are to express guaranteed space and time bounds for these structures.

In general the question of Paul is not answerable, because he also does not give any restrictions. Therefore it is even unknown whether an approach with DA's or AA's might be effective.

-manfred
April 18, 2005
>>> I was also wondering about how Associative arrays are stored internally, again, since I have speed in mind. I'd like to know so I will know if it would be faster to use a Map class of my own creation (which I might do anyway for better integration into my class design).
>>
>> It's stored as a binary tree, if my memory serves me.
>
> Really? and here I was thinking it was a hash table...

dmd implements AAs as a hash table with a binary tree to handle hash collisions.


April 19, 2005
Regan Heath wrote:
> On Mon, 18 Apr 2005 12:19:29 -0700, Paul Bonser <misterpib@gmail.com>  wrote:
> 
>> I know how to use them and whatnot, but I was just wondering, what  exactly happens behind the scenes when you change the size of an array  in D? Does it just make a new one and copy over the old stuff, then  leave the old array for the GC to pick up?
> 
> 
> I'd imagine it 'reallocates' the internal data pointer, this does not copy  unless the reallocate moves the memory block. So, there is no 'old array'  for the GC to collect. Any items now outside the length (i.e. when you  make it smaller) will be collected by the GC eventualy.

No, whenever length increases (also when concatenating and such), the array is duplicated, and the duplicate is given new length. The old array stays floating for GC to collect - or, perhaps, for other parts of software to use, if they need not be aware of what was happening.

In GDC, some optimizations were implemented IIRC - basically a one-bit reference counting for arrays, which keeps track of whether the array has one or multiple owners. If it has one, it is reallocated, if potentially multiple the DMD strategy is applied. Perhaps it wasn't too much of a gain if it wasn't included in DMD.

If you shorten/slice the array, what you have are pointers pointing into the old array's memory block, and as long as there are some the complete array - and from the view of garbage collector everything are just memory blocks, with perhaps an associated destructor, and someday in future perhaps with a custom scanning function - shall be kept alive. Since i believe we would always have some conservative GC roots, it is quite unlikely GC would ever consern itself with the slices of an array used/unused and smartly discard them.

>> I'm working on a game (well, the design for a game at the moment, as I  don't want to just haphazardly start coding before I know what I want to  do), and I am thinking of speed as a big factor of my design.
> 
> 
> Then you'll want to use array slicing as much as possible. Array slicing  allows you to wrap existing data in an array without copying. This is very  efficient. You can also pre-allocate objects and use object pooling to  increase speed and decrease the likelyhood of the GC running during time  critical phases.

Note: GC collections are only run at points where memory is alloctaed with new, so it is guaranteed not to touch high-performance inner loops.

>> One of the things that I'm planning is to avoid using the GC as much as  possible, so I just want to be completely sure when it is used.

This is the question of right amounts. If your whole program only has an active set of at most a few megabytes, you never have to worry about garbage collector performance. How do you keep it small? First thing, i would make "resource managers" for images, textures, and raw geometry data, and allocate them using C facilities - std.c.malloc, stc.c.free. You can slice into memory and use it like always but this memory will never be scanned by GC for other pointers. Variants of resource manages include a simple object which encapsulates the pointer and frees it when it gets destructed by GC, or perhaps when you need to manage a lot and fast, a manager which collects a group of pointers and frees them at once, and gets destroyed manually. The manager is in your active set, the raw memory from GC's view isn't.

-eye
April 19, 2005
On Tue, 19 Apr 2005 11:06:35 +0200, Ilya Minkov <minkov@cs.tum.edu> wrote:
> Regan Heath wrote:
>> On Mon, 18 Apr 2005 12:19:29 -0700, Paul Bonser <misterpib@gmail.com>  wrote:
>>
>>> I know how to use them and whatnot, but I was just wondering, what  exactly happens behind the scenes when you change the size of an array  in D? Does it just make a new one and copy over the old stuff, then  leave the old array for the GC to pick up?
>>   I'd imagine it 'reallocates' the internal data pointer, this does not copy  unless the reallocate moves the memory block. So, there is no 'old array'  for the GC to collect. Any items now outside the length (i.e. when you  make it smaller) will be collected by the GC eventualy.
>
> No, whenever length increases (also when concatenating and such), the array is duplicated, and the duplicate is given new length.

Are you sure? That seems to be a bad way to do it, given that "realloc" can and will extend the block of memory you currently have if there is space. How do you explain these results:

void main()
{
	char[] test;
	
	printf("%08x\n",test.ptr);
	
	test ~= 'a';
	printf("%08x\n",test.ptr);
	
	test ~= 'b';
	printf("%08x\n",test.ptr);
	
	test ~= 'c';
	printf("%08x\n",test.ptr);
	
	test.length = 10;
	printf("%08x\n",test.ptr);
	
	test.length = 4;
	printf("%08x\n",test.ptr);
	
	test.length = 9999999;      //this should be large enough to cause reallocation
	printf("%08x\n",test.ptr);
}

Output:
00000000
00870fe0
00870fe0
00870fe0
00870fe0
00870fe0
00970000

Regan
April 19, 2005
Regan Heath wrote:
> On Tue, 19 Apr 2005 11:06:35 +0200, Ilya Minkov <minkov@cs.tum.edu> wrote:
> 
>> Regan Heath wrote:
>>
>>> On Mon, 18 Apr 2005 12:19:29 -0700, Paul Bonser <misterpib@gmail.com>   wrote:
>>>
>>>> I know how to use them and whatnot, but I was just wondering, what   exactly happens behind the scenes when you change the size of an  array  in D? Does it just make a new one and copy over the old stuff,  then  leave the old array for the GC to pick up?
>>>
>>>   I'd imagine it 'reallocates' the internal data pointer, this does not  copy  unless the reallocate moves the memory block. So, there is no  'old array'  for the GC to collect. Any items now outside the length  (i.e. when you  make it smaller) will be collected by the GC eventualy.
>>
>>
>> No, whenever length increases (also when concatenating and such), the  array is duplicated, and the duplicate is given new length.
> 
> 
> Are you sure? That seems to be a bad way to do it, given that "realloc"  can and will extend the block of memory you currently have if there is  space. How do you explain these results:
> 
> void main()
> {
>     char[] test;
>         printf("%08x\n",test.ptr);
>         test ~= 'a';
>     printf("%08x\n",test.ptr);
>         test ~= 'b';
>     printf("%08x\n",test.ptr);
>         test ~= 'c';
>     printf("%08x\n",test.ptr);
>         test.length = 10;
>     printf("%08x\n",test.ptr);
>         test.length = 4;
>     printf("%08x\n",test.ptr);
>         test.length = 9999999;      //this should be large enough to cause  reallocation
>     printf("%08x\n",test.ptr);
> }
> 
> Output:
> 00000000
> 00870fe0
> 00870fe0
> 00870fe0
> 00870fe0
> 00870fe0
> 00970000
> 
> Regan

Just off-hand, IIRC (correct anybody me if I'm wrong), char[] does allocate some non-zero size by default. This is a speed optimization, at the cost of memory.

The array grows only when full, and then (IIRC again) in proportion to the previous size. So, after the array (at some time) has grown, then the next few additions will definitely not cause a reallocation (unless of course enough items get appended at once).

Setting .length to something smaller just chops the array. Setting it later to something larger will cause a reallocation only if there is not room on the heap to grow in-place to the required size.

The test above might have been more informative if the size of the array was explicitly set to something small (say 2) initially, and thereafter something else was newed (in order to create an obstacle for array growth on the heap). Then we might see a reallocation at appending to the array.

Another interesting thing would be to have a few slices into the array, and then see what actually happens to all of them when the array gets reallocated.

April 19, 2005
On Tue, 19 Apr 2005 14:15:08 +0300, Georg Wrede <georg.wrede@nospam.org> wrote:
> Setting it later to something larger will cause a reallocation only if there is not room on the heap to grow in-place to the required size.

That is what I was trying to proove.

Regan
« First   ‹ Prev
1 2