September 08, 2008
On Mon, Sep 8, 2008 at 6:09 AM, Sergey Gromov <snake.scaly@gmail.com> wrote:
> Jarrett Billingsley <jarrett.billingsley@gmail.com> wrote:
>> On Sun, Sep 7, 2008 at 12:59 PM, bearophile <bearophileHUGS@lycos.com> wrote:
>> > Uhm... I don't understand fully: do you mean roots don't become deleted
>> > when they have nothing that points to them?
>> > The docs say:
>> > "Roots are references to memory allocated by the collector that are
>> > maintained in memory outside the collector pool."
>>
>> I think what you should be using instead is addRange.  addRoot more or less tells the GC "never collect memory referenced by this pointer", but it doesn't actually scan the memory _pointed to_ by the roots for pointers.
>
> This just cannot be.  A root is a place where a tree of live memory objects is grafted.  That's why it is called a "root".  A memory block pointed to by a root is kept alive and, obviously, scanned for pointers to other memory blocks if it hasPointers().
>
> There's a problem with addRoot() that, if the memory block is re- allocated, the old root stops serving its purpose.  In that case you need to remove an old root and add the new one, pointing into the re- allocated block.
>
September 08, 2008
On Mon, Sep 8, 2008 at 6:09 AM, Sergey Gromov <snake.scaly@gmail.com> wrote:
> Jarrett Billingsley <jarrett.billingsley@gmail.com> wrote:
>> On Sun, Sep 7, 2008 at 12:59 PM, bearophile <bearophileHUGS@lycos.com> wrote:
>> > Uhm... I don't understand fully: do you mean roots don't become deleted
>> > when they have nothing that points to them?
>> > The docs say:
>> > "Roots are references to memory allocated by the collector that are
>> > maintained in memory outside the collector pool."
>>
>> I think what you should be using instead is addRange.  addRoot more or less tells the GC "never collect memory referenced by this pointer", but it doesn't actually scan the memory _pointed to_ by the roots for pointers.
>
> This just cannot be.  A root is a place where a tree of live memory objects is grafted.  That's why it is called a "root".  A memory block pointed to by a root is kept alive and, obviously, scanned for pointers to other memory blocks if it hasPointers().
>
> There's a problem with addRoot() that, if the memory block is re- allocated, the old root stops serving its purpose.  In that case you need to remove an old root and add the new one, pointing into the re- allocated block.
>

Let's put some words in this one!

You're right, I'm sorry for spreading misinformation.
September 08, 2008
Sergey Gromov, thank you for your explanations, you are quite clear, I have understood everything you have explained :-) And your code works.

I think this discussion may deserve to become a little tutorial useful for people that have never done such things (probably most people coming from all scripting languages, Pascal, Java and maybe C/C# don't know such things if the haven't seen in the University, so they are lot of people), it may be written here:
http://www.prowiki.org/wiki4d/wiki.cgi?DocComments/Phobos/StdGc


> addRange(&new_block.next, &new_block.data+1);

The trick of adding a range of pointers to pointers is cute :-)
I presume that +1 is necessary because it's an interval open on the right, even if the Phobos docs don't tell it:
void addRange(void* pbot, void* ptop);


>Both require Block.~this() to remove any unneeded roots or ranges<

Okay, I understand now. Jarrett Billingsley has told me that those ranges are inefficient, so I hope the roots are more efficient.

Using your second solution:
addRange(&new_block.next, &new_block.data+1);

I have added this destructor:

~this() {
    while (this.list_head != null) {
        Block* next_ptr = this.list_head.next;
        std.gc.removeRange(&this.list_head.next);
        std.gc.realloc(this.list_head.data, 0);
        std.gc.realloc(this.list_head, 0);
        this.list_head = next_ptr;
    }
}


Now I think I understand the Phobos GC API a bit better. But I'll have to be really careful, because doing mistakes with such things looks very easy still for me.

Bye and thank you,
bearophile
September 08, 2008
bearophile <bearophileHUGS@lycos.com> wrote:
> Sergey Gromov, thank you for your explanations, you are quite clear, I have understood everything you have explained :-) And your code works.

You're welcome. :)

> > addRange(&new_block.next, &new_block.data+1);
> 
> The trick of adding a range of pointers to pointers is cute :-)
> I presume that +1 is necessary because it's an interval open on the
> right, even if the Phobos docs don't tell it:
> void addRange(void* pbot, void* ptop);

Well, I assumed that.  Void doesn't have size after all, so I thought pbot and ptop were specifying byte boundaries, not bytes or words themselves.

> >Both require Block.~this() to remove any unneeded roots or ranges<
> 
> Okay, I understand now. Jarrett Billingsley has told me that those ranges are inefficient, so I hope the roots are more efficient.
> 
> Using your second solution:
> addRange(&new_block.next, &new_block.data+1);
> 
> I have added this destructor:
> 
> ~this() {
>     while (this.list_head != null) {
>         Block* next_ptr = this.list_head.next;
>         std.gc.removeRange(&this.list_head.next);
>         std.gc.realloc(this.list_head.data, 0);
>         std.gc.realloc(this.list_head, 0);
>         this.list_head = next_ptr;
>     }
> }

I think that
>         std.gc.realloc(this.list_head.data, 0);
>         std.gc.realloc(this.list_head, 0);
is dangerous, not to mention redundant.  If there are references to elements in data other than in Blocks, reallocing data would silently invalidate those references eventually causing GPF (segfault).  The correct code is

> ~this() {
>     while (this.list_head != null) {
>         Block* next_ptr = this.list_head.next;
>         std.gc.removeRange(&this.list_head.next);
>         this.list_head = next_ptr;
>     }
> }

What it does is:
1. next_ptr makes a reference to list_head.next on the stack so that it
isn't GC'd immediately.
2. removeRange() prevents GC from scanning list_head.next and
list_head.data for pointers.  The next is still referenced from stack so
it lives yet.  The data is not referenced from list_head anymore, and if
there are no other references to it, it is GC'd eventually.  If there
are references though, it lives while those references are alive, which
is expected and correct.
3. list_head = next_ptr removes the last reference to the former
list_head so it gets GC'd automatically later.

> Now I think I understand the Phobos GC API a bit better. But I'll have to be really careful, because doing mistakes with such things looks very easy still for me.
September 08, 2008
On Mon, Sep 8, 2008 at 11:01 AM, bearophile <bearophileHUGS@lycos.com> wrote:

>>Both require Block.~this() to remove any unneeded roots or ranges<
>
> Okay, I understand now. Jarrett Billingsley has told me that those ranges are inefficient, so I hope the roots are more efficient.

They're both pretty inefficient.  Adding a root or range is pretty fast, it might have to resize its internal array.  But removing a root or range requires the linear traversal of the internal list of roots or ranges as well as copying every root/range after it down a slot, which is all very slow, as downs has found out (and patched, though it's not in the official GC yet).
September 08, 2008
Jarrett Billingsley:

> They're both pretty inefficient.  Adding a root or range is pretty fast, it might have to resize its internal array.  But removing a root or range requires the linear traversal of the internal list of roots or ranges as well as copying every root/range after it down a slot, which is all very slow, as downs has found out

I have just read some of the relevant code:

for (size_t i = nranges; i--;) {
  if (ranges[i].pbot == pbot) {
    nranges--;
    cstring.memmove(ranges+i, ranges+i+1, (nranges-i) * ranges[0].sizeof);
    return;
  }
}

it's written in D, but it looks like C because I presume that you can't rely on the GC (to use a more efficient associative array (of or better a set) of Range structs) inside the code that implements the GC itself :-)
Even with that, if ranges are few, the slowdown is little.

The addRange() function doesn't seem to control if a given Range is already present, so it appends it again and again, I think. This may be okay, because if two sources define a Range as source for roots, then they both independently will take care of removing them, but the GC itself may slow down, because it looks from the  same starting points twice. So an associative Range->counter seems a better data structure, where the removeRange actually removes a range only of its counter goes to zero.

In that code I have also understood why Sergey Gromov has added 1 to the second pointer given to addRange():

/// Search a range of memory values and mark any pointers into the GC pool.
void mark(void *pbot, void *ptop) {
    void **p1 = cast(void **)pbot;
    void **p2 = cast(void **)ptop;
    uint changes = 0;
    for (; p1 < p2; p1++) {
...

It's an interval open to the right, indeed.


>(and patched, though it's not in the official GC yet).

Months ago a very gentle person has created a patch for the GC just for me, to reduce a lot a performance problem I have found, but I think that patch too is waiting still.
Where can I find the patch by downs (mostly to take a look)?
I presume the Tango GC doesn't suffer such performance problem.

Bye,
bearophile
September 08, 2008
Sergey Gromov:

>I think that [...] is dangerous, not to mention redundant.  If there are references to elements in data other than in Blocks, reallocing data would silently invalidate those references eventually causing GPF (segfault).  The correct code is<

- It's a depressing, in such kind of code it seems I'm unable to write 5 lines of code without putting in dangerous bugs :-]
- That particular stack implementation is designed for ints only, so they don't contain references, so that code works (It seems).
- But you are right, if the data block contains references then the realloc(... ,0) are very bad.
- So generally in such programs the right thing to do seems to not dealloc memory, but remove the known references to it, and then let the GC deallocate such memory when it can't find references pointing to that memory.
- Yet, I have seen that for some data structures (probably ones with long chains of references) letting the GC do all the work of looking for unconnected components to deallocate them leads to long destruction times (2-10 seconds of time in one case). In such situations I think it may be better to give the GC a hand, actually deallocating the data structure inside the destructor using an efficient algorithm (if the programmer is sure it doesn't contain references to other allocate data) (this needs just few hundreds of seconds in the same case of before).

Bye,
bearophile
September 08, 2008
bearophile <bearophileHUGS@lycos.com> wrote:
> - That particular stack implementation is designed for ints only, so
> they don't contain references, so that code works (It seems).
> - But you are right, if the data block contains references then the
> realloc(... ,0) are very bad.

Your code will work with arrays of pointers either.  The potential problem is not in array elements pointing somewhere else, but in some external references pointing into your array.  Well, since your code guarantees that such external references are impossible, it always works as expected.

> - So generally in such programs the right thing to do seems to not dealloc memory, but remove the known references to it, and then let the GC deallocate such memory when it can't find references pointing to that memory.

It's the safe thing to do.

> - Yet, I have seen that for some data structures (probably ones with long chains of references) letting the GC do all the work of looking for unconnected components to deallocate them leads to long destruction times (2-10 seconds of time in one case). In such situations I think it may be better to give the GC a hand, actually deallocating the data structure inside the destructor using an efficient algorithm (if the programmer is sure it doesn't contain references to other allocate data) (this needs just few hundreds of seconds in the same case of before).

My apologies.  Your last code was correct.  It was just dangerous, as any other explicit memory management code is.  The danger was in that it was easy to make breaking changes to your code and not notice.  My variant was simply safer but also probably slower under some circumstances.

Hugs.
1 2 3 4
Next ›   Last »