Thread overview
GC doesn't collect where expected
Jun 19, 2023
axricard
Jun 19, 2023
Anonymouse
Jun 19, 2023
axricard
Jun 19, 2023
axricard
June 19, 2023

I'm doing some experiments with ldc2 GC, by instrumenting it and printing basic information (what is allocated and freed)

My first tests are made on this sample :

>> cat test2.d
import core.memory;

class Bar { int bar; }

class Foo {

  this()
  {
    this.bar = new Bar;
  }

  Bar bar;
}


void func()
{
  Foo f2 = new Foo;
}

int main()
{
  Foo f = new Foo;

  func();
  GC.collect();

  return 0;
}

When trying to run the instrumented druntime, I get a strange behavior : the first collection (done with GC.collect) doesn't sweep anything (in particular, it doesn't sweep memory allocated in func()). The whole sweeping is done when program finish, at cleanup. I don't understand why : memory allocated in func() shouldn't be accessible from any root at first collection, right ?

╰─> /instrumented-ldc2 -g -O0 test2.d --disable-gc2stack --disable-d-passes --of test2  &&  ./test2 "--DRT-gcopt=cleanup:collect fork:0 parallel:0 verbose:2"


[test2.d:26] new 'test2.Foo' (24 bytes) => p = 0x7f3a0454d000
[test2.d:10] new 'test2.Bar' (20 bytes) => p = 0x7f3a0454d020
[test2.d:21] new 'test2.Foo' (24 bytes) => p = 0x7f3a0454d040
[test2.d:10] new 'test2.Bar' (20 bytes) => p = 0x7f3a0454d060

============ COLLECTION  =============
        ============= MARKING ==============
        marking range: [0x7fff22337a60..0x7fff22339000] (0x15a0)
                range: [0x7f3a0454d000..0x7f3a0454d020] (0x20)
                range: [0x7f3a0454d040..0x7f3a0454d060] (0x20)
        marking range: [0x7f3a0464d720..0x7f3a0464d8b9] (0x199)
        marking range: [0x46c610..0x47b3b8] (0xeda8)
        ============= SWEEPING ==============
=====================================================


============ COLLECTION  =============
        ============= MARKING ==============
        marking range: [0x46c610..0x47b3b8] (0xeda8)
        ============= SWEEPING ==============
        Freeing test2.Foo (test2.d:26; 24 bytes) (0x7f3a0454d000). AGE :  1/2
        Freeing test2.Bar (test2.d:10; 20 bytes) (0x7f3a0454d020). AGE :  1/2
        Freeing test2.Foo (test2.d:21; 24 bytes) (0x7f3a0454d040). AGE :  1/2
        Freeing test2.Bar (test2.d:10; 20 bytes) (0x7f3a0454d060). AGE :  1/2
=====================================================
June 19, 2023

On 6/19/23 12:13 PM, axricard wrote:

>

I'm doing some experiments with ldc2 GC, by instrumenting it and printing basic information (what is allocated and freed)

My first tests are made on this sample :

>> cat test2.d
import core.memory;

class Bar { int bar; }

class Foo {

   this()
   {
     this.bar = new Bar;
   }

   Bar bar;
}


void func()
{
   Foo f2 = new Foo;
}

int main()
{
   Foo f = new Foo;

   func();
   GC.collect();

   return 0;
}

When trying to run the instrumented druntime, I get a strange behavior : the first collection (done with GC.collect) doesn't sweep anything (in particular, it doesn't sweep memory allocated in func()). The whole sweeping is done when program finish, at cleanup. I don't understand why : memory allocated in func() shouldn't be accessible from any root at first collection, right ?

╰─> /instrumented-ldc2 -g -O0 test2.d --disable-gc2stack --disable-d-passes --of test2  &&  ./test2 "--DRT-gcopt=cleanup:collect fork:0 parallel:0 verbose:2"


[test2.d:26] new 'test2.Foo' (24 bytes) => p = 0x7f3a0454d000
[test2.d:10] new 'test2.Bar' (20 bytes) => p = 0x7f3a0454d020
[test2.d:21] new 'test2.Foo' (24 bytes) => p = 0x7f3a0454d040
[test2.d:10] new 'test2.Bar' (20 bytes) => p = 0x7f3a0454d060

============ COLLECTION  =============
         ============= MARKING ==============
         marking range: [0x7fff22337a60..0x7fff22339000] (0x15a0)
                 range: [0x7f3a0454d000..0x7f3a0454d020] (0x20)
                 range: [0x7f3a0454d040..0x7f3a0454d060] (0x20)
         marking range: [0x7f3a0464d720..0x7f3a0464d8b9] (0x199)
         marking range: [0x46c610..0x47b3b8] (0xeda8)
         ============= SWEEPING ==============
=====================================================


============ COLLECTION  =============
         ============= MARKING ==============
         marking range: [0x46c610..0x47b3b8] (0xeda8)
         ============= SWEEPING ==============
         Freeing test2.Foo (test2.d:26; 24 bytes) (0x7f3a0454d000). AGE :  1/2
         Freeing test2.Bar (test2.d:10; 20 bytes) (0x7f3a0454d020). AGE :  1/2
         Freeing test2.Foo (test2.d:21; 24 bytes) (0x7f3a0454d040). AGE :  1/2
         Freeing test2.Bar (test2.d:10; 20 bytes) (0x7f3a0454d060). AGE :  1/2
=====================================================

In general, the language does not guarantee when the GC will collect your item.

In this specific case, most likely it's a stale register or stack reference. One way I usually use to ensure such things is to call a function that destroys the existing stack:

void clobber()
{
   int[2048] x;
}

Calling this function will clear out 2048x4 bytes of data to 0 on the stack.

-Steve

June 19, 2023

On Monday, 19 June 2023 at 16:43:30 UTC, Steven Schveighoffer wrote:

>

In this specific case, most likely it's a stale register or stack reference. One way I usually use to ensure such things is to call a function that destroys the existing stack:

void clobber()
{
   int[2048] x;
}

Calling this function will clear out 2048x4 bytes of data to 0 on the stack.

-Steve

Could you elaborate on how you use this? When do you call it? Just, ever so often, or is there thought behind it?

June 19, 2023

On Monday, 19 June 2023 at 16:43:30 UTC, Steven Schveighoffer wrote:

>

In general, the language does not guarantee when the GC will collect your item.

In this specific case, most likely it's a stale register or stack reference. One way I usually use to ensure such things is to call a function that destroys the existing stack:

void clobber()
{
   int[2048] x;
}

Calling this function will clear out 2048x4 bytes of data to 0 on the stack.

-Steve

All clear, thank you !

June 19, 2023

On 6/19/23 12:51 PM, Anonymouse wrote:

>

On Monday, 19 June 2023 at 16:43:30 UTC, Steven Schveighoffer wrote:

>

In this specific case, most likely it's a stale register or stack reference. One way I usually use to ensure such things is to call a function that destroys the existing stack:

void clobber()
{
   int[2048] x;
}

Calling this function will clear out 2048x4 bytes of data to 0 on the stack.

Could you elaborate on how you use this? When do you call it? Just, ever so often, or is there thought behind it?

Just before forcing a collect.

The stack is always scanned conservatively, and even though really the stack data should be blown away by the next function call (probably GC.collect), it doesn't always work out that way. Indeed, even just declaring x might not do it if the compiler decides it doesn't actually have to.

But I've found that seems to help.

-Steve

June 19, 2023

On Monday, 19 June 2023 at 16:43:30 UTC, Steven Schveighoffer wrote:

>

In general, the language does not guarantee when the GC will collect your item.

In this specific case, most likely it's a stale register or stack reference. One way I usually use to ensure such things is to call a function that destroys the existing stack:

void clobber()
{
   int[2048] x;
}

Calling this function will clear out 2048x4 bytes of data to 0 on the stack.

-Steve

Does it mean that if my function func() is as following (say I don't use clobber), I could keep a lot of memory for a very long time (until the stack is fully erased by other function calls) ?

void func()
{
   Foo[2048] x;
   foreach(i; 0 .. 2048)
     x[i] = new Foo;
}
June 19, 2023

On 6/19/23 2:01 PM, axricard wrote:

>

Does it mean that if my function func() is as following (say I don't use clobber), I could keep a lot of memory for a very long time (until the stack is fully erased by other function calls) ?

void func()
{
    Foo[2048] x;
    foreach(i; 0 .. 2048)
      x[i] = new Foo;
}

When the GC stops all threads, each of them registers their current stack as the target to scan, so most likely not.

However, the compiler/optimizer is not trying to zero out stack unnecessarily, and likely this leads in some cases to false pointers. Like I said, even the "clobber" function might not actually zero out any stack because the compiler decides writing zeros to the stack that will never be read is a "dead store" and just omit that.

This question comes up somewhat frequently "why isn't the GC collecting the garbage I gave it!", and the answer is mostly "don't worry about it". There is no real good way to guarantee an interaction between the compiler, the optimizer, and the runtime to make sure something happens one way or another. The only thing you really should care about is if you have a reference to an item and it's prematurely collected. Then there is a bug. Other than that, just don't worry about it.

-Steve