May 25, 2009
nobody wrote:
>> One thing you could try is disabling the GC (this really just disables automatic
>> running of the collector) and run it manually at points that you know make sense.
>>  For example, you could just insert a GC.collect() statement at the end of every
>> run of your main loop.
>> Another thing to try is avoiding appending to arrays.  If you know the length in
>> advance, you can get pretty good speedups by pre-allocating the array instead of
>> appending using the ~= operator.
>> You can safely delete specific objects manually even when the GC is enabled.  For
>> very large objects with trivial lifetimes, this is probably worth doing.  First of
>> all, the GC will run less frequently.  Secondly, D's GC is partially conservative,
>> meaning that occasionally memory will not be freed when it should be.  The
>> probability of this happening is proportional to the size of the memory block.
> 
> I have tried all these: with GC enabled only periodically runs in the main loop, however the memory still grows faster than I expected when I feed more data into the program. Then I manually delete some specific objects. However the program start to fail randomly.
> 
> Has anyone experienced similar issues: i.e. with GC on, you defined you own dtor for certain class, and called delete manually on certain objects.
> 
> The program fails at random stages, with some stack trace showing some GC calls like:
> 
>  0x0821977a in _D2gc3gcx3Gcx16fullcollectshellMFZk ()
> 
> I suspected the GC is buggy when mixed with manual deletes.

After enabling the gc, did you force a collection?  Just enabling it won't cause one to occur.

Later,
Brad
May 25, 2009
nobody, el 24 de mayo a las 19:05 me escribiste:
> Hi,
> 
> I'm writing a data processing program in D, which deals with large amounts of small objects. One of the thing I found is that D's GC is horribly slow in such situation. I tried my program with gc enable & disabled (with some manual deletes). The GC disabled version (2 min) is ~100 times faster than the GC enabled version (4 hours)!
> 
> But of course the GC disabled version still leak memory, it soon exceeds the machine memory limit when I try to process more data; while the GC enabled version don't have such problem.
> 
> So my plan is to use the GC disabled version with manual deletes. But it was very hard to find all the memory leaks. I'm wondering: is there anyway to use GC as a leak detector? can the GC enabled version give me some help information on which objects get collected, so I can manually delete them in my GC disabled version?  Thanks!

As other asked, are you using D1 Tango/Phobos? D2? In Tango/D2 you can enable logging in the GC (using the LOGGING version identifier).

Is your program source available? I'm gathering programs to make a D GC benchmark suite an your programs seems like a good candidate for measuring the GC performance.

Thank you.

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
----------------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------------
Que importante, entonces en estos días de globalización refregar
nuestras almas, pasarle el lampazo a nuestros corazones para alcanzar
un verdadero estado de babia peperianal.
	-- Peperino Pómoro
May 25, 2009
nobody, el 24 de mayo a las 20:03 me escribiste:
> == Quote from Jason House (jason.james.house@gmail.com)'s article
> > Why not use valgrind? With the GC disabled, it should give accurate results.
> 
> Strange enough, indeed I have tried valgrind with the GC disabled version.  It didn't report anything useful.
> 
> That's why I'm puzzled, does D's GC do something special?
> 
> The GC disabled version run out of 3G memory; but the GC enabled version stays at ~800M throughout the run.

I guess that with such amount of memory used, your program can greatly benefit from using NO_SCAN if your 800M of data are plain old data. Did you tried it? And if you never have interior pointers to that data, your program can possibly avoid a lot of false positives due to the conservativism if you use NO_INTERIOR (this is only available if you patch the GC with David Simcha's patch[1]).

[1] http://d.puremagic.com/issues/show_bug.cgi?id=2927

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
----------------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------------
This is what you get,
when you mess with us.
May 25, 2009
> > I suspected the GC is buggy when mixed with manual deletes.
> I personally have not experienced this.  Please be more specific: D1 or D2?

D2.

> If D1, Phobos or Tango?
> DMD, LDC, or GDC?

DMD v2.030

> Compiler version?
> Also, please file a bug report, especially if you can create a concise,
> reproducible test case.

It's hard to isolate the code, and since the program is non-trivial I'm not 100% sure, as it could be my bug.
May 25, 2009
== Quote from Brad Roberts (braddr@puremagic.com)'s article
> After enabling the gc, did you force a collection?  Just enabling it won't cause one to occur.

Yes, I called:

  core.memory.GC.enable();
  core.memory.GC.collect();
  core.memory.GC.disable();
May 25, 2009
> As other asked, are you using D1 Tango/Phobos? D2? In Tango/D2 you can

DMD v2.030  on Linux.

> enable logging in the GC (using the LOGGING version identifier).

How to do it in D2?

> Is your program source available? I'm gathering programs to make a D GC

Sorry, no.
May 25, 2009
== Quote from Leandro Lucarella (llucax@gmail.com)'s article
> benefit from using NO_SCAN if your 800M of data are plain old data. Did you tried it? And if you never have interior pointers to that data, your program can possibly avoid a lot of false positives due to the conservativism if you use NO_INTERIOR (this is only available if you patch

No, my data are classes (not structs), and they need to be class by some other
design considerations;
and worse they contain pointers to other data, e.g.

class SmallDataA {  // need to be class
}

class SmallDataB {  // need to be class
  SmallDataA a;     // in D 'a' is a reference, or 'pointer'
}

I have thought about use POD. I think the above code in C++ will be more what I want: i.e. the 'a' object (not the reference) is embedded directly into SmallDataB. I guess when I have millions of such SmallDataB objects, it will make the GC busy in D since 'a' is reference.

So question: can we have such expanded class objects in D?

May 25, 2009
nobody, el 25 de mayo a las 07:37 me escribiste:
> == Quote from Leandro Lucarella (llucax@gmail.com)'s article
> > benefit from using NO_SCAN if your 800M of data are plain old data. Did you tried it? And if you never have interior pointers to that data, your program can possibly avoid a lot of false positives due to the conservativism if you use NO_INTERIOR (this is only available if you patch
> 
> No, my data are classes (not structs), and they need to be class by some other
> design considerations;
> and worse they contain pointers to other data, e.g.
> 
> class SmallDataA {  // need to be class
> }
> 
> class SmallDataB {  // need to be class
>   SmallDataA a;     // in D 'a' is a reference, or 'pointer'
> }
> 
> I have thought about use POD. I think the above code in C++ will be more what I want: i.e. the 'a' object (not the reference) is embedded directly into SmallDataB. I guess when I have millions of such SmallDataB objects, it will make the GC busy in D since 'a' is reference.
> 
> So question: can we have such expanded class objects in D?

There are request for them, but not for now...

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
----------------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------------
CAYO HUGO CONZI  ---  TENIA PUESTA PELUCA
	-- Crónica TV
May 25, 2009
nobody, el 25 de mayo a las 03:24 me escribiste:
> > As other asked, are you using D1 Tango/Phobos? D2? In Tango/D2 you can
> 
> DMD v2.030  on Linux.
> 
> > enable logging in the GC (using the LOGGING version identifier).
> 
> How to do it in D2?

You should recompile Druntime's GC with -version=LOGGING.

> > Is your program source available? I'm gathering programs to make a D GC
> 
> Sorry, no.

=(

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
----------------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------------
Es más probable que el tomate sea perita, a que la pera tomatito.
	-- Peperino Pómoro
May 25, 2009
== Quote from Leandro Lucarella (llucax@gmail.com)'s article
> nobody, el 25 de mayo a las 03:24 me escribiste:
> > > As other asked, are you using D1 Tango/Phobos? D2? In Tango/D2 you can
> >
> > DMD v2.030  on Linux.
> >
> > > enable logging in the GC (using the LOGGING version identifier).
> >
> > How to do it in D2?
> You should recompile Druntime's GC with -version=LOGGING.

When linking does DMD link against libphobos2.a, libdruntime.a or both?

> > > Is your program source available? I'm gathering programs to make a D GC
> >
> > Sorry, no.
> =(

But I have posted a simple example earlier, it's silly, but disable GC get 5x speed up:

http://www.digitalmars.com/pnews/read.php?server=news.digitalmars.com&group=digitalmars.D&artnum=90983