Thread overview
[D-runtime] D Concurrent GC
Sep 15, 2010
Sean Kelly
Sep 21, 2010
Leandro Lucarella
Sep 23, 2010
Sean Kelly
Sep 27, 2010
Leandro Lucarella
Sep 28, 2010
Sean Kelly
Sep 28, 2010
Sean Kelly
Sep 29, 2010
Leandro Lucarella
September 15, 2010
Here's a quick and dirty port of the GC to druntime.  I've included a new posix.mak (as posix.mak.new) and the code itself in gcnew/.  What I did was rename src/gc to src/gcbase and create a symbolic link from src/gc to src/gcnew.  Then build with posix.mak.new, etc.  I marked most changes with TODO or BUG, but the pertinent bits were that I added gc_qalloc() and commented out the PointerMap stuff.  I haven't done more than verify that the GC works for some unittests so I'd love to see some more thorough testing.  Porting to Windows may be a bit more involved because I believe I saw a posix header imported somewhere that isn't inside a version(Posix) block.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: gcnew.zip
Type: application/zip
Size: 32332 bytes
Desc: not available
URL: <http://lists.puremagic.com/pipermail/d-runtime/attachments/20100915/15bd74c7/attachment-0001.zip>
September 21, 2010
On Wed, Sep 15, 2010 at 4:04 PM, Sean Kelly <sean at invisibleduck.org> wrote:
> Here's a quick and dirty port of the GC to druntime. ?I've included a new posix.mak (as posix.mak.new) and the code itself in gcnew/. ?What I did was rename src/gc to src/gcbase and create a symbolic link from src/gc to src/gcnew. ?Then build with posix.mak.new, etc. ?I marked most changes with TODO or BUG, but the pertinent bits were that I added gc_qalloc() and commented out the PointerMap stuff. ?I haven't done more than verify that the GC works for some unittests so I'd love to see some more thorough testing. ?Porting to Windows may be a bit more involved because I believe I saw a posix header imported somewhere that isn't inside a version(Posix) block.

I'm very glad that you're interested in porting the concurrent GC to Druntime. I'm sorry I just saw this, but I don't have internet at home since about a week and my main e-mail account was hosted there.

It's very likely that the posix.mak (and any other Makefile is outdated because I'm using a Makefile for building the GC as part of Tango, so I don't use the makefiles in the cdgc repository since ages ago.

I was careful to try to keep Windows working (or at least compiling) but didn't *ever* tried (not even to compile it), so yes, you probably should expect some problems, but hopefully just not fundamental, aesthetic problems.
September 23, 2010
On Sep 21, 2010, at 8:02 AM, Leandro Lucarella wrote:

> On Wed, Sep 15, 2010 at 4:04 PM, Sean Kelly <sean at invisibleduck.org> wrote:
>> Here's a quick and dirty port of the GC to druntime.  I've included a new posix.mak (as posix.mak.new) and the code itself in gcnew/.  What I did was rename src/gc to src/gcbase and create a symbolic link from src/gc to src/gcnew.  Then build with posix.mak.new, etc.  I marked most changes with TODO or BUG, but the pertinent bits were that I added gc_qalloc() and commented out the PointerMap stuff.  I haven't done more than verify that the GC works for some unittests so I'd love to see some more thorough testing.  Porting to Windows may be a bit more involved because I believe I saw a posix header imported somewhere that isn't inside a version(Posix) block.
> 
> I'm very glad that you're interested in porting the concurrent GC to Druntime. I'm sorry I just saw this, but I don't have internet at home since about a week and my main e-mail account was hosted there.
> 
> It's very likely that the posix.mak (and any other Makefile is outdated because I'm using a Makefile for building the GC as part of Tango, so I don't use the makefiles in the cdgc repository since ages ago.
> 
> I was careful to try to keep Windows working (or at least compiling) but didn't *ever* tried (not even to compile it), so yes, you probably should expect some problems, but hopefully just not fundamental, aesthetic problems.

No worries... it should be an easy fix if any compile errors occur.

So what's the deal with PointerMap?  I couldn't find a definition in the GC code, nor in the few obvious places I checked in Tango (ie. object.d).  Are you working from a local branch where this patch is applied?  Also, any chance you'll give D2 a try?
September 27, 2010
OK, here I am again, after 13 days (and only 13 because I've just fixed it myself) without phone and Internet connection, thanks to my friends from TELECOM Argentina.

Sorry about the tiny rant, see the comments below for the important stuff...

Sean Kelly, el 24 de septiembre a las 09:47 me escribiste:
> On Sep 23, 2010, at 2:14 PM, Leandro Lucarella wrote:
> > See bug 3463 (http://d.puremagic.com/issues/show_bug.cgi?id=3463), it's in the patched object.d file. It should be easy to avoid the precise code by hardcoding the opts.options.conservative variable to true and defining PointerMap as in the patched object.d, so everything is scanned conservatively (you can also remove all the code for precise scanning, but I really hope it gets integrated into D sooner than later).
> 
> Since you've already done the work of adding precise scanning to your GC, integrating the rest should be easy.  I'll take a look at the patch.

The main problem here is the DMD-side, you've got to convince Walter.

> How do you feel about the state of your GC right now?  Is it
> ready to be integrated with druntime, or are there still things left
> to do?

I think is ready (as in ready enough, of course there is plenty of
room for improvements still) now that I've pushed the last changes
I couldn't push before because of the connectivity problems. The main
fixed problem was the heap becoming too big with eager allocation if no
big objects were used because minimize was only called by bigAllog(),
but is fixed now.

I don't know if the weakref stuff is working either, since I have to test programs that use them, but I don't know either if druntime have weakrefs.

> I know some additional performance tuning may need to be done
> with your parallel collection scheme, but I'd imagine that could
> always be turned off so it behaves pretty much like the current GC?

Right now, for the only real program I tested (Dil generating the entire Tango docs), is way faster than the basic collector (about 2 times faster on single-core and 2.5 times faster on multi-core), other are way faster too but some tests that are very GC centric (like the shootout binary trees test) are a little slower, but I don't think that's a fair test for this kind of collector.

This is talking about total run-time, the maximum pause is reduced even much more, about 10x or more (with exceptions, but again the exceptions are in not-very-realistic test programs).

So I think performance-wise, it deserves to be the default collector (at
least in Linux). I don't know stability-wise though :)

> I'll have to see what I've changed in the GC since creating druntime as well.  Nothing major I think though.

I think the array appending changes are the bigger since the last time I checked.

-- 
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
Every day 21 new born babies will be given to the wrong parents
September 28, 2010
On Sep 27, 2010, at 3:09 PM, Leandro Lucarella wrote:
> 
> Sean Kelly, el 24 de septiembre a las 09:47 me escribiste:
>> On Sep 23, 2010, at 2:14 PM, Leandro Lucarella wrote:
>>> See bug 3463 (http://d.puremagic.com/issues/show_bug.cgi?id=3463), it's in the patched object.d file. It should be easy to avoid the precise code by hardcoding the opts.options.conservative variable to true and defining PointerMap as in the patched object.d, so everything is scanned conservatively (you can also remove all the code for precise scanning, but I really hope it gets integrated into D sooner than later).
>> 
>> Since you've already done the work of adding precise scanning to your GC, integrating the rest should be easy.  I'll take a look at the patch.
> 
> The main problem here is the DMD-side, you've got to convince Walter.

Gotcha.

>> How do you feel about the state of your GC right now?  Is it
>> ready to be integrated with druntime, or are there still things left
>> to do?
> 
> I think is ready (as in ready enough, of course there is plenty of
> room for improvements still) now that I've pushed the last changes
> I couldn't push before because of the connectivity problems. The main
> fixed problem was the heap becoming too big with eager allocation if no
> big objects were used because minimize was only called by bigAllog(),
> but is fixed now.
> 
> I don't know if the weakref stuff is working either, since I have to test programs that use them, but I don't know either if druntime have weakrefs.

Only as the signal/slot support for object termination, there's no general solution.

>> I know some additional performance tuning may need to be done
>> with your parallel collection scheme, but I'd imagine that could
>> always be turned off so it behaves pretty much like the current GC?
> 
> Right now, for the only real program I tested (Dil generating the entire Tango docs), is way faster than the basic collector (about 2 times faster on single-core and 2.5 times faster on multi-core), other are way faster too but some tests that are very GC centric (like the shootout binary trees test) are a little slower, but I don't think that's a fair test for this kind of collector.

Why are these cases slower?  Do collections occur more often overall?

> This is talking about total run-time, the maximum pause is reduced even much more, about 10x or more (with exceptions, but again the exceptions are in not-very-realistic test programs).
> 
> So I think performance-wise, it deserves to be the default collector (at
> least in Linux). I don't know stability-wise though :)

Vetting the GC will be tricky because basically no one will take the trouble to rebuild druntime with it to try it out.  I think we'll simply have to do our best to make sure it works correctly and then just switch over and wait for bug reports.  Do you have any objection to it being distributed under the Boost license?

>> I'll have to see what I've changed in the GC since creating druntime as well.  Nothing major I think though.
> 
> I think the array appending changes are the bigger since the last time I checked.

There's an APPENDABLE flag now.  I'll check the diffs in the GC code to see if there's anything else.  Thanks for the reply!
September 28, 2010



----- Original Message ----
> From: Sean Kelly <sean at invisibleduck.org>
> 
> On Sep 27, 2010, at 3:09 PM, Leandro Lucarella wrote:
> > 
> > Sean  Kelly, el 24 de septiembre a las 09:47 me escribiste:
> >
> >> I'll have to see what I've  changed in the GC since creating druntime as well.  Nothing  major I think though.
> > 
> > I think the array appending changes are  the bigger since the last time I checked.
> 
> There's an APPENDABLE  flag now.  I'll check the diffs in the GC code to see if
>there's anything  else.  Thanks for the  reply!

Basically, APPENDABLE is used to mark a memory block as being an appendable array.  This helps prevent accidental appending when the array runtime code hasn't been used.

It was pretty easy to add, I just copy-pasted some code that creates the bits. The toughest part really was to go through all the modules that define BlkAttr and add the APPENDABLE flag (unsure why this isn't done in one place...).

There are some other changes I made, particularly to extend.  Basically, the old extend would give up after adding one page if it was available.  I changed it to add as many pages as possible up to the upper limit request size.  This helps keep appending amortized because the array grows proportionately instead of one page at a time.

However, if your GC does not implement this, it's not a severe problem, it just won't perform quite as well when appending.

-Steve




September 28, 2010
On Sep 28, 2010, at 8:19 AM, Steve Schveighoffer wrote:
> 
> It was pretty easy to add, I just copy-pasted some code that creates the bits. The toughest part really was to go through all the modules that define BlkAttr and add the APPENDABLE flag (unsure why this isn't done in one place...).

Part of the bid to have each piece of the runtime be separately compilable.  The BlkAttr flag values were defined in the spec, so each portion of the runtime just defined them.  I've since relaxed on this goal though and now think it's okay for everything to import whatever is in core, so the compiler runtime code could import core.memory and use that instead.
September 28, 2010
Sean Kelly, el 28 de septiembre a las 07:49 me escribiste:
> >> I know some additional performance tuning may need to be done
> >> with your parallel collection scheme, but I'd imagine that could
> >> always be turned off so it behaves pretty much like the current GC?
> > 
> > Right now, for the only real program I tested (Dil generating the entire Tango docs), is way faster than the basic collector (about 2 times faster on single-core and 2.5 times faster on multi-core), other are way faster too but some tests that are very GC centric (like the shootout binary trees test) are a little slower, but I don't think that's a fair test for this kind of collector.
> 
> Why are these cases slower?  Do collections occur more often overall?

I'm not 100% sure yet, but I think on one side, the extra overhead imposed by fork()ing and waiting is bigger than the gain the concurrency provides (which is almost zero for programs that doesn't do any actual work except calling GC code). Then, there is the problem of the static data being much bigger in my GC because of TypeInfos.

> > This is talking about total run-time, the maximum pause is reduced even much more, about 10x or more (with exceptions, but again the exceptions are in not-very-realistic test programs).
> > 
> > So I think performance-wise, it deserves to be the default collector (at
> > least in Linux). I don't know stability-wise though :)
> 
> Vetting the GC will be tricky because basically no one will take the trouble to rebuild druntime with it to try it out.  I think we'll simply have to do our best to make sure it works correctly and then just switch over and wait for bug reports.  Do you have any objection to it being distributed under the Boost license?

Not at all. I have no problems with any license, the only care I would like to take is, I hope the GC can be used in both Druntime and Tango. The GC is based on code written by Walter, you and David Friedman according to the sources, so I wanted to eventually ask all of you about the licensing :). From my side, I'm willing to free the code in whatever license is needed to integrate it to whatever anyone wants to integrate it.

> >> I'll have to see what I've changed in the GC since creating druntime as well.  Nothing major I think though.
> > 
> > I think the array appending changes are the bigger since the last time I checked.
> 
> There's an APPENDABLE flag now.  I'll check the diffs in the GC code to see if there's anything else.  Thanks for the reply!

Sure! Thanks for taking interest on this!

-- 
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
DIEZ "PUNGAS" MENOS
	-- Cr?nica TV