Jump to page: 1 2 3
Thread overview
Discusssion on the Discussion of the Design for a new GC
Apr 23, 2014
Orvid King
Apr 23, 2014
Joakim
Apr 24, 2014
Andrej Mitrovic
Apr 23, 2014
Messenger
Apr 24, 2014
Marco Leise
Apr 24, 2014
Rainer Schuetze
Apr 24, 2014
Brian Rogoff
Apr 24, 2014
Kagamin
Apr 24, 2014
safety0ff
Apr 24, 2014
Messenger
Apr 24, 2014
Kagamin
Apr 30, 2014
Orvid King
May 05, 2014
Marco Leise
Apr 24, 2014
Kagamin
Apr 24, 2014
Orvid King
Apr 24, 2014
Dicebot
Apr 24, 2014
Orvid King
Apr 24, 2014
Leandro Lucarella
Apr 24, 2014
Leandro Lucarella
Apr 24, 2014
Orvid King
Apr 24, 2014
Mike
Apr 24, 2014
Orvid King
April 23, 2014
So, in order to get the ball rolling on the new GC I intend to implement for D, I want to facilitate a lively discussion of the design of it, so that it can be designed to be both robust and free of design flaws. To keep the discussion from getting derailed, I want to lay out a few guidelines, but want to get feedback on those guidelines before I actually implement them. My current draft of them is as follows:

First we’ll start with a brief overview of the development process:
A PR will be created for DMD, DRuntime, and, although it may stay empty, Phobos. A new commit will be created for each update of the implementation, this includes bug fixes, and continuing work on the implementation, in as many iterations as are required. This is done to allow progressive review of the code rather than trying to review the PRs as a whole, because, as it is likely to include several thousand lines of changes to the code, it would be impractical to review all at once.
No force push should ever be done to the PRs except to fix a typo in or clarify a detail of the commit message for the newest commit. If there is a typo in a commit message, or it is not very clear on what was actually done, and another commit has already been pushed, the typo or un-clear message shall remain for all eternity. The suggested remedy in this case is to make a note of the typo or clarify the commit message with a comment on the commit.
PRs to the PRs are welcome, it is however encouraged to coordinate any work you do with the others actively working on the GC. The primary outlet for this should be the IRC, however, should the need arise, the mailing list is a valid venue for this.
Github should be used as the primary outlet for discussion of actual code, due to the ease of referencing code, as well as the ability to tell if a comment is about a piece of code that was already changed.
The mailing list should be used exclusively for discussion of the design. It should not be used for discussing snippets of code in the actual implementation. It can, and should be, used to discuss snippets of code that may demonstrate a flaw, weakness, or strength in the design.
The IRC should be used for rapid-fire Q&A, or bringing someone up-to-date with the discussion and progression of the GC so far. Discussion about inconsistencies in the coding style of the implementation (whitespaces, newlines, etc.) should reside exclusively on the IRC, as they are things that a future reader of the discussions doesn’t really care about. If a discussion of the overall code style used in the implementation is required, a thread should be created on the mailing list.
The IRC should not be used to facilitate a design discussion. The reason for this is twofold, firstly the IRC has a limited audience, thus limited feedback, and secondly, I want the discussion of the design to stand as documentation for why the GC is designed the way it is.

Now, on to the guidelines for the design discussion.
ARC does not exist. We are implementing a GC, however, if the opportunity arises to allow an efficient implementation of interfacing with an external ARC platform, such as what is used in Objective C, discussion of that interfacing mechanism is permitted.
If DMD support is needed, it exists. This means that if the GC needs DMD to be capable of something such as scope analysis in order to make a particular optimization, then DMD should be assumed to be capable of doing that.
While language additions may be proposed, the design must still be able to function should the additions not be done, as the additions should only be to allow for additional optimization opportunities. For instance, re-introducing scoped class locals.



After all of that, I intend to include a base draft of the design of the GC, along with opening the PRs and committing the starting API. So, is there something I’m missing? Am I overlooking the obvious? Is there a more practical way to produce the same results?
April 23, 2014
On Wednesday, 23 April 2014 at 15:33:36 UTC, Orvid King wrote:
> So, in order to get the ball rolling on the new GC I intend to implement for D, I want to facilitate a lively discussion of the design of it, so that it can be designed to be both robust and free of design flaws. To keep the discussion from getting derailed, I want to lay out a few guidelines, but want to get feedback on those guidelines before I actually implement them. My current draft of them is as follows:
>
> First we’ll start with a brief overview of the development process:
> A PR will be created for DMD, DRuntime, and, although it may stay empty, Phobos. A new commit will be created for each update of the implementation, this includes bug fixes, and continuing work on the implementation, in as many iterations as are required. This is done to allow progressive review of the code rather than trying to review the PRs as a whole, because, as it is likely to include several thousand lines of changes to the code, it would be impractical to review all at once.
> No force push should ever be done to the PRs except to fix a typo in or clarify a detail of the commit message for the newest commit. If there is a typo in a commit message, or it is not very clear on what was actually done, and another commit has already been pushed, the typo or un-clear message shall remain for all eternity. The suggested remedy in this case is to make a note of the typo or clarify the commit message with a comment on the commit.
> PRs to the PRs are welcome, it is however encouraged to coordinate any work you do with the others actively working on the GC. The primary outlet for this should be the IRC, however, should the need arise, the mailing list is a valid venue for this.
> Github should be used as the primary outlet for discussion of actual code, due to the ease of referencing code, as well as the ability to tell if a comment is about a piece of code that was already changed.
> The mailing list should be used exclusively for discussion of the design. It should not be used for discussing snippets of code in the actual implementation. It can, and should be, used to discuss snippets of code that may demonstrate a flaw, weakness, or strength in the design.
> The IRC should be used for rapid-fire Q&A, or bringing someone up-to-date with the discussion and progression of the GC so far. Discussion about inconsistencies in the coding style of the implementation (whitespaces, newlines, etc.) should reside exclusively on the IRC, as they are things that a future reader of the discussions doesn’t really care about. If a discussion of the overall code style used in the implementation is required, a thread should be created on the mailing list.
> The IRC should not be used to facilitate a design discussion. The reason for this is twofold, firstly the IRC has a limited audience, thus limited feedback, and secondly, I want the discussion of the design to stand as documentation for why the GC is designed the way it is.
>
> Now, on to the guidelines for the design discussion.
> ARC does not exist. We are implementing a GC, however, if the opportunity arises to allow an efficient implementation of interfacing with an external ARC platform, such as what is used in Objective C, discussion of that interfacing mechanism is permitted.
> If DMD support is needed, it exists. This means that if the GC needs DMD to be capable of something such as scope analysis in order to make a particular optimization, then DMD should be assumed to be capable of doing that.
> While language additions may be proposed, the design must still be able to function should the additions not be done, as the additions should only be to allow for additional optimization opportunities. For instance, re-introducing scoped class locals.
>
>
>
> After all of that, I intend to include a base draft of the design of the GC, along with opening the PRs and committing the starting API. So, is there something I’m missing? Am I overlooking the obvious? Is there a more practical way to produce the same results?

Wow, this takes the D forums tradition of "all talk, no code," or as the original saying goes, "all hat, no cattle," to a new peak. ;)

I think most would agree with you on most of these guidelines, better to get onto the actual design.  It might help if you put forth a tentative proposal, that the D goons can then proceed to destroy... I mean, critically evaluate.
April 23, 2014
On Wednesday, 23 April 2014 at 15:33:36 UTC, Orvid King wrote:
> After all of that, I intend to include a base draft of the design of the GC, along with opening the PRs and committing the starting API. So, is there something I’m missing? Am I overlooking the obvious? Is there a more practical way to produce the same results?

What is the state of Rainer Schütze's precise gc? Duplication of effort and all that.
April 24, 2014
Am Wed, 23 Apr 2014 18:35:25 +0000
schrieb "Messenger" <dont@shoot.me>:

> What is the state of Rainer Schütze's precise gc? Duplication of effort and all that.

+1. And I hope you know what you are up to :D. Some people may expect a magic pill to emerge from your efforts that makes the GC approx. as fast as manual memory management for typical uses or at least as good as the one in Java. We must not forget that Java has just-in-time compilation and no raw pointer access. They might have found clever ways to make use of features/restrictions in Java, that are not available to D. Memory compaction is one from the top of my head.

I only know for sure that you are looking into using some
ideas from TCMalloc. Other than that, what are the exact
problems you are trying to solve? That would be good to know,
since different goals might require different implementations.
E.g. a precise GC is generally slowed down by checking
data types, but it doesn't keep memory alive because some
float variable happens to look like a pointer to it.

What are the limitations of garbage collection? As an example: If someone loads some million items graph structure into memory, can they still make any assumption about the run time of GC.alloc()? Can generational collection be implemented?

-- 
Marco

April 24, 2014

On 23.04.2014 20:35, Messenger wrote:
> On Wednesday, 23 April 2014 at 15:33:36 UTC, Orvid King wrote:
>> After all of that, I intend to include a base draft of the design of
>> the GC, along with opening the PRs and committing the starting API.
>> So, is there something I’m missing? Am I overlooking the obvious? Is
>> there a more practical way to produce the same results?
>
> What is the state of Rainer Schütze's precise gc? Duplication of effort
> and all that.

The implementation relies on correct RTInfo generation, but that still doesn't work as this pull request is sitting there for 8 months now without getting much review: https://github.com/D-Programming-Language/dmd/pull/2480

Coincidentally, I updated the PR just a couple of days ago.

The precise GC changes for druntime are here: https://github.com/rainers/druntime/tree/gcx_precise2
April 24, 2014
On Wednesday, 23 April 2014 at 15:33:36 UTC, Orvid King wrote:
> After all of that, I intend to include a base draft of the design of the GC, along with opening the PRs and committing the starting API. So, is there something I’m missing? Am I overlooking the obvious? Is there a more practical way to produce the same results?

What specific problems do you hope to solve with the new design?  Is it only to improve the speed at which the GC runs to completion on a single pass?  What about premptability so the GC could run as a low priority task, and be interrupted at any time to run something at a higher priority?  Is that out of scope?

Mike
April 24, 2014
On 4/23/14, Joakim via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>  It might help if you put
> forth a tentative proposal, that the D goons can then proceed to
> destroy... I mean, critically evaluate.

Btw guys, what's stopping us from simply porting over Leandro's CDGC to D2? I've looked at the codebase and it doesn't seem to be that large (the following link is the most recent version of the GC I could find):

http://www.dsource.org/projects/tango/browser/trunk/tango/core/rt/gc/cdgc

I know there was some talk about a missing Windows implementation, but if we had a Posix implementation and it turned out to be really good I don't think it would take a long time before we step up and make a Windows implementation.
April 24, 2014
On Wednesday, 23 April 2014 at 18:35:26 UTC, Messenger wrote:
> On Wednesday, 23 April 2014 at 15:33:36 UTC, Orvid King wrote:
>> After all of that, I intend to include a base draft of the design of the GC, along with opening the PRs and committing the starting API. So, is there something I’m missing? Am I overlooking the obvious? Is there a more practical way to produce the same results?
>
> What is the state of Rainer Schütze's precise gc? Duplication of effort and all that.

Also CDGC http://www.llucax.com.ar/proj/dgc/
April 24, 2014
On Wednesday, 23 April 2014 at 18:35:26 UTC, Messenger wrote:
> On Wednesday, 23 April 2014 at 15:33:36 UTC, Orvid King wrote:
>> After all of that, I intend to include a base draft of the design of the GC, along with opening the PRs and committing the starting API. So, is there something I’m missing? Am I overlooking the obvious? Is there a more practical way to produce the same results?
>
> What is the state of Rainer Schütze's precise gc? Duplication of effort and all that.

While I do admire the work that has gone into it, there are multiple limitations in his implementation that I believe will prevent his implementation to being the best for D. The single biggest thing is that it was created before a preliminary std.allocators was published, and doesn't take them into account in it's design. I believe it is possible to make these a huge strength of the GC, because it is possible to use them, with minor extensions to the design, to even implement a generational compacting allocator, along with the allocation speed that it comes with. I believe it's design also forgoes the possibility of async collection, which would fail to address the existence of realtime and server applications. I do however intend to have a heap-precise allocator as the default. The extensibility of my design would even allow for DMD style allocation, where it simply never deallocates anything. I'll post a full design soon, but I have to type it all out first, and make sure that it doesn't just end up as some incoherent babbling, and also so that it will make sense to everyone else who's reading it.
April 24, 2014
On Thursday, 24 April 2014 at 10:14:07 UTC, Kagamin wrote:
> On Wednesday, 23 April 2014 at 18:35:26 UTC, Messenger wrote:
>> On Wednesday, 23 April 2014 at 15:33:36 UTC, Orvid King wrote:
>>> After all of that, I intend to include a base draft of the design of the GC, along with opening the PRs and committing the starting API. So, is there something I’m missing? Am I overlooking the obvious? Is there a more practical way to produce the same results?
>>
>> What is the state of Rainer Schütze's precise gc? Duplication of effort and all that.
>
> Also CDGC http://www.llucax.com.ar/proj/dgc/

Ah, I hadn't realized he had actually implemented the concurrent GC he gave a talk on, I will make sure to look over the code before writing out the design. I was also planning on watching the full talk before posting the design, to make sure that it doesn't prevent any aspects of the design from working.
« First   ‹ Prev
1 2 3