[RFC]Proposal for better garbage collection (page 3)

February 23, 2012

Re: [RFC]Proposal for better garbage collection

Posted by Manu
in reply to Benjamin Thaut

Permalink

Manu

Posted in reply to Benjamin Thaut

Attachments:

text/html part

Permalink

On 22 February 2012 20:56, Benjamin Thaut <code@benjamin-thaut.de> wrote:

> 2) Tracking references on the stack:
>
> The D compiler always needs to emit a full stack frame so that the GC can walk up the stack at any time in the program.


You say "every function needs a stack frame". Can you comment on this with
respect to leaf functions? No leaf function should ever generate a stack
frame, infact, typical leaf functions will never touch memory at all. This
is VERY IMPORTANT for critical loops. I have never worked on a project
where I did not depend on the performance of leaf functions to do the hard
work in the most critical parts of my application.
Obviously such functions would not be making allocations, and shouldn't be
interacting with the GC any way, so why is having a stack frame important?


> The stack frame of every function generated by the D compiler starts which a bitfield (*usually the size of a machine register*)...
>

Oh really? And how do we define that type? ;) (*cough* reference to
size_t/ptrdiff_t thread)



> For example on x86: 11001...
> 1 = bytes 0 to 4 are a pointer
> 1 = bytes 4 to 8 are a pointer
> 00 = bytes 8 to 16 are not a pointer
> 1 = bytes 16 to 20 are a pointer
>
> The last bit indicates whether the bitfield is continued in the following
> bytes or not. 1 means continued 0 means finished.
> Every scope generated by the D compiler would need additional code at the
> start and end of the scope. When the scope is entered the bitfield would be
> patched to represent the new variables inside the scope and when the scope
> is left the bitfield is patched again to remove the changes that were made
> on entering the scope.
> Every time a function gets called that did not get generated by the D
> compiler ( C / C++ etc functions) the compiler generates a call into the
> runtime and passes the current stack pointer and stack base pointer to it.
>
> void _d_externalCallStart(void* stackptr, void* baseptr);
>
> Every time such a function returns the compiler   generates a call into the the runtime too.
>
> void _d_externalCallEnd(void* stackptr, void* baseptr);
>

Can you comment on what those functions will actually do? It definitely
sounds very worrying to me to be turning every function call into THREE
calls.
Calling into extern code is certainly not rare... almost everything of any
use is an extern C lib. What about interaction with the OS? Trivial libs
like zlib? etc...

I wonder if there are alternative ways to detect a foreign stack. And I'm not sure why it even matters, you can't depend on the extern ABI, how do you unwind the stack reliably in the first place?


> Every time a functions that can get called from other languages (extern(C)
> etc) are executed the end callback is inserted at the start of the
> functions and the start callback is inserted at the end of the function.
> Using these callbacks the GC can mark certain parts of the stack as
> "non-D" and ignore them when scanning for bit fields and
> references/pointers. It can just skip parts of the stack that are "non-D"
> and therefore does not need a full stack frame within these "non-D"
> sections.
> All these features are required so that the GC can precisely scan
> pointers/references on the stack and change them as necessary.
> Remaining issues: The D compiler can freely move around value types on the
> stack. With such move operations it would be necessary to fix up all the
> bit fields. I needs to be investigated if this is doable.
>
> 3) Tracking references on the heap
>
> For every class / struct a mixin template which is defined inside the runtime gets instantiated. This template can then use introspection to generate the necessary information to allow the GC to scan the pointers within that struct / class precisely.
>
> 4) Thread local / global memory
>
> A callback into the runtime needs to happen in the following cases:
> - a __gshared variable is assigned
> - a reference / pointer is casted to immutable
> - a reference / pointer is casted to shared
>
> void _d_castToGlobalMem(void* ptr);
>
> This can be used by the GC to keep thread local pools of memory and move a memory block to a global memory pool as soon as it is needed there.
>
> 5) pointer / reference changed callback
>
> Every time a pointer / reference is changed the D compiler emits a call into the runtime and passes the new value of the reference / pointer with it.
>
> void _d_pointerChanged(void *ptr);
>
> This can be used when writing a generational GC to have separate pools for young and old generations. Every time the young generation needs to be collected it can be avoided to scan the old generations pool because it is sufficient to only check the pointers that have changed since the last time the young generation was collected. With the above mentioned callback it is easily possible to track these references.
>
> Remaining issues:
> -If the GC interrupts a thread right before any of the above mentioned
> callbacks happen it will cause a invalid state for the GC and the GC might
> access invalid pointers. It has to be investigated if this leads to invalid
> behavior. It can be fixed by not interrupting a thread but pause it the
> next time it calls any of callbacks, or other functions that can be
> interrupted by the GC. This in turn could cause a thread to be non pausable
> because it is stuck in a polling loop. The compiler could identify loops
> without any GC interruptible function and manually insert one.
> -When pointers/references are passed inside processor registers the GC
> cannot know if these values are actually pointers/references or represent
> other values. If threads are paused at defined points in the code as
> mentioned before this issues would be fixed because the state at these
> points is known and can be handled accordingly.
>
> 6) Different interface for the GC
>
> The current interface to the GC would have to change because the "this block of memory might contain a pointer" approach wouldn't work anymore. For example a block of memory and a delegate which iterates over all pointers within the memory block could be used for user allocated memory blocks. There should be a separate allocator function provided by the GC that allocates memory that does not get moved around so it can be used to pass it to non garbage collected code.
>
> 7) Compiler Options
>
> Each of the above mentioned groups of features should be exposed as compiler options so that you can turn them on/off depending on which type of GC you use. Default on/off states for these features are set within a config file depending on which type of GC currently ships per default with druntime.
>
> 8) Conclusion
>
> Garbage Collection brings a lot of advantages for the programmer using the language but is not free and shouldn't be treated as free. Full support for garbage collection is required to build a fast and efficient GC. This additional support requires additional features within the compiler but should result in a overall better performing language.
>

On Thu, Feb 23, 2012 at 06:01:48PM +0100, deadalnix wrote: [...] > Additionnaly, the stack is made like a linked list. Each function calling another one register the return address. With this information, we can have data about what is on the stack except for the very last function called with no runtime overhead. This is another alternative. Yep. In one of my replies I considered the possibility of storing a function ID on the stack, but that may not be necessary if the GC has access to compile-time static info about each function, so just by seeing the return address it knows which function it is, and can figure out where the pointers are. (Of course there are other issues that need to be addressed... but we can't decide on that without actual data.) > But you have to consider that, even with a mask, you are not sure that what is marked as a pointer is a pointer. A memory location can represent different thing during a function execution. So thoses values can only be considered as probable pointers, or we disable some compiler optimizations. As we cannot be sure, the point 2/ stay valid. I believe his proposal was for the function to manually update these bits as it runs. It does introduce a lot of overhead. And like you said, without actual hard data to show whether or not this overhead is justified (offset by improved GC performance), how do we know that we should do this at all? How do we know we aren't making it worse? > Granted the overhead of the operation, it ay not worth it. To know that, we need actual data on how much data is the stack is actually pointer, and how much false positive we get. As the future is 64bits, I'm not sure it is interesting for us. Actually, I believe David Simcha *is* considering the possibility of precise scanning. But the proof is in the actual benchmarks. We don't know if it will help or not unless we have real data to back it up. T -- Real Programmers use "cat > a.out".

On Thu, Feb 23, 2012 at 01:51:31PM +0200, Manu wrote: [...] > I wonder if there are alternative ways to detect a foreign stack. And I'm not sure why it even matters, you can't depend on the extern ABI, how do you unwind the stack reliably in the first place? [...] This is a bit off-topic, but what happens in the current implementation if you pass a D callback to a C function, and then throw an exception from the callback? Does it work? Or does it do something really nasty? T -- Elegant or ugly code as well as fine or rude sentences have something in common: they don't depend on the language. -- Luca De Vitis

On 02/23/12 20:58, H. S. Teoh wrote: > On Thu, Feb 23, 2012 at 01:51:31PM +0200, Manu wrote: > [...] >> I wonder if there are alternative ways to detect a foreign stack. And I'm not sure why it even matters, you can't depend on the extern ABI, how do you unwind the stack reliably in the first place? > [...] > > This is a bit off-topic, but what happens in the current implementation if you pass a D callback to a C function, and then throw an exception from the callback? Does it work? Or does it do something really nasty? No, unless you consider a segfault to be really nasty. :) Actually, it mostly works - i just tried it in a gtk app, and it works as long as you catch the exception and only look at the error msg. If you don't catch it (or try writeln(e) etc), then the result is something like: action.Action!(int).Action.registerNS.MissingActionEx@action.d(54): Action "GUI" missing symbol 'int AppWindowClosed()' ---------------- ./gtkapp() [0x8054366] /usr/lib/i686/sse2/libgtk-x11-2.0.so.0(+0x153052) [0xf7375052] /usr/lib/i686/sse2/libgobject-2.0.so.0(g_closure_invoke+0x19b) [0xf71e25fd] /usr/lib/i686/sse2/libgobject-2.0.so.0(+0x1ddc8) [0xf71f2dc8] /usr/lib/i686/sse2/libgobject-2.0.so.0(g_signal_emit_valist+0x59a) [0xf71fa37f] /usr/lib/i686/sse2/libgobject-2.0.so.0(g_signal_emit+0x34) [0xf71fa61f] /usr/lib/i686/sse2/libgtk-x11-2.0.so.0(+0x2a17f3) [0xf74c37f3] /usr/lib/i686/sse2/libgtk-x11-2.0.so.0(gtk_main_do_event+0x8e6) [0xf7373856] Segmentation fault So something appears to get confused while walking the stack; another thing to investigate later, i guess... artur

On Thu, Feb 23, 2012 at 09:18:22PM +0100, Artur Skawina wrote: > On 02/23/12 20:58, H. S. Teoh wrote: [...] > > This is a bit off-topic, but what happens in the current implementation if you pass a D callback to a C function, and then throw an exception from the callback? Does it work? Or does it do something really nasty? > > No, unless you consider a segfault to be really nasty. :) Well, segfaults are nasty, but there are nastier things. :) > Actually, it mostly works - i just tried it in a gtk app, and it works as long as you catch the exception and only look at the error msg. If you don't catch it (or try writeln(e) etc), then the result is something like: > > action.Action!(int).Action.registerNS.MissingActionEx@action.d(54): Action "GUI" missing symbol 'int AppWindowClosed()' > ---------------- > ./gtkapp() [0x8054366] > /usr/lib/i686/sse2/libgtk-x11-2.0.so.0(+0x153052) [0xf7375052] > /usr/lib/i686/sse2/libgobject-2.0.so.0(g_closure_invoke+0x19b) [0xf71e25fd] > /usr/lib/i686/sse2/libgobject-2.0.so.0(+0x1ddc8) [0xf71f2dc8] > /usr/lib/i686/sse2/libgobject-2.0.so.0(g_signal_emit_valist+0x59a) [0xf71fa37f] > /usr/lib/i686/sse2/libgobject-2.0.so.0(g_signal_emit+0x34) [0xf71fa61f] > /usr/lib/i686/sse2/libgtk-x11-2.0.so.0(+0x2a17f3) [0xf74c37f3] > /usr/lib/i686/sse2/libgtk-x11-2.0.so.0(gtk_main_do_event+0x8e6) [0xf7373856] > Segmentation fault > > So something appears to get confused while walking the stack; another thing to investigate later, i guess... [...] Looks like it got confused at the cross-language boundary. T -- Prosperity breeds contempt, and poverty breeds consent. -- Suck.com

Le 23/02/2012 20:58, H. S. Teoh a écrit : > On Thu, Feb 23, 2012 at 01:51:31PM +0200, Manu wrote: > [...] >> I wonder if there are alternative ways to detect a foreign stack. And >> I'm not sure why it even matters, you can't depend on the extern ABI, >> how do you unwind the stack reliably in the first place? > [...] > > This is a bit off-topic, but what happens in the current implementation > if you pass a D callback to a C function, and then throw an exception > from the callback? Does it work? Or does it do something really nasty? > > > T > This will work, but has serious drawbacks. First of all, you are not sure the C function will release all resources (free, fclose, etc . . .). So you cannot be sure of the state of your program. This is something that you don't want to do.

On 02/22/2012 08:40 PM, H. S. Teoh wrote: > This would introduce quite a lot of overhead per scope. It will also > lead to strange things like: > > if (x) y(); // faster > if (x) { y(); } // slower > Those are the same thing. '{ }' is not what introduces a scope.

Forums