GC.calloc(), then what? - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » Learn » GC.calloc(), then what?

Thread overview

GC.calloc(), then what?
Jun 27, 2014 Ali Çehreli
Jun 27, 2014 safety0ff
Jun 27, 2014 safety0ff
Jun 27, 2014 Ali Çehreli
Jun 27, 2014 Ali Çehreli
Jun 27, 2014 safety0ff
Jun 27, 2014 Ali Çehreli
Jun 29, 2014 safety0ff
Jun 27, 2014 safety0ff
Jun 27, 2014 safety0ff
Jun 27, 2014 eles
Jun 27, 2014 Sean Kelly

June 27, 2014

GC.calloc(), then what?

Posted by Ali Çehreli

Ali Çehreli

1) After allocating memory by GC.calloc() to place objects on it, what else should one do? In what situations does one need to call addRoot() or addRange()?

2) Does the answer to the previous question differ for struct objects versus class objects?

3) Is there a difference between core.stdc.stdlib.calloc() and GC.calloc() in that regard? Which one to use in what situation?

4) Are the random bit patterns in a malloc()'ed memory always a concern for false pointers? Does that become a concern after calling addRoot() or addRange()? If so, why would anyone ever malloc() instead of always calloc()'ing?

Ali

June 27, 2014

Re: GC.calloc(), then what?

Posted by safety0ff
in reply to Ali Çehreli

safety0ff

Posted in reply to Ali Çehreli

On Friday, 27 June 2014 at 07:03:28 UTC, Ali Çehreli wrote:
> 1) After allocating memory by GC.calloc() to place objects on it, what else should one do?

Use std.conv.emplace.

> In what situations does one need to call addRoot() or addRange()?

Add root creates an internal reference within the GC to the memory pointed by the argument (void* p.)
This pins the memory so that it won't be collected by the GC. E.g. you're going to pass a string to an extern C function, and the function will store a pointer to the string within its own data structures. Since the GC won't have access to the data structures, you must addRoot it to avoid creating a dangling pointer in the C data structure.

Add range is usually for cases when you use stdc.stdlib.malloc/calloc and place pointers to GC managed memory within that memory. This allows the GC to scan that memory for pointers during collection, otherwise it may reclaim memory which is pointed to my malloc'd memory.

> 2) Does the answer to the previous question differ for struct objects versus class objects?

No.

> 3) Is there a difference between core.stdc.stdlib.calloc() and GC.calloc() in that regard? Which one to use in what situation?

One is GC managed, the other is not. calloc simply means the memory is pre-zero'd, it has nothing to do with "C allocation" / "allocation in the C language"

> 4) Are the random bit patterns in a malloc()'ed memory always a concern for false pointers? Does that become a concern after calling addRoot() or addRange()?

If by malloc you're talking about stdc.stdlib.malloc then:
It only becomes a concern after you call addRange, and the false pointers potential is only present within the range you gave to addRange.
So if you over-allocate using malloc and give the entire memory range to addRange, then any false pointers in the un-intialized portion become a concern.

If you're talking about GC.malloc():
Currently the GC zeros the memory unless you allocate NO_SCAN memory, so it only differs in the NO_SCAN case.

> If so, why would anyone ever malloc() instead of always calloc()'ing?

To save on redundant zero'ing.

June 27, 2014

Re: GC.calloc(), then what?

Posted by safety0ff
in reply to safety0ff

safety0ff

Posted in reply to safety0ff

I realize that my answer isn't completely clear in some cases, if you still have questions, ask away.

June 27, 2014

Re: GC.calloc(), then what?

Posted by Ali Çehreli
in reply to safety0ff

Ali Çehreli

Posted in reply to safety0ff

Thank you for your responses. I am partly enlightened. :p

On 06/27/2014 12:34 AM, safety0ff wrote:

> On Friday, 27 June 2014 at 07:03:28 UTC, Ali Çehreli wrote:
>> 1) After allocating memory by GC.calloc() to place objects on it, what
>> else should one do?
>
> Use std.conv.emplace.

That much I know. :) I have actually finished the first draft of translating my memory management chapter (the last one in the book!) and trying to make sure that the information is correct.

>> In what situations does one need to call addRoot() or addRange()?
>
> Add root creates an internal reference within the GC to the memory
> pointed by the argument (void* p.)
> This pins the memory so that it won't be collected by the GC. E.g.
> you're going to pass a string to an extern C function, and the function
> will store a pointer to the string within its own data structures. Since
> the GC won't have access to the data structures, you must addRoot it to
> avoid creating a dangling pointer in the C data structure.

Additionally and according to the documentation, any other GC blocks will be considered live. So, addRoot makes a true roots where the GC starts its scanning from.

> Add range is usually for cases when you use stdc.stdlib.malloc/calloc
> and place pointers to GC managed memory within that memory. This allows
> the GC to scan that memory for pointers during collection, otherwise it
> may reclaim memory which is pointed to my malloc'd memory.

One part that I don't understand in the documentation is "if p points into a GC-managed memory block, addRange does not mark this block as live".

  http://dlang.org/phobos/core_memory.html#.GC.addRange

Does that mean that if I have objects in my addRange'd memory that in turn have references to objects in the GC-managed memory, my references in my memory may be stale?

If so, does that mean that if I manage objects in my memory, all their members should be managed by me as well?

This seems to bring two types of GC-managed memory:

1) addRoot'ed memory that gets scanned deep (references are followed)

2) addRange'd memory that gets scanned shallow (references are not followed)

See, that's confusing: What does that mean? I still hold the memory block anyway; what does the GC achieve by scanning my memory if it's not going to follow references anyway?

>> 2) Does the answer to the previous question differ for struct objects
>> versus class objects?
>
> No.
>
>> 3) Is there a difference between core.stdc.stdlib.calloc() and
>> GC.calloc() in that regard? Which one to use in what situation?
>
> One is GC managed, the other is not. calloc simply means the memory is
> pre-zero'd, it has nothing to do with "C allocation" / "allocation in
> the C language"

I know even that much. ;) I find people's malloc+memset code amusing.

>> 4) Are the random bit patterns in a malloc()'ed memory always a
>> concern for false pointers? Does that become a concern after calling
>> addRoot() or addRange()?
>
> If by malloc you're talking about stdc.stdlib.malloc then:
> It only becomes a concern after you call addRange,

But addRange doesn't seem to make sense for stdlib.malloc'ed memory, right? The reason is, that memory is not managed by the GC so there is no danger of losing that memory due to a collection anyway. It will go away only when I call stdlib.free.

> and the false
> pointers potential is only present within the range you gave to addRange.
> So if you over-allocate using malloc and give the entire memory range to
> addRange, then any false pointers in the un-intialized portion become a
> concern.

Repeating myself, that makes sense but I don't see when I would need addRange on a stdlib.malloc'ed memory.

> If you're talking about GC.malloc():
> Currently the GC zeros the memory unless you allocate NO_SCAN memory, so
> it only differs in the NO_SCAN case.

So, the GC's default behavior is to scan the memory, necessitating clearing the contents? That seems to make GC.malloc() behave the same as GC.calloc() by default, doesn't it?

So, is this guideline right?

  "GC.malloc() makes sense only with NO_SCAN."

>> If so, why would anyone ever malloc() instead of always calloc()'ing?
>
> To save on redundant zero'ing.

And again, redundant zero'ing is saved only when used with NO_SCAN.

I think I finally understand the main difference between stdlib.malloc and GC.malloc: The latter gets collected by the GC.

Another question: Do GC.malloc'ed and GC.calloc'ed memory scanned deep?

Ali

June 27, 2014

Re: GC.calloc(), then what?

Posted by Ali Çehreli
in reply to safety0ff

Ali Çehreli

Posted in reply to safety0ff

On 06/27/2014 12:53 AM, safety0ff wrote:
> I realize that my answer isn't completely clear in some cases, if you
> still have questions, ask away.

Done! That's why we are here anyway. :p

Ali

June 27, 2014

Re: GC.calloc(), then what?

Posted by safety0ff
in reply to Ali Çehreli

safety0ff

Posted in reply to Ali Çehreli

On Friday, 27 June 2014 at 08:17:07 UTC, Ali Çehreli wrote:
> Thank you for your responses. I am partly enlightened. :p

I know you're a knowledgeable person in the D community, I may have stated many things you already knew, but I tried to answer the questions as-is.

> On 06/27/2014 12:34 AM, safety0ff wrote:
>
> > Add range is usually for cases when you use
> stdc.stdlib.malloc/calloc
> > and place pointers to GC managed memory within that memory.
> This allows
> > the GC to scan that memory for pointers during collection,
> otherwise it
> > may reclaim memory which is pointed to my malloc'd memory.
>
> One part that I don't understand in the documentation is "if p points into a GC-managed memory block, addRange does not mark this block as live".
>
> [SNIP]
>
> See, that's confusing: What does that mean? I still hold the memory block anyway; what does the GC achieve by scanning my memory if it's not going to follow references anyway?

The GC _will_ follow references (i.e. scan deeply,) that's the whole point of addRange.
What that documentation is saying is that:

If you pass a range R through addRange, and R lies in the GC heap, then once there are no pointers (roots) to R, the GC will collect it anyway regardless that you called addRange on it.

In other words, prefer using addRoot for GC memory and addRange for non-GC memory.

> >> 4) Are the random bit patterns in a malloc()'ed memory
> always a
> >> concern for false pointers? Does that become a concern after
> calling
> >> addRoot() or addRange()?
> >
> > If by malloc you're talking about stdc.stdlib.malloc then:
> > It only becomes a concern after you call addRange,
>
> But addRange doesn't seem to make sense for stdlib.malloc'ed memory, right? The reason is, that memory is not managed by the GC so there is no danger of losing that memory due to a collection anyway. It will go away only when I call stdlib.free.

addRange almost exclusively makes sense with stdlib.malloc'ed memory.
As you've stated: If you pass it GC memory it does not mark the block as live.

I believe the answer above clears things up: the GC does scan the range, and scanning is always "deep" (i.e. when it finds pointers to unmarked GC memory, it marks them.)

Conversely, addRoot exclusively makes sense with GC memory.

> > If you're talking about GC.malloc():
> > Currently the GC zeros the memory unless you allocate NO_SCAN
> memory, so
> > it only differs in the NO_SCAN case.
>
> So, the GC's default behavior is to scan the memory, necessitating clearing the contents? That seems to make GC.malloc() behave the same as GC.calloc() by default, doesn't it?

I don't believe it's necessary to clear it, it's just a measure against false pointers (AFAIK.)

> So, is this guideline right?
>
>   "GC.malloc() makes sense only with NO_SCAN."

I wouldn't make a guideline like that, just say that: if you want the memory to be guaranteed to be zero'd use GC.calloc.

However, due to GC internals (for preventing false pointers,) GC.malloc'd memory  will often be zero'd anyway.

> >> If so, why would anyone ever malloc() instead of always
> calloc()'ing?
> >
> > To save on redundant zero'ing.
>
> And again, redundant zero'ing is saved only when used with NO_SCAN.

Yup.

> I think I finally understand the main difference between stdlib.malloc and GC.malloc: The latter gets collected by the GC.

Yup.

> Another question: Do GC.malloc'ed and GC.calloc'ed memory scanned deep?

Yes, only NO_SCAN memory doesn't get scanned, everything else does.

June 27, 2014

Re: GC.calloc(), then what?

Posted by safety0ff
in reply to Ali Çehreli

safety0ff

Posted in reply to Ali Çehreli

On Friday, 27 June 2014 at 08:17:07 UTC, Ali Çehreli wrote:
>
> So, the GC's default behavior is to scan the memory, necessitating clearing the contents? That seems to make GC.malloc() behave the same as GC.calloc() by default, doesn't it?

Yes.
compare:
https://github.com/D-Programming-Language/druntime/blob/master/src/gc/gc.d#L543
to:
https://github.com/D-Programming-Language/druntime/blob/master/src/gc/gc.d#L419

June 27, 2014

Re: GC.calloc(), then what?

Posted by safety0ff
in reply to safety0ff

safety0ff

Posted in reply to safety0ff

On Friday, 27 June 2014 at 09:20:53 UTC, safety0ff wrote:
> Yes.
> compare:
> https://github.com/D-Programming-Language/druntime/blob/master/src/gc/gc.d#L543
> to:
> https://github.com/D-Programming-Language/druntime/blob/master/src/gc/gc.d#L419

Actually, I just realized that I was wrong in saying "the memory likely be cleared by malloc" it's only the overallocation that gets cleared.

June 27, 2014

Re: GC.calloc(), then what?

Posted by eles
in reply to Ali Çehreli

eles

Posted in reply to Ali Çehreli

On Friday, 27 June 2014 at 08:17:07 UTC, Ali Çehreli wrote:
> Thank you for your responses. I am partly enlightened. :p
>
> On 06/27/2014 12:34 AM, safety0ff wrote:
>
> > On Friday, 27 June 2014 at 07:03:28 UTC, Ali Çehreli wrote:

> But addRange doesn't seem to make sense for stdlib.malloc'ed memory, right? The reason is, that memory is not managed by the GC so there is no danger of losing that memory due to a collection anyway. It will go away only when I call stdlib.free.

It is not about that, but about the fact that this unmanaged memory *might contain* references towards managed memory.

If you intend to place such references into this particular chunk of memory, then you need to tell GC to scan the memory chunk for references towards managed memory.

Otherwise, the GC might ignore this chunk of memory, find elsewhere no references towards a managed object, delete the managed object, then your pointer placed in the unmanaged memory becomes dangling.

June 27, 2014

Re: GC.calloc(), then what?

Posted by Sean Kelly
in reply to safety0ff

Sean Kelly

Posted in reply to safety0ff

On Friday, 27 June 2014 at 07:34:55 UTC, safety0ff wrote:
> On Friday, 27 June 2014 at 07:03:28 UTC, Ali Çehreli wrote:
>> 1) After allocating memory by GC.calloc() to place objects on it, what else should one do?
>
> Use std.conv.emplace.

And possibly set BlkInfo flags to indicate whether the block has
pointers, and the finalize flag to indicate that it's an object.
I'd look at _d_newclass in Druntime/src/rt/lifetime.d for the
specifics.

To be honest, I think the GC interface is horribly outdated, but
my proposal for a redesign (first in 2010, then again in 2012 and
once again in 2013) never gained traction.  In short, what I'd
really like to have is a way to tell the GC to allocate an object
of type T.  Perhaps Andrei's allocators will sort this out and
the issue will be moot.  For reference:

http://lists.puremagic.com/pipermail/d-runtime/2010-August/000075.html
http://lists.puremagic.com/pipermail/d-runtime/2012-April/001095.html
http://lists.puremagic.com/pipermail/d-runtime/2013-July/001840.html

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation