On Wednesday, 3 September 2025 at 23:05:45 UTC, H. S. Teoh wrote:
> On Wed, Sep 03, 2025 at 07:56:03PM +0000, Brother Bill via Digitalmars-d-learn wrote:
[...]
> C, C++ and D can play shenanigans with pointers, such as casting them to size_t, which hides them from the GC.
D's current GC is conservative, meaning that any value it sees that looks like it might be a pointer value, will be regarded as a pointer value.
There is an optional precise GC that has been implemented, that can be turned on with compiled-in options or command line options, which uses a slightly less conservative scheme.
The recommendation is avoid only storing data in size_t
that points to an allocated block.
Even without the precise collector, the GC has pointer containing blocks and no-pointer blocks. this means that it's quite easy to accidentally only store a pointer in a size_t
that will not be scanned, even with the conservative GC.
You should only store pointers as size_t
"if you know what you are doing". Otherwise do not do this.
It is fine to make a temporary copy of a pointer to a size_t
for example to examine the bits inside. This should leave the original pointer alone.
> [...]
> GC.calloc can allocate memory for a slice of MyClass instances. The developer may run GC.free to free the allocated memory. But GC may perform its own garbage collection of GC allocated memory blocks.
GC.free
is going to free the memory. It will NOT run finalizers. It will not collect it again later. I want to make that clear.
If you do not explicitly free the memory, and it becomes garbage, then the GC will collect it.
As far as a slice of MyClass
instances, if you mean a slice of data that contains the fields of an array of classes, you should be very cautious of this. The GC is not equipped to call finalizers on such a structure, and so you likely will run into lifetime issues.
For classes, I'd just stick with new
.
For structs, you can quite easily allocate an array of structs, and the GC can support finalization of that. Also recommend just using new
.
> > Let's look at each attribute: (confirm if my analysis is right,
otherwise correct)
FINALIZE - just before GC reclaims the memory, such as with GC.free,
call destructors, aka finalizers.
This bit is probably best left untouched by user code, and left to the runtime to figure out when/how to use it.
In the latest compiler (2.111), this has been changed to a bit that requests finalization upon allocation. The GC uses this bit and the typeinfo passed in to determine the correct action. This is different from before where the bit was an implementation detail that you had to know what you are asking for.
I do agree that you should basically leave this alone. But for sure the new treatment of the bit is more robust than before.
Note: changing bits after allocation does not take this into account, at that point you are modifying implementation details. I really would like to get rid of these bits completely and use more reliable API (having a set of implementation bits as an option is quite dangerous).
> > NO_SCAN - There may be false positives regarding byte values that look like 'new' allocated pointers. This can result in 'garbage' memory not being collected. If we are CERTAIN that this memory block doesn't contain any pointers to 'new' SomeClass allocated memory, then mark as NO_SCAN.
Correct. Though if you're writing idiomatic D code, you'll almost never need to worry about this. Whenever you allocate an array whose elements are PODs (without any pointers), the allocator will automatically mark the memory NO_SCAN so that the GC doesn't waste time scanning such blocks. So things like implicit string allocations will be marked NO_SCAN, etc. If you're allocating an array or object that contains indirections, then NO_SCAN will not be set, so the GC will scan the interior of suc blocks for pointers to other live objects.
I will add that the concern of scanning non-pointers is pretty much obsolete with 64-bit addressing. It's still important to use NO_SCAN
, as it's quite common to allocate large blocks of data that are just bytes (e.g. load a file). You don't want to waste time scanning that, even if there are no false-positives to be found in there.
> > Question 1: if GC-calloc has allocated MyClass that has a
string 'name' member, which may expand in size, would be
still properly apply NO_SCAN.
I would say this is not true. A string has a pointer, it should be scanned.
> > Question 2: if GC-calloc has allocated MyClass, which may
allocate new MyStudent(...), would that mean 'don't apply
NO_SCAN'?
It's very simple. If a memory block may contain pointers, then it should not be NO_SCAN. If a memory block never contains any pointers, then it can (should) be marked NO_SCAN.
100% correct.
> Normal D code does not need to fiddle with GC flags.
Great advice!
> > NO_MOVE - For GC.realloc, if increasing memory allocated, and it's not available, throw 'MEMORY_NOT_AVAILABLE' exception.
Correct. You might want to use this flag if you have non-D code that might be holding pointers to this memory block, e.g., if you passed a pointer to some D array to C code which retains it in some C-managed pointer, and the C code expects the array to still be there later.
It's not very often that such situations come up, though. When passing GC-allocated data to C code, it's generally a good idea to keep a reference to it inside D code so that the GC can find the reference anyway. Since D doesn't have a moving GC, this is really all you need to do. Again, unless you're doing something unusual, you probably don't need to touch the NO_MOVE flag.
No, this is not correct. NO_MOVE
is supposed to mean that a moving GC cannot move this block (and fix up pointers to it).
Given that we have a conservative GC, which scans the stack conservatively including C stacks, and we will always have one, I would say this bit should just be deprecated.
Indeed, it is completely ignored in the current GC.
> > APPENDABLE - For D internal runtime use. Don't mark this yourself.
Yes.
Also improved with D 2.111. The APPENDABLE
bit is now an input to malloc that tells the GC this is an array (including adjusting the size to deal with padding space). The GC now handles array runtime features directly, and so it understands what this means.
So in fact, this is a bit you can set, and there are currently unexposed GC interface functions that can be used to manage the array. They have not yet been exposed in core.memory
, because we are not sure if these are the final interfaces we want.
However, allocating an array with this bit will do exactly what you expect (and managing the resulting array with the normal array management functions such as appending or capacity
will work).
I do still recommend using new
.
> > NO_INTERIOR - This says that only the base address of the block may be a target address of other GC allocated pointers. All other possible pointers are 'false' pointers.
Yes, though I would say it like:
"only pointers found while scanning that point to the exact target address may be considered pointers to the block."
Again, this is really only of great use in 32-bit addressing.
> > Perhaps I am missing the fundamentals of various D garbage collectors.
[...]
The various GC flags are simply hints that let you influence the scanning process to some extent. The NO_SCAN bit means that upon reaching this block, don't bother scanning its contents to find more pointers (because there are none). The NO_INTERIOR bit means that if the GC finds a pointer-like value that looks like it points to the inside of this block, ignore it as a non-pointer, because pointers to this block only ever point to its head (the supposed pointer is actually not a real pointer, but an integer value that happens to have a pointer-like value).
The other flags have very specific uses that, if you don't know what they actually do, you probably don't need them and shouldn't touch them.
Flags you should be able to use:
NO_SCAN
FINALIZE
APPENDABLE
NO_INTERIOR
(very cautiously)
Do not use any other bits directly. A future version of D likely will migrate these into function parameters instead of providing bits.
-Steve