The D standard library is built on GC, is that a negative or positive?

I was speaking to one of my friends on D language and he spoke about how he doesn't like D language due to the fact that its standard library is built on top of GC (garbage collection).

He said that if he doesn't want to implement GC he misses out on the standard library, which for him is a big disadvantage.

Does this claim have merit? I am not far enough into learning D, so I haven't touched GC stuff yet, but I am curious what the D community has to say about this issue.

December 13, 2022

Re: The D standard library is built on GC, is that a negative or positive?

Posted by rikki cattermole
in reply to thebluepandabear

Permalink

rikki cattermole

Posted in reply to thebluepandabear

Permalink

On 13/12/2022 8:11 PM, thebluepandabear wrote:
> Hello,
> 
> I was speaking to one of my friends on D language and he spoke about how he doesn't like D language due to the fact that its standard library is built on top of GC (garbage collection).
> 
> He said that if he doesn't want to implement GC he misses out on the standard library, which for him is a big disadvantage.
> 
> Does this claim have merit? I am not far enough into learning D, so I haven't touched GC stuff yet, but I am curious what the D community has to say about this issue.

No.

Memory and lifetime management, are two very difficult problems to solve and extremely easy to get wrong in a major way.

Garbage collectors are by far the easiest and most common solution in the literature to solve these sets of problems in a way that is very unlikely to cause you issues.

In D the GC is merely a library that you can call into, and in turn control. If you want to stop it collecting until a better time you can. Its just one function call away.

As someone who has studied lock-free concurrent data structures, GC's, memory allocators and has a what amounts to a standard library in -betterC with his own memory allocators and locks using reference counting; use locks, embrace the GC, they solve real problems with less issues.

December 13, 2022

Re: The D standard library is built on GC, is that a negative or positive?

Posted by areYouSureAboutThat
in reply to thebluepandabear

Permalink

areYouSureAboutThat

Posted in reply to thebluepandabear

Permalink

On Tuesday, 13 December 2022 at 07:11:34 UTC, thebluepandabear wrote:
>
> He said that if he doesn't want to implement GC he misses out on the standard library, which for him is a big disadvantage.
>

your 'friends' assertion is not entirely correct:

module test;

@safe:
@nogc:

import std.container.array;

void main()
{
    auto arr = Array!int(0, 1, 2);
    // fine if you comment out @safe:
    // otherwise..Error: @safe cannot call @system constructor -> std.container.array.Array!
}

December 13, 2022

Re: The D standard library is built on GC, is that a negative or positive?

Posted by Dukc
in reply to thebluepandabear

Permalink

Dukc

Posted in reply to thebluepandabear

Permalink

On Tuesday, 13 December 2022 at 07:11:34 UTC, thebluepandabear wrote:

Hello,

I was speaking to one of my friends on D language and he spoke about how he doesn't like D language due to the fact that its standard library is built on top of GC (garbage collection).

He said that if he doesn't want to implement GC he misses out on the standard library, which for him is a big disadvantage.

Does this claim have merit? I am not far enough into learning D, so I haven't touched GC stuff yet, but I am curious what the D community has to say about this issue.

He said he must "implement" the GC? Does this mean he's going to program for some rare target platform where D does not offer GC of the box? If so this somewhat but not quite correct. Parts of D standard library do work with -betterC or otherwise stripped down DRuntime to some extent, but it's quite unstable and often requires workarounds.

On the other hand, if the intention is to simply avoid using the GC, this claim is mostly wrong. Big part of the standard library, probably most of it, works. For example:

@safe @nogc vectorLength(double[] vec)
{   import std.array, std.algorithm, std.math;
    return vec.map!"a*a".sum.sqrt;
}

There are still some tasks, like Unicode normalization, that don't work, but almost all of the core functions in std.algorithm and std.range are GC-free, plus many of the other tasks.

December 13, 2022

Re: The D standard library is built on GC, is that a negative or positive?

Posted by Steven Schveighoffer
in reply to thebluepandabear

Permalink

Steven Schveighoffer

Posted in reply to thebluepandabear

Permalink

On 12/13/22 2:11 AM, thebluepandabear wrote:

Hello,

I was speaking to one of my friends on D language and he spoke about how he doesn't like D language due to the fact that its standard library is built on top of GC (garbage collection).

He said that if he doesn't want to implement GC he misses out on the standard library, which for him is a big disadvantage.

Does this claim have merit? I am not far enough into learning D, so I haven't touched GC stuff yet, but I am curious what the D community has to say about this issue.

Most pieces do not use the GC except to throw exceptions. This one thing is preventing much of the standard library from being @nogc.

I'm not sure what the priority of DIP1008 implementation is, but if we want to severely lessen the reliance on the GC, this would be a huge step in that direction.

-Steve

December 13, 2022

Re: The D standard library is built on GC, is that a negative or positive?

Posted by H. S. Teoh
in reply to thebluepandabear

Permalink

H. S. Teoh

Posted in reply to thebluepandabear

Permalink

On Tue, Dec 13, 2022 at 07:11:34AM +0000, thebluepandabear via Digitalmars-d wrote:
> Hello,
> 
> I was speaking to one of my friends on D language and he spoke about how he doesn't like D language due to the fact that its standard library is built on top of GC (garbage collection).
> 
> He said that if he doesn't want to implement GC he misses out on the standard library, which for him is a big disadvantage.
> 
> Does this claim have merit? I am not far enough into learning D, so I haven't touched GC stuff yet, but I am curious what the D community has to say about this issue.

1) No, this claim has no merit.  However, I sympathize with the reaction because that's the reaction I myself had when I first found D online. I came from a strong C/C++ background, got fed up with C++ and was looking for a new language closer to my ideals of what a programming language should be.  Stumbled across D, which caught my interest.  Then I saw the word "GC" and my knee-jerk reaction was, "what a pity, the rest of the language looks so promising, but GC? No thanks."  It took me a while to realize the flaw in my reasoning.  Today, I wholeheartedly embrace the GC.

2) Your friend has incomplete/inaccurate information about the standard library being dependent on the GC.  A pretty significant chunk of Phobos is actually usable without the GC -- a large part of the range-based stuff (std.range, std.algorithm, etc.), for example.  True, some parts are GC-dependent, but you can still get pretty good mileage out of the nogc subset of Phobos.

//

The thing about GC vs. non-GC is that, coming from a C/C++ background, my philosophy was that I must be in control of every detail of my program; I had to know exactly what it does at any given point. Especially when it comes to managing memory allocations. The idea being that if I kept my memory tidy (i.e., free allocated chunks when I'm done with them) then there wouldn't be an accumulation of garbage that would cost a lot of time to clean up later. The idea of a big black box called the GC that I don't understand, randomly taking over management of my memory, scared me.  What if it triggered a collection at an inconvenient time when performance is critical?

Not an entirely wrong line of reasoning, but manual memory management comes with costs:

a) The biggest cost is the additional mental load it adds to your programming tasks.  Once you go beyond your trivial hello-world and add-two-numbers-together type of functions, you have to start thinking about memory management at every turn, every juncture. "My function needs space to sort this list of stuff, hmm, I need to allocate a buffer. How big of a buffer do I need?  When should I allocate it? When should I free it?  I also need this other scratchpad buffer for caching this other bit of data that I'll need 2 blocks down the function body. Better allocate it too.  Oh no, now I have to free it, so both branches of the if-statement has to check the pointer and free it.  Oh, and inside this loop too; I can't just short-circuit it by returning from the function, I need an exit block for cleaning up my allocations.  Oh, but this function might be called from a performance-critical part of the code!  Better not do allocations here, let the caller pass it in. Oh wait, that changes the signature of this function, so I can't put it in the generic table of function pointers to callbacks anymore, I need a control block to store the necessary information.  Oh wait, I have to allocate the control block too. Who's gonna free it? When should it be freed?"

And on and on it goes.  Pretty soon, you find yourself spending an inordinate amount of time and effort fiddling with memory management rather than making progress in the problem domain, i.e., actually solving the problem you set out to solve in the first place.

And worse yet:

b) Your APIs become cluttered with memory management paraphrenalia. Instead of only input parameters that are directly related to the problem domain the function is supposed to do work in, you must also include memory-management related stuff.  Like allocators, wrapped pointers -- because nobody can keep track of raw pointers without eventually tripping up -- you better wrap it in a managed pointer like auto_ptr<> or some ref-counted handle.  But should you use auto_ptr or ref_counted<> or something else?  In a large project, some functions will expect auto_ptr, others will expect ref_counted<>, and when you need to put them together, you need to insert additional code for interconverting between your wrapped pointer types. (And you need to take extra care not to screw up the semantics and leak/corrupt memory.)

The net result is, memory management paraphrenalia percolates throughout your code, polluting every API and demanding extra code for interconverting / gluing disparate memory management conventions together.  Extra code that don't help you make any progress in your problem domain, but have to be there because of manual memory management.

c) So you went through all of the above troubles because you believed that it would save you from the bogeyman of unwanted GC pauses and keep you in control of the inner workings of your program.  But does it really live up to its promises?  Not necessarily.

If you have a graph of allocated objects, for example, when the last reference to some node in that graph is going out of scope, then you have to deallocate the entire graph.  The dtor must recursively traverse the entire structure and destruct everything, because after that point, you no longer have a reference to the graph, and would leak the memory if you didn't clean up now.  And here's the thing: in a sufficiently complex program, (1) you cannot predict the size of this graph -- it's potentially unbounded; and (2) you cannot predict where in the code the last reference will go out of scope (when the refcount goes to 0, if you're using refcounting).  The net result is: your program will unpredictably get to a point where it must spend an unbounded amount of time to deallocate a large graph of allocated objects.

IOW, this is not that much different from the GC having to pause and do a collection at an unpredictable time.

So you put in all this effort just to avoid this bogeyman, and lo and behold you haven't got rid of it at all!

Furthermore, on today's CPU architectures that have cache hierarchies and memory access prediction units, one very important factor of performance is locality. I.e., if your program accesses memory in a sequential pattern, or within close proximity to each other, your program tends to run faster, than if it had to successively access multiple random locations in memory.  If you manage memory yourself, then when a large graph of objects is going out of scope you're forced to clean it up right there and then -- even if the nodes happen to be widely scattered across memory (because they were allocated at different times in the program and attached to the graph).  If you used a GC, however, the GC could change the order in which it scans for garbage in a way that has better cache utility -- because the GC isn't obligated to clean up immediately, but can wait until there's enough garbage that a single sweep would pick up pieces of diverse object graphs that happen to be close to each other in memory, and clean them up in a sequential order so that there are less CPU cache misses.

Or, to put it succinctly, the GC can sometimes outperform your manual management of memory!

d) Lastly, memory management is hard.  Very hard.  So hard that, after how many decades of industry experience with manual memory management in C/C++, well-known, battle-worn large software projects are still riddled with memory management bugs that lead to crashes and security exploits. Just check the CVE database, for example.  An inordinately large proportion of security bugs are related to memory management.

Using a GC immediately gets rid of 90% of these issues. (Not 100%, unfortunately, because there are still cases where problems may arise. See: "memory management is hard".)  If you don't need to write the code that frees memory, then by definition you cannot introduce bugs while doing so.

This leads us to the advantages of having a GC:

1) It greatly reduces the number of memory-related bugs in your program. Gets rid of an entire class of bugs related to manually managing your allocations.

2) It frees up your mental resources to make progress in your problem domain, instead of endlessly worrying about the nitty-gritty of memory management at every turn.  More mental resources available means you can make progress in your problem domain faster, and with lower chances of bugs.

3) Your APIs become cleaner.  You no longer need memory management paraphrenalia polluting your APIs; your parameters can be restricted to only those that are required for your problem domain and nothing else. Cleaner APIs lead to less boilerplate / glue code for interfacing between APIs that expect different memory management schemes (e.g., converting between auto_ptr<> and ref_counted<> or whatever). Diverse modules become more compatible with each other, and can call each other with less friction.  Less friction means shorter development times, less bugs, and better maintainability (code without memory management paraphrenalia is much easier to read -- and understand correctly so that you can make modifications without introducing bugs).

4) In some cases, you may even get better runtime performance than if you manually managed everything.

//

And as a little footnote: D's GC does not run in the background independently of your program's threads; GC collections will NOT trigger unless you're allocating memory and the GC runs out of memory to give you.  Meaning that you *do* have some control over GC pauses in your program -- if you want to be sure you have no collections in some piece of code, simply don't do any allocations, and collections won't start.

If you're worried that another thread might trigger a collection, you
can always bring out the GC.stop() hammer to stop the GC from doing any
collections even in the face of continuing allocations. (And then call
GC.start() later when it's safe for collections to run again.)

And if you're like me, and you like more control over how things are run in your program, you can even call GC.stop() and then periodically call GC.collect() in your own schedule, at your own convenience. (In one of my D projects, I managed to eke out a 20-25% performance boost just by reducing the frequency of GC collections by running GC.collect on my own schedule.)

//

Also, in those few places in your code where the GC really *does* get in your way, there's @nogc at your disposal.  The compiler will statically enforce zero GC usage in such functions, so that you can be sure you won't trigger any collections and you won't make any new GC allocations.

//

So you see, the GC isn't really *that* bad, as if it were a plague that you have to avoid at all costs.  It's actually a good helper if you know how to make use of its advantages.

T

-- 
Why waste time reinventing the wheel, when you could be reinventing the engine? -- Damian Conway

December 13, 2022

Re: The D standard library is built on GC, is that a negative or positive?

Posted by Steven Schveighoffer
in reply to H. S. Teoh

Permalink

Steven Schveighoffer

Posted in reply to H. S. Teoh

Permalink

On 12/13/22 8:18 PM, H. S. Teoh wrote:

On Tue, Dec 13, 2022 at 07:11:34AM +0000, thebluepandabear via Digitalmars-d wrote:

Hello,

I was speaking to one of my friends on D language and he spoke about
how he doesn't like D language due to the fact that its standard
library is built on top of GC (garbage collection).

He said that if he doesn't want to implement GC he misses out on the
standard library, which for him is a big disadvantage.

Does this claim have merit? I am not far enough into learning D, so I
haven't touched GC stuff yet, but I am curious what the D community
has to say about this issue.

No, this claim has no merit. However, I sympathize with the reaction
because that's the reaction I myself had when I first found D online. I
came from a strong C/C++ background, got fed up with C++ and was looking
for a new language closer to my ideals of what a programming language
should be. Stumbled across D, which caught my interest. Then I saw the
word "GC" and my knee-jerk reaction was, "what a pity, the rest of the
language looks so promising, but GC? No thanks." It took me a while to
realize the flaw in my reasoning. Today, I wholeheartedly embrace the
GC.

So while I, too, don't hate the GC and embrace it, the truth really is that way way too much is dependent on the GC.

as I alluded to in my previous post:

void main() @nogc
{
   import std.conv;
   auto v = "42".to!int;
}

fails to compile. Why? Surely, it can't use the GC for converting a string to an integer? Well, it doesn't. But if the string you give it happens to not contain a string representation of an integer, it wants to throw an exception. And the act of allocating and throwing that exception needs the GC.

We really really need to fix it. It completely cuts the legs out of the answer "if you don't want the gc, use @nogc". If we do fix it, all these questions pretty much just go away. It goes from something like 20% of phobos being nogc-compatible to 80%.

-Steve

December 13, 2022

Re: The D standard library is built on GC, is that a negative or positive?

Posted by H. S. Teoh
in reply to Steven Schveighoffer

Permalink

H. S. Teoh

Posted in reply to Steven Schveighoffer

Permalink

On Tue, Dec 13, 2022 at 08:47:29PM -0500, Steven Schveighoffer via Digitalmars-d wrote:
> On 12/13/22 8:18 PM, H. S. Teoh wrote:
> > On Tue, Dec 13, 2022 at 07:11:34AM +0000, thebluepandabear via Digitalmars-d wrote:
> > > Hello,
> > > 
> > > I was speaking to one of my friends on D language and he spoke about how he doesn't like D language due to the fact that its standard library is built on top of GC (garbage collection).
> > > 
> > > He said that if he doesn't want to implement GC he misses out on the standard library, which for him is a big disadvantage.
> > > 
> > > Does this claim have merit? I am not far enough into learning D, so I haven't touched GC stuff yet, but I am curious what the D community has to say about this issue.
> > 
> > 1) No, this claim has no merit.  However, I sympathize with the reaction because that's the reaction I myself had when I first found D online. I came from a strong C/C++ background, got fed up with C++ and was looking for a new language closer to my ideals of what a programming language should be.  Stumbled across D, which caught my interest.  Then I saw the word "GC" and my knee-jerk reaction was, "what a pity, the rest of the language looks so promising, but GC? No thanks."  It took me a while to realize the flaw in my reasoning.  Today, I wholeheartedly embrace the GC.
> 
> So while I, too, don't hate the GC and embrace it, the truth really is that way way too much is dependent on the GC.
> 
> as I alluded to in my previous post:
> 
> ```d
> void main() @nogc
> {
>    import std.conv;
>    auto v = "42".to!int;
> }
> ```
> 
> fails to compile. Why? Surely, it can't use the GC for converting a string to an integer? Well, it doesn't. But if the string you give it happens to not contain a string representation of an integer, it wants to throw an exception. And the act of allocating and throwing that exception needs the GC.
> 
> We really really need to fix it. It completely cuts the legs out of the answer "if you don't want the gc, use @nogc". If we do fix it, all these questions pretty much just go away. It goes from something like 20% of phobos being nogc-compatible to 80%.
[...]

Hmm.  Whatever happened to that proposal for GC-less exceptions? Something about allocating the exception from a static buffer and freeing it in the catch block or something?


T

-- 
Which is worse: ignorance or apathy? Who knows? Who cares? -- Erich Schubert

December 13, 2022

Re: The D standard library is built on GC, is that a negative or positive?

Posted by Steven Schveighoffer
in reply to H. S. Teoh

Permalink

Steven Schveighoffer

Posted in reply to H. S. Teoh

Permalink

On 12/13/22 9:05 PM, H. S. Teoh wrote:

> Hmm.  Whatever happened to that proposal for GC-less exceptions?
> Something about allocating the exception from a static buffer and
> freeing it in the catch block or something?

https://github.com/dlang/DIPs/blob/master/DIPs/other/DIP1008.md

"Postponed"

-Steve

December 13, 2022

Re: The D standard library is built on GC, is that a negative or positive?

Posted by Ali Çehreli
in reply to H. S. Teoh

Permalink

Ali Çehreli

Posted in reply to H. S. Teoh

Permalink

On 12/13/22 18:05, H. S. Teoh wrote:

> Hmm.  Whatever happened to that proposal for GC-less exceptions?
> Something about allocating the exception from a static buffer and
> freeing it in the catch block or something?

I have an errornogc module here:

  https://code.dlang.org/packages/alid

I hope it still compiles. :)

Ali

Top | Forum index | About this forum

Forums