Why many programmers don't like GC? (page 3) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » Learn » Why many programmers don't like GC? (page 3)

January 14, 2021

Re: Why many programmers don't like GC?

Posted by Ola Fosheim Grøstad
in reply to Ola Fosheim Grøstad

Ola Fosheim Grøstad

Posted in reply to Ola Fosheim Grøstad

On Thursday, 14 January 2021 at 13:16:16 UTC, Ola Fosheim Grøstad wrote:
> 1. Use "shared" to prevent GC allocated memory from entering other threads and switch to thread local GC. Then use ARC for shared.
>
> 2. Redefine language semantics/type system for a different GC model. This will break existing code.

3. Keep the existing GC for existing code and introduce ARC across the board for new code. Add a versioning statement that people can add to their libraries to tell the compiler which models they support.

January 14, 2021

Re: Why many programmers don't like GC?

Posted by Ola Fosheim Grøstad
in reply to sighoya

Ola Fosheim Grøstad

Posted in reply to sighoya

On Thursday, 14 January 2021 at 13:05:31 UTC, sighoya wrote:
> But this is already the case for C++ and Rust. Remembering the days back developing in C++ there were a huge amount of memory deallocation side effects because opencv's memory management differs from qt's memory management.

The problem in C++ is that older frameworks have their own ways of doing things for performance reasons or because the C++ standard they started with didn't provide what they needed...

And... most C++ frameworks that are big are old... If you avoid big frameworks then it gets better.

> Personally, I find it better to prefer encapsulating manual memory management and not to leak them outside.

Yes. Most programmers don't need system level programming. So if D defines itself to not be a system level programming language then there would be room to improve a lot, but then it should move towards more high level features and prevent the usage of some low level features like untagged non-discriminating unions of pointers. Rust is more high level than D... I think.

January 14, 2021

Re: Why many programmers don't like GC?

Posted by Виталий Фадеев
in reply to Marcone

Виталий Фадеев

Posted in reply to Marcone

On Wednesday, 13 January 2021 at 18:58:56 UTC, Marcone wrote:
> I've always heard programmers complain about Garbage Collector GC. But I never understood why they complain. What's bad about GC?

I like GC.

How write quickly without GC ?

January 14, 2021

Re: Why many programmers don't like GC?

Posted by Dukc
in reply to Виталий Фадеев

Dukc

Posted in reply to Виталий Фадеев

On Thursday, 14 January 2021 at 14:28:43 UTC, Виталий Фадеев wrote:
> On Wednesday, 13 January 2021 at 18:58:56 UTC, Marcone wrote:
>> I've always heard programmers complain about Garbage Collector GC. But I never understood why they complain. What's bad about GC?
>
> How write quickly without GC ?

In DMD style: never release memory!

This is not an option for long-running programs though, nor for anything that otherwise uses significant amounts of memory.

Better to just use the GC if unsure.

January 14, 2021

Re: Why many programmers don't like GC?

Posted by ddcovery
in reply to Basile B.

ddcovery

Posted in reply to Basile B.

On Thursday, 14 January 2021 at 10:28:13 UTC, Basile B. wrote:
> On Wednesday, 13 January 2021 at 18:58:56 UTC, Marcone wrote:
>> I've always heard programmers complain about Garbage Collector GC. But I never understood why they complain. What's bad about GC?
>
> Semi serious answer:
>
> In the domain of hoby-ism and small companies programmers that work with statically typed languages all believe that they are super hero in the domain of memory managment. When they see "GC" they think that they are considered as 2nd grade student ^^
>
> It's basically snobbism.

Hi Basile,

My experience:

in 90's I worked with Pascal, C and C++ with rudimentary memory management: basically it was no difference between working with memory or files in terms of life-cycle management: you must alloc/free memory and you must open/close files.  The secret for "stability" was a set of conventions to determine who was the responsible of the resource handler or memory pointer:  I developed some ERP/CRMs, some multimedia products  and some industrial environment applications (real time ones).

 At the end of 90's I began to work with VB and the COM model (that uses references counter) and I discovered that the best way to manage memory (avoiding death-locks) was treating objects as "external" unmanaged resources:  The VB6 "WITH" statement was key to use ARM techniques (similar to future "using" in C#).

And then arrived GC with C#, Java and Scala:  I have found GC good enough for all applications and services that I have been developing last 20 years because this languages (and it's frameworks+based libraries) have never crossed certain limits:  they always separated managed and unmanaged resources:  developer is responsible of unmanaged resources, and Memory is managed by GC.   Language itself offers you good tooling to ARM (like "using" in c#, "try-with-resources" in java, ...).

Finally arrived the last actors to the scene: mainly javascript and derivatives (when working in a browser context), where developer is abstracted of how memory and resources are really managed (I can remember critical bugs in chrome like Image object memory leaks because this "abstraction").

GC has introduced a "productive" way of working removing old memory problems for large scale projects (and finally with other kind of resources in some scenarios) but, as developers/architects, we have de responsibility to recognize the limits to each technique and when it fits to our needs.

After all, my opinion is that if I was to develop something like a Real Time app (industrial/medical/aeronautics/...) or a game where a large amount of objects must be mutated ~30 times per second,  GC "unpredictable" or "large" time cost will be enough to stop using it.  There is other reasons (like "efficient" memory management when we need to manage large amounts of memory or to run in limited memory environments).

I understand perfectly the D community people that needs to work without GC:  **it is not snobbish**:  it is a real need.  But not only a "need"... sometimes it is basically the way a team wants to work:  explicit memory management vs GC.

D toke the way of GC without "cutting" the relationship with C/C++ developers:  I really don't have enough knowledge of the language and libraries to know the level of support that D offers to non GC based developments, but I find completely logic trying to maintain this relationship (in the basis that GC must continue been the default way of working)

Sorry for my "extended", may be unnecessary, explanation (and my "poor" english :-p).

January 14, 2021

Re: Why many programmers don't like GC?

Posted by mw
in reply to Ola Fosheim Grøstad

mw

Posted in reply to Ola Fosheim Grøstad

On Thursday, 14 January 2021 at 09:26:06 UTC, Ola Fosheim Grøstad wrote:
> On Thursday, 14 January 2021 at 00:37:29 UTC, mw wrote:
>> ok, what I really mean is:
>>
>> ... in other "(more popular) languages (than D, and directly supported by the language & std library only)" ...
>
> Well, even Python supports both

Python's `del` isn't guaranteed to free the memory, that's what we are discussing here: core.memory.GC.free / core.stdc.stdlib.free

https://www.quora.com/Why-doesnt-Python-release-the-memory-when-I-delete-a-large-object

In CPython (the default reference distribution), the Garbage collection in Python is not guaranteed to run when you delete the object - all del (or the object going out of scope) does is decrement the reference count on the object. The memory used by the object is not guaranteed to be freed and returned to the processes pool at any time before the process exits. Even if the Garbage collection does run - all it needs is another object referencing the deleted object and the garbage collection won’t free the object at all.

January 14, 2021

Re: Why many programmers don't like GC?

Posted by Ola Fosheim Grøstad
in reply to mw

Ola Fosheim Grøstad

Posted in reply to mw

On Thursday, 14 January 2021 at 18:10:43 UTC, mw wrote:
> Python's `del` isn't guaranteed to free the memory, that's what

Fair point, but I was thinking of the C interop interface. You can create your own wrapper (e.g. numpy) and do manual memory management, but it isn't something people want to do! It is mostly pointless to do that within Python because of the existing overhead.  That applies to most high level languages; you can, but it is pointless. You only do it for interop...

One can follow the same kind of reasoning for D. It makes no sense for people who want to stay high level and do batch programming. Which is why this disconnect exists in the community... I think.

January 14, 2021

Re: Why many programmers don't like GC?

Posted by IGotD-
in reply to ddcovery

IGotD-

Posted in reply to ddcovery

On Thursday, 14 January 2021 at 15:18:28 UTC, ddcovery wrote:
>
> I understand perfectly the D community people that needs to work without GC:  **it is not snobbish**:  it is a real need.  But not only a "need"... sometimes it is basically the way a team wants to work:  explicit memory management vs GC.

D already supports manual memory management so that escape hatch was always there. My main criticism of D is the inability to freely exchange the GC algorithms as one type of GC might not be the best fit for everyone. The problem is of course that there is no differentiation between raw and fat pointers. With fat pointers, the community would have a better opportunities to experiment with different GC designs which would lead to a larger palette of GC algorithms.

January 14, 2021

Re: Why many programmers don't like GC?

Posted by H. S. Teoh
in reply to claptrap

H. S. Teoh

Posted in reply to claptrap

On Thu, Jan 14, 2021 at 12:36:12PM +0000, claptrap via Digitalmars-d-learn wrote: [...]
> I think you also have to consider that the GC you get with D is not state of the art, and if the opinions expressed on the newsgroup are accurate, it's not likely to get any better. So while you can find examples of high performance applications, AAA games, or whatever that use GC, I doubt any of them would be feasible with Ds GC. And given the design choices D has made as a language, a high performance GC is not really possible.

To be fair, the GC *has* improved over the years.  Just not as quickly as people would like, but it *has* improved.

> So the GC is actually a poor fit for D as a language. It's like a convertible car with a roof that is only safe up to 50 mph, go over that and its likely to be torn off. So if you want to drive fast you have to put the roof down.

How much D code have you actually written and optimized?  That analogy is inaccurate.  IME, performance issues caused by the GC are generally localized, and easy to fix by replacing that small part of the code with a bit of manual memory management (you *can* rewrite a function not to use the GC; this isn't the Java straitjacket, y'know!), or standard GC optimization techniques like reducing GC load in hot loops.  There's also GC.stop and GC.collect for those times when you want more control over exactly when collection pauses happen.

I wrote a compute-intensive program once, and after some profiling revealed the GC being a bottleneck, I:

(1) Refactored one function called from an inner loop to reuse a buffer instead of allocating a new one each time, thus eliminating a large amoun of garbage from small allocations;

(2) Used GC.stop and scheduled my own GC.collect with slightly reduced
frequency.

The result was about 40-50% reduction in runtime, which is close to about a 2x speedup.

Now, you'll argue that had I written this code without a GC in the first place I wouldn't have needed to do all this.  However:

(a) Because I *had* the GC, I could write this code in about 1/5 of the time it would've taken me to write it in C++;

(b) The optimization involved only changing a couple of lines of code in 2-3 functions -- a couple of days' work at most -- as opposed to blindly optimizing *every* single danged line of code, 95% of which wouldn't even have had any noticeable effect because they *are not the bottleneck*;

(c) The parts of the code that aren't in the hot path can still freely take advantage of the GC require minimal effort to write, and be free of the time-consuming bugs that often creep into code that manually manages memory.

As I said, it's an ROI question. I *could* have spent 5x the amount of time and effort to write the perfect, GC-less, macho-hacker-style code, and get maybe about a 1-2% performance improvement. But why would I?  It takes 5x less effort to write GC code, and requires only a couple more days of effort to fix GC-related performance issues, vs. 5x the development effort to write the entire program GC-less, and who knows how much longer after that to debug obscure pointer bugs.  Life is too short to be squandered chasing down the 1000th double-free and the 20,000th dangling pointer in my life.

A lot of naysayers keep repeating GC performance issues as if it's a black-and-white, all-or-nothing question.  It's not.  You *can* write high-performance programs even with D's supposedly lousy GC -- just profile the darned thing, and refactor the hotspots to reduce GC load or avoid the GC. *In those parts of the code that actually matter*.  You don't have to do this for the *entire* lousy program.  The non-hot parts of the code can still GC away like there's no tomorrow, and your customers would hardly notice a difference.  This isn't Java where you have no choice but to use the GC everywhere.

Another example: one day I had some spare time, and wrote fastcsv (http://github.com/quickfur/fastcsv).  It's an order of magnitude faster than std.csv, *and it uses the evil GC*.  I just applied the same technique: write it with GC, then profile to find the bottlenecks. The first round of profiling showed that there tend to be a lot of small allocations, which create lots of garbage, which means slow collection cycles.  The solution? Use a linear buffer instead of individual allocations for field/row data, and use slices where possible instead of copying the data.  By reducing GC load and minimizing copying, I got huge boosts in performance -- without throwing out the GC with the bathwater.  (And note: it's *because* I can rely on the GC that I can use slices so freely; if I had to use RC or manage this stuff manually, it'd take 5x longer to write and would involve copying data all over the place, which means it'd probably lose out in overall performance.)

But then again, it's futile to argue with people who have already made up their minds about the GC, so meh. Let the bystanders judge for themselves. I'll shut up now. *shrug*

T

-- 
Study gravitation, it's a field with a lot of potential.

January 15, 2021

Re: Why many programmers don't like GC?

Posted by Imperatorn
in reply to H. S. Teoh

Imperatorn

Posted in reply to H. S. Teoh

On Friday, 15 January 2021 at 07:35:00 UTC, H. S. Teoh wrote:
> On Thu, Jan 14, 2021 at 12:36:12PM +0000, claptrap via Digitalmars-d-learn wrote: [...]
>> [...]
>
> To be fair, the GC *has* improved over the years.  Just not as quickly as people would like, but it *has* improved.
>
> [...]

Nice strategy, using GC and optimizing where you need it.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation