January 15, 2021
On Friday, 15 January 2021 at 07:35:00 UTC, H. S. Teoh wrote:
> To be fair, the GC *has* improved over the years.  Just not as quickly as people would like, but it *has* improved.

It cannot improve enough as a global collector without write barriers. No language has been able to do this. Therefore, D cannot do it.

Precise collection only helps when you have few pointers to trace.


> improvement. But why would I?  It takes 5x less effort to write GC code, and requires only a couple more days of effort to fix

That's like saying it takes 5x more time to write code in Swift than D. That is not at all reasonable.

Tracing GC is primarily useful when you have many small long-lived objects with unclear ownership and cyclic references that are difficult to break with weak pointers.

In those cases it is invaluable, but most well-designed programs have more tree-like structures and clear ownership.


> after that to debug obscure pointer bugs.  Life is too short to be squandered chasing down the 1000th double-free and the 20,000th dangling pointer in my life.

That has nothing to do with a tracing GC... Cyclic references is the only significant problem a tracing GC addresses compared to other solutions.


> A lot of naysayers keep repeating GC performance issues as if it's a black-and-white, all-or-nothing question.  It's not.  You *can* write high-performance programs even with D's supposedly lousy GC -- just profile the darned thing, and

There are primarily two main problems, and they are not throughput, they are:

1. LATENCY: stopping the world will never be acceptable in interactive applications of some size, it is only acceptable in batch programs. In fact, even incremental collectors can cause a sluggish experience!

2. MEMORY CONSUMPTION: doing fewer collection cycles will increase the memory footprint. Ideally the collector would run all the time. In the cloud you pay for memory, so you want to keep memory consumption to a fixed level that you never exceed.


System level programming is primarily valuable for interactive applications, OS level programming, or embedded. So, no, it is not snobbish to not want a sluggish GC. Most other tasks are better done in high level languages.


January 15, 2021
On Friday, 15 January 2021 at 08:49:21 UTC, Imperatorn wrote:

>
> Nice strategy, using GC and optimizing where you need it.

That's the whole point of being able to mix and match. Anyone avoiding the GC completely is missing it (unless they really, really, must be GC-less).
January 15, 2021
On Friday, 15 January 2021 at 11:11:14 UTC, Mike Parker wrote:
> That's the whole point of being able to mix and match. Anyone avoiding the GC completely is missing it (unless they really, really, must be GC-less).

Has DMD switched to using the GC as the default?
January 15, 2021
On Friday, 15 January 2021 at 11:28:55 UTC, Ola Fosheim Grøstad wrote:
> On Friday, 15 January 2021 at 11:11:14 UTC, Mike Parker wrote:
>> That's the whole point of being able to mix and match. Anyone avoiding the GC completely is missing it (unless they really, really, must be GC-less).
>
> Has DMD switched to using the GC as the default?

No. And it will never will. Currently DMD uses custom allocator for almost everything. It works as follows. Allocate a big chunk(1MB) of memory using malloc. Have a internal pointer that points to the beginning of unallocated memory. When someone ask for memory return that pointer and increment internal pointer with the 16 byte aligned size of allocation. Meaning the new pointer is pointing to unused memory and everything behind the pointer has been allocated. This simple allocation strategy is called bump the pointer and it improved DMD performance by ~70%.

You can use GC with D compiler by passing -lowmem flag. I didnt measure but I heard it can increase compilation time by 3x.

https://github.com/dlang/dmd/blob/master/src/dmd/root/rmem.d#L153
January 15, 2021
On Friday, 15 January 2021 at 14:24:40 UTC, welkam wrote:
> You can use GC with D compiler by passing -lowmem flag. I didnt measure but I heard it can increase compilation time by 3x.

Thanks for the info. 3x is a lot though, maybe it could be improved with precise collection, but I assume that would require a rewrite.

Making it use automatic garbage collection (of some form) would be an interesting benchmark.

January 15, 2021
On Thursday, 14 January 2021 at 18:51:16 UTC, Ola Fosheim Grøstad wrote:
> One can follow the same kind of reasoning for D. It makes no sense for people who want to stay high level and do batch programming. Which is why this disconnect exists in the community... I think.

The reasoning of why we do not implement write barriers is that it will hurt low level programming. But I feel like if we drew a ven diagram of people who rely on GC and those who do a lot of writes trough a pointer we would get almost no overlap. In other words if D compiler had a switch that turned on write barriers and better GC I think many people would use it and find the trade offs acceptable.
January 15, 2021
On Friday, 15 January 2021 at 14:50:00 UTC, welkam wrote:
> The reasoning of why we do not implement write barriers is that it will hurt low level programming. But I feel like if we drew a ven diagram of people who rely on GC and those who do a lot of writes trough a pointer we would get almost no overlap. In other words if D compiler had a switch that turned on write barriers and better GC I think many people would use it and find the trade offs acceptable.


Yes, I think this is what we need some way of making the compiler know which pointers has to be traced so that it can avoid redundant pointers. For instance, a type for telling the compiler that a pointer is non-owning. Then we don't have to use a write barrier for that non-owning pointer I think? Or maybe I am missing something?

Then we can also have a switch.

But I also think that we could do this:

1. Make all class objects GC allocated and use write barriers for those.
2. Allow non-owning annotations for class object pointers.
3. Make slices and dynamic arrays RC.
4. Let structs be held by unique_ptr style (Rust/C++ default).

Then we need a way to improve precise tracing:
1. make use of LLVM precise stack/register information
2. introduce tagged unions and only allow redundant pointers in untagged unions
3. Each compile phase emits information for GC.
4. Before linking the compiler generates code to narrowly trace the correct pointers.

Then we don't have to deal with real time type information lookup and don't have to do expensive lookup to figure out if a pointer points to GC memory or not. The compiler can then just assume that the generated collection code is exact.

January 15, 2021
On Friday, 15 January 2021 at 14:59:18 UTC, Ola Fosheim Grøstad wrote:
> On Friday, 15 January 2021 at 14:50:00 UTC, welkam wrote:
> avoid redundant pointers. For instance, a type for telling the compiler that a pointer is non-owning.

I guess "non-owning" is the wrong term. I mean pointers that are redundant. Not all "non-owning" pointers are redundant.

January 15, 2021
On Friday, 15 January 2021 at 14:24:40 UTC, welkam wrote:
>
> No. And it will never will. Currently DMD uses custom allocator for almost everything. It works as follows. Allocate a big chunk(1MB) of memory using malloc. Have a internal pointer that points to the beginning of unallocated memory. When someone ask for memory return that pointer and increment internal pointer with the 16 byte aligned size of allocation. Meaning the new pointer is pointing to unused memory and everything behind the pointer has been allocated. This simple allocation strategy is called bump the pointer and it improved DMD performance by ~70%.
>
> You can use GC with D compiler by passing -lowmem flag. I didnt measure but I heard it can increase compilation time by 3x.
>
> https://github.com/dlang/dmd/blob/master/src/dmd/root/rmem.d#L153

Actually druntime uses map (Linux) and VirtualAlloc (Windows) to break out more memory. C-lib malloc is an option but not used in most platforms and this option also is very inefficient in terms of waste of memory because of alignment requirements.

Bump the pointer is a very fast way to allocate memory but what is more interesting is what happens when you return the memory. What does the allocator do with chunks of free memory? Does it put it in a free list, does it merge chunks? I have a feeling that bump the pointer is not the complete algorithm that D uses because of that was the only one, D would waste a lot of memory.

As far as I can see, it is simply very difficult to create a completely lockless allocator. Somewhere down the line there will be a lock, even if you don't add one in druntime (the lock will be in the kernel instead when breaking out memory). Also merging chunks can be difficult without locks.

January 15, 2021
On Friday, 15 January 2021 at 14:50:00 UTC, welkam wrote:
> On Thursday, 14 January 2021 at 18:51:16 UTC, Ola Fosheim Grøstad wrote:
>> One can follow the same kind of reasoning for D. It makes no sense for people who want to stay high level and do batch programming. Which is why this disconnect exists in the community... I think.
>
> The reasoning of why we do not implement write barriers is that it will hurt low level programming. But I feel like if we drew a ven diagram of people who rely on GC and those who do a lot of writes trough a pointer we would get almost no overlap. In other words if D compiler had a switch that turned on write barriers and better GC I think many people would use it and find the trade offs acceptable.

Hypothetically, would it be possible for users to supply their own garbage collector that uses write barriers?