December 04, 2022

On Sunday, 4 December 2022 at 09:53:41 UTC, vushu wrote:

>

Dear dlang community.

I am unsure about what idiomatic D is.

Some of the Dconf talks tells people just to use the GC, until you can't afford
it.

If there are documents that describes what idiomatic D is then I would appreciate it.

So my questions are:

What are your thoughts about using GC as a library writer?

If you wan't to include a library into your project aren't you more inclined to use a

library which is gc free?

If that is true, then idiomatic D doesn't apply for library writers.

Since to get most exposure as a D library writer you kinda need to make it gc free right?

Cheers.

D gives you the choice

But the most important thing is your usecase, what kind of library are you making?

Once you answer this question, you can then ask what your memory strategy should be, and then it is based on performance concerns

D scale from microcontrollers to servers, drivers, games, desktop apps

Your audience will determine what you should provide

For a desktop app, a GC is an advantage

For a driver or a game, it's not

December 04, 2022
On Sunday, 4 December 2022 at 16:02:28 UTC, Ali Çehreli wrote:
> D's GC needed to stop the world, which meant it would have to know what threads were running. You can never be sure whether your D library function is being called from a thread you've known or whether the Java runtime (or other user code) just decided to start another thread.

Interesting... you know, maybe D's GC should formally expose a mutex that you can synchronize on for when it is running. So you can cooperatively do this in the jni bridge or something. Might be worth considering.

I've heard stories about similar things happening with C#.
December 04, 2022
On Sunday, 4 December 2022 at 17:53:00 UTC, Adam D Ruppe wrote:
> Interesting... you know, maybe D's GC should formally expose a mutex that you can synchronize on for when it is running.

.......... or compile in write barriers. then it doesn't matter if the thread is unregistered, the write barrier will protect it as-needed!
December 04, 2022
On Sunday, 4 December 2022 at 12:37:08 UTC, Adam D Ruppe wrote:
> All of the top 5 most popular libraries on code.dlang.org embrace the GC.

Do you mean the top of the https://code.dlang.org/?sort=score&category=library list?

How do you know that they embrace GC? Is it possible to filter packages in this list by @nogc or @safe compatibility?
December 04, 2022
On Sunday, 4 December 2022 at 21:55:52 UTC, Siarhei Siamashka wrote:
> Do you mean the top of the https://code.dlang.org/?sort=score&category=library list?

Well, I was referring to the five that appear on the homepage, which shows silly instead of emsi containers.

> How do you know that they embrace GC?

I looked at the projects. Except for that arsd-official thing, that's a big mystery to me, the code is completely unreadable.

But vibe and dub use it pretty broadly. Unit-threaded and silly are test runners, which isn't even really a library (I find it weird that they are consistently at the top of the list), so much of them don't need the GC anyway, but you can still see that they use it without worry when they do want it like when building the test list with ~=.

emsi-containers is built on the allocators thing so it works with or without gc (it works better without though as you learn if you try to use them.)

> Is it possible to filter packages in this list by @nogc or @safe compatibility?

No. I do have an idea for it, searching for @nogc attributes or attached @nogc unittests, but I haven't gotten around to trying it.
December 04, 2022
On 12/4/22 12:17, Adam D Ruppe wrote:
> On Sunday, 4 December 2022 at 17:53:00 UTC, Adam D Ruppe wrote:
>> Interesting... you know, maybe D's GC should formally expose a mutex that you can synchronize on for when it is running.
> 
> .......... or compile in write barriers. then it doesn't matter if the thread is unregistered, the write barrier will protect it as-needed!

That's way beyond my pay grade. Explain please. :)

Ali

December 05, 2022
ALl it means is certain memory patterns (such as writes), will tell the GC about it.

Its required for pretty much all advanced GC designs, as a result we are pretty much maxing out what we can do.

Worth reading: https://www.amazon.com/Garbage-Collection-Handbook-Management-Algorithms/dp/1420082795
December 04, 2022
On Sunday, 4 December 2022 at 22:46:52 UTC, Ali Çehreli wrote:
> That's way beyond my pay grade. Explain please. :)

The reason that the GC stops threads right now is to ensure that something doesn't change in the middle of its analysis.

Consider for example, the GC scans address 0 - 1000 and finds nothing. Then a running thread moves a reference from memory address 2200 down to address 800 while the GC is scanning 1000-2000.

Then the GC scans 2000-3000, where the object used to be, but it isn't there anymore... and the GC has no clue it needs to scan address 800 again. It, never having seen the object, thinks the object is just dead and frees it.

Then the thread tries to use the object, leading to a crash.

The current implementation prevents this by stopping all threads. If nothing is running, nothing can move objects around while the GC is trying to find them.

But, actually stopping everything requires 1) the GC knows which threads are there and has a way to stop them and 2) is overkill! All it really needs to do is prevent certain operations that might change the GC's analysis while it is running, like what happened in the example. It isn't important to stop numeric work, that won't change the GC. It isn't important to stop pointer reads (well not in D's gc anyway, there's some that do need to stop this) so it doesn't need to stop them either.

Since what the GC cares about are pointer locations, it is possible to hook that specifically, which we call write barriers; they either block pointer writes or at least notify the GC about them. (And btw not all pointer writes need to be blocked either, just ones that would point to a different memory block. So things like slice iterations can also be allowed to continue. More on my blog http://dpldocs.info/this-week-in-d/Blog.Posted_2022_10_31.html#thoughts-on-pointer-barriers )

So what happens then:


GC scans address 0 - 1000 and finds nothing.

Then a running thread moves a reference from memory address 2200 down to address 800... which would trigger the write barrier. The thread isn't allowed to complete this operation until the GC is done. Notice that the GC didn't have to know about this thread ahead of time, since the running thread is responsible for communicating its intentions to the GC as it happens. (Essentially, the GC holds a mutex and all pointer writes in generated D code are synchronized on it, but there's various implementations.)

Then the GC scans 2000-3000, and the object is still there since the write is paused! It doesn't free it.

The GC finishes its work and releases the barriers. The thread now resumes and finishes the move, with the object still alive and well. No crash.

This would be a concurrent GC, not stopping threads that are doing self-contained work, but it would also be more compatible with external threads, since no matter what the thread, it'd use that gc mutex barrier.
December 04, 2022
On 12/4/22 15:25, Adam D Ruppe wrote:

> which would trigger the write barrier. The thread isn't
> allowed to complete this operation until the GC is done.

According to my limited understanding of write barriers, the thread moving to 800 could continue because order of memory operations may have been satisfied. What I don't see is, what would the GC thread be waiting for about the write to 800?

Would the GC be leaving behind writes to every page it scans, which have barriers around so that the other thread can't continue? But then the GC's write would finish and the other thread's write would finish.

Ok, here is the question: Is there a very long standing partial write that the GC can perform like: "I write to 0x42, but I will finish it 2 seconds later. So, all other writes should wait?"

> The GC finishes its work and releases the barriers.

So, it really is explicit acquisition and releasing of these barriers... I think this is provided by the CPU, not the OS. How many explicit write barriers are there?

Ali

December 05, 2022
On Sunday, 4 December 2022 at 23:37:39 UTC, Ali Çehreli wrote:
> On 12/4/22 15:25, Adam D Ruppe wrote:
>
> > which would trigger the write barrier. The thread isn't
> > allowed to complete this operation until the GC is done.
>
> According to my limited understanding of write barriers, the thread moving to 800 could continue because order of memory operations may have been satisfied. What I don't see is, what would the GC thread be waiting for about the write to 800?

I'm not a specialist but I have the impression that GC write barrier and CPU memory ordering write barriers are 2 different things that confusedly use the same term for 2 completely different concepts.

>
> Would the GC be leaving behind writes to every page it scans, which have barriers around so that the other thread can't continue? But then the GC's write would finish and the other thread's write would finish.
>
> Ok, here is the question: Is there a very long standing partial write that the GC can perform like: "I write to 0x42, but I will finish it 2 seconds later. So, all other writes should wait?"
>
> > The GC finishes its work and releases the barriers.
>
> So, it really is explicit acquisition and releasing of these barriers... I think this is provided by the CPU, not the OS. How many explicit write barriers are there?
>
> Ali