August 03, 2021

On 8/3/21 5:15 AM, Gregor Mückl wrote:

>

On Sunday, 1 August 2021 at 08:54:05 UTC, Kirill wrote:

>

It's interesting to hear do you use D's GC? Or do you use your own custom memory management structure?

How performant is GC?

The reason I'm asking is I'm planning to dive into 3D game dev with D as a hobby in an attempt to create a game I dreamed of since I was a kid. I'd like to know if GC is worth using at all, or should I go with 100% manual memory management.

Any opinion is appreciated. Thanks in advance.

The D garbage collector seems reasonable for applications with small (object sized) allocations. But that changes when you allocate large blocks of memory: due to a bug internal to the tracking of allocated pages, performance gradually degrades over time. So if you have to allocate large(ish) buffers regularly, it'll show over time. I reported that bug here with a repro case, but it didn't get any attention yet:

https://issues.dlang.org/show_bug.cgi?id=20434

I haven't managed to understand thag part of the GC enough to submit a patch myself :(.

Is that repro case doing what you think it is doing? It appears to keep adding more and more larger allocations to the mix.

It seems you are calculating a size variable and never using it.

I think possibly you meant to use size * 1024 * 1024 instead of i * 1024 * 1024.

In regards to the conservative nature of the GC, the larger the blocks get, the more chances they get "accidentally" pointed at by the stack (or some other culprit). 64-bit memory space should alleviate a lot of this, but I think there are still cases where it can pin data unintentionally.

-Steve

August 03, 2021

On Tuesday, 3 August 2021 at 12:26:14 UTC, russhy wrote:

>

WHY DO YOU LIE?!

there is a strong ecosystem for system libraries and nogc libraries in D

and you can plug in what ever C libraries you want, so you already can consume the WHOLE C ecosystem, out of the box

saying there is no ecosystem is plain and simple dishonest and a PURE LIE

I am sorry if I use the wrong words that hurt you feeling, but that is just my opinion.

August 03, 2021
On Sun, Aug 01, 2021 at 08:54:05AM +0000, Kirill via Digitalmars-d wrote:
> It's interesting to hear do you use D's GC? Or do you use your own custom memory management structure?
> 
> How performant is GC?
> 
> The reason I'm asking is I'm planning to dive into 3D game dev with D as a hobby in an attempt to create a game I dreamed of since I was a kid. I'd like to know if GC is worth using at all, or should I go with 100% manual memory management.
[...]

My approach to D's GC is:

(1) Use it by default until it starts showing up as a bottleneck in your
profiler.

(2) When it does show up as a bottleneck, the fix is often simple and yields good benefits. E.g., in one of my projects, after the GC showed up as a bottleneck, I quickly pinpointed the problem to a function in an inner loop that was allocating a new array every iteration.  Fixing that to reuse a previously-allocated array (maybe about 5 lines' change) immediately gave me 20-30% performance boost.  There were a couple of other similar small code changes to reduce GC pressure, and replace some small bits of GC code in inner loops where performance matters the most.

(3) If performance is still not good enough, there are other ways of controlling the GC. In the aforementioned project, for example, after the array optimization I found that GC collections were taking place too often. So I added GC.stop to the start of my program, and scheduled my own calls to GC.collect at a lower frequency, and got about another 20-30% performance improvement.

Overall, I think I got about 50-60% performance improvement just by several small code changes to an essentially GC-centric codebase.  By writing GC code I saved countless days of writing code for manually managing memory (and weeks of pulling out my hair to debug said code). Only in actual hotspots where the GC becomes a hindrance, I spent some focused effort to either optimize GC usage, or replace small parts of the code (in inner loops and other bottlenecks) with manually-managed memory.  Much faster development time than if I had written *everything* to be @nogc.  Most of that effort would have been wasted on code that doesn't even lie in the bottleneck and therefore doesn't actually matter to performance.

tl;dr: don't fear the GC, just use it freely until your profiler has actually identified the GC as the bottleneck. Then strategically optimize those hotspots, optionally replace them with @nogc code, etc., with much less effort than writing your entire application with @nogc.


T

-- 
"Hi." "'Lo."
August 03, 2021

On Tuesday, 3 August 2021 at 15:28:57 UTC, workman wrote:

>

On Tuesday, 3 August 2021 at 12:26:14 UTC, russhy wrote:

>

WHY DO YOU LIE?!

there is a strong ecosystem for system libraries and nogc libraries in D

and you can plug in what ever C libraries you want, so you already can consume the WHOLE C ecosystem, out of the box

saying there is no ecosystem is plain and simple dishonest and a PURE LIE

I am sorry if I use the wrong words that hurt you feeling, but that is just my opinion.

Don't make it personal, i talk about the lies you spread, it has nothing to do with feelings

August 03, 2021
On 8/3/21 7:34 AM, Steven Schveighoffer wrote:

> Is that repro case doing what you think it is doing? It appears to keep adding more and more larger allocations to the mix.
You didn't reply in the bug thread. ;)

Ali
August 03, 2021
On 8/3/21 9:31 AM, H. S. Teoh wrote:

> (1) Use it by default until it starts showing up as a bottleneck in your
> profiler.
>
> (2) When it does show up as a bottleneck, [...]

That's engineering! :)

However, dmd's -profile=gc switch seems to have a bug reported by multiple people where one may get a segmentation fault (or was it -profile?). Even though I've seen it as well, it is not clear whether it's my own code causing errors in a destructor.

> Overall, I think I got about 50-60% performance improvement just by
> several small code changes to an essentially GC-centric codebase.

Same here.

One surprising pessimization which I have not mentioned publicly before was with assigning to the .length property of a buffer. Was it supposed to allocate? I had the following function:

void ensureHasRoom(ref ubyte[] buffer, size_t length) {
  buffer.length = length;
}

My idea was I would blindly assign and even if the length was reduced, the *capacity* would not change and memory would *not* be allocated by a later assignment.

Unfortunately, that function was causing GC allocations at least with 2.084.1 perhaps in the presence of other slices to the same elements in my program. Changing it to the following more sensible approach reduced the allocations a lot:

void ensureHasRoom(ref ubyte[] buffer, size_t length) {
  if (buffer.length < length) {
    buffer.length = length;
  }
}

I can't be sure now whether it was related to the presence of other slices. Possible... Anyway, that quick fix was a huge improvement.

Ali

August 03, 2021

On Sunday, 1 August 2021 at 08:54:05 UTC, Kirill wrote:

>

It's interesting to hear do you use D's GC? Or do you use your own custom memory management structure?

How performant is GC?

The reason I'm asking is I'm planning to dive into 3D game dev with D as a hobby in an attempt to create a game I dreamed of since I was a kid. I'd like to know if GC is worth using at all, or should I go with 100% manual memory management.

Any opinion is appreciated. Thanks in advance.

I think GC fear is massively overblown, but I would be quite careful about using it heavily in a 3D game. Unpredictable pauses are your enemy. So using it to load/build e.g. level assets would be fine, but allocating anything short-lived? No way. I would likely make the core game loop @nogc or be very strategic in allowing it to run.

Then again, I have never build a 3d game before, only 3d scientific animations, so my advice is coming from a relatively weak position: I know D very well and have heard how games work.

August 03, 2021

On 8/3/21 1:53 PM, Ali Çehreli wrote:

>

On 8/3/21 9:31 AM, H. S. Teoh wrote:

>

(1) Use it by default until it starts showing up as a bottleneck in your
profiler.

(2) When it does show up as a bottleneck, [...]

That's engineering! :)

However, dmd's -profile=gc switch seems to have a bug reported by multiple people where one may get a segmentation fault (or was it -profile?). Even though I've seen it as well, it is not clear whether it's my own code causing errors in a destructor.

>

Overall, I think I got about 50-60% performance improvement just by
several small code changes to an essentially GC-centric codebase.

Same here.

One surprising pessimization which I have not mentioned publicly before was with assigning to the .length property of a buffer. Was it supposed to allocate? I had the following function:

void ensureHasRoom(ref ubyte[] buffer, size_t length) {
  buffer.length = length;
}

My idea was I would blindly assign and even if the length was reduced, the capacity would not change and memory would not be allocated by a later assignment.

Let's rewrite this into appending to see why this doesn't work:

buffer = new int[100]; // start with some data;
auto buffer2 = buffer; // keep a reference to it (to give us a reason to keep the data)
buffer = buffer[0 .. 50]; // "shrink" the length
buffer ~= 10; // appending, will reallocate because otherwise we stomp on buffer2

It's no different for length setting:

buffer = new int[100];
auto buffer2 = buffer;
buffer.length = 50;
buffer.length = 51; // must reallocate
buffer[$-1] = 10;

In order for a length change to not reallocate, you have to call assumeSafeAppend on that adjusted buffer, to let the runtime know that we don't care about any existing data that might be referenced by others.

-Steve

August 03, 2021

On Tuesday, 3 August 2021 at 18:18:48 UTC, Steven Schveighoffer wrote:

>

It's no different for length setting:

buffer = new int[100];
auto buffer2 = buffer;
buffer.length = 50;
buffer.length = 51; // must reallocate
buffer[$-1] = 10;

All of the buffer-stomping "must reallocate" cases are preserved in the rewrite that only assigns buffer.length when the new value is larger. The cases that were omitted, that were a performance drain, were same-length or less-length assignments.

August 03, 2021
On 8/3/21 11:18 AM, Steven Schveighoffer wrote:

> In order for a length change to *not* reallocate, you have to call
> `assumeSafeAppend` on that adjusted buffer, to let the runtime know that
> we don't care about any existing data that might be referenced by others.

Makes sense. Yes, I use assumeSafeAppend mostly on function-local-static buffers. (Similarly, Appender.clear() for appenders.)

Ali