September 13, 2015
On Sunday, 13 September 2015 at 17:00:30 UTC, Ola Fosheim Grøstad wrote:
> On Sunday, 13 September 2015 at 16:53:20 UTC, ponce wrote:
>> GC is basically ok for anything soft-realtime, where you already spend a lot of time to go fast enough. And if you want hard-realtime, well you wouldn't want malloc either.
>>
>> It's a non-problem.
>
> If this was true then Go would not have a concurrent collector.

I was speaking of the D language.
September 13, 2015
On Sunday, 13 September 2015 at 17:16:02 UTC, ponce wrote:
> On Sunday, 13 September 2015 at 17:00:30 UTC, Ola Fosheim Grøstad wrote:
>> On Sunday, 13 September 2015 at 16:53:20 UTC, ponce wrote:
>>> GC is basically ok for anything soft-realtime, where you already spend a lot of time to go fast enough. And if you want hard-realtime, well you wouldn't want malloc either.
>>>
>>> It's a non-problem.
>>
>> If this was true then Go would not have a concurrent collector.
>
> I was speaking of the D language.

Of course, that makes it make sense!
September 13, 2015
On Sunday, 13 September 2015 at 17:16:02 UTC, ponce wrote:
> On Sunday, 13 September 2015 at 17:00:30 UTC, Ola Fosheim
>> If this was true then Go would not have a concurrent collector.
>
> I was speaking of the D language.

Go only added concurrent GC now at version 1.5 and keep improving it to avoid blocking for more than 10ms.

Efficient mark sweep GC may also affects real time threads by polluting the caches and reducing available memory bandwith.

The theoretical limit for 10ms mark sweep collection on current desktop cpus is 60 megabytes at peak performance. That means you'll have to stay below 30 MiB in total memory use with pointers.

Not a non-issue.

September 13, 2015
On Sunday, 13 September 2015 at 19:39:20 UTC, Ola Fosheim Grostad wrote:
>
> The theoretical limit for 10ms mark sweep collection on current desktop cpus is 60 megabytes at peak performance. That means you'll have to stay below 30 MiB in total memory use with pointers.
>

30 MiB of scannable heap.
My point is that we now have the tools to reduce that amount of memory with -profile=gc




September 14, 2015
On Sunday, September 13, 2015 16:53:18 ponce via Digitalmars-d-learn wrote:
> On Sunday, 13 September 2015 at 15:35:07 UTC, Jonathan M Davis wrote:
> > But the idea that your average D program is going to run into problems with the GC while using Phobos is just plain wrong. The folks who need to care are the rare folks who need extreme enough performance that they can't afford for the GC to _ever_ stop the world.
>
> Even in that case not all threads need to be real-time and you can do threads without GC.
>
> Honestly I think only people using microcontrollers or really constrained environments and don't have the memory have that problem.
>
> I suspect preciously few of the GC haters actually have those requirements or misrepresents the ways to avoid GC-related problems.
>
> Same arguments but there is a solution for everything:
>
> "Don't want memory overhead" => minimize heap usage, use -vgc /
> -profile=gc
> "Don't want pauses" => unregister thread + @nogc
> "Want shorter pauses" => minimize heap usage, use -vgc /
> -profile=gc
> "Want determinism" => ways to do that
>
> GC is basically ok for anything soft-realtime, where you already spend a lot of time to go fast enough. And if you want hard-realtime, well you wouldn't want malloc either.
>
> It's a non-problem.

There _are_ some programs that simply cannot afford a stop-the-world GC. For instance, this has come up in discussions on games where a certain framerate needs to be maintained. Even a 100 ms stop would be way too much for them. In fact, it came up with the concurrent GC that was presented at dconf 2013 that it would likely have to be guaranteed to stop the world for less than 10 ms (or something in that range anyway) to be acceptable for such environments. So, it _is_ a problem for some folks.

That being said, it is _not_ a problem for most folks, and the folks who have those sorts of performance requirements frequently can't even use malloc after the program has gotten past its startup phase. So, many of them would simply be allocating up front and then only reusing existing memory for the rest of the program's run, whether that memory was GC-allocated or malloced. For instance, as I understand it, Warp used the GC, but it allocated everything up front and didn't allocate once it got going, so the GC wasn't a performance problem for it at all, and it's _very_ fast.

But there are other solutions such as having the critical threads not use the GC (as you mentioned) which make it so that you can use the GC in parts of your program while still avoiding its performance costs in the critical portions.

Regardless, idiomatic D involves a lot more stack allocations than you often get even in C++, so GC usage tends to be low in programs that use idiomatic D, and there are ways to work around the cases where the GC actually turns out to be a bottleneck. And for the most part, the folks who are freaking out about the GC and insisting that it's simply evil and shouldn't exist and losing out on some great stuff. And even with more lazy ranges in Phobos and more consistent @nogc usage, I suspect that many of them will continue to complain about Phobos using the GC even though it uses it pretty minimally.

- Jonathan M Davis

September 14, 2015
On Sunday, September 13, 2015 16:58:21 Ola Fosheim Grøstad via Digitalmars-d-learn wrote:
> On Sunday, 13 September 2015 at 15:35:07 UTC, Jonathan M Davis wrote:
> > the GC heavily. And the reality of the matter is that the vast majority of programs will have _no_ problems with using the GC so long as they don't use it heavily. Programming like you're in Java and allocating everything on the heap will kill performance, but idiomatic D code doesn't do that, and Phobos doesn't do that. Far too many programmers freak out at the thought of D even having a GC and overreact thinking that they have to root it out completely, when there really is no need to. Plenty of folks how written highly performant code in D using the GC. You just have to avoid doing a lot of allocating and make sure you track down unwanted allocations when you have a performance problem.
>
> I don't understand this argument. Even if the GC heap only contains a single live object, you still have to scan ALL memory that contains pointers.

> So how does programming like you do in Java affect anything related to the GC?
>
> Or are you saying that finalization is taking up most of the time?

Only the stack and the GC heap get scanned unless you tell the GC about memory that was allocated by malloc or some other mechanism. malloced memory won't be scanned by default. So, if you're using the GC minimally and coding in a way that doesn't involve needing to tell the GC about a bunch of malloced memory, then the GC won't have all that much to scan. And while the amount of memory that the GC has to scan does affect the speed of a collection, in general, the less memory that's been allocated by the GC, the faster a collection is.

Idiomatic D code uses the stack heavily and allocates very little on the GC heap. Classes are used only rarely, and ranges are generally preferred over arrays, so idiomatic D code ends up with structs on the stack rather than classes on the heap, and the number of allocations required for arrays goes down considerably. So, there simply isn't all that much garbage to collect, and memory isn't being constantly allocated, so it's a lot rarer that a collection needs to be run in order to get more memory.

So, the big win is simply not allocating much on the GC heap, whether it's because it's allocated on the malloced heap or because it's on the stack. The result of that is that even if a collection isn't super fast, collections are actually relatively rare. So, for a _lot_ of idiomatic D code, collections simply won't be happening often, and as long as you don't have realtime constraints, then having an occasional collection occur that's a bit longer than desirable isn't necessarily a problem - though we definitely do want to improve the GC so that collections are faster and thus less likely to cause performance problems (and work has been done recently on that; Martin Nowak was supposed to give a talk on it at dconf this year, but he missed his flight).

So, while the fact that D's GC is less than stellar is certainly a problem, and we would definitely like to improve that, the idioms that D code typically uses seriously reduce the number of performance problems that we get.

- Jonathan M Davis


September 14, 2015
On Sunday, September 13, 2015 17:14:05 Prudence via Digitalmars-d-learn wrote:
> On Sunday, 13 September 2015 at 16:58:22 UTC, Ola Fosheim Grøstad wrote:
> > On Sunday, 13 September 2015 at 15:35:07 UTC, Jonathan M Davis wrote:
> >> the GC heavily. And the reality of the matter is that the vast majority of programs will have _no_ problems with using the GC so long as they don't use it heavily. Programming like you're in Java and allocating everything on the heap will kill performance, but idiomatic D code doesn't do that, and Phobos doesn't do that. Far too many programmers freak out at the thought of D even having a GC and overreact thinking that they have to root it out completely, when there really is no need to. Plenty of folks how written highly performant code in D using the GC. You just have to avoid doing a lot of allocating and make sure you track down unwanted allocations when you have a performance problem.
> >
> > I don't understand this argument. Even if the GC heap only contains a single live object, you still have to scan ALL memory that contains pointers.
> >
> > So how does programming like you do in Java affect anything related to the GC?
> >
> > Or are you saying that finalization is taking up most of the time?
>
> What if I happen to write a RT app that happens to use a part of phobo's that happens to heavily rely on the GC? Am I suppose to use -vgs all the time to avoid that? Do I avoid phobo's because 3 functions in it use the GC? Am I suppose to memorize a table of all the places phobo's uses the GC and then roll my own to avoid them?

@nogc was added specifically to support folks who want to guarantee that they aren't using the GC. If you want to guarantee that your function isn't using the GC, then mark it with @nogc, and you try and call a function in it that isn't @nogc (be it because it's not marked with @nogc, or it's a templated function that wasn't inferred to be @nogc), then you'll get a compilation error, and you'll know that you need to do something different. Whether the function you're calling is in Phobos or a 3rd party library doesn't really matter. If you want to be sure that it's @nogc, you'll just have to mark your code with @nogc, and you'll catch any accidental or otherwise unknown allocations. You can then use -vgc to figure out exactly what is allocating in a function that you thought should be @nogc but can't be. But you don't have to use it simply to find out whether your code is using the GC or not. @nogc does that.

Yes, unlike C++, with D, if you don't want to use a GC at all, then you're going to have be careful about how you write your code, because some features use the GC (albeit not many), and some 3rd party code that you might want to use (be it the standard library or someone else's code) is likely going to end up using the GC, whereas in C++, that's not a concern. But having the GC makes a lot of programs easier to write, and it does solve some safety concern with regards to memory and allow us to have a few features that C++ doesn't. If you're using the GC, D can guarantee memory safety in a way that C++ can't. And that can be a big gain. And yes, that can be a bit of a pain for those folks who can't use the GC, but that's not the majority of programs, and the language and compiler do have tools for supporting folks who insist on minimizing GC usage or even avoiding it completely. So, it's not like the anti-GC folks are being left out in the cold here. And work is being done to improve the GC and to make sure that Phobos doesn't allocate using the GC except when it absolutely needs to. And ranges are really helping with that.

Regardless, it's still the case that Phobos has never used the GC heavily. It just hasn't always avoided it everywhere it could or should, and that's being fixed. But it's always going to use the GC in some places, because some things simply require it, and the language does have a GC. Those few pieces of functionality will simply have to be avoided by anyone who insists on never using the GC, and @nogc will help them ensure that they don't use that functionality.

- Jonathan M Davis


September 14, 2015
On Monday, 14 September 2015 at 00:41:28 UTC, Jonathan M Davis wrote:
> stop-the-world GC. For instance, this has come up in discussions on games where a certain framerate needs to be maintained. Even a 100 ms stop would be way too much for them. In fact, it came up with the concurrent GC that was presented at dconf 2013 that it would likely have to be guaranteed to stop the world for less than 10 ms (or something in that range anyway) to be acceptable for such environments. So, it _is_ a problem for some folks.

In the renderloop you only have 15 ms, so it would be more like 2ms (not realistic). The 10 ms target is for high performance servers and regular interactive applications.

> That being said, it is _not_ a problem for most folks, and the folks who have those sorts of performance requirements frequently can't even use malloc after the program has gotten past its startup phase.

I don't agree with this. You can use your own allocators that don't syscall in critical areas.

> GC-allocated or malloced. For instance, as I understand it, Warp used the GC, but it allocated everything up front and didn't allocate once it got going, so the GC wasn't a performance problem for it at all, and it's _very_ fast.

A c preprocessor is a simple program that has to allocate for macro defs, but the rest can just use static buffers... So it should not be a problem.

> Regardless, idiomatic D involves a lot more stack allocations than you often get even in C++, so GC usage tends to be low in

Really? I use VLAs in my C++ (a C extension) and use very few mallocs after init. In C++ even exceptions can be put outside the heap. Just avoid STL after init and you're good.


September 14, 2015
On Monday, 14 September 2015 at 00:53:58 UTC, Jonathan M Davis wrote:
> So, while the fact that D's GC is less than stellar is certainly a problem, and we would definitely like to improve that, the idioms that D code typically uses seriously reduce the number of performance problems that we get.

What D needs is some way for a static analyzer to be certain that a pointer does not point to a specific GC heap. And that means language changes... one way or the other.

Without language changes it becomes very difficult to reduce the amount of memory scanned without sacrificing memory safety.

And I don't think a concurrent GC is realistic given the complexity and performance penalties. The same people who complain about GC would not accept performance hits on pointer-writes. That would essentially make D and Go too similar IMO.

September 14, 2015
On Monday, 14 September 2015 at 08:57:07 UTC, Ola Fosheim Grøstad wrote:
> On Monday, 14 September 2015 at 00:53:58 UTC, Jonathan M Davis wrote:
>> So, while the fact that D's GC is less than stellar is certainly a problem, and we would definitely like to improve that, the idioms that D code typically uses seriously reduce the number of performance problems that we get.
>
> What D needs is some way for a static analyzer to be certain that a pointer does not point to a specific GC heap. And that means language changes... one way or the other.
>
> Without language changes it becomes very difficult to reduce the amount of memory scanned without sacrificing memory safety.

Personally, when I make a strong claim about something and find that I am wrong (the claim that D needs to scan every pointer), I take a step back and consider my view rather than pressing harder.  It's beautiful to be wrong because through recognition of error, growth.  If recognition.

> And I don't think a concurrent GC is realistic given the complexity and performance penalties. The same people who complain about GC would not accept performance hits on pointer-writes. That would essentially make D and Go too similar IMO.

Given one was written by one (very smart) student for his PhD thesis, and that as I understand it that formed the basis of Sociomantic's concurrent garbage collector (correct me if I am wrong), and that this is being ported to D2, and whether or not it is released, success will spur others to follow - it strikes me as a problematic claim to make that developing one isn't realistic unless one is deeply embedded in the nitty gritty of the problem (because theory and practice are more different in practice than they are in theory!)  There is etcimon's work too (at research stage).

Don't underestimate too how future corporate support combined with an organically growing community may change what's possible.  Andy Smith gave his talk based on his experience at one of the largest and well-run hedge funds.  An associate who sold a decent sized marketing group got in contact to thank me for posting links on D as it helped him implement a machine-learning problem better.  And if I look at what's in front of me, I really am not aware of a better solution to the needs I have, which I am pretty sure are needs that are more generally shared - corporate inertia may be a nuisance but it is also a source of opportunity for others.

In response to your message earlier where you suggested that Sociomantic was an edge case of little relevance for the rest of us.  I made that point in response to the claim that D had no place for such purposes.  It's true that being able to do something doesn't mean it is a good idea, but really having seen them speak and looked at the people they hire, I really would be surprised if they do not know what they are doing.  (I would say the same if they had never been bought).  And they say that using D has significantly lowered their costs compared to their competitors.  It's what I have been finding, too, dealing with data sets that are for now by no means 'big' but will be soon enough.

It's also a human group phenomenon that it's very difficult to do something for the first time, and the more people that follow, the easier it is for others.  So the edge case of yesteryear shall be the best practice of the future.  One sees this also with allocators, where Andrei's library is already beginning to be integrated in different projects.  I had never even heard of D two years ago and had approaching a twenty year break from doing a lot of programming.  But they weren't difficult to pick up and use effectively.

Clearly, latency and performance hits are different things, and the category of people who care about performance is only a partial intersection of those who care about latency.

Part of what I do involves applying the principle of contrarian thinking, and I can say that it is very useful, and not just in the investment world:
http://www.amazon.com/The-Contrary-Thinking-Humphrey-Neill/dp/087004110X

On the other hand, there is also the phenomenon of just being contrary.  One sometimes has the impression that some people like to argue for the sake of it.  Nothing wrong with that, provided one understands the situation.  Poking holes at things without taking any positive steps to fix them is understandable for people that haven't a choice about their situation, but in my experience is rarely effective in making the world better.