July 25, 2006
Karen,

Your response seems to indicate a lack of knowledge about garbage collection, but perhaps I'm only reading what you said wrong.

First of all, let's get this clear:

1. Every call to malloc will not cause a collect.
2. It is, in fact, unlikely that two subsequent calls should ever trigger two subsequent collects - because of pooling.
3. Pooling does increase memory use, but it also means less collects.

Any program which triggers collections frequently is written badly.  If you must ACTIVELY and continuously allocate chunks of ram larger than 64k you either:

  - need to avoid using the GC for those allocations.
  - need to disable the GC while allocating that data.
  - have a serious design flaw.
  - are not a reputable or skilled programmer.

Assuming you won't agree with the above, though, clearly garbage collection simply does not work for the *uncommon* and *impractical* case of constant and large allocations.  If you do not agree that this is an uncommon thing in computer programming, please say so.  I will not bother responding to you any further.

Furthermore, it is entirely practical to write generational garbage collectors, or other garbage collectors utilizing different methods or processes.  This is not done in the current implementation of D.  Yet, if it were then this problem could be avoided.

Regardless, I maintain that such a program would perform poorly.  I don't care if you have 20 gigs of main system memory.  Any program that is constantly allocating and filling large amounts of memory WILL BE SLOW, at least in my experience.

Please understand that the garbage collector, at least in D, works something like this (as far as I understand):

1. A batch of memory is allocated.  I believe this happens in fixed chunks of 64k, but it may scale the size of them.

2. From this memory, parts are dolled out.

3. If a "large" allocation happens, there is special code to handle this.

For more information, please see the source code to Phobos' garbage collector, available in src/phobos/internal/gc/gcx.d.

You could, theoretically, tell your garbage collector not to scan the memory range you allocated so it would never get swapped in (unless this range also contains pointers.)  In such a case, I again point to the programmer as the one at fault.

Thus you could avoid swapping in those special cases where you need large amounts of memory.  Again, I do not believe such things are common.  If you are unable to program with efficiency in respect to memory, I suggest you find a new occupation.

I hope you do not take offense to that, but I truly believe too many people these days try to force themselves into things they just aren't any good at.  Some people would make wonderful lawyers but they think being a doctor is cooler, so they make their lives horrible.

Honestly, I feel like I'm debating how dangerous it would be to be hit by a sedan or an SUV.  I really don't care, it's going to hurt either way.  A lot.  The answer is not to get hit, not to say that we should all break our bones with sedans because it's not as bad.

I mean, really.  It's one thing to argue about theoretical problems but it's quite another to argue about impractical ones and accuse a methodology of being flawed because it could fail in these impractical cases.  That's just not the logic I was taught.  Doesn't jive.

I really don't care to prove you wrong.  I've said what I'm going to say.  I may respond again if you seem reasonable and bring up something new; but if you bring nothing else new in (as with this post)... you've lost my interest.

Of course, this is only my opinion and understanding.

-[Unknown]


> Unknown W. Brackets wrote:
> 
> I disagree. Assume a non GC'ed program that allocates 1.5 GB to 1.7 GB memory, from which 0.7 GB to 0.9 GB are vital data. If you run this program on a machine equipped with 1 GB, the OS will swap out the 0.8 GB data that is accessed infrequently. Therefore this program cause swapping only if it accesses data from the swapped out part of data and the size of the swapped data will be approximately bounded by doubling the size of the data needed to be swapped back.
> 
> This changes dramatically if you GC it, because on every allocation the available main memory is exhausted and the GC requires the OS to swap all 0.8 GB back, doesn't it. 
> 
> 
>> I'm afraid I'm not terribly familiar with the dining
>> philosopher's problem, but again I think this is a problem only
>> somewhat aggravated by garbage collection.
>>
>> Most of your post seems to be wholly concerned with applications
>> that use at least the exact figure of Too Much Memory (tm). 
> 
> It is not only somewhat aggravated. Assume the example given above is doubled by two instances of that program and the main memory is not only doubled to 2GB but increased to 4GB or even more.
> 
> Again both non GC'ed version of the program run without any performance problems, but the GC'ed versions do not---although the memory size is increased by a factor that enables the OS to not swap out any allocated data in case of the non GC'ed versions.
> 
> This is because both programs at least slowly increase their allocations of main memory.
> 
> This goes without performance problems unitl the available main memory is exhausted. The first program that hits the limit starts GC'ing its allocated memory---and forces the OS to swap all in. Hence this first program is getting in the danger that all memory freed by its GC is immetiately eaten up by the other instance, that continues running unaffected because its thirst for main memory is accompülished by the GC of the other instance, if that GC is freeing memory as the GC recognizes it.
> 
> At the time when this GC run ends there are at least two cases distinguishable:
> a) the main memory at the end of the run is still insufficient, because the other application ate it all up. Then this instance stops with "out of memory".
> b) the main memory at the end of the run by chance is sufficient, because the other application was not that hungry. Then this instance will start being performant again. But only for the short time until the limit is reached again.
> 
> This is a simple example with only one processor and two competing applications---and I believe that case a) can happen.
> 
> So I feel unable to prove that on multi-core machines running several GC'ed applications the case a) will never happen.
> 
> And even if case a) never happens there might be always at least one application that is running its GC. Hence swapping si always on the run. 
> 
>  
>> A sweeping statement that garbage collection causes
>> a dining philosopher's problem just doesn't seem correct to me.
> 
> Then prove me wrong.
July 25, 2006
Unknown W. Brackets wrote:

> I've said what I'm going to say.

Yes. Near to nothing about the brain stormed problem but much down from the thrown you have built under yourself.
July 25, 2006
Dave wrote:
> You're making the original assertion that it's a problem - I believe the onus is on you to prove that it would apply to efficient design patterns using D <g>

If efficient design patterns using D forbid memory leaks, then there will never occur any garbage in any application.

This would rise the question why the GC is enabled by default.

If on the other hand memory leaks do appear I have a running example that shows a side effect of enabling the GC similar to what I have brain stormed.

As far as I can see, this side effect is neither documented here nor have I found any mentioning in other resources on the net.

But according to those Unknown guys here, its of no interest here anyway.
July 25, 2006
Karen Lanrap wrote:
> Dave wrote:
>> You're making the original assertion that it's a problem - I
>> believe the onus is on you to prove that it would apply to
>> efficient design patterns using D <g>
> 
> If efficient design patterns using D forbid memory leaks, then there will never occur any garbage in any application.
> 
> This would rise the question why the GC is enabled by default.
> 
> If on the other hand memory leaks do appear I have a running example that shows a side effect of enabling the GC similar to what I have brain stormed.
> 
> As far as I can see, this side effect is neither documented here nor have I found any mentioning in other resources on the net.
> 
> But according to those Unknown guys here, its of no interest here anyway.

Don't assume that - I don't think any of us are trying to shut the door on anything. If you have some code to post that'd be great. It's just that (as I read it) you made some strong general and sweeping assertions about GC in general that I don't think reflect general use of the GC.

Many of the long-term contributors to this group are aware of some issues with the "first generation" GC, and no one's ever claimed GC is a panacea, especially for a systems language like D. Only that using the GC shouldn't be ruled out for general programming chores unless proved otherwise. Yes, the primary mode of memory mgmt. for D is GC but of course it's not the only one precisely because it will never be perfect for every job.
July 25, 2006
I have no throne, Karen, but if you want to argue effectively with people, repeating yourself just won't work.

If you don't believe I've addressed your comments, I'm sorry.  I believe I have, in any practical and useful application.  I really don't have the spare time to theorize about completely impractical cases.

Obviously, I'm only one person.  I'm just letting you know that you've basically lost my interest.  You really shouldn't care about that, since I'm only one person.

That said, if you've lost my interest or not gotten your point through to me such that I have addressed it, it's not impossible that your argument is unconvincing.  I suggest you strengthen it, if you don't (as is clear) agree with any of my assertions.  There are many people who prefer to deal in the practical and not the impractical.

Again, I'm sorry if you feel I've been condescending.  The only things I said that could make you feel that are that I don't consider people who have poor memory management people who should be programmers (I meant that in general, and was not saying you did or did not have said skill) and that I tire of arguing about what I feel to be impractical issues.

If I thought I was better than you, I probably wouldn't have spent the time typing responses to you.  After all, my time (just as yours) is valuable and I could be doing other productive things with my time, just as you could.

In another comment in this thread you said, "according to those Unknown guys here, its of no interest".  It is simply of no further interest to me.  I've heard your argument, I stand unconvinced, and you have not added anything new or addressed what I feel are flaws in your argument.

But, please understand, that is only me.  I really hope I haven't hurt your or anyone's feelings.  It sounds like I have.

-[Unknown]


> Unknown W. Brackets wrote:
> 
>> I've said what I'm going to say.
> 
> Yes. Near to nothing about the brain stormed problem but much down from the thrown you have built under yourself.
July 25, 2006
Dave wrote:
> you made some strong general and sweeping assertions about GC in general that I don't think reflect general use of the GC.

You may be right, because I introduced at least two faults:
1) I used the word "typical behaviour" where I would have been better
off with "behaviour in general" or "not excludable behaviour".
2) I still have not found an example where D's memory management
allows for steadily growing allocated memory only interrupted by
starting of some GC sweeps. I would be glad if someone can hint to an
argument that such behaviour is impossible with this GC---then I can
stop that search.

The side effects I detected are rooted in the fact, that the sweeps of the GC break the locality of data accesses.

1. Observation:
There are cases of only one application, where a memory leak causes
severe performance degradation, although the GC is enabled.

2. Observation:
If there are more than one application poisoned by memory leaks and
although the GC is enabled, there are cases where caused by the
memory leaks

2.a. all but one application are that slow, that they seem to be dead.

2.b. the capability of a system to run a number of applications decreases by the factor "online data" / "all data" approximately

If these cases are of no unterest, then it is useless to post any code.
July 26, 2006
Karen Lanrap wrote:
> I see three problems:
> 
> 1) The typical behaviour of a GC'ed application is to require more and more main memory but not to need it. Hence every GC'ed application forces the OS to diminish the size of the system cache held in main memory until the GC of the application kicks in.
> 
> 2) If the available main memory is unsufficient for the true memory requirements of the application and the OS provides virtual memory by swapping out to secondary storage, every run of the GC forces the OS to slowly swap back all data for this application from secondary storage and runs of the GC occur frequently, because main memory is tight.
> 
> 3) If there is more than one GC'ed application running, those applications compete for the available main memory.
> 

Just a philosophical thought:

Perhaps we should look at the GC as a RAD tool for initial development,
and that the goal is to replace it with manual memory management (M3?).
Then you could do it piece by piece, this might be attractive to two types of coders.

 Corporate coders in the need of quick deliveries, but without much
 performance issues (because they can always tell the customer to
 buy new hardware...)

 Performance coders doing alpha blending a billion times per second.
 Their first priority would be to use the GC as little as possible.

To aid in M3, some way of tracking allocations would be needed. Maybe
a program like coverage or profiling, where you could see which allocations are (most often) freed by the GC and which are manually
deallocated. This would probably appeal to a third kind of people:

 Academical coders who wants everything to be proven, but does not feel
 they have the time to write deallocating calls as priority one. But
 they are driven by the urge, or feeling, that beautiful code is code
 that does not depend on GC and does not leak memory.

</philosophy>
July 27, 2006
Tommie Gannert wrote:
> Perhaps we should look at the GC as a RAD tool for initial development

Yes, but then the GC has to be disabled by default for the release versions---and for non-releases there has to be a runtime error message, if a sweep starts but the gc is not enabled explicitely.
July 27, 2006
Karen Lanrap wrote:
> Tommie Gannert wrote:
>> Perhaps we should look at the GC as a RAD tool for initial
>> development
> 
> Yes, but then the GC has to be disabled by default for the release versions---and for non-releases there has to be a runtime error message, if a sweep starts but the gc is not enabled explicitely.

That wasn't my intention. More like "should work towards removing the GC'd objects", but not enforcing it. THe corporate guys won't mind the GC. They'll mind the extra two days it takes to fix deletes everywhere.

The message should be available (after GC) including what objects were collected (the profiling tool mentioned later in my post).

/T
July 27, 2006
Karen Lanrap wrote:
> I disagree. Assume a non GC'ed program that allocates 1.5 GB to 1.7 GB memory, from which 0.7 GB to 0.9 GB are vital data. If you run this program on a machine equipped with 1 GB, the OS will swap out the 0.8 GB data that is accessed infrequently. Therefore this program cause swapping only if it accesses data from the swapped out part of data and the size of the swapped data will be approximately bounded by doubling the size of the data needed to be swapped back.
> 
> This changes dramatically if you GC it, because on every allocation the available main memory is exhausted and the GC requires the OS to swap all 0.8 GB back, doesn't it. 

No, it doesn't require it to all be swapped in. It fact, it doesn't require any of it to be swapped in, unless a full collect is done. Full collects are not performed on every allocation - that would be a terrible design if it did.