July 27, 2006
Walter Bright wrote:
> Karen Lanrap wrote:
>> I disagree. Assume a non GC'ed program that allocates 1.5 GB to 1.7 GB memory, from which 0.7 GB to 0.9 GB are vital data. If you run this program on a machine equipped with 1 GB, the OS will swap out the 0.8 GB data that is accessed infrequently. Therefore this program cause swapping only if it accesses data from the swapped out part of data and the size of the swapped data will be approximately bounded by doubling the size of the data needed to be swapped back.
>>
>> This changes dramatically if you GC it, because on every allocation the available main memory is exhausted and the GC requires the OS to swap all 0.8 GB back, doesn't it. 
> 
> No, it doesn't require it to all be swapped in. It fact, it doesn't require any of it to be swapped in, unless a full collect is done. Full collects are not performed on every allocation - that would be a terrible design if it did.

By the way, there's no reason that even a full collect must swap all 0.8 GB back in.  Some GCs use an alternate approach where pages are scanned and marked stale when the VMM swaps them to disk, so no page faults occur on collection runs.



Sean
July 27, 2006
Sean Kelly wrote:
> By the way, there's no reason that even a full collect must swap all 0.8 GB back in.  Some GCs use an alternate approach where pages are scanned and marked stale when the VMM swaps them to disk, so no page faults occur on collection runs.

I only know of one such collector, and it required a specially patched (Linux) kernel that notifies a process before it swaps its pages to disk and allows it to specify which pages to swap (since it touches the pages on receiving the swap warning, which for a normal LRU-like scheme stops it from being swapped).

So the reason a full collect must swap all 0.8 GB back in might be the absence of such a patched kernel, for one :). (I wouldn't want to require a user to patching his OS just to run my GC-ed program)

Also, that particular GC would sometimes do an actual full collect of the memory, since otherwise swapped-out garbage might never be collected.
July 27, 2006
Frits van Bommel wrote:
> Sean Kelly wrote:
>> By the way, there's no reason that even a full collect must swap all 0.8 GB back in.  Some GCs use an alternate approach where pages are scanned and marked stale when the VMM swaps them to disk, so no page faults occur on collection runs.
> 
> I only know of one such collector, and it required a specially patched (Linux) kernel that notifies a process before it swaps its pages to disk and allows it to specify which pages to swap (since it touches the pages on receiving the swap warning, which for a normal LRU-like scheme stops it from being swapped).

Yes, the scheme isn't supported everywhere, though I had thought it was possible on Linux without a kernel patch.

> Also, that particular GC would sometimes do an actual full collect of the memory, since otherwise swapped-out garbage might never be collected.

True.  It would be somewhat similar to a generational GC in some respects, with stale pages representing "mature" data.


Sean
July 27, 2006
Karen Lanrap wrote:
> I see three problems:
> 
> 1) The typical behaviour of a GC'ed application is to require more and more main memory but not to need it.

No, that is not at all how GC works. The whole idea behind GC is to "collect" unneeded memory so it can be reused.

> Hence every GC'ed application forces the OS to diminish the size of the system cache held in main memory until the GC of the application kicks in.

This has nothing to do with the system cache. All the system cache is is the most recently accessed memory. Memory that is allocated, but not referenced, is flushed from the system cache.


> 2) If the available main memory is unsufficient for the true memory requirements of the application and the OS provides virtual memory by swapping out to secondary storage, every run of the GC forces the OS to slowly swap back all data for this application from secondary storage

What you're describing is called 'thrashing', and happens on any system where the sum of the applications running regularly access more memory than exists on the system, regardless of what kind of memory management system is used.

> and runs of the GC occur frequently, because main
> memory is tight.

This is incorrect, as GC keys off of virtual memory available, not main memory available.


> 3) If there is more than one GC'ed application running, those applications compete for the available main memory.

No, they compete for virtual memory. The most recently accessed pages (4k resolution) are swapped into main memory.

> I see four risks:
> 
> a) from 1: The overall reaction of the system gets slower in favor for the GC'ed application.

No - GC uses *virtual* address space, it doesn't use any more *physical* memory than any other app would. Remember, physical memory only gets actually used if the memory is referenced. Unused virtual memory, even if allocated, is not put in physical memory.

> 
> b) from 2: Projects decomposed into several subtasks may face severe runtime problems when integrating the independent and succesful tested modules.

I seriously doubt that, unless the subtasks all want all of memory, which seems like an extreme, highly unusual case.

> c) from 2 and b: The reduction of man time in the development and maintenance phases for not being forced to avoid memory leaks may be overly compensated by an increase of machine time by a factor of 50 or more.

GC apps often run faster than the equivalent explicitly managed apps. Why? Because:
1) GC apps need to do far less allocation
2) GC apps don't consume memory needed for <shared_ptr> or equivalent memory management objects
3) GC apps don't need to spend time doing synchronization of reference counts

> Therefore GC'ed applications currently seem to be suitable only if they are running single instance on a machine well equipped with main memory and no other GC'ed applications are used.

GC is quite mainstream now, and the technology has progressed far beyond such a primitive state. I believe your concerns are unfounded. I suggest the book "Garbage Collection" you can get from www.digitalmars.com/bibliography.html.
July 27, 2006
Sean Kelly wrote:
> Frits van Bommel wrote:
>> Sean Kelly wrote:
>>> By the way, there's no reason that even a full collect must swap all 0.8 GB back in.  Some GCs use an alternate approach where pages are scanned and marked stale when the VMM swaps them to disk, so no page faults occur on collection runs.
>>
>> I only know of one such collector, and it required a specially patched (Linux) kernel that notifies a process before it swaps its pages to disk and allows it to specify which pages to swap (since it touches the pages on receiving the swap warning, which for a normal LRU-like scheme stops it from being swapped).
>
> Yes, the scheme isn't supported everywhere, though I had thought it was possible on Linux without a kernel patch.

The one I was talking about is described in http://www.cs.umass.edu/~emery/pubs/04-16.pdf (and at least one other paper at http://www.cs.umass.edu/~emery/pubs/). A quote from page 5:

-----
4. Kernel Support
The Hippocratic collector improves garbage collection paging performance primarily by cooperating with the virtual memory manager. In this section, we describe our extensions to the Linux kernel that enable cooperative garbage collection.
[...]
-----

Of course, it is entirely possible that their patch or one with similar effects has been accepted in the kernel since then.


>> Also, that particular GC would sometimes do an actual full collect of the memory, since otherwise swapped-out garbage might never be collected.
>
> True.  It would be somewhat similar to a generational GC in some respects, with stale pages representing "mature" data.

Yep, except instead of the objects in them having existed for a long time they haven't been touched in a while (which I suppose implies they've existed for all that time :) ).


One of the other cool thing this collector does IIRC: it tries to move all objects out of a page that's scheduled to be swapped out if there's space in other (memory resident) pages. If that's successful it then tells the kernel to forget about saving the contents to disk since they won't be needed anymore.
July 27, 2006
Frits van Bommel wrote:
> One of the other cool thing this collector does IIRC: it tries to move all objects out of a page that's scheduled to be swapped out if there's space in other (memory resident) pages. If that's successful it then tells the kernel to forget about saving the contents to disk since they won't be needed anymore.

The GC can be improved if it can cooperate with/use the VM hardware. I've thought for some time that GC ought to be an OS system service for that reason, rather than an app library.
July 27, 2006
Walter Bright wrote:
> Frits van Bommel wrote:
>> One of the other cool thing this collector does IIRC: it tries to move all objects out of a page that's scheduled to be swapped out if there's space in other (memory resident) pages. If that's successful it then tells the kernel to forget about saving the contents to disk since they won't be needed anymore.
> 
> The GC can be improved if it can cooperate with/use the VM hardware. I've thought for some time that GC ought to be an OS system service for that reason, rather than an app library.

Same here.  Between thread scheduling and memory management, having the GC as an OS service seems a natural fit.  I'll admit to being somewhat curious about whether MS tries this to improve .NET performance.


Sean
July 28, 2006
Walter Bright wrote:

> that would be a terrible design if it did.

Then have a look what the code below compiled (dmd 0.163; win) is doing under XPSP2.

1) Start it with for example the command
  <exeName> 400 100
   in as shell
2) wait until initializing is done
3) veriify in task manager, that allocated memory is about 400M
4) minimize shell
5) observe in task manager, that allocated memory reduces to 100M

6) Start it with for example the command
  <exeName> 400 100 -lg
  in as shell
7) wait until initializing is done
8) veriify in task manager, that allocated memory is about 400M
9) minimize shell
10) observe in task manager, that allocated memory does not redcue

Note: -lg means leak and use GC

Note: If main mmeory has 1GB, the call "<exeName> 1200 400 -lg" will cause thrashing.

Note: If main memory has 1GB you can have several shells running the
command "<exeName> 500 100"
but not with the command "<exeName 500 100 -lg"


July 28, 2006
Karen Lanrap wrote:
> Walter Bright wrote:
> 
>> that would be a terrible design if it did.
> 
> Then have a look what the code below compiled (dmd 0.163; win) is doing under XPSP2.

Assuming this is actually a problem then it's likely just with the allocator strategy, not with GC as a technique.  For example, I have a small program to test multithreading in D and XP reports its memory use climbing steadily as it runs, even though it should remain roughly constant insofar as the code itself is concerned.  I've wondered whether some leapfrogging is going on, but haven't bothered to look into it.


Sean
July 28, 2006
Karen Lanrap wrote:
> Walter Bright wrote:
> 
> 
>>that would be a terrible design if it did.
> 
> 
> Then have a look what the code below compiled (dmd 0.163; win) is doing under XPSP2.
> 
[...]
> 3) veriify in task manager, that allocated memory is about 400M
> 4) minimize shell
> 5) observe in task manager, that allocated memory reduces to 100M
> 
[...]

I don't see how this demonstrates that a GC'd app will cause more thrashing than a non-GC'd app. First of all, is that 400/100M quote the actual Physical memory usage, or the virtual memory usage?

Even if the program IS using more (virtual) RAM it won't inherently thrash the system. Thrashing is caused by /access/ to more memory than is available, not by having more memory /allocated/ than is available.

Having lots of "lost" memory around is quite unlikely to cause thrashing because, In all likelihood, the extra ram is inaccessible due to lack of pointers to it. As such It won't get accessed and will never get swapped back in. An exception to this is if the GC maintains some sort of tagging of memory block that is located at the block it's self. Then some kinds of GC actions will cause the "lost" blocks to be read, and thus swapped in (and probably deallocated a short time later).

The accessible memory on the other hand, _will_ get scanned by the GC on a full collect. This could cause some thrashing. However, unless the GC only frees up enough memory for each allocation (this would not only be a bad design, but a stupid one as well) most allocations won't result in any collection at all.

As to swapping, the same logic applys to the OS's swapping algorithm. Typically OS's try to keep a pool of free memory available by swapping stuff out when it has the time. Most of the time, allocations and page faults don't require anything to be swapped out.