View mode: basic / threaded / horizontal-split · Log in · Help
July 24, 2005
How much data can the GC handle ?
Hello,

i've never seen an answer to this question (and i didn't search very much):

How large can the data set become until the GC consumes to much space or is it
O(1) ? I doubt this. Has anybody ever written a program which handles dozens of
millions objects with hunderts of megabytes.

I thought about writing a test program, but i gave up because it is very
difficult to emulate a typical use pattern. Sure every program is different, but
i think a real world program might be better then any synthetic benchmark
July 25, 2005
Re: How much data can the GC handle ?
"llothar" <llothar_member@pathlink.com> wrote in message 
news:dc0k8e$3df$1@digitaldaemon.com...
> Hello,
>
> i've never seen an answer to this question (and i didn't search very 
> much):
>
> How large can the data set become until the GC consumes to much space or 
> is it
> O(1) ? I doubt this. Has anybody ever written a program which handles 
> dozens of
> millions objects with hunderts of megabytes.
>
> I thought about writing a test program, but i gave up because it is very
> difficult to emulate a typical use pattern. Sure every program is 
> different, but
> i think a real world program might be better then any synthetic benchmark

Maybe I don't get the question, but the amount of space available is 
proportional to the amount of memory (and virtual memory) available for any 
application. Are you asking how big the GC overhead is per allocation? Note 
the overhead depends on the size requested. For small allocations it is very 
efficient since it splits large blocks into small chunks - it could be an 
overhead of something like 1 bit per 8 bytes or something. For larger 
allocations the constant-size header is not noticeable.
July 25, 2005
Re: How much data can the GC handle ?
Ben Hinkle schrieb:
> "llothar" <llothar_member@pathlink.com> wrote in message 
> news:dc0k8e$3df$1@digitaldaemon.com...
> 
>> Hello,
>> 
>> i've never seen an answer to this question (and i didn't search
>> very much):
>> 
>> How large can the data set become until the GC consumes to much
>> space or is it O(1) ? I doubt this. Has anybody ever written a
>> program which handles dozens of millions objects with hunderts of
>> megabytes.

The much more interesting question is:
How efficient is the GC at detecting unused memory and reclaiming it?

A very basic sample:

void test(){
	size_t a;
	size_t* b;
	b = &a;
	a = cast(size_t) b;
}

Will the GC reclaim a and b after exiting test() and calling
std.gc.minimize()?

>> I thought about writing a test program, but i gave up because it is
>> very difficult to emulate a typical use pattern.

Maybe you should start with synthetic tests:
What happens if a program uses a large amount of ints?
Has a different nesting any influence on the GC?
...

That way you could not only detect that there are potential problems but
also identify/locate them.

Thomas
July 25, 2005
Re: How much data can the GC handle ?
In article <dc226e$169l$1@digitaldaemon.com>, =?UTF-8?B?VGhvbWFzIEvDvGhuZQ==?=
says...

>The much more interesting question is:
>How efficient is the GC at detecting unused memory and reclaiming it?

Thats why i asked in the past if D does emit typehints for the allocated
structures. At the moment D does not. Very bad. I found a message on the GC
mailing list that this is a problem for example when allocating large floating
point arrays. The number of false positives was so high that the application
became unuseable, but a "GC_atom_malloc" on the array removed this problem.

Hope that D implements the type hinting and correct use of "GC_malloc" soon,
because IMHO this is a serious mission critical problem.
July 25, 2005
Re: How much data can the GC handle ?
> How large can the data set become until the GC consumes to much space or is it
> O(1) ? I doubt this. Has anybody ever written a program which handles dozens of
> millions objects with hunderts of megabytes.


Somewhat related to your question, but not D-specific:
http://www.cs.umass.edu/~emery/pubs/04-17.pdf

marcio
October 15, 2005
Re: How much data can the GC handle ?
On 07/25/2005 07:38 AM, llothar wrote:
> In article <dc226e$169l$1@digitaldaemon.com>, =?UTF-8?B?VGhvbWFzIEvDvGhuZQ==?=
> says...
> 
> 
>>The much more interesting question is:
>>How efficient is the GC at detecting unused memory and reclaiming it?
> 
> 
> Thats why i asked in the past if D does emit typehints for the allocated
> structures. At the moment D does not. Very bad. I found a message on the GC
> mailing list that this is a problem for example when allocating large floating
> point arrays. The number of false positives was so high that the application
> became unuseable, but a "GC_atom_malloc" on the array removed this problem.
> 
> Hope that D implements the type hinting and correct use of "GC_malloc" soon,
> because IMHO this is a serious mission critical problem.
> 
> 
Using the D equivalent of one of the policy_ptr's at:

http://cvs.sourceforge.net/viewcvs.py/boost-sandbox/boost-sandbox/boost/policy_ptr/

would provide this hint in the form of:

  selected_fields_description_of
          < FieldsVisitor
          , record_type
          >::ptr()

where record_type is the type of the structure, and FieldsVisitor is
some type with member functions:

  template<class PolicyPtr>
  void visit_field(PolicyPtr& a_ptr);

where PolicyPtr is the type of fields in record_type which are
"visited" by FieldsVisitor.  Such a FieldVisitor could be a type
of garbage collector which, as part of visit_field, marks the
referent pointed to by a_ptr and then traverses the referent, of
type PolicyPtr::referent_type, using:

  selected_fields_description_of
          < FieldsVisitor
          , PolicyPtr::referent_type
          >::ptr()

thus allowing, a precise (as opposed to "conservative") scan of
the heap.
Top | Discussion index | About this forum | D home