View mode: basic / threaded / horizontal-split · Log in · Help
December 15, 2012
Re: Significant GC performance penalty
On 2012-12-14 19:27, Rob T wrote:

> I wonder what can be done to allow a programmer to go fully manual,
> while not loosing any of the nice features of D?

Someone has create a GC free version of druntime and Phobos. 
Unfortunately I can't find the post in the newsgroup right now.

-- 
/Jacob Carlborg
December 15, 2012
Re: Significant GC performance penalty
On Saturday, 15 December 2012 at 11:35:18 UTC, Jacob Carlborg 
wrote:
> On 2012-12-14 19:27, Rob T wrote:
>
>> I wonder what can be done to allow a programmer to go fully 
>> manual,
>> while not loosing any of the nice features of D?
>
> Someone has create a GC free version of druntime and Phobos. 
> Unfortunately I can't find the post in the newsgroup right now.

http://3d.benjamin-thaut.de/?p=20
December 15, 2012
Re: Significant GC performance penalty
On Saturday, 15 December 2012 at 13:04:41 UTC, Mike Parker wrote:
> On Saturday, 15 December 2012 at 11:35:18 UTC, Jacob Carlborg 
> wrote:
>> On 2012-12-14 19:27, Rob T wrote:
>>
>>> I wonder what can be done to allow a programmer to go fully 
>>> manual,
>>> while not loosing any of the nice features of D?
>>
>> Someone has create a GC free version of druntime and Phobos. 
>> Unfortunately I can't find the post in the newsgroup right now.
>
> http://3d.benjamin-thaut.de/?p=20

Thanks for the link, Windows only and I'm using Linux, but still 
worth a look.

Note this, comment below, a 3x difference, same as what I 
experienced:

Update:
I found a piece of code that did manually slow down the 
simulation in case it got to fast. This code never kicked in with 
the GC version, because it never reached the margin. The manual 
memory managed version however did reach the margin and was 
slowed down. With this piece of code removed the manual memory 
managed version runs at 5 ms which is 200 FPS and thus nearly 3 
times as fast as the GC collected version.
December 16, 2012
Re: Significant GC performance penalty
On Friday, 14 December 2012 at 19:24:39 UTC, Rob T wrote:
> On Friday, 14 December 2012 at 18:46:52 UTC, Peter Alexander 
> wrote:
>> Allocating memory is simply slow. The same is true in C++ 
>> where you will see performance hits if you allocate memory too 
>> often. The GC makes things worse, but if you really care about 
>> performance then you'll avoid allocating memory so often.
>>
>> Try to pre-allocate as much as possible, and use the stack 
>> instead of the heap where possible. Fixed size arrays and 
>> structs are your friend.
>
> In my situation, I can think of some ways to mitigate the 
> memory allocation  problem, however it's a bit tricky when 
> SELECT statement results have to be dynamically generated, 
> since the number of rows returned and size and type of the rows 
> are always different depending on the query and the data stored 
> in the database. It's just not at all practical to custom fit 
> for each SELECT to a pre-allocated array or list, it'll just be 
> far too much manual effort.
>

Isn't the memory management completely negligible when compared 
to the database access here ?
December 16, 2012
Re: Significant GC performance penalty
On Sunday, 16 December 2012 at 05:37:57 UTC, SomeDude wrote:
>
> Isn't the memory management completely negligible when compared 
> to the database access here ?

Here are the details ...

My test run selects and returns 206,085 records with 14 fields 
per record.

With all dynamic memory allocations disabled that are used to 
create the data structure containing the returned rows, a run 
takes 5 seconds. This does not return any data, but it runs 
exactly through all the records in the same way but returns to a 
temporary stack allocated value of appropriate type.

If I disable the GC before the run and re-enable it immediately 
after, it takes 7 seconds. I presume a full 2 seconds are used to 
disable and re-enable the GC which seems like a lot of time.

With all dynamic memory allocations enabled that are used to 
create the data structure containing the returned rows, a run 
takes 28 seconds. In this case, all 206K records are returned in 
a dynamically generate list.

If I disable the GC before the run and re-enable it immediately 
after, it takes 11 seconds. Since a full 2 seconds are used to 
disable and re-enable the GC, then 9 seconds are used, and since 
5 seconds are used without memory allocations, the allocations 
are using 4 seconds, but I'm doing a lot of allocations.

In my case, the structure is dynamically generated by allocating 
each individual field for each record returned, so there's 
206,085 records x 14 fields = 2,885,190 allocations being 
performed. I can cut the individual allocations down to 206,000 
by allocating the full record in one shot, however this is a 
stress test designed to work D as hard as possible and compare it 
with an identically stressed C++ version.

Both the D and C++ versions perform identically with the GC 
disabled and subtracting the 2 seconds from the D version to 
remove the time used up by enabling and disabling the GC during 
and after the run.

I wonder why 2 seconds are used to disable and enable the GC? 
That seems like a very large amount of time. If I select only 
5,000 records, the time to disable and enable the GC drops 
significantly to negligible levels and it takes the same amount 
of time per run with GC disabled & enabled, or with GC left 
enabled all the time.

During all tests, I do not run out of free RAM, and at no point 
does the memory go to swap.

--rt
December 16, 2012
Re: Significant GC performance penalty
Rob T:

> I wonder why 2 seconds are used to disable and enable the GC?

If you want one more test, try to put a "exit(0);" at the end of 
your program (The C exit is in core.stdc.stdlib).

Bye,
bearophile
December 16, 2012
Re: Significant GC performance penalty
On Sunday, 16 December 2012 at 07:47:48 UTC, Rob T wrote:
> On Sunday, 16 December 2012 at 05:37:57 UTC, SomeDude wrote:
>>
>> Isn't the memory management completely negligible when 
>> compared to the database access here ?
>
> Here are the details ...
>
> My test run selects and returns 206,085 records with 14 fields 
> per record.
>
> With all dynamic memory allocations disabled that are used to 
> create the data structure containing the returned rows, a run 
> takes 5 seconds. This does not return any data, but it runs 
> exactly through all the records in the same way but returns to 
> a temporary stack allocated value of appropriate type.
>
> If I disable the GC before the run and re-enable it immediately 
> after, it takes 7 seconds. I presume a full 2 seconds are used 
> to disable and re-enable the GC which seems like a lot of time.
>
> With all dynamic memory allocations enabled that are used to 
> create the data structure containing the returned rows, a run 
> takes 28 seconds. In this case, all 206K records are returned 
> in a dynamically generate list.
>
> If I disable the GC before the run and re-enable it immediately 
> after, it takes 11 seconds. Since a full 2 seconds are used to 
> disable and re-enable the GC, then 9 seconds are used, and 
> since 5 seconds are used without memory allocations, the 
> allocations are using 4 seconds, but I'm doing a lot of 
> allocations.
>
> In my case, the structure is dynamically generated by 
> allocating each individual field for each record returned, so 
> there's 206,085 records x 14 fields = 2,885,190 allocations 
> being performed. I can cut the individual allocations down to 
> 206,000 by allocating the full record in one shot, however this 
> is a stress test designed to work D as hard as possible and 
> compare it with an identically stressed C++ version.
>
> Both the D and C++ versions perform identically with the GC 
> disabled and subtracting the 2 seconds from the D version to 
> remove the time used up by enabling and disabling the GC during 
> and after the run.
>
> I wonder why 2 seconds are used to disable and enable the GC? 
> That seems like a very large amount of time. If I select only 
> 5,000 records, the time to disable and enable the GC drops 
> significantly to negligible levels and it takes the same amount 
> of time per run with GC disabled & enabled, or with GC left 
> enabled all the time.
>
> During all tests, I do not run out of free RAM, and at no point 
> does the memory go to swap.
>
> --rt

Adding and subtracting times like this doesn't give very reliable 
results. If you want to know how much time is taken by different 
parts of code, I suggest you use a profiler.
December 16, 2012
Re: Significant GC performance penalty
On Sunday, 16 December 2012 at 07:47:48 UTC, Rob T wrote:
> On Sunday, 16 December 2012 at 05:37:57 UTC, SomeDude wrote:
>>
>> Isn't the memory management completely negligible when 
>> compared to the database access here ?
>
> Here are the details ...
>
> My test run selects and returns 206,085 records with 14 fields 
> per record.
>
> With all dynamic memory allocations disabled that are used to 
> create the data structure containing the returned rows, a run 
> takes 5 seconds. This does not return any data, but it runs 
> exactly through all the records in the same way but returns to 
> a temporary stack allocated value of appropriate type.
>
> If I disable the GC before the run and re-enable it immediately 
> after, it takes 7 seconds. I presume a full 2 seconds are used 
> to disable and re-enable the GC which seems like a lot of time.
>
> With all dynamic memory allocations enabled that are used to 
> create the data structure containing the returned rows, a run 
> takes 28 seconds. In this case, all 206K records are returned 
> in a dynamically generate list.
>
> If I disable the GC before the run and re-enable it immediately 
> after, it takes 11 seconds. Since a full 2 seconds are used to 
> disable and re-enable the GC, then 9 seconds are used, and 
> since 5 seconds are used without memory allocations, the 
> allocations are using 4 seconds, but I'm doing a lot of 
> allocations.
>
> In my case, the structure is dynamically generated by 
> allocating each individual field for each record returned, so 
> there's 206,085 records x 14 fields = 2,885,190 allocations 
> being performed. I can cut the individual allocations down to 
> 206,000 by allocating the full record in one shot, however this 
> is a stress test designed to work D as hard as possible and 
> compare it with an identically stressed C++ version.
>
> Both the D and C++ versions perform identically with the GC 
> disabled and subtracting the 2 seconds from the D version to 
> remove the time used up by enabling and disabling the GC during 
> and after the run.
>
> I wonder why 2 seconds are used to disable and enable the GC? 
> That seems like a very large amount of time. If I select only 
> 5,000 records, the time to disable and enable the GC drops 
> significantly to negligible levels and it takes the same amount 
> of time per run with GC disabled & enabled, or with GC left 
> enabled all the time.
>
> During all tests, I do not run out of free RAM, and at no point 
> does the memory go to swap.
>
> --rt

Use the stopwatch class from std.datetime to get a proper idea of 
where time is being spent. All this subtracting 2 secs business 
stinks.

or just fire up a profiler.
December 16, 2012
Re: Significant GC performance penalty
On 2012-12-15 14:04, Mike Parker wrote:

> http://3d.benjamin-thaut.de/?p=20

That's it, thanks.

-- 
/Jacob Carlborg
December 16, 2012
Re: Significant GC performance penalty
On Sunday, 16 December 2012 at 11:43:20 UTC, John Colvin wrote:
> Use the stopwatch class from std.datetime to get a proper idea 
> of where time is being spent. All this subtracting 2 secs 
> business stinks.
>
> or just fire up a profiler.

I am using the stopwatch, but had not gotten around to wrapping 
around things for the extra detail. The subtractions and so forth 
was roughly calculated on the fly while I was posting and 
noticing new things I hadn't notice before.

The fact is disabling and enabling the GC added on an extra 2 
secs for some reason, so it's of interest knowing why. I'll do 
proper timing later and post the results here.

--rt
1 2 3
Top | Discussion index | About this forum | D home