The purpose of D (GC rant, long) (page 3) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » The purpose of D (GC rant, long) (page 3)

October 27, 2006

Re: The purpose of D (GC rant, long)

Posted by Sean Kelly
in reply to Dave

Sean Kelly

Posted in reply to Dave

Dave wrote:
> Sean Kelly wrote:
>> Now that memory is not freed when string.length is set to zero it's quite possible to avoid most reallocations simply by preallocating in buffers before using them (ie. set length to some large number and then 
> 
> I did not know that had been changed.. Is that now part of the language 'spec' somewhere as well?

No.  It was implemented in 170-172 by request from Derek.  I don't know the issue number offhand.

> I'm betting this has been discussed or at least proposed, but here goes again; let's get an array.reserve at least for native arrays (that could be implemented as {arr.length = nnn; arr.length = 0;}). That way it would make for less of a hack than re/setting the length, and also codify it as part of the language.

I agree that this would be useful.  Though it would probably be more like:

size_t tmp = arr.length; arr.length = nnn; arr.length = tmp;


Sean

October 27, 2006

Re: The purpose of D (GC rant, long)

Posted by Sean Kelly
in reply to Walter Bright

Sean Kelly

Posted in reply to Walter Bright

Walter Bright wrote:
> Sean Kelly wrote:
>>  but D can
>> really do everything C++ can here, unless you absolutely insist on having user-defined data types that are indistinguishable from built-in types.
> 
> C++ is not capable of having user-defined types indistinguishable from built-in ones.

How so?  It can certainly get pretty close, but I'm unaware of the limitations.


Sean

October 28, 2006

Re: The purpose of D (GC rant, long)

Posted by Dave
in reply to Sean Kelly

Dave

Posted in reply to Sean Kelly

Sean Kelly wrote:
> Dave wrote:
>> Sean Kelly wrote:
>>> Now that memory is not freed when string.length is set to zero it's quite possible to avoid most reallocations simply by preallocating in buffers before using them (ie. set length to some large number and then 
>>
>> I did not know that had been changed.. Is that now part of the language 'spec' somewhere as well?
> 
> No.  It was implemented in 170-172 by request from Derek.  I don't know the issue number offhand.
> 
>> I'm betting this has been discussed or at least proposed, but here goes again; let's get an array.reserve at least for native arrays (that could be implemented as {arr.length = nnn; arr.length = 0;}). That way it would make for less of a hack than re/setting the length, and also codify it as part of the language.
> 
> I agree that this would be useful.  Though it would probably be more like:
> 
> size_t tmp = arr.length; arr.length = nnn; arr.length = tmp;
> 

Oops, you're right..

I was curious as to how much initialization cost. I took the code (bottom) and ran it three times: 1) as-is, 2) with the initialization code in gc._d_newarrayi() commented out and 3) with the initialization code in gcx._malloc() also commented out (along with (2))

1 (DMD v0.172 -O -inline -release, libc v2.4):
new(20):    2.316
malloc(20): 0.83
new(40):    3.029
malloc(40): 0.831
new(60):    2.627
malloc(60): 0.872
new(80):    4.138
malloc(80): 1.754
new(100):    3.968
malloc(100): 1.756

2:
new(20):    1.451
malloc(20): 0.835
new(40):    2.137
malloc(40): 0.838
new(60):    1.765
malloc(60): 0.874
new(80):    3.33
malloc(80): 2.114
new(100):    3.108
malloc(100): 2.164

3:
new(20):    1.133
malloc(20): 0.838
new(40):    1.658
malloc(40): 0.834
new(60):    1.657
malloc(60): 0.981
new(80):    1.871
malloc(80): 1.888
new(100):    1.871
malloc(100): 1.899

The cost of initialization is actually *higher* than the cost of allocation/GC, and for larger arrays the performance is comparable to malloc/free.

One thing I noticed is that for most/all of the _d_new* functions, initialization will be done twice, once in the gcx.malloc and again in the _d_new* function. I believe the extra initialization could be removed in most cases (perhaps with an optional parameter to gcx.malloc()?). Maybe also some syntax to support 'void' initializers for heap allocated arrays?

Those may go along way toward getting rid of the OP's concerns with the performance of the GC.

//----------------

import std.date, std.c.stdlib, std.stdio, std.gc;

void main()
{
  const iters = 10_000_000;
  for(int j = 1; j <= 5; j++)
  {
    {
    d_time s = getUTCtime();
    for(int i = 0; i < iters; i++)
    {
        char[] str = new char[j * 20];
    }
    d_time e = getUTCtime();
    writefln("new(",j*20,"):    ",(e-s)/cast(float)TicksPerSecond);
    }
    {
    d_time s = getUTCtime();
    for(int i = 0; i < iters; i++)
    {
        char* str = cast(char*)malloc(j * 20 + 1);
        free(str);
    }
    d_time e = getUTCtime();
    writefln("malloc(",j*20,"): ",(e-s)/cast(float)TicksPerSecond);
    }
    fullCollect;
  }
}

October 28, 2006

Re: The purpose of D (GC rant, long)

Posted by Sean Kelly
in reply to Dave

Sean Kelly

Posted in reply to Dave

Dave wrote:
> Sean Kelly wrote:
>> Dave wrote:
>>> Sean Kelly wrote:
>>>> Now that memory is not freed when string.length is set to zero it's quite possible to avoid most reallocations simply by preallocating in buffers before using them (ie. set length to some large number and then 
>>>
>>> I did not know that had been changed.. Is that now part of the language 'spec' somewhere as well?
>>
>> No.  It was implemented in 170-172 by request from Derek.  I don't know the issue number offhand.
>>
>>> I'm betting this has been discussed or at least proposed, but here goes again; let's get an array.reserve at least for native arrays (that could be implemented as {arr.length = nnn; arr.length = 0;}). That way it would make for less of a hack than re/setting the length, and also codify it as part of the language.
>>
>> I agree that this would be useful.  Though it would probably be more like:
>>
>> size_t tmp = arr.length; arr.length = nnn; arr.length = tmp;
>>
> 
> Oops, you're right..
> 
> I was curious as to how much initialization cost. I took the code (bottom) and ran it three times: 1) as-is, 2) with the initialization code in gc._d_newarrayi() commented out and 3) with the initialization code in gcx._malloc() also commented out (along with (2))
...
> One thing I noticed is that for most/all of the _d_new* functions, initialization will be done twice, once in the gcx.malloc and again in the _d_new* function. I believe the extra initialization could be removed in most cases (perhaps with an optional parameter to gcx.malloc()?). Maybe also some syntax to support 'void' initializers for heap allocated arrays?

What initialization in gcx.malloc?  The only call to memset I see has a debug flag.  And I believe void initializers already work for arrays.


Sean

October 28, 2006

Re: The purpose of D (GC rant, long)

Posted by Dave
in reply to Sean Kelly

Dave

Posted in reply to Sean Kelly

Sean Kelly wrote:
> Dave wrote:
>> Sean Kelly wrote:
>>> Dave wrote:
>>>> Sean Kelly wrote:
>>>>> Now that memory is not freed when string.length is set to zero it's quite possible to avoid most reallocations simply by preallocating in buffers before using them (ie. set length to some large number and then 
>>>>
>>>> I did not know that had been changed.. Is that now part of the language 'spec' somewhere as well?
>>>
>>> No.  It was implemented in 170-172 by request from Derek.  I don't know the issue number offhand.
>>>
>>>> I'm betting this has been discussed or at least proposed, but here goes again; let's get an array.reserve at least for native arrays (that could be implemented as {arr.length = nnn; arr.length = 0;}). That way it would make for less of a hack than re/setting the length, and also codify it as part of the language.
>>>
>>> I agree that this would be useful.  Though it would probably be more like:
>>>
>>> size_t tmp = arr.length; arr.length = nnn; arr.length = tmp;
>>>
>>
>> Oops, you're right..
>>
>> I was curious as to how much initialization cost. I took the code (bottom) and ran it three times: 1) as-is, 2) with the initialization code in gc._d_newarrayi() commented out and 3) with the initialization code in gcx._malloc() also commented out (along with (2))
> ...
>> One thing I noticed is that for most/all of the _d_new* functions, initialization will be done twice, once in the gcx.malloc and again in the _d_new* function. I believe the extra initialization could be removed in most cases (perhaps with an optional parameter to gcx.malloc()?). Maybe also some syntax to support 'void' initializers for heap allocated arrays?
> 
> What initialization in gcx.malloc?  The only call to memset I see has a debug flag.  And I believe void initializers already work for arrays.
> 

Line 296:  foreach(inout byte b; cast(byte[])(p + size)[0..binsize[bin] - size]) { b = 0; }

Right above the debug (MEMSTOMP)

Not really 'initialization' as it just clears the unused portion of the 'bin', but it's still overhead that std.c.stdlib.malloc() doesn't have.

The important point is that currently the initialization/clearing in itself takes longer than stdlib.malloc, so no matter how fast the allocator is, initialization will be a bottleneck.

Any ideas on how to optimize that?

> 
> Sean

October 28, 2006

Re: The purpose of D (GC rant, long)

Posted by Sean Kelly
in reply to Dave

Sean Kelly

Posted in reply to Dave

Dave wrote:
> Sean Kelly wrote:
>> Dave wrote:
>>> Sean Kelly wrote:
>>>> Dave wrote:
>>>>> Sean Kelly wrote:
>>>>>> Now that memory is not freed when string.length is set to zero it's quite possible to avoid most reallocations simply by preallocating in buffers before using them (ie. set length to some large number and then 
>>>>>
>>>>> I did not know that had been changed.. Is that now part of the language 'spec' somewhere as well?
>>>>
>>>> No.  It was implemented in 170-172 by request from Derek.  I don't know the issue number offhand.
>>>>
>>>>> I'm betting this has been discussed or at least proposed, but here goes again; let's get an array.reserve at least for native arrays (that could be implemented as {arr.length = nnn; arr.length = 0;}). That way it would make for less of a hack than re/setting the length, and also codify it as part of the language.
>>>>
>>>> I agree that this would be useful.  Though it would probably be more like:
>>>>
>>>> size_t tmp = arr.length; arr.length = nnn; arr.length = tmp;
>>>>
>>>
>>> Oops, you're right..
>>>
>>> I was curious as to how much initialization cost. I took the code (bottom) and ran it three times: 1) as-is, 2) with the initialization code in gc._d_newarrayi() commented out and 3) with the initialization code in gcx._malloc() also commented out (along with (2))
>> ...
>>> One thing I noticed is that for most/all of the _d_new* functions, initialization will be done twice, once in the gcx.malloc and again in the _d_new* function. I believe the extra initialization could be removed in most cases (perhaps with an optional parameter to gcx.malloc()?). Maybe also some syntax to support 'void' initializers for heap allocated arrays?
>>
>> What initialization in gcx.malloc?  The only call to memset I see has a debug flag.  And I believe void initializers already work for arrays.
>>
> 
> Line 296:  foreach(inout byte b; cast(byte[])(p + size)[0..binsize[bin] - size]) { b = 0; }
> 
> Right above the debug (MEMSTOMP)

Oops!  Dunno how I missed that.

> Not really 'initialization' as it just clears the unused portion of the 'bin', but it's still overhead that std.c.stdlib.malloc() doesn't have.
> 
> The important point is that currently the initialization/clearing in itself takes longer than stdlib.malloc, so no matter how fast the allocator is, initialization will be a bottleneck.
> 
> Any ideas on how to optimize that?

I'm not sure of the ideal approach for Phobos, but in Ares I have separate malloc and calloc methods exposed.  So I'll likely just change calloc to initialize the entire block instead of just the allocated portion, and remove the spare space initializer from mallocNoSync.


Sean

October 28, 2006

Re: The purpose of D (GC rant, long)

Posted by Sean Kelly
in reply to Sean Kelly

Sean Kelly

Posted in reply to Sean Kelly

Sean Kelly wrote:
> Dave wrote:
>>
>> Not really 'initialization' as it just clears the unused portion of the 'bin', but it's still overhead that std.c.stdlib.malloc() doesn't have.
>>
>> The important point is that currently the initialization/clearing in itself takes longer than stdlib.malloc, so no matter how fast the allocator is, initialization will be a bottleneck.
>>
>> Any ideas on how to optimize that?
> 
> I'm not sure of the ideal approach for Phobos, but in Ares I have separate malloc and calloc methods exposed.  So I'll likely just change calloc to initialize the entire block instead of just the allocated portion, and remove the spare space initializer from mallocNoSync.

You know, one thing to be said for the current approach is that it will result in fewer memory 'leaks' because unused memory is initialized to a value that is guaranteed not to look like a reference to actual memory.  I may leave things as-is.


Sean

October 30, 2006

Re: The purpose of D (GC rant, long)

Posted by Walter Bright
in reply to Sean Kelly

Walter Bright

Posted in reply to Sean Kelly

Sean Kelly wrote:
> Walter Bright wrote:
>> Sean Kelly wrote:
>>>  but D can
>>> really do everything C++ can here, unless you absolutely insist on having user-defined data types that are indistinguishable from built-in types.
>>
>> C++ is not capable of having user-defined types indistinguishable from built-in ones.
> 
> How so?  It can certainly get pretty close, but I'm unaware of the limitations.

Consider std::string. It cannot deal with "string1"+"string2". std::vector<> cannot be statically initialized. There's no way to create user-defined literals. Etc.

October 30, 2006

Re: The purpose of D (GC rant, long)

Posted by Walter Bright
in reply to Andrey Khropov

Walter Bright

Posted in reply to Andrey Khropov

Andrey Khropov wrote:
> http://rsdn.ru/forum/Message.aspx?mid=2126270&only=1
> 
> (it's in Russian but you can easily recognize the numbers)

I tried google's translator on it, but unfortunately Russian is not supported!

November 02, 2006

Re: The purpose of D (GC rant, long)

Posted by Roberto Mariottini
in reply to Walter Bright

Roberto Mariottini

Posted in reply to Walter Bright

Walter Bright wrote:
> Andrey Khropov wrote:
>> http://rsdn.ru/forum/Message.aspx?mid=2126270&only=1
>>
>> (it's in Russian but you can easily recognize the numbers)
> 
> I tried google's translator on it, but unfortunately Russian is not supported!

But it's supported by Altavista:

http://babelfish.altavista.com/

Ciao

P.S.: The Java example should use StringBuffer, not String!

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation