October 27, 2006 Re: The purpose of D (GC rant, long) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dave | Dave wrote: > Sean Kelly wrote: >> Now that memory is not freed when string.length is set to zero it's quite possible to avoid most reallocations simply by preallocating in buffers before using them (ie. set length to some large number and then > > I did not know that had been changed.. Is that now part of the language 'spec' somewhere as well? No. It was implemented in 170-172 by request from Derek. I don't know the issue number offhand. > I'm betting this has been discussed or at least proposed, but here goes again; let's get an array.reserve at least for native arrays (that could be implemented as {arr.length = nnn; arr.length = 0;}). That way it would make for less of a hack than re/setting the length, and also codify it as part of the language. I agree that this would be useful. Though it would probably be more like: size_t tmp = arr.length; arr.length = nnn; arr.length = tmp; Sean |
October 27, 2006 Re: The purpose of D (GC rant, long) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | Walter Bright wrote:
> Sean Kelly wrote:
>> but D can
>> really do everything C++ can here, unless you absolutely insist on having user-defined data types that are indistinguishable from built-in types.
>
> C++ is not capable of having user-defined types indistinguishable from built-in ones.
How so? It can certainly get pretty close, but I'm unaware of the limitations.
Sean
|
October 28, 2006 Re: The purpose of D (GC rant, long) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Sean Kelly | Sean Kelly wrote: > Dave wrote: >> Sean Kelly wrote: >>> Now that memory is not freed when string.length is set to zero it's quite possible to avoid most reallocations simply by preallocating in buffers before using them (ie. set length to some large number and then >> >> I did not know that had been changed.. Is that now part of the language 'spec' somewhere as well? > > No. It was implemented in 170-172 by request from Derek. I don't know the issue number offhand. > >> I'm betting this has been discussed or at least proposed, but here goes again; let's get an array.reserve at least for native arrays (that could be implemented as {arr.length = nnn; arr.length = 0;}). That way it would make for less of a hack than re/setting the length, and also codify it as part of the language. > > I agree that this would be useful. Though it would probably be more like: > > size_t tmp = arr.length; arr.length = nnn; arr.length = tmp; > Oops, you're right.. I was curious as to how much initialization cost. I took the code (bottom) and ran it three times: 1) as-is, 2) with the initialization code in gc._d_newarrayi() commented out and 3) with the initialization code in gcx._malloc() also commented out (along with (2)) 1 (DMD v0.172 -O -inline -release, libc v2.4): new(20): 2.316 malloc(20): 0.83 new(40): 3.029 malloc(40): 0.831 new(60): 2.627 malloc(60): 0.872 new(80): 4.138 malloc(80): 1.754 new(100): 3.968 malloc(100): 1.756 2: new(20): 1.451 malloc(20): 0.835 new(40): 2.137 malloc(40): 0.838 new(60): 1.765 malloc(60): 0.874 new(80): 3.33 malloc(80): 2.114 new(100): 3.108 malloc(100): 2.164 3: new(20): 1.133 malloc(20): 0.838 new(40): 1.658 malloc(40): 0.834 new(60): 1.657 malloc(60): 0.981 new(80): 1.871 malloc(80): 1.888 new(100): 1.871 malloc(100): 1.899 The cost of initialization is actually *higher* than the cost of allocation/GC, and for larger arrays the performance is comparable to malloc/free. One thing I noticed is that for most/all of the _d_new* functions, initialization will be done twice, once in the gcx.malloc and again in the _d_new* function. I believe the extra initialization could be removed in most cases (perhaps with an optional parameter to gcx.malloc()?). Maybe also some syntax to support 'void' initializers for heap allocated arrays? Those may go along way toward getting rid of the OP's concerns with the performance of the GC. //---------------- import std.date, std.c.stdlib, std.stdio, std.gc; void main() { const iters = 10_000_000; for(int j = 1; j <= 5; j++) { { d_time s = getUTCtime(); for(int i = 0; i < iters; i++) { char[] str = new char[j * 20]; } d_time e = getUTCtime(); writefln("new(",j*20,"): ",(e-s)/cast(float)TicksPerSecond); } { d_time s = getUTCtime(); for(int i = 0; i < iters; i++) { char* str = cast(char*)malloc(j * 20 + 1); free(str); } d_time e = getUTCtime(); writefln("malloc(",j*20,"): ",(e-s)/cast(float)TicksPerSecond); } fullCollect; } } |
October 28, 2006 Re: The purpose of D (GC rant, long) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dave | Dave wrote: > Sean Kelly wrote: >> Dave wrote: >>> Sean Kelly wrote: >>>> Now that memory is not freed when string.length is set to zero it's quite possible to avoid most reallocations simply by preallocating in buffers before using them (ie. set length to some large number and then >>> >>> I did not know that had been changed.. Is that now part of the language 'spec' somewhere as well? >> >> No. It was implemented in 170-172 by request from Derek. I don't know the issue number offhand. >> >>> I'm betting this has been discussed or at least proposed, but here goes again; let's get an array.reserve at least for native arrays (that could be implemented as {arr.length = nnn; arr.length = 0;}). That way it would make for less of a hack than re/setting the length, and also codify it as part of the language. >> >> I agree that this would be useful. Though it would probably be more like: >> >> size_t tmp = arr.length; arr.length = nnn; arr.length = tmp; >> > > Oops, you're right.. > > I was curious as to how much initialization cost. I took the code (bottom) and ran it three times: 1) as-is, 2) with the initialization code in gc._d_newarrayi() commented out and 3) with the initialization code in gcx._malloc() also commented out (along with (2)) ... > One thing I noticed is that for most/all of the _d_new* functions, initialization will be done twice, once in the gcx.malloc and again in the _d_new* function. I believe the extra initialization could be removed in most cases (perhaps with an optional parameter to gcx.malloc()?). Maybe also some syntax to support 'void' initializers for heap allocated arrays? What initialization in gcx.malloc? The only call to memset I see has a debug flag. And I believe void initializers already work for arrays. Sean |
October 28, 2006 Re: The purpose of D (GC rant, long) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Sean Kelly | Sean Kelly wrote: > Dave wrote: >> Sean Kelly wrote: >>> Dave wrote: >>>> Sean Kelly wrote: >>>>> Now that memory is not freed when string.length is set to zero it's quite possible to avoid most reallocations simply by preallocating in buffers before using them (ie. set length to some large number and then >>>> >>>> I did not know that had been changed.. Is that now part of the language 'spec' somewhere as well? >>> >>> No. It was implemented in 170-172 by request from Derek. I don't know the issue number offhand. >>> >>>> I'm betting this has been discussed or at least proposed, but here goes again; let's get an array.reserve at least for native arrays (that could be implemented as {arr.length = nnn; arr.length = 0;}). That way it would make for less of a hack than re/setting the length, and also codify it as part of the language. >>> >>> I agree that this would be useful. Though it would probably be more like: >>> >>> size_t tmp = arr.length; arr.length = nnn; arr.length = tmp; >>> >> >> Oops, you're right.. >> >> I was curious as to how much initialization cost. I took the code (bottom) and ran it three times: 1) as-is, 2) with the initialization code in gc._d_newarrayi() commented out and 3) with the initialization code in gcx._malloc() also commented out (along with (2)) > ... >> One thing I noticed is that for most/all of the _d_new* functions, initialization will be done twice, once in the gcx.malloc and again in the _d_new* function. I believe the extra initialization could be removed in most cases (perhaps with an optional parameter to gcx.malloc()?). Maybe also some syntax to support 'void' initializers for heap allocated arrays? > > What initialization in gcx.malloc? The only call to memset I see has a debug flag. And I believe void initializers already work for arrays. > Line 296: foreach(inout byte b; cast(byte[])(p + size)[0..binsize[bin] - size]) { b = 0; } Right above the debug (MEMSTOMP) Not really 'initialization' as it just clears the unused portion of the 'bin', but it's still overhead that std.c.stdlib.malloc() doesn't have. The important point is that currently the initialization/clearing in itself takes longer than stdlib.malloc, so no matter how fast the allocator is, initialization will be a bottleneck. Any ideas on how to optimize that? > > Sean |
October 28, 2006 Re: The purpose of D (GC rant, long) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dave | Dave wrote: > Sean Kelly wrote: >> Dave wrote: >>> Sean Kelly wrote: >>>> Dave wrote: >>>>> Sean Kelly wrote: >>>>>> Now that memory is not freed when string.length is set to zero it's quite possible to avoid most reallocations simply by preallocating in buffers before using them (ie. set length to some large number and then >>>>> >>>>> I did not know that had been changed.. Is that now part of the language 'spec' somewhere as well? >>>> >>>> No. It was implemented in 170-172 by request from Derek. I don't know the issue number offhand. >>>> >>>>> I'm betting this has been discussed or at least proposed, but here goes again; let's get an array.reserve at least for native arrays (that could be implemented as {arr.length = nnn; arr.length = 0;}). That way it would make for less of a hack than re/setting the length, and also codify it as part of the language. >>>> >>>> I agree that this would be useful. Though it would probably be more like: >>>> >>>> size_t tmp = arr.length; arr.length = nnn; arr.length = tmp; >>>> >>> >>> Oops, you're right.. >>> >>> I was curious as to how much initialization cost. I took the code (bottom) and ran it three times: 1) as-is, 2) with the initialization code in gc._d_newarrayi() commented out and 3) with the initialization code in gcx._malloc() also commented out (along with (2)) >> ... >>> One thing I noticed is that for most/all of the _d_new* functions, initialization will be done twice, once in the gcx.malloc and again in the _d_new* function. I believe the extra initialization could be removed in most cases (perhaps with an optional parameter to gcx.malloc()?). Maybe also some syntax to support 'void' initializers for heap allocated arrays? >> >> What initialization in gcx.malloc? The only call to memset I see has a debug flag. And I believe void initializers already work for arrays. >> > > Line 296: foreach(inout byte b; cast(byte[])(p + size)[0..binsize[bin] - size]) { b = 0; } > > Right above the debug (MEMSTOMP) Oops! Dunno how I missed that. > Not really 'initialization' as it just clears the unused portion of the 'bin', but it's still overhead that std.c.stdlib.malloc() doesn't have. > > The important point is that currently the initialization/clearing in itself takes longer than stdlib.malloc, so no matter how fast the allocator is, initialization will be a bottleneck. > > Any ideas on how to optimize that? I'm not sure of the ideal approach for Phobos, but in Ares I have separate malloc and calloc methods exposed. So I'll likely just change calloc to initialize the entire block instead of just the allocated portion, and remove the spare space initializer from mallocNoSync. Sean |
October 28, 2006 Re: The purpose of D (GC rant, long) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Sean Kelly | Sean Kelly wrote:
> Dave wrote:
>>
>> Not really 'initialization' as it just clears the unused portion of the 'bin', but it's still overhead that std.c.stdlib.malloc() doesn't have.
>>
>> The important point is that currently the initialization/clearing in itself takes longer than stdlib.malloc, so no matter how fast the allocator is, initialization will be a bottleneck.
>>
>> Any ideas on how to optimize that?
>
> I'm not sure of the ideal approach for Phobos, but in Ares I have separate malloc and calloc methods exposed. So I'll likely just change calloc to initialize the entire block instead of just the allocated portion, and remove the spare space initializer from mallocNoSync.
You know, one thing to be said for the current approach is that it will result in fewer memory 'leaks' because unused memory is initialized to a value that is guaranteed not to look like a reference to actual memory. I may leave things as-is.
Sean
|
October 30, 2006 Re: The purpose of D (GC rant, long) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Sean Kelly | Sean Kelly wrote:
> Walter Bright wrote:
>> Sean Kelly wrote:
>>> but D can
>>> really do everything C++ can here, unless you absolutely insist on having user-defined data types that are indistinguishable from built-in types.
>>
>> C++ is not capable of having user-defined types indistinguishable from built-in ones.
>
> How so? It can certainly get pretty close, but I'm unaware of the limitations.
Consider std::string. It cannot deal with "string1"+"string2". std::vector<> cannot be statically initialized. There's no way to create user-defined literals. Etc.
|
October 30, 2006 Re: The purpose of D (GC rant, long) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrey Khropov | Andrey Khropov wrote:
> http://rsdn.ru/forum/Message.aspx?mid=2126270&only=1
>
> (it's in Russian but you can easily recognize the numbers)
I tried google's translator on it, but unfortunately Russian is not supported!
|
November 02, 2006 Re: The purpose of D (GC rant, long) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | Walter Bright wrote: > Andrey Khropov wrote: >> http://rsdn.ru/forum/Message.aspx?mid=2126270&only=1 >> >> (it's in Russian but you can easily recognize the numbers) > > I tried google's translator on it, but unfortunately Russian is not supported! But it's supported by Altavista: http://babelfish.altavista.com/ Ciao P.S.: The Java example should use StringBuffer, not String! |
Copyright © 1999-2021 by the D Language Foundation