Thread overview | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
|
August 01, 2005 Implementation of char[] std.string.toString(char) | ||||
---|---|---|---|---|
| ||||
I recently noticed that char[] std.string.toString(char) in Phobos (DMD 0.127) is implemented this way: # char[] toString(char c) # { # char[] result = new char[2]; # result[0] = c; # result[1] = 0; # return result[0 .. 1]; # } Why is it not simply # char[] toString(char c) # { # char[] result = new char[1]; # result[0] = c; # return result; # } Can anyone shed a light on this? Thanks in advance, Stefan |
August 01, 2005 Re: Implementation of char[] std.string.toString(char) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Stefan | In article <dckpo7$23vs$1@digitaldaemon.com>, Stefan says... > >I recently noticed that char[] std.string.toString(char) in >Phobos (DMD 0.127) is implemented this way: > ># char[] toString(char c) ># { ># char[] result = new char[2]; ># result[0] = c; ># result[1] = 0; ># return result[0 .. 1]; ># } > > >Why is it not simply > ># char[] toString(char c) ># { ># char[] result = new char[1]; ># result[0] = c; ># return result; ># } > > >Can anyone shed a light on this? > >Thanks in advance, >Stefan > > At first I thought it was because 'char' and 'int' (int are 2 bytes long) are implicitly converted to one another as needed, below is an example of the toString(char) coverting both a 'char' and a 'int' without a cast(). # //int2char.d # private import std.stdio; # # char[] toString1(char c) # { # char[] result = new char[2]; # result[0] = c; # result[1] = 0; # return result[0 .. 1]; # } # # char[] toString2(char c) # { # char[] result = new char[1]; # result[0] = c; # return result; # } # # int main() # { # char c; # int i = 67; # # c = i; // no cast() needed # writefln("toString1(c)=\"%s\" toString1(i)=\"%s\"", # .toString1(c), .toString1(i)); # writefln("toString2(c)=\"%s\" toString2(i)=\"%s\"", # .toString2(c), .toString2(i)); # return 0; # } C:\dmd>dmd int2char.d C:\dmd\bin\..\..\dm\bin\link.exe int2char,,,user32+kernel32/noi; C:\dmd>int2char toString1(c)="C" toString1(i)="C" toString2(c)="C" toString2(i)="C" C:\dmd> But that's clearly not the case...umm...not sure at this point. Sorry I wasn't more helpful on the matter. David L. ------------------------------------------------------------------- "Dare to reach for the Stars...Dare to Dream, Build, and Achieve!" ------------------------------------------------------------------- MKoD: http://spottedtiger.tripod.com/D_Language/D_Main_XP.html |
August 01, 2005 Re: Implementation of char[] std.string.toString(char) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Stefan | On Mon, 1 Aug 2005 09:24:23 +0000 (UTC), Stefan wrote: > I recently noticed that char[] std.string.toString(char) in > Phobos (DMD 0.127) is implemented this way: > > # char[] toString(char c) > # { > # char[] result = new char[2]; > # result[0] = c; > # result[1] = 0; > # return result[0 .. 1]; > # } > > > Why is it not simply > > # char[] toString(char c) > # { > # char[] result = new char[1]; > # result[0] = c; > # return result; > # } > > > Can anyone shed a light on this? I believe its because Walter is trying to be 'C' friendly. The returned 'string' must have a length of 1, because it only holds one char, but it must own a 2-byte memory allocation because the byte after the string must be zero for potential C usage. Your alternate routine certainly returns a 1-byte string, but the byte after the string is undetermined. -- Derek Parnell Melbourne, Australia 1/08/2005 9:47:37 PM |
August 01, 2005 Re: Implementation of char[] std.string.toString(char) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Derek Parnell | In article <gu39ywiarmwp.1vayamiha3tm3.dlg@40tude.net>, Derek Parnell says... > >On Mon, 1 Aug 2005 09:24:23 +0000 (UTC), Stefan wrote: > >> I recently noticed that char[] std.string.toString(char) in >> Phobos (DMD 0.127) is implemented this way: >> >> # char[] toString(char c) >> # { >> # char[] result = new char[2]; >> # result[0] = c; >> # result[1] = 0; >> # return result[0 .. 1]; >> # } >> >> >> Why is it not simply >> >> # char[] toString(char c) >> # { >> # char[] result = new char[1]; >> # result[0] = c; >> # return result; >> # } >> >> >> Can anyone shed a light on this? > >I believe its because Walter is trying to be 'C' friendly. The returned 'string' must have a length of 1, because it only holds one char, but it must own a 2-byte memory allocation because the byte after the string must be zero for potential C usage. Hhm, I initially thought the same. But as I understand it, there are a lot of toString() routines in there that don't zero-terminate (e.g. char[] toString(uint u)). So, I thought I must have missed something? Thanks for your reply, Stefan > >Your alternate routine certainly returns a 1-byte string, but the byte after the string is undetermined. > >-- >Derek Parnell >Melbourne, Australia >1/08/2005 9:47:37 PM |
August 01, 2005 Re: Implementation of char[] std.string.toString(char) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Stefan | "Stefan" <Stefan_member@pathlink.com> wrote in message news:dcl59f$2jhr$1@digitaldaemon.com... > In article <gu39ywiarmwp.1vayamiha3tm3.dlg@40tude.net>, Derek Parnell says... >> >>On Mon, 1 Aug 2005 09:24:23 +0000 (UTC), Stefan wrote: >> >>> I recently noticed that char[] std.string.toString(char) in >>> Phobos (DMD 0.127) is implemented this way: >>> >>> # char[] toString(char c) >>> # { >>> # char[] result = new char[2]; >>> # result[0] = c; >>> # result[1] = 0; >>> # return result[0 .. 1]; >>> # } >>> >>> >>> Why is it not simply >>> >>> # char[] toString(char c) >>> # { >>> # char[] result = new char[1]; >>> # result[0] = c; >>> # return result; >>> # } >>> >>> >>> Can anyone shed a light on this? >> >>I believe its because Walter is trying to be 'C' friendly. The returned 'string' must have a length of 1, because it only holds one char, but it must own a 2-byte memory allocation because the byte after the string must be zero for potential C usage. > > > Hhm, I initially thought the same. But as I understand it, there are a > lot of toString() routines in there that don't zero-terminate (e.g. char[] > toString(uint u)). So, I thought I must have missed something? Since the GC allocates in blocks of 16 bytes or more allocating a single byte will actually allocate 16 so it doesn't hurt space-wise to ask for 2. Other functions probably don't know they'll always fit in one block. Note different GCs might not behave that way. |
August 01, 2005 Re: Implementation of char[] std.string.toString(char) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ben Hinkle | In article <dclc4k$2qgv$1@digitaldaemon.com>, Ben Hinkle says... > > >"Stefan" <Stefan_member@pathlink.com> wrote in message news:dcl59f$2jhr$1@digitaldaemon.com... >> In article <gu39ywiarmwp.1vayamiha3tm3.dlg@40tude.net>, Derek Parnell says... >>> >>>On Mon, 1 Aug 2005 09:24:23 +0000 (UTC), Stefan wrote: >>> >>>> I recently noticed that char[] std.string.toString(char) in >>>> Phobos (DMD 0.127) is implemented this way: >>>> >>>> # char[] toString(char c) >>>> # { >>>> # char[] result = new char[2]; >>>> # result[0] = c; >>>> # result[1] = 0; >>>> # return result[0 .. 1]; >>>> # } >>>> >>>> >>>> Why is it not simply >>>> >>>> # char[] toString(char c) >>>> # { >>>> # char[] result = new char[1]; >>>> # result[0] = c; >>>> # return result; >>>> # } >>>> >>>> >>>> Can anyone shed a light on this? >>> >>>I believe its because Walter is trying to be 'C' friendly. The returned 'string' must have a length of 1, because it only holds one char, but it must own a 2-byte memory allocation because the byte after the string must be zero for potential C usage. >> >> >> Hhm, I initially thought the same. But as I understand it, there are a >> lot of toString() routines in there that don't zero-terminate (e.g. char[] >> toString(uint u)). So, I thought I must have missed something? > >Since the GC allocates in blocks of 16 bytes or more allocating a single byte will actually allocate 16 so it doesn't hurt space-wise to ask for 2. Other functions probably don't know they'll always fit in one block. Note different GCs might not behave that way. Yes, that might explain it. Thanks a lot. Best regards, Stefan |
August 01, 2005 Re: Implementation of char[] std.string.toString(char) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Derek Parnell | Derek Parnell wrote:
> I believe its because Walter is trying to be 'C' friendly. The returned
> 'string' must have a length of 1, because it only holds one char, but it
> must own a 2-byte memory allocation because the byte after the string must
> be zero for potential C usage.
Nearly correct. toString() is not required to return something that has the "hidden" zero trailing it, but it's useful when it does. Look at the implementation of toStringz() (convert to zero-terminated string). That will look at the trailing character and see if it just happens to be 0; if so, then it can convert the string without any copying.
Ofc, that implementation of toStringz() is controversial, and when you're talking about a string of length 1, the cost of copying is very small. But I suppose that even that small of a copy might kick off a GC sweep, so it's probably not a bad idea that it works the way it does.
|
August 01, 2005 Re: Implementation of char[] std.string.toString(char) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Russ Lewis | In article <dcldo3$2s6r$1@digitaldaemon.com>, Russ Lewis says... > >Derek Parnell wrote: >> I believe its because Walter is trying to be 'C' friendly. The returned 'string' must have a length of 1, because it only holds one char, but it must own a 2-byte memory allocation because the byte after the string must be zero for potential C usage. > >Nearly correct. toString() is not required to return something that has the "hidden" zero trailing it, but it's useful when it does. Look at the implementation of toStringz() (convert to zero-terminated string). That will look at the trailing character and see if it just happens to be 0; if so, then it can convert the string without any copying. In my Phobos source (DMD 0.127) that code is commented out. The impl is essentially: # char* toStringz(char[] string) # { # char[] copy; # if (string.length == 0) # return ""; # # // Need to make a copy # copy = new char[string.length + 1]; # copy[0..string.length] = string; # copy[string.length] = 0; # return copy; # } Or are we talking about different things here? Best regards, Stefan >Ofc, that implementation of toStringz() is controversial, and when you're talking about a string of length 1, the cost of copying is very small. But I suppose that even that small of a copy might kick off a GC sweep, so it's probably not a bad idea that it works the way it does. |
August 01, 2005 Re: Implementation of char[] std.string.toString(char) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Stefan | Stefan wrote:
> In article <dcldo3$2s6r$1@digitaldaemon.com>, Russ Lewis says...
>
>>Derek Parnell wrote:
>>
>>>I believe its because Walter is trying to be 'C' friendly. The returned
>>>'string' must have a length of 1, because it only holds one char, but it
>>>must own a 2-byte memory allocation because the byte after the string must
>>>be zero for potential C usage.
>>
>>Nearly correct. toString() is not required to return something that has the "hidden" zero trailing it, but it's useful when it does. Look at the implementation of toStringz() (convert to zero-terminated string). That will look at the trailing character and see if it just happens to be 0; if so, then it can convert the string without any copying.
>
>
> In my Phobos source (DMD 0.127) that code is commented out.
It appears you are right; I guess I missed the change. Looks to me like it was commented out in version 0.113. My thought is that, then, the implementation of toString(char) can be simplified. At least, I don't perceive any reason not to...
|
Copyright © 1999-2021 by the D Language Foundation