January 21, 2005 Re: toStringz and predictability | ||||
---|---|---|---|---|
| ||||
Posted in reply to parabolis | "parabolis" <parabolis@softhome.net> wrote in message news:csmiqa$edp$1@digitaldaemon.com... > Ben Hinkle wrote: >> There's something about toStringz that has me uncomfortable. Consider this code: > > There is something else that you should be uncomfortable about - the domains of C strings and D strings are not the same. The toStringz function is so named because C strings are 'Z'ero (or null) terminated. That implies they cannot contain a null character yet D strings have no such silly limitations. So the toStringz function should probably look like this: > > ---------------------------------------------------------------- > char* toStringz(char[] dStr) { > char[] cStr = new char[dStr.length+1]; > foreach(int i, char dChar; dStr) { > if(!(cStr[i] = dChar)) throw new Exception("Null char"); > } > return &cStr; > ---------------------------------------------------------------- > > Now seems like a great time for plugging the unless/until feature of Perl as being nice in this context allowing: > > unless(cStr[i] = dChar) throw new Exception("Null char"); Has there been debate about unless/until? If so, count me on the list of 'wanting'. :-) |
January 21, 2005 Re: toStringz and predictability | ||||
---|---|---|---|---|
| ||||
Posted in reply to Matthew | Matthew wrote: > "parabolis" <parabolis@softhome.net> wrote in message news:csmiqa$edp$1@digitaldaemon.com... > >> >>---------------------------------------------------------------- >>char* toStringz(char[] dStr) { >> char[] cStr = new char[dStr.length+1]; >> foreach(int i, char dChar; dStr) { >> if(!(cStr[i] = dChar)) throw new Exception("Null char"); >> } >> return &cStr; >>---------------------------------------------------------------- >> >>Now seems like a great time for plugging the unless/until feature of Perl as being nice in this context allowing: >> >> unless(cStr[i] = dChar) throw new Exception("Null char"); > > > Has there been debate about unless/until? If so, count me on the list of 'wanting'. :-) > Yes back around the time the digitalmars.d newsgroup started: http://www.digitalmars.com/d/archives/digitalmars/D/1714.html Walter wrote: > >"Brian Hammond" <d at brianhammond dot comBrian_member xx >pathlink.com> wrote >in message news:c8lmu2$vdm$1 xx digitaldaemon.com... >> I really like the unless because it reads so well. >> >> "do this unless this is true" > > That just seems backwards to me <g>. I like things to execute > forwards, not backwards. However Walter's response was long before "is" replaced "===" and so I think it at least deserves another consideration as Perl's unless construct would give us "unless(A is null)" instead of the akward and much maligned "if(!(A is null))". |
January 24, 2005 Re: toStringz and predictability | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ben Hinkle | (Actually, I refer here to several examples in this thread.)
>>>char* toStringzz(char[] str) {
>>> str.length = str.length+1;
>>> str[length-1] = 0;
>>> return str.ptr;
>>>}
What bothers me is, if a string gets repeatedly passed, say, between a library and the main program, and the library functions pass the string on to the OS or another library, every time using toStringz -- then what keeps the string from growing at each iteration? Finally we end up with a (possibly short) string with a lot of zeros at the end.
It seems harmless at first glance, but what if later this kind of strings are concatenated (in D code) and passed on to a C-written parser? It would see a lot of "empty strings" between real data.
Or am I missing something?
In the same manner, should toStringz guarantee a valid C string? I.e. no internal zeros? At the _very least_ in the non-release build!
----
The name toStringz is misleading. Since the only use for it is to make strings edible for C code, it should be renamed toStringC. Normally, if a programmer _wants_ to slap a zero at the end, he'd use ~, wouldn't he.
Misnomers like this introduce parallax, and in this case so subtle that we don't even notice. And that's where it _really_ counts!
|
January 24, 2005 Re: toStringz and predictability | ||||
---|---|---|---|---|
| ||||
Posted in reply to Georg Wrede | Georg Wrede wrote: > It seems harmless at first glance, but what if later this kind of strings are concatenated (in D code) and passed on to a C-written parser? It would see a lot of "empty strings" between real data. > > Or am I missing something? It would probably be easier to remove the hack altogether and just copy? > body > { > if (string.length == 0) > return ""; > > // Need to make a copy > char[] copy = new char[string.length + 1]; > copy[0..string.length] = string; > copy[string.length] = 0; > return copy; > } Isn't that just what "string.length = string.length + 1" does, anyway ? It would be neat if it could be optimized for string literals, but not at the expense of making the whole function instable? (like it is now) > In the same manner, should toStringz guarantee a valid C string? I.e. no internal zeros? At the _very least_ in the non-release build! The contract for toStringz specifies that the char[] is *without* '\0': > in > { > if (string) > { > // No embedded 0's > for (uint i = 0; i < string.length; i++) > assert(string[i] != 0); > } > } > out (result) > { > if (result) > { assert(strlen(result) == string.length); > assert(memcmp(result, string, string.length) == 0); > } > } It also (implicitly) returns a "" string, for an input param of null. > The name toStringz is misleading. Since the only use for it is to make strings edible for C code, it should be renamed toStringC. Normally, if a programmer _wants_ to slap a zero at the end, he'd use ~, wouldn't he. It converts a char[], to a zero-terminated char*. No "C" about that ?? (I'm not sure why it doesn't just 'return (string ~ "\0");', anyone ?) ==> body { return ((string.length == 0) ? "" : string ~ "\0"); } Besides, most of the C functions does not accept UTF-8 input anyway... To be usable from regular C, it would need to be converted to byte* ? (and that would most likely involve charset encoding conversion too) --anders |
Copyright © 1999-2021 by the D Language Foundation