Thread overview | ||||||||
---|---|---|---|---|---|---|---|---|
|
June 07, 2004 Suggestion: char.init, wchar.init and dchar.init | ||||
---|---|---|---|---|
| ||||
Hi, The default value of NaN for floating point numbers is an excellent idea. I suggest that we do the same thing for chars, wchars and dchars. The init value for char should (IMO) be 0xFF. Rationale - char by definition contains a UTF-8 fragment. The byte 0xFF will never occur in a valid UTF-8 sequence. It is a clear indication of an unassigned value. The init value for wchar and dchar should be 0xFFFF (that is, 0x0000FFFF for dchar). Rationale - wchar and dchar by definiton contain UTF-16 and UTF-32 (equivalent to plain Unicode within their defined ranges). The codepoint U+FFFF is not a legitimate Unicode character, and, furthermore, it is guaranteed by the Unicode Consortium that 0xFFFF will NEVER be a legitimate Unicode character. This codepoint will remain forever unassigned, precisely so that it may be used for purposes such as this. Be it noted that that the codepoint 0 is a bad choice for a default value. It might have made sense in C, where '\0' has special meaning as a string terminator, but in D '\0' is just another character. Unicode defines '\0' as a control character whose interpretation is implementation dependent. Better, I feel, to use a value with universal meaning. Jill |
June 07, 2004 Re: Suggestion: char.init, wchar.init and dchar.init | ||||
---|---|---|---|---|
| ||||
Posted in reply to Arcane Jill | Gets my vote! -eye |
June 07, 2004 Re: Suggestion: char.init, wchar.init and dchar.init | ||||
---|---|---|---|---|
| ||||
Posted in reply to Arcane Jill | That's a good idea. "Arcane Jill" <Arcane_member@pathlink.com> wrote in message news:ca17qq$224t$1@digitaldaemon.com... > Hi, > > The default value of NaN for floating point numbers is an excellent idea. I > suggest that we do the same thing for chars, wchars and dchars. > > The init value for char should (IMO) be 0xFF. Rationale - char by definition > contains a UTF-8 fragment. The byte 0xFF will never occur in a valid UTF-8 sequence. It is a clear indication of an unassigned value. > > The init value for wchar and dchar should be 0xFFFF (that is, 0x0000FFFF for > dchar). Rationale - wchar and dchar by definiton contain UTF-16 and UTF-32 > (equivalent to plain Unicode within their defined ranges). The codepoint U+FFFF > is not a legitimate Unicode character, and, furthermore, it is guaranteed by the > Unicode Consortium that 0xFFFF will NEVER be a legitimate Unicode character. > This codepoint will remain forever unassigned, precisely so that it may be used > for purposes such as this. > > Be it noted that that the codepoint 0 is a bad choice for a default value. It > might have made sense in C, where '\0' has special meaning as a string terminator, but in D '\0' is just another character. Unicode defines '\0' as a > control character whose interpretation is implementation dependent. Better, I > feel, to use a value with universal meaning. > > Jill > > |
June 07, 2004 Re: Suggestion: char.init, wchar.init and dchar.init | ||||
---|---|---|---|---|
| ||||
Posted in reply to Arcane Jill | Arcane Jill wrote:
> Hi,
>
> The default value of NaN for floating point numbers is an excellent idea. I
> suggest that we do the same thing for chars, wchars and dchars.
>
> The init value for char should (IMO) be 0xFF. Rationale - char by definition
> contains a UTF-8 fragment. The byte 0xFF will never occur in a valid UTF-8
> sequence. It is a clear indication of an unassigned value.
>
> The init value for wchar and dchar should be 0xFFFF (that is, 0x0000FFFF for
> dchar). Rationale - wchar and dchar by definiton contain UTF-16 and UTF-32
> (equivalent to plain Unicode within their defined ranges). The codepoint U+FFFF
> is not a legitimate Unicode character, and, furthermore, it is guaranteed by the
> Unicode Consortium that 0xFFFF will NEVER be a legitimate Unicode character.
> This codepoint will remain forever unassigned, precisely so that it may be used
> for purposes such as this.
>
> Be it noted that that the codepoint 0 is a bad choice for a default value. It
> might have made sense in C, where '\0' has special meaning as a string
> terminator, but in D '\0' is just another character. Unicode defines '\0' as a
> control character whose interpretation is implementation dependent. Better, I
> feel, to use a value with universal meaning.
I like the 0 initialization. It is consistent and easy to understand and remember.
And it has an important function. If anyone ever passes an uninitialized D memory block to functions that expect a 0-terminated string then nothing bad will happen.
But then again, I also don't like that floats are initialized to NaN.
If it HAS to be done then there should definitely be an easy-to-remember property for the char types to test for this. Otherwise many programmers will have a hard time remembering which value means "not a char".
Hauke
|
June 07, 2004 Re: Suggestion: char.init, wchar.init and dchar.init | ||||
---|---|---|---|---|
| ||||
Posted in reply to Hauke Duden | > If it HAS to be done then there should definitely be an easy-to-remember property for the char types to test for this. Otherwise many programmers will have a hard time remembering which value means "not a char".
.init?
|
June 07, 2004 Re: Suggestion: char.init, wchar.init and dchar.init | ||||
---|---|---|---|---|
| ||||
Posted in reply to Hauke Duden | In article <ca2754$h5k$1@digitaldaemon.com>, Hauke Duden says... >If it HAS to be done then there should definitely be an easy-to-remember property for the char types to test for this. Otherwise many programmers will have a hard time remembering which value means "not a char". You're not supposed to /test/ for uninitialized variables - you're simply supposed to initialize them! And that error, of course is exactly what we're trying to catch. Anyway, you could always test for "if (c == char.init)" no matter what char.init was. By the way, I got to look at your Unichar code today. Excellent stuff. It's on my machine now. Also, you were right about doxygen, judging by the quality of your documentation - it really does rock. Jill |
Copyright © 1999-2021 by the D Language Foundation