Assigning certain wchars to a char is allowed

Currently, D accepts the assigment of wchar literals to chars, even on certain dubious cases. In these examples: char ch1 = '\u0100'; // Error: cannot convert ... type wchar to char char ch2 = '\u0044'; // No Error, ok. char ch3 = '\u00E7'; // No Error, but should it be ok? writefln(ch3); // prints: Error: 4invalid UTF-8 sequence shouldn't the third case also be an error, because altough the codepoint itselft (0xE7) fits within the range of a char, in UTF-8 it is enconded with two code values (0xC3A7), and thus it cannot be present in a char. -- Bruno Medeiros - CS/E student "Certain aspects of D are a pathway to many abilities some consider to be... unnatural."

November 25, 2005

Re: Assigning certain wchars to a char is allowed

Posted by Georg Wrede
in reply to Bruno Medeiros

Permalink

Georg Wrede

Posted in reply to Bruno Medeiros

Permalink

Bruno Medeiros wrote:
> Currently, D accepts the assigment of wchar literals to chars, even on
> certain dubious cases. In these examples:
> 
>   char ch1 = '\u0100'; // Error: cannot convert ... type wchar to char
>   char ch2 = '\u0044'; // No Error, ok.
>   char ch3 = '\u00E7'; // No Error, but should it be ok?
>   writefln(ch3); // prints: Error: 4invalid UTF-8 sequence
> 
> shouldn't the third case also be an error, because altough the codepoint
> itselft (0xE7) fits within the range of a char, in UTF-8 it is enconded
> with two code values (0xC3A7), and thus it cannot be present in a char.

Yes. Sort of.

Check the String Unified Theory thread for more info. It'll be fixed. (But it's not as easy as just making that an error. The fix has to (and will be) part of a larger rework around chars, strings and such.

Forums