Thread overview
Conversion of char* to wchar*
Aug 19, 2001
Russ Lewis
Aug 19, 2001
Walter
Aug 20, 2001
Tobias Weingartner
August 19, 2001
What happens in this cast:

char *myString = "asdf";
wchar *myUnicodeString = (wchar*)myString;

If we do a simple pointer conversion, then we have lost the string meaning of the pointer, which, while technically correct is most likely not what 95% of the code writers would have wanted.

August 19, 2001
Russ Lewis wrote in message <3B7F3BCB.6C07277D@deming-os.org>...
>What happens in this cast:
>
>char *myString = "asdf";
>wchar *myUnicodeString = (wchar*)myString;
>
>If we do a simple pointer conversion, then we have lost the string meaning of the pointer, which, while technically correct is most likely not what 95% of the code writers would have wanted.


It undergoes a simple pointer conversion, and is pretty obviously a coding bug. Doing a cast on a string literal *will* convert the string itself, as:

    char *astring = "asdf";        // an ASCII version of "asdf"
    wchar *wstring = "asdf";    // makes a unicode version of "asdf"

No need to put the L prefix on the string.




August 20, 2001
In article <9lnimq$1ht9$1@digitaldaemon.com>, Walter wrote:
> 
> Doing a cast on a string literal *will* convert the string itself, as:
> 
>     char *astring = "asdf";        // an ASCII version of "asdf"
>     wchar *wstring = "asdf";    // makes a unicode version of "asdf"

That begs the question, "What character set is the language written in?"  Will that be configurable?  Will it be ascii?  Etc, etc...

--Toby.
August 20, 2001
Tobias Weingartner wrote:
> That begs the question, "What character set is the language written in?"  Will that be configurable?  Will it be ascii?  Etc, etc...

If you mean "what character set does the language expect source to appear in," that's addressed in http://www.digitalmars.com/d/lex.html :

# "The source file is checked to see if it is in ascii or unicode, and
# the appropriate scanner is loaded ... D source text consists of
Unicode
# characters. If the source text consists of ASCII characters, they are
# treated as the first 128 Unicode characters. Multibyte and UTF8
# character sets are not supported, although nothing precludes them
# from being supported."