February 15, 2007
Todor Totev schrieb am 2007-02-15:
[snip]
> My personal experience is that our customers don't even use Windows 2000.
> Everyone is using XP for desktops and x64 for servers.
> So what is your opinion? Do you need to support a 9x version of a program
> for living?

Yes. 9x is still used because the communication software for engineering hardware are still 16bit. (closed source and undocumented protocols ...)

Thomas


February 15, 2007
kris wrote:
> Lionello Lunesu wrote:
>> Walter Bright wrote:
>>
>>> Lionello Lunesu wrote:
>>>
>>>> Both Phobos and Tango pretend utf8 is valid for calling ANSI methods from the Windows' API. Obviously, it's not. The correct way is to convert the utf8 string to the code-page expected by the call, or convert them to unicode.
>>>>
>>>> I'd like to suggest the latter. Let's drop the ANSI support for Win32 altogether. Unicode is supported since Windows 95 OSR-2 (if I'm not mistaken) and converting utf8 to ANSI is more expensive than converting it utf8 to utf16 (which is what Windows 2000 and up convert to internally anyway). No more "bool UseWFuncs". And converting utf8 to utf16 using MultiByteToWideChar would also take care of the 0-terminator.
>>>
>>>
>>> The "useWfuncs" only happens for Windows 9x (including Me). All Windows 9x systems are 8 bit internally, and even if you use the W interface, they are internally converted to 8 bits anyway.
>>
>>
>> Yes, they will be converted to "8 bits", but not to utf8. They will be converted to whatever code-page the thread's currently using, which is what's supposed to be done. That's my point: both Phobos and Tango pass utf8 to ANSI (..A) versions of Windows' functions, which is not correct.
> 
> Regarding Tango, it uses the WindowsA functions only if -verion=Win32SansUnicode is configured. This switch is for supporting certain older environments, but does /not/ imply that code-pages are supported in Tango. There has never been an intent to do so.

Then it's not actually supporting those older environments at all.

> For code-page support, we currently suggest using a library such as ICU to do the appropriate conversions.

On Windows, just convert to wchar[] (as you would on W2K and up) and then use WideCharToMultiByte.

L.
February 15, 2007
Lionello Lunesu wrote:
> kris wrote:
>> Regarding Tango, it uses the WindowsA functions only if -verion=Win32SansUnicode is configured. This switch is for supporting certain older environments, but does /not/ imply that code-pages are supported in Tango. There has never been an intent to do so.
> 
> Then it's not actually supporting those older environments at all.

Well, that's support, just not *full* support. Needing to stick to ASCII is still better than no support at all...
February 15, 2007
Lionello Lunesu wrote:

> kris wrote:
>> 
>> Regarding Tango, it uses the WindowsA functions only if -verion=Win32SansUnicode is configured. This switch is for supporting certain older environments, but does /not/ imply that code-pages are supported in Tango. There has never been an intent to do so.
> 
> Then it's not actually supporting those older environments at all.

Would depend on the nature of said environments, no?

-- 
Lars Ivar Igesund
blog at http://larsivi.net
DSource & #D: larsivi
Dancing the Tango
February 15, 2007
Walter Bright wrote:
> Lionello Lunesu wrote:
>> Walter Bright wrote:
>>> The "useWfuncs" only happens for Windows 9x (including Me). All Windows 9x systems are 8 bit internally, and even if you use the W interface, they are internally converted to 8 bits anyway.
>>
>> Yes, they will be converted to "8 bits", but not to utf8. They will be converted to whatever code-page the thread's currently using, which is what's supposed to be done. That's my point: both Phobos and Tango pass utf8 to ANSI (..A) versions of Windows' functions, which is not correct.
>>
>> You should either convert the utf8 to the correct code-page for passing to WhatEverA(..),
> 
> It does convert to the correct code-page. See std.windows.charset.toMBSz().

The problem is that this function is not always called. And because, by default, the A-functions are the ones that get aliased to the 'normal form', many times the utf8 char[] is passed as if it were 'ansi'.

A quick grep reveals:

std\loader.d [5]
std\windows\registry.d [35]

I know these are easily solvable, but I was just wondering if it was worth the trouble.

>> or convert it to utf16 and pass it to WhatEverW(..). The last one is much easier: a fixed, straightforward conversion (no need to know about code-pages)
> 
> This just does not work under Win9x, because most of the 'W' functions are not supported. (Also, Win9x internally converts the few 'W' functions it does support right back to 'A'.)

Yes, but it would be done by Windows. Instead of:

if (UseWFuncs)
  WhatEverA( str.toMBSz );
else
  WhatEverW( str.toUTF16z );

You'd do only:

  WhatEverW( str.toUTF16z );

and Windows' unicode layer for Win9x would convert the string back to the proper code-page. Hey, which is exactly what's going on in std.windows.charset! But at least I don't have to worry about "UseWFuncs" in my own code anymore...

>> that also happens to be efficient for Windows 2000 and up.
> 
> Under Windows NT, 2000, and up, the 'W' functions *are* called.

Only is you'd bother to check UseWFuncs. You probably would, but many don't.

>> As for UseWFuncs: I don't like it because the check is done at run-time.
> 
> It has to be done at runtime, because that's the only way to make it work between different Windows versions.

You could provide link-time support only, using version blocks?

>> It's allover the place, practically doubles all Win32 code, not to mention the imports / obj-size. More importantly, for the reasons mentioned above, I don't think it's necessary.
> 
> There's no hope for it unless all support for Win9x is dropped.

See previous question.

L.
February 15, 2007
Lars Ivar Igesund wrote:
> Lionello Lunesu wrote:
> 
>> kris wrote:
>>> Regarding Tango, it uses the WindowsA functions only if
>>> -verion=Win32SansUnicode is configured. This switch is for supporting
>>> certain older environments, but does /not/ imply that code-pages are
>>> supported in Tango. There has never been an intent to do so.
>> Then it's not actually supporting those older environments at all.
> 
> Would depend on the nature of said environments, no?

??

That would mean that a char[] in Tango is not always utf8 and could in fact be code-page specific encoding. This is quite nasty for somebody writing library functions in Tango.

L.
February 15, 2007
Frits van Bommel wrote:
> Lionello Lunesu wrote:
>> kris wrote:
>>> Regarding Tango, it uses the WindowsA functions only if -verion=Win32SansUnicode is configured. This switch is for supporting certain older environments, but does /not/ imply that code-pages are supported in Tango. There has never been an intent to do so.
>>
>> Then it's not actually supporting those older environments at all.
> 
> Well, that's support, just not *full* support. Needing to stick to ASCII is still better than no support at all...

OK, then all it needs is a "ThrowIfContainsUpperAscii(str);" and we're set :)

L.
February 15, 2007
Lionello Lunesu wrote:

> Lars Ivar Igesund wrote:
>> Lionello Lunesu wrote:
>> 
>>> kris wrote:
>>>> Regarding Tango, it uses the WindowsA functions only if -verion=Win32SansUnicode is configured. This switch is for supporting certain older environments, but does /not/ imply that code-pages are supported in Tango. There has never been an intent to do so.
>>> Then it's not actually supporting those older environments at all.
>> 
>> Would depend on the nature of said environments, no?
> 
> ??
> 
> That would mean that a char[] in Tango is not always utf8 and could in fact be code-page specific encoding. This is quite nasty for somebody writing library functions in Tango.
> 
> L.

No, as Kris mentions, code pages are not currently supported in Tango (they are possible to support via the ICU bindings in Mango), but in environments where ASCII is the only used subset (like on your typical old PC in the US of A) would be supported by the functionality in question. This is not compiled in by default in Tango, and as such you use it only if you are aware that you don't use standard Unicode compliant Tango.

-- 
Lars Ivar Igesund
blog at http://larsivi.net
DSource & #D: larsivi
Dancing the Tango
February 15, 2007
Lionello Lunesu wrote:
> Walter Bright wrote:
>> It does convert to the correct code-page. See std.windows.charset.toMBSz().
> The problem is that this function is not always called. And because, by default, the A-functions are the ones that get aliased to the 'normal form', many times the utf8 char[] is passed as if it were 'ansi'.
> 
> A quick grep reveals:
> 
> std\loader.d [5]
> std\windows\registry.d [35]

Those would be bugs. All the ones using useWfuncs are correctly done (see std.file).

>> This just does not work under Win9x, because most of the 'W' functions are not supported. (Also, Win9x internally converts the few 'W' functions it does support right back to 'A'.)
> 
> Yes, but it would be done by Windows. Instead of:
> 
> if (UseWFuncs)
>   WhatEverA( str.toMBSz );
> else
>   WhatEverW( str.toUTF16z );
> 
> You'd do only:
> 
>   WhatEverW( str.toUTF16z );
> 
> and Windows' unicode layer for Win9x would convert the string back to the proper code-page. Hey, which is exactly what's going on in std.windows.charset! But at least I don't have to worry about "UseWFuncs" in my own code anymore...

unicode layer for Windows is not part of Win9x, it's a separate add-on. This means that in order to use a D executable, the user would have to find and install MSLU. This is unacceptable - I don't want to deal with the constant "bug reports" about this.

>>> that also happens to be efficient for Windows 2000 and up.
>>
>> Under Windows NT, 2000, and up, the 'W' functions *are* called.
> 
> Only is you'd bother to check UseWFuncs. You probably would, but many don't.
> 
>>> As for UseWFuncs: I don't like it because the check is done at run-time.
>>
>> It has to be done at runtime, because that's the only way to make it work between different Windows versions.
> 
> You could provide link-time support only, using version blocks?

Then there'd be two Phobos libraries, and the D programmer would have to ship two different executables. This is not worth it.
February 15, 2007
Walter Bright wrote:
> 
> unicode layer for Windows is not part of Win9x, it's a separate add-on. This means that in order to use a D executable, the user would have to find and install MSLU. This is unacceptable - I don't want to deal with the constant "bug reports" about this.

Supporting 9x in general is a huge pain.  There are a lot of important library features that it doesn't provide.


Sean