Thread overview
sysErrorString has problem on multibyte environment
Jul 14, 2005
k2
Jul 14, 2005
Regan Heath
Jul 14, 2005
k2
multibyte, utf, conversions everywhere
Jul 14, 2005
Regan Heath
July 14, 2005
sysErrorString are using "A" function.
If OS environment is multibyte, that result include multibyte string.


July 14, 2005
On Thu, 14 Jul 2005 08:24:36 +0000 (UTC), k2 <k2_member@pathlink.com> wrote:
> sysErrorString are using "A" function.
> If OS environment is multibyte, that result include multibyte string.

This should do the trick, I suspect.

//This code should be considered public domain
wchar[] sysErrorStringW(uint errcode)
{
    wchar[] result;
    wchar* buffer;
    DWORD r;

    r = FormatMessageW(
	    FORMAT_MESSAGE_ALLOCATE_BUFFER |
	    FORMAT_MESSAGE_FROM_SYSTEM |
	    FORMAT_MESSAGE_IGNORE_INSERTS,
	    null,
	    errcode,
	    MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT), // Default language
	    cast(LPWSTR)&buffer,
	    0,
	    null);

    /* Remove \r\n from error string */
    if (r >= 2)
	r -= 2;
    result = buffer[0..r].dup;
    LocalFree(cast(HLOCAL)buffer);
    return result;
}

Regan
July 14, 2005
The following sample codes get an error, "Error: 4invalid UTF-8 sequence", on my PC:

import std.file;
import std.utf;
import std.c.windows.windows;

extern(Windows) export
int MessageBoxW(HWND hWnd, LPCWSTR lpText, LPCWSTR lpCaption, UINT uType);

void main()
{
try{
std.file.isdir("notexist");
}catch(Object o){
MessageBoxW(null, toUTF16z(o.toString()), null, MB_OK);
}
}

isdir, etc are using sysErrorString.
So, I think this is better:

char[] sysErrorString(uint errcode)
{
char[] result;
wchar* buffer;
DWORD r;

r = FormatMessageW(
FORMAT_MESSAGE_ALLOCATE_BUFFER |
FORMAT_MESSAGE_FROM_SYSTEM |
FORMAT_MESSAGE_IGNORE_INSERTS,
null,
errcode,
MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT), // Default language
cast(LPWSTR)&buffer,
0,
null);

/* Remove \r\n from error string */
if (r >= 2)
r -= 2;
result = std.utf.toUTF8(buffer[0..r]);
LocalFree(cast(HLOCAL)buffer);
return result;
}


July 14, 2005
On Thu, 14 Jul 2005 09:54:46 +0000 (UTC), k2 <k2_member@pathlink.com> wrote:
> The following sample codes get an error, "Error: 4invalid UTF-8 sequence", on my
> PC:
>
> import std.file;
> import std.utf;
> import std.c.windows.windows;
>
> extern(Windows) export
> int MessageBoxW(HWND hWnd, LPCWSTR lpText, LPCWSTR lpCaption, UINT uType);
>
> void main()
> {
> try{
> std.file.isdir("notexist");
> }catch(Object o){
> MessageBoxW(null, toUTF16z(o.toString()), null, MB_OK);
> }
> }
>
> isdir, etc are using sysErrorString.
> So, I think this is better:
>
> char[] sysErrorString(uint errcode)
> {
> char[] result;
> wchar* buffer;
> DWORD r;
>
> r = FormatMessageW(
> FORMAT_MESSAGE_ALLOCATE_BUFFER |
> FORMAT_MESSAGE_FROM_SYSTEM |
> FORMAT_MESSAGE_IGNORE_INSERTS,
> null,
> errcode,
> MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT), // Default language
> cast(LPWSTR)&buffer,
> 0,
> null);
>
> /* Remove \r\n from error string */
> if (r >= 2)
> r -= 2;
> result = std.utf.toUTF8(buffer[0..r]);
> LocalFree(cast(HLOCAL)buffer);
> return result;
> }

Ahh, of course, FormatMessageA is giving a multibyte result which isn't UTF8. FormatMessageW gives a unicode result AKA UTF-16.

Does it bother anyone else that we do UTF conversions all the time? Here we go from UTF-16 to UTF-8. I noticed in particular doFormat uses a delegate taking a dchar so each and every character formatted gets converted to/from dchar. It just seems inefficient in some way.

Regan