Thread overview
convert ANSI to UTF-8
Jul 07, 2006
gertje
Jul 07, 2006
Oskar Linde
Jul 07, 2006
Lionello Lunesu
Jul 07, 2006
Sean Kelly
July 07, 2006
Hello,

Does anybody have or know how to write a function to convert an ANSI string to a UTF-8 string? I am not using windows, so I cannot rely on the functions in std.windows.charset, since they use the MultiByteToWideChar function from the windows API...

Geert


July 07, 2006
gertje@gertje.org wrote:
> Hello,
> 
> Does anybody have or know how to write a function to convert an ANSI string to a
> UTF-8 string? I am not using windows, so I cannot rely on the functions in
> std.windows.charset, since they use the MultiByteToWideChar function from the
> windows API...

I don't know what you mean by an ANSI string. An ascii string is a subset of an utf-8 string so no conversion is neccessary. If your source string is in the (in western countries) common ISO 8859-1 (aka latin-1) the character values are a subset of the unicode code points and you can convert directly using std.utf.encode. If the encoding is different, you need to supply the mapping to unicode code points yourself (which shouldn't be too hard).

/Oskar
July 07, 2006
gertje@gertje.org wrote:
> Hello,
> 
> Does anybody have or know how to write a function to convert an ANSI string to a
> UTF-8 string? I am not using windows, so I cannot rely on the functions in
> std.windows.charset, since they use the MultiByteToWideChar function from the
> windows API...

I suppose you need something like libiconv, http://www.gnu.org/software/libiconv/

It has the mappings tables for a lot of encodings. Have never had the 'pleasure' to work with libiconv myself, but I see it's used in many projects.

L.
July 07, 2006
gertje@gertje.org wrote:
> Hello,
> 
> Does anybody have or know how to write a function to convert an ANSI string to a
> UTF-8 string? I am not using windows, so I cannot rely on the functions in
> std.windows.charset, since they use the MultiByteToWideChar function from the
> windows API...

You might want to look at 'mbsrtowcs' which is a standard C function. It's supposed to be in wchar.h, but as wchar is a keyword in D I've placed it in string.d instead:

http://svn.dsource.org/projects/ares/trunk/src/ares/std/c/string.d


Sean