View mode: basic / threaded / horizontal-split · Log in · Help
May 16, 2012
MBCS character code support
Hello, everyone!

I want multibyte character string (MBCS) support except Unicode, UTF-8 and 
UTF-16.

Could you make the D compiler generate Shift_JIS, EUC-JP code for every 
string literals by a specific command line option?

Shift_JIS and EUC-JP are Japanese character set.
May 16, 2012
Re: MBCS character code support
You can convert UTF-8 to Shift_JIS by the following code.


/* Linux, FreeBSD or UNIX */
#include <iconv.h>
iconv_t g_icUTF8toSJIS;

char *convert_utf8_to_sjis(char *in)
{
    char *out, *p_in, *p_out,
    size_t in_size, out_size;

    in_size = strlen(in);
    out_size = in_size;
    out = (char *)malloc(out_size + 1);
    if (out == NULL)
        return NULL;

    p_in = in;
    p_out = out;
    iconv(g_icUTF8toSJIS, &p_in, &in_size, &p_out, &out_size);
    *p_out = 0;

    return out;
}


int main(void)
{
    char *out;
    g_icUTF8toSJIS = iconv_open("UTF-8", "SJIS");
    if (g_icUTF8toSJIS == (iconv_t)-1) {
        // error
    }
    ...
    out = convert_utf8_to_sjis(...);
    ...
    free(out);
    ...

    iconv_close(g_icUTF8toSJIS);
    return 0;
}

/* Windows */
#include <windows.h>

char *UTF8toSJIS(char *utf8)
{
    char *wide, *sjis;
    int size;

    size = MultiByteToWideChar(CP_UTF8, 0, utf8, -1, 0, 0);
    wide = (char *)malloc((size + 1) * sizeof(WCHAR));
    if (wide == NULL)
        return NULL;
    MultiByteToWideChar(CP_UTF8, 0, utf8, -1, wide, size);

    size = WideCharToMultiByte(CP_ACP, 0, wide, -1, 0, 0, 0, 0);
    sjis = malloc(size * 2 + 1);
    if (sjis == NULL) {
        free(wide);
        return NULL;
    }

    WideCharToMultiByte(CP_ACP, 0, wide, -1, sjis, size, 0, 0);
    free(wide);

    return sjis;
}

int main(void)
{
    char *out;
    ...
    out = UTF8toSJIS(...);
    ...
    free(out);
    ...
    return 0;
}
May 16, 2012
Re: MBCS character code support
All Japaneses and/or other Asians want native MBCS support.
Please let the D compiler generate Shift_JIS code for literal 
strings.
May 16, 2012
Re: MBCS character code support
On 16-05-2012 06:04, Katayama Hirofumi MZ wrote:
> All Japaneses and/or other Asians want native MBCS support.
> Please let the D compiler generate Shift_JIS code for literal strings.

I really do not understand why you want to use Shift-JIS. Unicode has 
long superseded all these magical encodings used all over the world. Why 
oppose a unified encoding?

-- 
- Alex
May 16, 2012
Re: MBCS character code support
On Wednesday, 16 May 2012 at 04:12:04 UTC, Alex Rønne Petersen 
wrote:
> I really do not understand why you want to use Shift-JIS. 
> Unicode has long superseded all these magical encodings used 
> all over the world. Why oppose a unified encoding?

On Windows 9x, there is no Unicode support. Instead, native MBCS
encoding exists.

So, if the D Windows program could use UTF-8 only, then the 
programmer
should let the D program convert these strings to Shift_JIS.
May 16, 2012
Re: MBCS character code support
On 16-05-2012 06:18, Katayama Hirofumi MZ wrote:
> On Wednesday, 16 May 2012 at 04:12:04 UTC, Alex Rønne Petersen wrote:
>> I really do not understand why you want to use Shift-JIS. Unicode has
>> long superseded all these magical encodings used all over the world.
>> Why oppose a unified encoding?
>
> On Windows 9x, there is no Unicode support. Instead, native MBCS
> encoding exists.
>
> So, if the D Windows program could use UTF-8 only, then the programmer
> should let the D program convert these strings to Shift_JIS.

D does not support Windows versions older than Windows 2000.

-- 
Alex Rønne Petersen
alex@lycus.org
http://lycus.org
May 16, 2012
Re: MBCS character code support
On 5/15/2012 7:12 PM, Katayama Hirofumi MZ wrote:
> Hello, everyone!
> 
> I want multibyte character string (MBCS) support except Unicode, UTF-8 and UTF-16.
> 
> Could you make the D compiler generate Shift_JIS, EUC-JP code for every string 
> literals by a specific command line option?
> 
> Shift_JIS and EUC-JP are Japanese character set.

I'm familiar with Shift-JIS from the C compiler days.

D is designed to internally be all UTF-8, and the runtime code all assumes
UTF-8. I recommend programming in such a way that user input is converted from
Shift-JIS to UTF-8, then all processing is done in terms of UTF-8, then the
output is converted to Shift-JIS.
May 16, 2012
Re: MBCS character code support
On Wednesday, 16 May 2012 at 04:19:00 UTC, Katayama Hirofumi MZ 
wrote:
> On Windows 9x, there is no Unicode support.

http://msdn.microsoft.com/en-us/goglobal/bb688166
May 19, 2012
Re: MBCS character code support
16.05.2012 8:26, Alex Rønne Petersen написал:
> On 16-05-2012 06:18, Katayama Hirofumi MZ wrote:
>> On Wednesday, 16 May 2012 at 04:12:04 UTC, Alex Rønne Petersen wrote:
>>> I really do not understand why you want to use Shift-JIS. Unicode has
>>> long superseded all these magical encodings used all over the world.
>>> Why oppose a unified encoding?
>>
>> On Windows 9x, there is no Unicode support. Instead, native MBCS
>> encoding exists.
>>
>> So, if the D Windows program could use UTF-8 only, then the programmer
>> should let the D program convert these strings to Shift_JIS.
>
> D does not support Windows versions older than Windows 2000.
>

D2 has no Windows 2000 support for a long time.

http://d.puremagic.com/issues/show_bug.cgi?id=6024
https://github.com/D-Programming-Language/druntime/pull/212

-- 
Денис В. Шеломовский
Denis V. Shelomovskij
May 21, 2012
Re: MBCS character code support
Can D convert strings on compile time?
« First   ‹ Prev
1 2
Top | Discussion index | About this forum | D home