Thread overview
Wired question related with Chinese characters
Mar 22, 2020
walker
Mar 23, 2020
walker
Mar 22, 2020
Adam D. Ruppe
Mar 23, 2020
walker
Mar 23, 2020
Adam D. Ruppe
Mar 23, 2020
walker
Mar 29, 2020
lovemini
Mar 29, 2020
lovemini
Mar 31, 2020
walker
March 22, 2020
I am new to dlang, I like it :)
I am on windows10 and use the terminal preview to test and run programs.

In order to print Chinese characters correctly, I always use

void main()
{
string var1 = "你好"; # to!string(in_other_conditions)
writeln(var1);
}

I tried dstring but not working. Need Help Here.

---------------------------------
The wired thing is the above approach may not work sometimes -> corrupted characters show up.

However, if I run in the same place a little program written in nim (just print hello world), then I run the same dlang program, the Chinese character is print correctly.

Why that happened?
March 22, 2020
On 3/22/20 11:19 AM, walker wrote:
> I am new to dlang, I like it :)
> I am on windows10 and use the terminal preview to test and run programs.
> 
> In order to print Chinese characters correctly, I always use
> 
> void main()
> {
> string var1 = "你好"; # to!string(in_other_conditions)
> writeln(var1);
> }
> 
> I tried dstring but not working. Need Help Here.
> 
> ---------------------------------
> The wired thing is the above approach may not work sometimes -> corrupted characters show up.
> 
> However, if I run in the same place a little program written in nim (just print hello world), then I run the same dlang program, the Chinese character is print correctly.
> 
> Why that happened?

On windows, the default codepage for the terminal is NOT UTF8. Which means it may not know how to properly deal with your output.

Most likely nim is making the terminal use the  UTF8 codepage.

See more info here: https://superuser.com/questions/269818/change-default-code-page-of-windows-console-to-utf-8

-Steve
March 22, 2020
On Sunday, 22 March 2020 at 15:19:13 UTC, walker wrote:
> writeln(var1);

writeln calls the wrong function for the Windows console.

You can kinda hack it by changing the code page like Steven said (which has other bugs though, but works for many cases), or you can call the correct function - WriteConsoleW - yourself instead.

https://docs.microsoft.com/en-us/windows/console/writeconsole

---
import core.sys.windows.windows;

wstring s = ""w; // note it is a wstring, not a string
WriteConsole(GetStdHandle(STD_OUTPUT_HANDLE), s.ptr, cast(DWORD) s.length, null, null);
---

my terminal.d library also calls that function but it isn't likely to 100% work with Chinese characters either due to other bugs. still you can try it if you like

https://github.com/adamdruppe/arsd/blob/master/terminal.d

---
import arsd.terminal;

void main()
{
Terminal terminal = Terminal(ConsoleOutputType.linear);
string var1 = "你好"; // I might have broken this in copy/paste
terminal.writeln(var1); // could work.
}
---

that should work if you have the fonts set up already but just it migh have other bugs so I kinda suggest just doing your own write function.

readln in the stdlib is similarly broken btw.
March 23, 2020
On Sunday, 22 March 2020 at 15:53:29 UTC, Steven Schveighoffer wrote:
> On windows, the default codepage for the terminal is NOT UTF8. Which means it may not know how to properly deal with your output.
>
> Most likely nim is making the terminal use the  UTF8 codepage.
>
> See more info here: https://superuser.com/questions/269818/change-default-code-page-of-windows-console-to-utf-8
>
> -Steve

Thank you for your explanation :)
March 23, 2020
On Sunday, 22 March 2020 at 16:12:34 UTC, Adam D. Ruppe wrote:

Thank you!

Your arsd.terminal is the first 3rd party module I used, which I use to output colored text on the terminal preview on windows10.

I have tried your code above with both string and dstring, It works! Really good!

> readln in the stdlib is similarly broken btw.

Which function should I use when I read Chinese characters in the terminal?



March 23, 2020
On Monday, 23 March 2020 at 01:18:15 UTC, walker wrote:
> Which function should I use when I read Chinese characters in the terminal?

Terminal.getline *might* work in my lib, but if there's combining codepoints I'm not sure. You can try it though and let me know if you are already using the lib.

There's also ReadConsole in core.sys.windows.windows that isn't a bad experience (just Windows only). It works with `wstring` just like WriteConsole and is the native function with unicode support. The operating system also manages some editing for the user, cursor placement, etc. so is more likely to work better with edge cases than my library's custom implementation of all that.

https://docs.microsoft.com/en-us/windows/console/readconsole
March 23, 2020
On Monday, 23 March 2020 at 01:40:16 UTC, Adam D. Ruppe wrote:

> Terminal.getline *might* work in my lib, but if there's combining codepoints I'm not sure. You can try it though and let me know if you are already using the lib.
>

I have done a small test and It works. Thank you!


March 29, 2020
On Sunday, 22 March 2020 at 15:19:13 UTC, walker wrote:
> I am new to dlang, I like it :)
> I am on windows10 and use the terminal preview to test and run programs.
>
> In order to print Chinese characters correctly, I always use
>
> void main()
> {
> string var1 = "你好"; # to!string(in_other_conditions)
> writeln(var1);
> }
>
> I tried dstring but not working. Need Help Here.
>
> ---------------------------------
> The wired thing is the above approach may not work sometimes -> corrupted characters show up.
>
> However, if I run in the same place a little program written in nim (just print hello world), then I run the same dlang program, the Chinese character is print correctly.
>
> Why that happened?


March 29, 2020
import std.stdio;

void main()
{
	version( Windows ) {
		//直接运行中文显示乱码,原因在于Windows控制台默认编码为 936,而D语言输出utf-8
		//可以将控制台编码修改为 utf-8,命令为 "CHCP 65001"
		//修改后就可以显示中文了
		import core.sys.windows.windows;
		SetConsoleCP(65001);
		SetConsoleOutputCP( 65001 );
	}

    writeln("Hello World! 你好,中国!");
}

March 31, 2020
On Sunday, 29 March 2020 at 10:36:53 UTC, lovemini wrote:
> import std.stdio;
>
> void main()
> {
> 	version( Windows ) {
> 		//直接运行中文显示乱码,原因在于Windows控制台默认编码为 936,而D语言输出utf-8
> 		//可以将控制台编码修改为 utf-8,命令为 "CHCP 65001"
> 		//修改后就可以显示中文了
> 		import core.sys.windows.windows;
> 		SetConsoleCP(65001);
> 		SetConsoleOutputCP( 65001 );
> 	}
>
>     writeln("Hello World! 你好,中国!");
> }

Thank you, Chinese output works now, but the Chinese input doesn't work. How to change input behaviors?