Thread overview
Writing unicode strings to the console
Dec 18, 2012
Jeremy DeHaan
Dec 18, 2012
Adam D. Ruppe
Dec 18, 2012
bearophile
Dec 18, 2012
H. S. Teoh
Dec 18, 2012
Jeremy DeHaan
Dec 18, 2012
Adam D. Ruppe
Dec 19, 2012
Sam Hu
Dec 19, 2012
monarch_dodra
December 18, 2012
I was playing with unicode strings the other day, and have been searching for a way to correctly write unicode to the console.

If I try something like:

dstring String = "さいごの果実";
		
writeln(String);

All I get is a bunch of nonsense as if it converts the dstring into a regular string. Is it possible to write the unicode string to the console correctly?
December 18, 2012
I suggest you use string instead of dstring, because utf-8 (string) has better output support than utf-32 (dstring), and both support the complete unicode character set.

If string doesn't work, the question is: Windows or Linux?

On Windows, the api call SetConsoleOutputCP will help

http://msdn.microsoft.com/en-us/library/windows/desktop/ms686036%28v=vs.85%29.aspx

The magic number for UTF-8 is 65001. (see here: http://msdn.microsoft.com/en-us/library/dd317756%28v=VS.85%29.aspx )

The link says utf32 is only available to managed applications, so you probably want to use utf-8.



If you are on linux, you need to get a terminal that supports utf8. Writing "\033%G" to an xterm will switch it to utf8, but this is the default most the time.... so you'll probably be ok on that.


Again though, writing strings is probably going to give better results than dstring on either OS with any set of options.
December 18, 2012
Jeremy DeHaan:

> Is it possible to write the unicode string to the console correctly?

What is your operating system?

On oldish Windows you have to set the console to Unicode or nearly Unicode. I don't know about Windows7/8.

Bye,
bearophile
December 18, 2012
On Tue, Dec 18, 2012 at 01:29:55AM +0100, Jeremy DeHaan wrote:
> I was playing with unicode strings the other day, and have been searching for a way to correctly write unicode to the console.
> 
> If I try something like:
> 
> dstring String = "さいごの果実";
> 
> writeln(String);
> 
> All I get is a bunch of nonsense as if it converts the dstring into a regular string. Is it possible to write the unicode string to the console correctly?

It works for me (urxvt on Linux/64bit).

What console are you using? Does your console support Unicode output? Did you set the console's encoding to UTF-8?


T

-- 
One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie. -- The Silicon Valley Tarot
December 18, 2012
@Adam D. Ruppe
>I suggest you use string instead of dstring, because utf-8 (string) has better output support than utf-32 (dstring), and both support the complete unicode character set.

Tried string and wstring. Both had the same results as my dstring.

>On Windows, the api call SetConsoleOutputCP will help

>http://msdn.microsoft.com/en-us/library/windows/desktop/ms686036%28v=vs.85%29.aspx

>The magic number for UTF-8 is 65001. (see here: http://msdn.microsoft.com/en-us/library/dd317756%28v=VS.85%29.aspx )


How/Where would I call this?


I am using 64-bit Win7

As for the console, I do my D development with MonoDevelop, so it's the console used in that. I looked, but I didn't see any settings relating to this.


Thanks for your help guys!
December 18, 2012
On Tuesday, 18 December 2012 at 00:59:12 UTC, Jeremy DeHaan wrote:
> How/Where would I call this?

Right at the beginning of your main, but after trying it, I don't think this is going to fix your problem anyway... I think it is fonts. But:

import std.stdio;

extern(Windows) int SetConsoleOutputCP(uint);

void main() {
    if(SetConsoleOutputCP(65001) == 0)
        throw new Exception("failure");
    string String = "さいごの果\“hello\”";

    writeln(String);
}


Is how you'd use it. But I just tried on my Windows 7 computer, and the default was utf8, so the call was unnecessary. You can try though and see what happens.

A problem I did have though was my console font didn't support unicode. If you bring up the properties option on the console window, under the Font tab, you can pick a font.

Raster fonts is what mine was set to, and had no unicode support. It output gibberish.

Lucida Console gave better results - it had the right number of characters and showed the curly quotes, but not the other characters. Could be because I only have the English language pack for Windows installed on my computer.

But anyway I suggest you try playing with the different fonts you have and see what happens.
December 19, 2012
On Tuesday, 18 December 2012 at 00:29:56 UTC, Jeremy DeHaan wrote:
> I was playing with unicode strings the other day, and have been searching for a way to correctly write unicode to the console.
>
> If I try something like:
>
> dstring String = "さいごの果実";
> 		
> writeln(String);
>
> All I get is a bunch of nonsense as if it converts the dstring into a regular string. Is it possible to write the unicode string to the console correctly?

http://forum.dlang.org/thread/suzymdzjeifnfirtbnrc@dfeed.kimsufi.thecybershadow.net#post-suzymdzjeifnfirtbnrc:40dfeed.kimsufi.thecybershadow.net
December 19, 2012
On Tuesday, 18 December 2012 at 00:29:56 UTC, Jeremy DeHaan wrote:
> I was playing with unicode strings the other day, and have been searching for a way to correctly write unicode to the console.
>
> If I try something like:
>
> dstring String = "さいごの果実";
> 		
> writeln(String);
>
> All I get is a bunch of nonsense as if it converts the dstring into a regular string. Is it possible to write the unicode string to the console correctly?

If all else fails, you can always just print to file instead. That's what I do.