Thread overview | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
October 04, 2007 Displaying non UTF-8 8 bit character codes with writefln() | ||||
---|---|---|---|---|
| ||||
Is there an easy way of displaying non UTF-8 8 bit codes with writefln() ? E.g. code like: writefln("elapsed time %.9f \µS", elapsed_time); On a windows system displays output like: elapsed time 2.598202392 µS (displayed when running in a cmd.exe window) The µ is character codes 0xC2 0xB5 for the UTF-8 encoding of µ. Code like: writefln("elapsed time %.9f \u00B5S", elapsed_time); displays the same and code like: writefln("elapsed time %.9f \xB5S", elapsed_time); understandably displays the run-time error: Error: 4invalid UTF-8 sequence trying a Wysiwyg string like: writefln("elapsed time %.9f " r"µ" "S", elapsed_time); displays a compiler error: invalid UTF-8 sequence Is there any simple way to output a non UTF-8 string containing the B5 character code without the C2 prefix ? |
October 04, 2007 Re: Displaying non UTF-8 8 bit character codes with writefln() | ||||
---|---|---|---|---|
| ||||
Posted in reply to Graham | Try printf and saving the file as a UTF-8 encoded text file... --[b5.d]-- import std.stdio; void main() { printf("\µ\n"); printf("\u00B5\n"); printf("\xB5\n"); //doesn't output anything writefln("µ"); } Using this source saved as b5.d as a UTF-8 encoded text file (IMPORTANT) I can set my command prompt font to "Lucida Console" and execute the following commands: E:\D\src\tmp>chcp 65001 Active code page: 65001 E:\D\src\tmp>dmd -run b5.d µ µ µ The 3rd printf doesn't output anything, not sure why, the others all output the same character. chcp 65001 changes to UTF-8 code page :) Regan |
October 04, 2007 Re: Displaying non UTF-8 8 bit character codes with writefln() | ||||
---|---|---|---|---|
| ||||
Posted in reply to Graham | After searching back a bit further than before I see this was discussed in April and the answer was to use printf for the 8 bit string. something like: writef("elapsed time %.9f", elapsed_time); printf(" \xB5S\n"); does work, but if anybody has a more elegant solution please let me know. |
October 04, 2007 Re: Displaying non UTF-8 8 bit character codes with writefln() | ||||
---|---|---|---|---|
| ||||
Posted in reply to Regan Heath | Regan Heath Wrote:
> Try printf and saving the file as a UTF-8 encoded text file...
>
> --[b5.d]--
> import std.stdio;
>
> void main()
> {
> printf("\µ\n");
> printf("\u00B5\n");
> printf("\xB5\n"); //doesn't output anything
> writefln("µ");
> }
>
> Using this source saved as b5.d as a UTF-8 encoded text file (IMPORTANT)
> I can set my command prompt font to "Lucida Console" and execute the
> following commands:
>
> E:\D\src\tmp>chcp 65001
> Active code page: 65001
>
> E:\D\src\tmp>dmd -run b5.d
> µ
> µ
> µ
>
> The 3rd printf doesn't output anything, not sure why, the others all output the same character.
>
> chcp 65001 changes to UTF-8 code page :)
>
> Regan
Thanks, I was hoping for something more elegant but if all char variables in phobos have to be UTF-8 I guess this is the only way.
|
October 04, 2007 Re: Displaying non UTF-8 8 bit character codes with writefln() | ||||
---|---|---|---|---|
| ||||
Posted in reply to Graham | Graham wrote:
> After searching back a bit further than before I see this was discussed
> in April and the answer was to use printf for the 8 bit string.
>
> something like:
>
> writef("elapsed time %.9f", elapsed_time);
> printf(" \xB5S\n");
>
> does work, but if anybody has a more elegant solution please let me know.
>
Hi,
There's a better solution. You could switch to the Tango librabry which uses WriteConsoleW() internally to correctly write Unicode characters on the Windows console.
Regards,
Aziz
|
October 05, 2007 Re: Displaying non UTF-8 8 bit character codes with writefln() | ||||
---|---|---|---|---|
| ||||
Posted in reply to Regan Heath | "Regan Heath" <regan@netmail.co.nz> wrote in message news:fe2uf5$2gsa$1@digitalmars.com... > Try printf and saving the file as a UTF-8 encoded text file... Why, exactly, are you advocating going back to the printf abomination? <snip> > Using this source saved as b5.d as a UTF-8 encoded text file (IMPORTANT) I can set my command prompt font to "Lucida Console" and execute the following commands: > > E:\D\src\tmp>chcp 65001 > Active code page: 65001 <snip> This misses the point slightly. The user shouldn't have to change the codepage just to get someone else's application to work properly. What you want is my utility library: http://pr.stewartsplace.org.uk/d/sutil/ Stewart. -- My e-mail address is valid but not my primary mailbox. Please keep replies on the 'group where everybody may benefit. |
October 05, 2007 Re: Displaying non UTF-8 8 bit character codes with writefln() | ||||
---|---|---|---|---|
| ||||
Posted in reply to Stewart Gordon | Stewart Gordon Wrote: > > What you want is my utility library: http://pr.stewartsplace.org.uk/d/sutil/ > > Stewart. > > -- Thanks, that's nice. By the way, I spotted some minor errors on a couple of your documentation pages: ConsoleOutput referring to ConsoleInput in second column on http://pr.stewartsplace.org.uk/d/sutil/ref/annotated.html and the subtitle on http://pr.stewartsplace.org.uk/d/sutil/ref/classsmjg_1_1libs_1_1util_1_1console_1_1ConsoleOutput.html is ConsoleInput instead of ConsoleOutput |
October 05, 2007 Re: Displaying non UTF-8 8 bit character codes with writefln() | ||||
---|---|---|---|---|
| ||||
Posted in reply to Stewart Gordon | Stewart Gordon wrote: > "Regan Heath" <regan@netmail.co.nz> wrote in message news:fe2uf5$2gsa$1@digitalmars.com... >> Try printf and saving the file as a UTF-8 encoded text file... > > Why, exactly, are you advocating going back to the printf abomination? Well.. there were 2 ways to solve his problem: 1. avoid the valid utf-8 cahracter check. 2. make the console display utf-8 correctly. To achive #1 you've gotta use printf, eg. printf("%c\n", 230); To achive #2 you use chcp and lucida console, eg. writefln("\u00B5"); or save the file as UTF-8 and use writefln("µ"); > <snip> >> Using this source saved as b5.d as a UTF-8 encoded text file (IMPORTANT) I can set my command prompt font to "Lucida Console" and execute the following commands: >> >> E:\D\src\tmp>chcp 65001 >> Active code page: 65001 > <snip> > > This misses the point slightly. The user shouldn't have to change the codepage just to get someone else's application to work properly. Sadly, if the application is outputting UTF-8 you don't have a choice. > What you want is my utility library: > http://pr.stewartsplace.org.uk/d/sutil/ Cool. You're converting UTF-8 to the console code page I assume. Regan |
October 05, 2007 Re: Displaying non UTF-8 8 bit character codes with writefln() | ||||
---|---|---|---|---|
| ||||
Posted in reply to Regan Heath | "Regan Heath" <regan@netmail.co.nz> wrote in message news:fe5d88$15l$1@digitalmars.com... <snip> > 1. avoid the valid utf-8 cahracter check. > 2. make the console display utf-8 correctly. > > To achive #1 you've gotta use printf, eg. > printf("%c\n", 230); No I gottan't. I could use putchar, puts or OutputStream.writeString for example. <snip> >> This misses the point slightly. The user shouldn't have to change the codepage just to get someone else's application to work properly. > > Sadly, if the application is outputting UTF-8 you don't have a choice. But how many DOS or Windows console apps in the real world output UTF-8? Presumably not many, considering that no versions of DOS and only a few versions of Windows support it. There's also a causal loop in that even modern Windows versions don't come with the console code page set to 65001 by default. I don't know what is likely to break this loop, but I doubt that the restrictiveness of one language's standard library is going to do it. >> What you want is my utility library: >> http://pr.stewartsplace.org.uk/d/sutil/ > > Cool. You're converting UTF-8 to the console code page I assume. Exactly. (Well, as exactly as is possible under the constraints.) Stewart. -- My e-mail address is valid but not my primary mailbox. Please keep replies on the 'group where everybody may benefit. |
October 05, 2007 Re: Displaying non UTF-8 8 bit character codes with writefln() | ||||
---|---|---|---|---|
| ||||
Posted in reply to Graham | "Graham >" <GC <grahamc001uk@nospam-yahoo.co.uk> wrote in message news:fe5cp5$bp$1@digitalmars.com... <snip> > By the way, I spotted some minor errors on a couple of your documentation pages: > > ConsoleOutput referring to ConsoleInput in second column on > http://pr.stewartsplace.org.uk/d/sutil/ref/annotated.html > > and the subtitle on > http://pr.stewartsplace.org.uk/d/sutil/ref/classsmjg_1_1libs_1_1util_1_1console_1_1ConsoleOutput.html > is ConsoleInput instead of ConsoleOutput Good catch. Also noticed quite a few cases where the automatic removal of words like "The ConsoleInput class" in the brief description hasn't worked. Stewart. -- My e-mail address is valid but not my primary mailbox. Please keep replies on the 'group where everybody may benefit. |
Copyright © 1999-2021 by the D Language Foundation