Thread overview | ||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
November 19, 2004 Character encoding problem | ||||
---|---|---|---|---|
| ||||
How can I print German characters? I've tried the following simple program: import std.c.stdio; int main() { puts("äöüßÄÖÜ"); // German characters return 0; } As the normal MS-DOS EDIT encoding didn't work (Windows 98 SE, German edition) I tried Mozilla to save the source code file with different character encodings but none worked as expected. Here's what I tried using the current DMD version: MS-DOS encoding as performed by Microsoft's EDIT editor: (5) "invalid UTF-sequence" Western (ISO-8859-1): (5) "invalid UTF-sequence" Unicode (UTF-16 and UTF-32, each with Big Endian and Little Endian): (1) "semicolon expected, not '.'" (1) no identifier for declarator Unicode (UTF-16 and UTF-8): both compile fine but output garbage under MS-DOS (Windows 98 SE, German edition) |
November 19, 2004 Re: Character encoding problem | ||||
---|---|---|---|---|
| ||||
Posted in reply to Mathias Bierschenk | Mathias Bierschenk wrote:
> How can I print German characters? I've tried the following simple program:
>
> import std.c.stdio;
>
> int main()
> {
> puts("äöüßÄÖÜ"); // German characters
>
> return 0;
> }
D only supports Unicode, so *both* your editor and
your terminal must be set to this. (UTF-8, usually)
Does the Windows 98 SE command prompt support Unicode ?
If you not, you need to convert before outputting...
--anders
|
November 19, 2004 Re: Character encoding problem | ||||
---|---|---|---|---|
| ||||
Posted in reply to Mathias Bierschenk | On Fri, 19 Nov 2004 12:49:01 +0100, Mathias Bierschenk <Mathias.Bierschenk@web.de> wrote: > How can I print German characters? I've tried the following simple program: > > import std.c.stdio; > > int main() > { > puts("äöüßÄÖÜ"); // German characters > > return 0; > } > > As the normal MS-DOS EDIT encoding didn't work (Windows 98 SE, German edition) I tried Mozilla to save the source code file with different character encodings but none worked as expected. Here's what I tried using the current DMD version: > > MS-DOS encoding as performed by Microsoft's EDIT editor: > (5) "invalid UTF-sequence" > > Western (ISO-8859-1): > (5) "invalid UTF-sequence" > > Unicode (UTF-16 and UTF-32, each with Big Endian and Little Endian): > (1) "semicolon expected, not '.'" > (1) no identifier for declarator > > Unicode (UTF-16 and UTF-8): > both compile fine but output garbage under MS-DOS > (Windows 98 SE, German edition) The c functions dont like non-latin char's very much. I had this problem displaying a file to console. Currently, you are best of to use either writef or (if you dont want it formatted) std.stream 's stdout.writeString and stdout.writeLine. (You could of course use writef("%s", yourstring) , but I dont like that very much) Be careful: std.stdio and std.stream.stdout arn't sync'ed. (I use std.stream exclusively) -- Using Opera's revolutionary e-mail client: http://www.opera.com/m2/ |
November 19, 2004 Re: Character encoding problem | ||||
---|---|---|---|---|
| ||||
Posted in reply to Mathias Bierschenk |
Let's try to track down the real problem.
change the string into "\u00E2\u00F6\u00FC\u00DF" (ae)(oe)(ue)(ss).
If the output is still garbage try printf instead of puts.
If the problem still exists it's an output/shell problem.
Thomas
Mathias Bierschenk schrieb am Fri, 19 Nov 2004 12:49:01 +0100:
> How can I print German characters? I've tried the following simple program:
>
> import std.c.stdio;
>
> int main()
> {
> puts("äöüßÄÖÜ"); // German characters
>
> return 0;
> }
>
> As the normal MS-DOS EDIT encoding didn't work (Windows 98 SE, German edition) I tried Mozilla to save the source code file with different character encodings but none worked as expected. Here's what I tried using the current DMD version:
>
> MS-DOS encoding as performed by Microsoft's EDIT editor:
> (5) "invalid UTF-sequence"
>
> Western (ISO-8859-1):
> (5) "invalid UTF-sequence"
>
> Unicode (UTF-16 and UTF-32, each with Big Endian and Little Endian):
> (1) "semicolon expected, not '.'"
> (1) no identifier for declarator
>
> Unicode (UTF-16 and UTF-8):
> both compile fine but output garbage under MS-DOS
> (Windows 98 SE, German edition)
|
November 19, 2004 Re: Character encoding problem | ||||
---|---|---|---|---|
| ||||
Posted in reply to Thomas Kuehne | Am Fri, 19 Nov 2004 13:09:06 +0100 schrieb Thomas Kuehne <thomas-dloop@kuehne.thisisspam.cn>: > Let's try to track down the real problem. > > change the string into "\u00E2\u00F6\u00FC\u00DF" (ae)(oe)(ue)(ss). > > If the output is still garbage try printf instead of puts. I've tested the above string. The result for both puts and printf is that either it doesn't compile or it outputs garbage: MS-DOS/Western (ISO-8859-1), UTF-16, UTF-8 compile fine but output garbage under MS-DOS (Windows 98 SE, German edition) Unicode (UTF-16 and UTF-32, each with Big Endian and Little Endian): (1) "semicolon expected, not '.'" (1) no identifier for declarator > If the problem still exists it's an output/shell problem. |
November 19, 2004 Re: Character encoding problem | ||||
---|---|---|---|---|
| ||||
Posted in reply to Simon Buchan | Am Sat, 20 Nov 2004 01:03:14 +1300 schrieb Simon Buchan <currently@no.where>: > The c functions dont like non-latin char's very much. I had this problem > displaying a file to console. > Currently, you are best of to use either writef or (if you dont want it > formatted) std.stream 's stdout.writeString and stdout.writeLine. (You > could of course use writef("%s", yourstring) , but I dont like that very > much) > Be careful: std.stdio and std.stream.stdout arn't sync'ed. (I use std.stream > exclusively) Could you provide an example? I can't get it to work here. The following program, saved with several unicode encodings, still yields garbage: import std.stream; int main() { stdout.writeString("äöüßÄÖÜ\n"); return 0; } |
November 19, 2004 Re: Character encoding problem | ||||
---|---|---|---|---|
| ||||
Posted in reply to Mathias Bierschenk | Mathias Bierschenk schrieb: >> Let's try to track down the real problem. >> >> change the string into "\u00E2\u00F6\u00FC\u00DF" (ae)(oe)(ue)(ss). >> >> If the output is still garbage try printf instead of puts. > >I've tested the above string. The result for both puts and printf is that either it doesn't compile or it outputs garbage: > >MS-DOS/Western (ISO-8859-1), UTF-16, UTF-8 >compile fine but output garbage under MS-DOS >(Windows 98 SE, German edition) Clearly seems to be a shell problem. >Unicode (UTF-16 and UTF-32, each with Big Endian and Little Endian): >(1) "semicolon expected, not '.'" >(1) no identifier for declarator This is a known problem. If you use UTF-16/32 without a BOM(byte order mark) the current dmd assumes UTF-8 and subsequently fails. http://svn.kuehne.cn/dstress/www/dstress.html#encoding_utf_16be http://svn.kuehne.cn/dstress/www/dstress.html#encoding_utf_16le http://svn.kuehne.cn/dstress/www/dstress.html#encoding_utf_32be http://svn.kuehne.cn/dstress/www/dstress.html#encoding_utf_32le Thomas |
November 19, 2004 Re: Character encoding problem | ||||
---|---|---|---|---|
| ||||
Posted in reply to Mathias Bierschenk |
>Could you provide an example? I can't get it to work here. The following program, saved with several unicode encodings, still yields garbage:
>
>import std.stream;
>
>int main()
>{
> stdout.writeString("äöüßÄÖÜ\n");
>
> return 0;
>}
Are you sure your command window is set to use UTF-8? On Windows I think you change it by going to the "Regional Settings" control panel.
|
November 19, 2004 Re: Character encoding problem | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ben Hinkle | Ben Hinkle schrieb:
> Are you sure your command window is set to use UTF-8? On Windows I think you
> change it by going to the "Regional Settings" control panel.
That doesn't matter - or rather i think there is nothing to configure. The problem is, he misuses Mozilla for something wrong. He should rather use a programmer's editor which supports UTF-8, for example SciTE. In this example, also go to File -> Encoding -> UTF-8.
The output will be another problem - either multi-character garbage (C functions) or automatically converted to local codepage (D native Unicode functions)
-eye
|
November 19, 2004 Re: Character encoding problem | ||||
---|---|---|---|---|
| ||||
Posted in reply to Mathias Bierschenk | Mathias Bierschenk wrote: > How can I print German characters? I've tried the following simple program: > > import std.c.stdio; > > int main() > { > puts("äöüßÄÖÜ"); // German characters > > return 0; > } > <snip> > Unicode (UTF-16 and UTF-8): > both compile fine but output garbage under MS-DOS > (Windows 98 SE, German edition) You can include MS-DOS characters in a string, but only as escape codes. In your case (assuming your code page is 437, 850, 852, 853 or 857): puts("\x84\x94\x81\xE1\x8E\x99\x9A"); Since the whole point of this is for outputting to MS-DOS, you could argue that this is appropriate use of non-Unicode characters in a string. Stewart. |
Copyright © 1999-2021 by the D Language Foundation