Jump to page: 1 2 3
Thread overview
Character encoding problem
Nov 19, 2004
Mathias Bierschenk
Nov 19, 2004
Simon Buchan
Nov 19, 2004
Mathias Bierschenk
Nov 19, 2004
Ben Hinkle
Nov 19, 2004
Ilya Minkov
Nov 19, 2004
Mathias Bierschenk
Nov 19, 2004
Valéry Croizier
Nov 19, 2004
Mathias Bierschenk
Nov 19, 2004
Ilya Minkov
Nov 19, 2004
Stewart Gordon
Nov 19, 2004
Thomas Kuehne
Nov 19, 2004
Mathias Bierschenk
Nov 19, 2004
Thomas Kuehne
[PATCH] Re: Character encoding problem
Nov 19, 2004
Thomas Kuehne
Nov 19, 2004
Thomas Kuehne
Nov 19, 2004
Stewart Gordon
Nov 19, 2004
Mathias Bierschenk
Nov 19, 2004
Walter
Nov 20, 2004
Mathias Bierschenk
Nov 22, 2004
Roberto Mariottini
Nov 22, 2004
Thomas Kuehne
Nov 22, 2004
Roberto Mariottini
Nov 20, 2004
Manfred Hansen
Nov 20, 2004
Thomas Kuehne
Nov 20, 2004
Manfred Hansen
November 19, 2004
How can I print German characters? I've tried the following simple program:

import std.c.stdio;

int main()
{
  puts("äöüßÄÖÜ"); // German characters

  return 0;
}

As the normal MS-DOS EDIT encoding didn't work (Windows 98 SE, German edition) I tried Mozilla to save the source code file with different character encodings but none worked as expected. Here's what I tried using the current DMD version:

MS-DOS encoding as performed by Microsoft's EDIT editor:
(5) "invalid UTF-sequence"

Western (ISO-8859-1):
(5) "invalid UTF-sequence"

Unicode (UTF-16 and UTF-32, each with Big Endian and Little Endian):
(1) "semicolon expected, not '.'"
(1) no identifier for declarator

Unicode (UTF-16 and UTF-8):
both compile fine but output garbage under MS-DOS
(Windows 98 SE, German edition)
November 19, 2004
Mathias Bierschenk wrote:

> How can I print German characters? I've tried the following simple program:
> 
> import std.c.stdio;
> 
> int main()
> {
>   puts("äöüßÄÖÜ"); // German characters
> 
>   return 0;
> }

D only supports Unicode, so *both* your editor and
your terminal must be set to this. (UTF-8, usually)

Does the Windows 98 SE command prompt support Unicode ?
If you not, you need to convert before outputting...

--anders
November 19, 2004
On Fri, 19 Nov 2004 12:49:01 +0100, Mathias Bierschenk <Mathias.Bierschenk@web.de> wrote:

> How can I print German characters? I've tried the following simple program:
>
> import std.c.stdio;
>
> int main()
> {
>    puts("äöüßÄÖÜ"); // German characters
>
>    return 0;
> }
>
> As the normal MS-DOS EDIT encoding didn't work (Windows 98 SE, German edition) I tried Mozilla to save the source code file with different character encodings but none worked as expected. Here's what I tried using the current DMD version:
>
> MS-DOS encoding as performed by Microsoft's EDIT editor:
> (5) "invalid UTF-sequence"
>
> Western (ISO-8859-1):
> (5) "invalid UTF-sequence"
>
> Unicode (UTF-16 and UTF-32, each with Big Endian and Little Endian):
> (1) "semicolon expected, not '.'"
> (1) no identifier for declarator
>
> Unicode (UTF-16 and UTF-8):
> both compile fine but output garbage under MS-DOS
> (Windows 98 SE, German edition)

The c functions dont like non-latin char's very much. I had this problem
displaying a file to console.
Currently, you are best of to use either writef or (if you dont want it
formatted) std.stream 's stdout.writeString and stdout.writeLine. (You
could of course use writef("%s", yourstring) , but I dont like that very
much)
Be careful: std.stdio and std.stream.stdout arn't sync'ed. (I use std.stream
exclusively)

-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/m2/
November 19, 2004
Let's try to track down  the real problem.

change the string into "\u00E2\u00F6\u00FC\u00DF" (ae)(oe)(ue)(ss).
If the output is still garbage try printf instead of puts.

If the problem still exists it's an output/shell problem.

Thomas

Mathias Bierschenk schrieb am Fri, 19 Nov 2004 12:49:01 +0100:
> How can I print German characters? I've tried the following simple program:
>
> import std.c.stdio;
>
> int main()
> {
>    puts("äöüßÄÖÜ"); // German characters
>
>    return 0;
> }
>
> As the normal MS-DOS EDIT encoding didn't work (Windows 98 SE, German edition) I tried Mozilla to save the source code file with different character encodings but none worked as expected. Here's what I tried using the current DMD version:
>
> MS-DOS encoding as performed by Microsoft's EDIT editor:
> (5) "invalid UTF-sequence"
>
> Western (ISO-8859-1):
> (5) "invalid UTF-sequence"
>
> Unicode (UTF-16 and UTF-32, each with Big Endian and Little Endian):
> (1) "semicolon expected, not '.'"
> (1) no identifier for declarator
>
> Unicode (UTF-16 and UTF-8):
> both compile fine but output garbage under MS-DOS
> (Windows 98 SE, German edition)
November 19, 2004
Am Fri, 19 Nov 2004 13:09:06 +0100 schrieb Thomas Kuehne <thomas-dloop@kuehne.thisisspam.cn>:

> Let's try to track down  the real problem.
>
> change the string into "\u00E2\u00F6\u00FC\u00DF" (ae)(oe)(ue)(ss).
>
> If the output is still garbage try printf instead of puts.

I've tested the above string. The result for both puts and printf is that either it doesn't compile or it outputs garbage:

MS-DOS/Western (ISO-8859-1), UTF-16, UTF-8
compile fine but output garbage under MS-DOS
(Windows 98 SE, German edition)

Unicode (UTF-16 and UTF-32, each with Big Endian and Little Endian):
(1) "semicolon expected, not '.'"
(1) no identifier for declarator

> If the problem still exists it's an output/shell problem.
November 19, 2004
Am Sat, 20 Nov 2004 01:03:14 +1300 schrieb Simon Buchan <currently@no.where>:

> The c functions dont like non-latin char's very much. I had this problem
> displaying a file to console.
> Currently, you are best of to use either writef or (if you dont want it
> formatted) std.stream 's stdout.writeString and stdout.writeLine. (You
> could of course use writef("%s", yourstring) , but I dont like that very
> much)
> Be careful: std.stdio and std.stream.stdout arn't sync'ed. (I use std.stream
> exclusively)

Could you provide an example? I can't get it to work here. The following program, saved with several unicode encodings, still yields garbage:

import std.stream;

int main()
{
  stdout.writeString("äöüßÄÖÜ\n");

  return 0;
}
November 19, 2004
Mathias Bierschenk schrieb:
>> Let's try to track down  the real problem.
>>
>> change the string into "\u00E2\u00F6\u00FC\u00DF" (ae)(oe)(ue)(ss).
>>
>> If the output is still garbage try printf instead of puts.
>
>I've tested the above string. The result for both puts and printf is that either it doesn't compile or it outputs garbage:
>
>MS-DOS/Western (ISO-8859-1), UTF-16, UTF-8
>compile fine but output garbage under MS-DOS
>(Windows 98 SE, German edition)

Clearly seems to be a shell problem.

>Unicode (UTF-16 and UTF-32, each with Big Endian and Little Endian):
>(1) "semicolon expected, not '.'"
>(1) no identifier for declarator

This is a known problem. If you use UTF-16/32 without a BOM(byte order mark) the current dmd assumes UTF-8 and subsequently fails.

http://svn.kuehne.cn/dstress/www/dstress.html#encoding_utf_16be http://svn.kuehne.cn/dstress/www/dstress.html#encoding_utf_16le http://svn.kuehne.cn/dstress/www/dstress.html#encoding_utf_32be http://svn.kuehne.cn/dstress/www/dstress.html#encoding_utf_32le

Thomas


November 19, 2004
>Could you provide an example? I can't get it to work here. The following program, saved with several unicode encodings, still yields garbage:
>
>import std.stream;
>
>int main()
>{
>   stdout.writeString("äöüßÄÖÜ\n");
>
>   return 0;
>}

Are you sure your command window is set to use UTF-8? On Windows I think you change it by going to the "Regional Settings" control panel.


November 19, 2004
Ben Hinkle schrieb:

> Are you sure your command window is set to use UTF-8? On Windows I think you
> change it by going to the "Regional Settings" control panel.

That doesn't matter - or rather i think there is nothing to configure. The problem is, he misuses Mozilla for something wrong. He should rather use a programmer's editor which supports UTF-8, for example SciTE. In this example, also go to File -> Encoding -> UTF-8.

The output will be another problem - either multi-character garbage (C functions) or automatically converted to local codepage (D native Unicode functions)

-eye
November 19, 2004
Mathias Bierschenk wrote:
> How can I print German characters? I've tried the following simple program:
> 
> import std.c.stdio;
> 
> int main()
> {
>   puts("äöüßÄÖÜ"); // German characters
> 
>   return 0;
> }
> 
<snip>
> Unicode (UTF-16 and UTF-8):
> both compile fine but output garbage under MS-DOS
> (Windows 98 SE, German edition)

You can include MS-DOS characters in a string, but only as escape codes.  In your case (assuming your code page is 437, 850, 852, 853 or 857):

    puts("\x84\x94\x81\xE1\x8E\x99\x9A");

Since the whole point of this is for outputting to MS-DOS, you could argue that this is appropriate use of non-Unicode characters in a string.

Stewart.
« First   ‹ Prev
1 2 3