Thread overview
national language support
Sep 30, 2004
novice
Sep 30, 2004
Arcane Jill
Sep 30, 2004
novice
Sep 30, 2004
Stephan Wienczny
Oct 01, 2004
David L. Davis
OT - Encoding names (was: national language support)
Oct 01, 2004
Arcane Jill
Oct 01, 2004
Arcane Jill
September 30, 2004
Hi.
Can i "switch off" utf8 support in dmd compiler?

My localized Windows (it's russian language, but IMHO it like to many other
europe languages) have no utf8 support. I use (and IMHO other europe users)
8-bit code page. Lower 128 symbols is ASCII. High 128 symbols is national
symbols.
But dmd want utf8 everywhere. So no comments in russain, no string constants in
russian - "invalid UTF-8 sequence" compiler error.
I never see programming language in windows with such restrictions before D :(
C, Delphi, perl - not need utf8 or unicode16 editor.
And such editors in windows is rare.

May be i don't understand something? Some D compiler options?


September 30, 2004
In article <cjgqiq$2ae3$1@digitaldaemon.com>, novice says...
>
>Hi.
>Can i "switch off" utf8 support in dmd compiler?

No. And beleive me - you don't want to.


>My localized Windows (it's russian language, but IMHO it like to many other europe languages) have no utf8 support. I use (and IMHO other europe users) 8-bit code page. Lower 128 symbols is ASCII. High 128 symbols is national symbols.

Your local codepage is not relevant to D.




>But dmd want utf8 everywhere.

True. Or UTF-16, or UTF-32.


>So no comments in russain,

Not true. By its very nature, UTF-8 allows comments in Russian. It also allows comments in Greek, Arabic, Hebrew, Chinese, Japanese, and - well - /everything/.


>no string constants in
>russian

Not true. Same answer as above.


- "invalid UTF-8 sequence" compiler error.

Your error report is genuine. You must save your D source files in UTF-8, UTF-16 or UTF-32 before compiling them. If you do this, you can insert all international characters directly into your source code. The trick is this - when you save your source files, select "Save As", instead of "Save". Then find the pull-down menu for "Encoding". Select "UTF-8". Your compile-time errors will then go away.



>I never see programming language in windows with such restrictions before D :(

It's not a restriction, it's a liberation. The 8-bit code with which you are familiar will run correctly /only/ for users sharing your Windows code page. The equivalent D program will work for everyone, worldwide, regardless of their code page.


>C, Delphi, perl - not need utf8 or unicode16 editor.
>And such editors in windows is rare.

Also not true. Virtually every Windows text editor that exists is capable of saving text in UTF-8. Even Microsoft Notepad can do this. Pretty much all programmers text editors (e.g. TextPad, jEdit, UltraEdit, EmEditor, ...) can do this.



>May be i don't understand something? Some D compiler options?

I think the thing you haven't understood is how wonderful Unicode is, and why D supports it in a way that C doesn't. With D, you just insert your international characters directly into the source code, as save as UTF-8. That source file will then read (and compile) the same for everyone, worldwide. Dependency on locale is gone. Although the concepts may take a little getting used to, beleive me - this is a good thing.

Arcane Jill


September 30, 2004
Thanks, Arcane Jill

>beleive me - this is a good thing.

Hmm.. Yes, you are right.
(But goodby my favorite editor)

Sorry for crossposting in two themes.


September 30, 2004
novice wrote:
> Thanks, Arcane Jill
> 
> 
>>beleive me - this is a good thing.
> 
> 
> Hmm.. Yes, you are right.
> (But goodby my favorite editor)
> 
> Sorry for crossposting in two themes.
> 
> 

You could ask the vendor of "my favorite editor" to support UTF!?!

Stephan
October 01, 2004
In article <cjgtf6$2c1o$1@digitaldaemon.com>, novice says...
>
>Thanks, Arcane Jill
>
>>beleive me - this is a good thing.
>
>Hmm.. Yes, you are right.
>(But goodby my favorite editor)
>
>Sorry for crossposting in two themes.
>
>

novice: I'm in the same boat...I'm gotta to say farewell to my favorite editor as well! :(

But the good news is...I found a pretty good replacement for it today, that I'd like to share with you. ;)

Crimson Editor (a Free "Professional Source Editor")
http://www.crimsoneditor.com/english/

1) Encodings:
- ASCI
- Unicode Little Endian
- Unicode Big Endian
- UTF-8 with BOM
- UTF-8 without BOM

2) Code Syntax-Highlighting for D

3) a Tabbed Multi-Document Interface

4) Toggleable Side Line-Numbers

5) File Formats for:
- DOS/Windows
- Mac
- UNIX

---------------------

I've been checking it out, and it looks and operates rather cleanly.

David L.

-------------------------------------------------------------------
"Dare to reach for the Stars...Dare to Dream, Build, and Achieve!"
October 01, 2004
In article <cjiiou$2sc1$1@digitaldaemon.com>, David L. Davis says...

>novice: I'm in the same boat...I'm gotta to say farewell to my favorite editor as well! :(
>
>But the good news is...I found a pretty good replacement for it today, that I'd like to share with you. ;)
>
>Crimson Editor (a Free "Professional Source Editor")
>http://www.crimsoneditor.com/english/
>
>1) Encodings:
>- ASCI
>- Unicode Little Endian
>- Unicode Big Endian
>- UTF-8 with BOM
>- UTF-8 without BOM

Just for the sake of sheer pedantry, I'd like to point out that Windows misnames encodings. I'm guessing that "ASCI" was probably a typo for "ANSI" - it means the default local encoding of your PC, and it is /misnamed/, because of course Microsoft's code pages are _not_ ANSI standards. (I believe Microsoft applied, and got rejected). The encodings named "Unicode Little Endian" and "Unicode Big Endian" are also misnamed, and should in fact be "UTF-16LE" and "UTF-16BE". Again, that's Microsoft getting it wrong. (Windows was designed in the days when Unicode was only 16 bits wide).

Unfortunately, a lot of Windows applications use Microsoft's names.

Arcane Jill


October 01, 2004
In article <cjivrk$2el$1@digitaldaemon.com>, Arcane Jill says...

>>Crimson Editor (a Free "Professional Source Editor")
>>http://www.crimsoneditor.com/english/
>>
>>1) Encodings:
>>- ASCI
>>- Unicode Little Endian
>>- Unicode Big Endian
>>- UTF-8 with BOM
>>- UTF-8 without BOM

I just installed crimson editor to check it out. The first named encoding is actually "ASCII" (not "ANSI", which is what I'd suspected). It is still misnamed, however. I just tried saving a text file containing a Euro currency sign as ASCII using Crimson Editor -- and it succeeded! Examination of the saved file with a binary editor revealed that the saved file contained the single byte 0x80 - in other words, the true encoding was WINDOWS-1252, not ASCII. I assume that this misnamed encoding is /actually/ your PC's default encoding, whatever that happens to be - same as "ANSI" on other editors. "Default" would be a much more accurate name in both cases.

Don't let that put you off though - Crimson seems like a good editor.

Arcane Jill