Thread overview
Unicode symbols in the identifiers
Jan 11, 2013
Andrey
Jan 11, 2013
evilrat
Jan 11, 2013
Andrey
Jan 11, 2013
Andrey
Jan 11, 2013
evilrat
Jan 11, 2013
H. S. Teoh
Jan 11, 2013
Jacob Carlborg
Jan 11, 2013
Andrey
Jan 11, 2013
Regan Heath
January 11, 2013
Should these variants serve as identifiers?

auto x²; //fails to compile: char 0x00b2 not allowed in identifier, unsupported char 0xb2 (why? is it not a digit?)

Same for ⅀, ∫ and etc.

Official documentations says:
«
D source text can be in one of the following formats:
ASCII
UTF-8
UTF-16BE
UTF-16LE
UTF-32BE
UTF-32LE
»

Math symbols could have a great use compare to just characters from other languages (who does code in Greek or Chinese?). Still, this function name in russian cause compile error: 2.вквадрате (вквадрате(2))
January 11, 2013
On Friday, 11 January 2013 at 02:09:33 UTC, Andrey wrote:
> Should these variants serve as identifiers?
>
> auto x²; //fails to compile: char 0x00b2 not allowed in identifier, unsupported char 0xb2 (why? is it not a digit?)
>
> Same for ⅀, ∫ and etc.
>
> Official documentations says:
> «
> D source text can be in one of the following formats:
> ASCII
> UTF-8
> UTF-16BE
> UTF-16LE
> UTF-32BE
> UTF-32LE
> »
>
> Math symbols could have a great use compare to just characters from other languages (who does code in Greek or Chinese?). Still, this function name in russian cause compile error: 2.вквадрате (вквадрате(2))

save module as utf8(or any other enconding from list above) file, from your error description it is ascii
January 11, 2013
On Fri, Jan 11, 2013 at 03:09:29AM +0100, Andrey wrote:
> Should these variants serve as identifiers?
> 
> auto x²; //fails to compile: char 0x00b2 not allowed in identifier, unsupported char 0xb2 (why? is it not a digit?)

Weird, identifiers like "Цвет" and "張" and even "ℝ" all work fine, but "⅀" doesn't work. Maybe it's a bug?


[...]
> Still, this function name in russian cause compile error: 2.вквадрате
> (вквадрате(2))

This works for me:

	import std.stdio;
	real плюс(real a, real b) { return a+b; }
	void main() {
		writeln(плюс(1.61803, 3.14159));
		writeln(1.61803.плюс(3.14159));
	}

Both writeln's print 4.75962. Are you sure you saved your source file in UTF-8 format?


T

-- 
"I'm not childish; I'm just in touch with the child within!" - RL
January 11, 2013
> save module as utf8(or any other enconding from list above) file, from your error description it is ascii

I'm pretty sure I'm saving it in unicode. I can use all unicode chars easily in string literals ("x²") and output them to console. But using them in identifiers leads to compiler error.

Apart from that, try this code:

int в_квадрате(int num) { return num*num; }

writeln(2.в_квадрате);

You get: Error: found 'в_квадрате' when expecting ','
January 11, 2013
Forgot to mention. Linux 64 bit, D version 2.060
January 11, 2013
On Friday, 11 January 2013 at 07:47:03 UTC, Andrey wrote:
>
> Apart from that, try this code:
>
> int в_квадрате(int num) { return num*num; }
>
> writeln(2.в_квадрате);
>
> You get: Error: found 'в_квадрате' when expecting ','

don't have any errors with this code(dmd 2.061, win8)

but x² as identifier is really gives error. idk maybe it is a bug, or maybe not
January 11, 2013
On 2013-01-11 03:09, Andrey wrote:
> Should these variants serve as identifiers?
>
> auto x²; //fails to compile: char 0x00b2 not allowed in identifier,
> unsupported char 0xb2 (why? is it not a digit?)
>
> Same for ⅀, ∫ and etc.
>
> Official documentations says:
> «
> D source text can be in one of the following formats:
> ASCII
> UTF-8
> UTF-16BE
> UTF-16LE
> UTF-32BE
> UTF-32LE
> »
>
> Math symbols could have a great use compare to just characters from
> other languages (who does code in Greek or Chinese?). Still, this
> function name in russian cause compile error: 2.вквадрате (вквадрате(2))

According to the specification D doesn't necessarly support unicode identifiers:

"Identifiers start with a letter, _, or universal alpha, and are followed by any number of letters, _, digits, or universal alphas. Universal alphas are as defined in ISO/IEC 9899:1999(E) Appendix D. (This is the C99 Standard.) Identifiers can be arbitrarily long, and are case sensitive. Identifiers starting with __ (two underscores) are reserved."

http://dlang.org/lex.html#Identifier

-- 
/Jacob Carlborg
January 11, 2013
> According to the specification D doesn't necessarly support unicode identifiers:
>
> "Identifiers start with a letter, _, or universal alpha, and are followed by any number of letters, _, digits, or universal alphas. Universal alphas are as defined in ISO/IEC 9899:1999(E) Appendix D. (This is the C99 Standard.) Identifiers can be arbitrarily long, and are case sensitive. Identifiers starting with __ (two underscores) are reserved."
>
> http://dlang.org/lex.html#Identifier

http://www.algonet.se/~afb/d/universalalphas/universalalphas.html

I can't understand logic why there are such symbols allowed

µ·ʰʱʲʳʴʵʶʷʸʻʽʾʿˀˁːˑˠˡˢˣˤͺՙऽଽι‿⁀ℂℇℊℋℌℍℎℏℐℑℒℓℕ℘ℙℚℛℜℝℤΩℨKÅℬℭ℮ℯℰℱℳℴℵℶℷℸⅠⅡⅢⅣⅤⅥⅦⅧⅨⅩⅪⅫⅬⅭⅮⅯⅰⅱⅲⅳⅴⅵⅶⅷⅸⅹⅺⅻⅼⅽⅾⅿↀↁↂ々〆〇〡〢〣〤〥〦〧〨〩

and very useful math symbols completely ignored.

C++11 standard allows these: ²³⁴⁵⁶⁷⁸⁹₀₁₂₃₄₅₆₇₈₉

Annex E (normative)
Universal character names for identifier
characters
[charname]

E.1
Ranges of characters allowed
[charname.allowed]
00A8, 00AA, 00AD, 00AF, 00B2-00B5, 00B7-00BA, 00BC-00BE, 00C0-00D6, 00D8-00F6, 00F8-00FF
0100-167F, 1681-180D, 180F-1FFF
200B-200D, 202A-202E, 203F-2040, 2054, 2060-206F
2070-218F, 2460-24FF, 2776-2793, 2C00-2DFF, 2E80-2FFF
3004-3007, 3021-302F, 3031-303F
3040-D7FF
F900-FD3D, FD40-FDCF, FDF0-FE44, FE47-FFFD
10000-1FFFD, 20000-2FFFD, 30000-3FFFD, 40000-4FFFD, 50000-5FFFD,
60000-6FFFD, 70000-7FFFD, 80000-8FFFD, 90000-9FFFD, A0000-AFFFD,
B0000-BFFFD, C0000-CFFFD, D0000-DFFFD, E0000-EFFFD
January 11, 2013
On Fri, 11 Jan 2013 02:09:29 -0000, Andrey <andr-sar@yandex.ru> wrote:

> Should these variants serve as identifiers?

See:
http://dlang.org/lex.html#Identifier

"Identifiers start with a letter, _, or universal alpha, and are followed by any number of letters, _, digits, or universal alphas. Universal alphas are as defined in ISO/IEC 9899:1999(E) Appendix D. (This is the C99 Standard.) Identifiers can be arbitrarily long, and are case sensitive. Identifiers starting with __ (two underscores) are reserved."

Regan

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/