Thread overview | ||||||
---|---|---|---|---|---|---|
|
June 01, 2014 support for unicode in identifiers | ||||
---|---|---|---|---|
| ||||
I was pretty happy to find that I could use mu and sigma when writing statistical routines, but I've found that for more obscure non-ascii characters the support is hit or miss. For example, none of the subscripts are valid characters, but I can use superscript n as well as dot-notation for derivatives. I'm using dmd 2.065. What's the story behind the scenes? Is there a rationale behind the supported/unsupported or is it happenstance? Is there anywhere I can find a list of supported characters? |
June 01, 2014 Re: support for unicode in identifiers | ||||
---|---|---|---|---|
| ||||
Posted in reply to Vlad Levenfeld | On Sunday, 1 June 2014 at 22:26:42 UTC, Vlad Levenfeld wrote:
> I was pretty happy to find that I could use mu and sigma when writing statistical routines, but I've found that for more obscure non-ascii characters the support is hit or miss. For example, none of the subscripts are valid characters, but I can use superscript n as well as dot-notation for derivatives.
> I'm using dmd 2.065. What's the story behind the scenes? Is there a rationale behind the supported/unsupported or is it happenstance? Is there anywhere I can find a list of supported characters?
The allowed characters are those defined as "universal" in ISO/IEC 9899 (the C standard). It's a pretty long list, but almost only "alphas;" I'm actually surprised you got superscripts and some other things to work.
As I understand it, the intention was a) be like C99, and b) allow things like using "stærð" rather than "staerdh." I'm not sure usage like yours was even thought about, although I'd concede that it seems reasonable.
|
June 02, 2014 Re: support for unicode in identifiers | ||||
---|---|---|---|---|
| ||||
Posted in reply to Chris Nicholson-Sauls | With unicode support (especially with UCFS) I can really code more in the way I think. I never gave it much thought until I worked with D, but now that I have I feel it is a bit weird to work with epsilons and deltas on paper and "eps" and "del" or something on the screen. And what's a more descriptive variable name than the symbol used for it in the canonical representations? So, this may be a very naive question but I wonder, since dmd is open source, is there somewhere that the list of supported symbols can be extended? (hopefully something trivial to change, like a big array literal tucked away somewhere) I'm looking through the files labeled 'lexer' and 'utf' and things like that on github currently, but nothing's jumped out at me yet. |
June 02, 2014 Re: support for unicode in identifiers | ||||
---|---|---|---|---|
| ||||
Posted in reply to Vlad Levenfeld | Ah!, found it in utf.h as ALPHA_TABLE |
Copyright © 1999-2021 by the D Language Foundation