September 26, 2018
On Wednesday, 26 September 2018 at 06:50:47 UTC, Shachar Shemesh wrote:
> The properties that cause city names to be poor candidates for enum values are the same as those that make them Unicode candidates.

How so?

> City names (data, changes over time) as enums (compile time set) seem like a horrible idea.

In most cases yes. But not always. You might me doing some sort of game where certain cities are a central concept, not just data with properties. Another possibility is that you're using code as data, AKA scripting.

And who says anyway you can't make a program that's designed specificially for certain cities?
September 26, 2018
On 26/09/18 10:26, Dukc wrote:
> On Wednesday, 26 September 2018 at 06:50:47 UTC, Shachar Shemesh wrote:
>> The properties that cause city names to be poor candidates for enum values are the same as those that make them Unicode candidates.
> 
> How so?
> 
>> City names (data, changes over time) as enums (compile time set) seem like a horrible idea.
> 
> In most cases yes. But not always. You might me doing some sort of game where certain cities are a central concept, not just data with properties. Another possibility is that you're using code as data, AKA scripting.
> 
> And who says anyway you can't make a program that's designed specificially for certain cities?

Sure you can. It's just very poor design.

I think, when asking such questions, two types of answers are relevant. One is hypotheticals where you say "this design requires this". For such answers, the design needs to be a good one. It makes no sense to design a language to support a hypothetical design which is not a good one.

The other type of answer is "it's being done in the real world". If it's in active use in the real world, it might make sense to support it, even if we can agree that the design is not optimal.

Since your answer is hypothetical, I think arguing this is not a good way to code is a valid one.

Shachar
September 26, 2018
On Wednesday, 26 September 2018 at 07:37:28 UTC, Shachar Shemesh wrote:
> The other type of answer is "it's being done in the real world". If it's in active use in the real world, it might make sense to support it, even if we can agree that the design is not optimal.
>
> Shachar

Two years ago, I taked part in implementing a commerical game. It was made in C# (Unity) but I don't think that matters, since D would have faced the same thing, were it used.

Anyway, the game has three characters with completely different abilites. The abilites were unique enough that it made sense to name some functions after the characters. One of the characters really has a non-ASCII character in his name, and that meant naming him differently in the code.
September 26, 2018
On Fri, 21 Sep 2018 16:27:46 +0000, Neia Neutuladh wrote:

> I've got this coded up and can submit a PR, but I thought I'd get feedback here first.
> 
> Does anyone see any horrible potential problems here?
> 
> Or is there an interestingly better option?
> 
> Does this need a DIP?

I just want to point out since this thread is still living that there have been very few answers to the actual question ("should I submit my PR?").

Walter did answer the question, with the reasons that Unicode identifier support is not useful/helpful and could cause issues with tooling. Which is likely correct; and if we really want to follow this logic, Unicode identifier support should be removed from D entirely.

I don't recall seeing anyone in favor providing technical reasons, save the OP.

Especially since the work is done, it makes sense to me to ask for the PR for review. Worst case scenario, it sits there until we need it.
September 26, 2018
On 9/26/18 2:50 AM, Shachar Shemesh wrote:
> On 25/09/18 15:35, Dukc wrote:
>> Another reason is that something may not have a good translation to English. If there is an enum type listing city names, it is IMO better to write them as normal, using Unicode. CityName.seinäjoki, not CityName.seinaejoki.
> 
> This sounded like a very compelling example, until I gave it a second thought. I now fail to see how this example translates to a real-life scenario.
> 
> City names (data, changes over time) as enums (compile time set) seem like a horrible idea.
> 
> That may sound like a very technical objection to an otherwise valid point, but it really think that's not the case. The properties that cause city names to be poor candidates for enum values are the same as those that make them Unicode candidates.

Hm... I could see actually some "clever" use of opDispatch being used to define cities or other such names.

In any case, I think the biggest pro for supporting Unicode symbol names is -- we already support Unicode symbol names. It doesn't make a whole lot of sense to only support some of them.

-Steve
September 26, 2018
On 9/26/18 5:54 AM, rjframe wrote:
> On Fri, 21 Sep 2018 16:27:46 +0000, Neia Neutuladh wrote:
> 
>> I've got this coded up and can submit a PR, but I thought I'd get
>> feedback here first.
>>
>> Does anyone see any horrible potential problems here?
>>
>> Or is there an interestingly better option?
>>
>> Does this need a DIP?
> 
> I just want to point out since this thread is still living that there have
> been very few answers to the actual question ("should I submit my PR?").
> 
> Walter did answer the question, with the reasons that Unicode identifier
> support is not useful/helpful and could cause issues with tooling. Which
> is likely correct; and if we really want to follow this logic, Unicode
> identifier support should be removed from D entirely.

This is a non-starter. We can't break people's code, especially for trivial reasons like 'you shouldn't code that way because others don't like it'. I'm pretty sure Walter would be against removing Unicode support for identifiers.

> 
> I don't recall seeing anyone in favor providing technical reasons, save
> the OP.

There doesn't necessarily need to be a technical reason. In fact, there really isn't one -- people can get by with using ASCII identifiers just fine (and many/most people do). Supporting Unicode would be purely for social or inclusive reasons (it may make D more approachable to non-English speaking schoolchildren for instance).

As an only-English speaking person, it doesn't bother me either way to have Unicode identifiers. But the fact that we *already* support Unicode identifiers leads me to expect that we support *all* Unicode identifiers. It doesn't make a whole lot of sense to only support some of them.

> 
> Especially since the work is done, it makes sense to me to ask for the PR
> for review. Worst case scenario, it sits there until we need it.

I suggested this as well.

https://forum.dlang.org/post/poaq1q$its$1@digitalmars.com

I think it stands a good chance of getting incorporated, just for the simple fact that it's enabling and not disruptive.

-Steve
September 26, 2018
On Wednesday, 26 September 2018 at 02:12:07 UTC, Ali Çehreli wrote:
> On 09/24/2018 08:17 AM, 0xEAB wrote:
>
> > - Non-idiomatic translations of tech terms [2]
>
[snip]
> English message was something like "No memory left" and the German translation was "No memory on the left hand side" :)
>
> Ali

Not sure if this was not just some urban legend, but there was a delightful story back in the late 80s/early 90s about the early translation programs. They were in particular not very good at idiomatic translations, so people would play with idiomatic expressions from language X (say english) to language Y, and then back from Y to X  - and then see what was returned.

Apparently the expression "the spirit is willing but the flesh is weak" translated to Russian and back was returned by one such program as:

"The vodka is good but the meat is rotten!"
September 26, 2018
On Wednesday, 26 September 2018 at 12:57:21 UTC, ShadoLight wrote:
> On Wednesday, 26 September 2018 at 02:12:07 UTC, Ali Çehreli wrote:
>> On 09/24/2018 08:17 AM, 0xEAB wrote:
>>
>> > - Non-idiomatic translations of tech terms [2]
>>
> [snip]
>> English message was something like "No memory left" and the German translation was "No memory on the left hand side" :)
>>
>> Ali
>
> Not sure if this was not just some urban legend, but there was a delightful story back in the late 80s/early 90s about the early translation programs. They were in particular not very good at idiomatic translations, so people would play with idiomatic expressions from language X (say english) to language Y, and then back from Y to X  - and then see what was returned.
>
> Apparently the expression "the spirit is willing but the flesh is weak" translated to Russian and back was returned by one such program as:
>
> "The vodka is good but the meat is rotten!"

In case you missed it, this was well spreaded in the tech news last month or so:

https://translate.google.fr/?hl=fr#so/en/ngoo%20m%20goon%20goob%20goo%20goo%20goo%20mgoo%20goo%20goo%20goo%20goo%20goo%20m%20goo

Still progress to do.
September 26, 2018
On Sunday, 23 September 2018 at 20:49:39 UTC, Walter Bright wrote:
> On 9/23/2018 9:52 AM, aliak wrote:
>
> There's a reason why dmd doesn't have international error messages. My experience with it is that international users don't want it. They prefer the english messages.

Yes please. Keep them in english.
But please, add an error code too in front of them.

> I'm sure if you look hard enough you'll find someone using non-ASCII characters in identifiers.

It depends on what I'm developing.
If I'm writing a public library I'm planning to release on github, I use english identifiers.

But of course if is a piece of software for my company or for myself, I use italian identifiers.

Andrea
September 26, 2018
On 9/25/2018 11:50 PM, Shachar Shemesh wrote:
> This sounded like a very compelling example, until I gave it a second thought. I now fail to see how this example translates to a real-life scenario.

Also, there are usually common ASCII versions of city names, such as Cologne for Köln.