View mode: basic / threaded / horizontal-split · Log in · Help
March 01, 2009
std.locale
Sooner or later that will need to be defined. I know next to nothing 
about locales. (I know I dislike the design C++ uses.)

I was thinking of a design along the following lines. There are RFCs 
dedicated to locale nomenclature:

http://tools.ietf.org/html/rfc4646 for language names
http://www.unicode.org/cldr/ for various locale names

So we know the basic names we want to follow, which is one less burden. 
Then what I want to do is to define a hierarchical string table that 
fills the appropriate names.

This is in opposition to defining an actual class hierarchy that mimics 
the localization table. I think a hierarchical string table is better 
because it allows simple extensibility.

The type stored by each slot of a locale is:

Algebraic!(
    int,
    string,
    Variant delegate(Variant),
    This[string]);

meaning that a locale could store one of these types. (What else should 
go in there?)

The access pattern goes like:

// Get the date display pattern
auto pat = myLocale.get("calendars", "calendar=default",
    "dateFormats", "dateFormatLength=medium", "pattern");

This will return an Algebraic with a string in it. The string looks like 
e.g. "yyyy-MM-dd".

The access is rather verbose because the corresponding locale names tree 
is equally (actually more) verbose, see 
http://unicode.org/Public/cldr/1.6.1/core.zip. But the flexibility and 
the standards-compliance are there. We may add later some convenience 
functions for frequently-used stuff such as dates, times, and numbers.

Extension is obvious:

myLocale.put("my-category", "my-slot", "whatever");

Getting later the stuff in "my-category", "my-slot" will return a string 
Algebraic containing "whatever".

There will be a global reference to a Locale class, e.g. defaultLocale. 
By default the reference will be null, implying the C locale should be 
in effect. Applications can assign to it as they find fit, and also pass 
around multiple locale variables.

So I wanted to gather some good ideas about locale design. Is a 
string-and-Algebraic design good for all uses? What kind of locale 
functionality does it not capture? I must have missed a ton of details, 
so if you don't understand what I mean by the above, it must be me.



Andrei
March 02, 2009
Re: std.locale
Andrei Alexandrescu wrote:
> There will be a global reference to a Locale class, e.g. defaultLocale. 
> By default the reference will be null, implying the C locale should be 
> in effect. Applications can assign to it as they find fit, and also pass 
> around multiple locale variables.

I disagree with being able to assign to the global defaultLocale. This 
is going to cause endless problems. Just one is that any function that 
uses locale can no longer be pure. defaultLocale should be immutable.

Any function that is locale aware should be parameterized with a locale 
parameter. (Not only is that better design, it self-documents the 
dependency.)
March 02, 2009
Re: std.locale
Andrei Alexandrescu wrote:
> Sooner or later that will need to be defined. I know next to nothing 
> about locales. (I know I dislike the design C++ uses.)


D uses Utf-8, and that is *good enough*!

This lets my programs "understand" Finnish, and doesn't give me undue 
headaches.


Seriously tending to locale issues would be an *endless swamp*. Just for 
this, I looked up something suitable to read:

http://www.manpagez.com/man/1/perllocale/

It may even be that you would find the time, but think about Walter and 
us, please. There *really are* other things to do.


An excellent string hierarchy without the entire rest of i18n, is only 
going to look like a Ferrari with a Trabant engine. Which is worse than 
nothing at all.

Besides, there's more to this than just designing the perfect, or even a 
good locale system in a language. *Somebody should actually use it*.

Now, the non-English programmer, what does he really want? He wants to 
be able to type stuff into his program in his native character set. D 
already does that, by way of Utf-8.

What else? Well, it is conceivable that he wants his program to print 
dates and times the way it's done over there. He simply writes the 
program "by hand" so it does dates and times like he wants. Even if 
there was a locale thing in the language, he wouldn't bother with the 
hassle. And he couldn't care less about Urdu.

The hypothetical Ambitious Programmer might want to use locale. He could 
then have the dates and times (and currencies, etc.) follow the country. 
Now, that might sound commendable, but in practice it *crumbles*.
He can't possibly know how to deal with languages that are written 
backwards, languages where several characters make one letter, exotic 
ways of writing dates, etc.

So, his fancy i18n project is doomed to be, at most, as usable as the 
"normal" D program. Probably less, since his decisions will actually 
worsen the user experience -- for users in another culture.


And, any project big enough to tackle this, will implement its own 
locale handling anyway. I'm sorry to say.

----

Yes, locales are nice and all.
For D 3.5 that is.
Honestly.
March 02, 2009
Re: std.locale
Walter Bright wrote:
> Andrei Alexandrescu wrote:
>> There will be a global reference to a Locale class, e.g. 
>> defaultLocale. By default the reference will be null, implying the C 
>> locale should be in effect. Applications can assign to it as they find 
>> fit, and also pass around multiple locale variables.
> 
> I disagree with being able to assign to the global defaultLocale. This 
> is going to cause endless problems. Just one is that any function that 
> uses locale can no longer be pure. defaultLocale should be immutable.
> 
> Any function that is locale aware should be parameterized with a locale 
> parameter. (Not only is that better design, it self-documents the 
> dependency.)

I don't understand this. That means there's no more default locale. 
Here's what I had in mind:

class Locale { ... }

// function parameterized with an optional locale
void foo(Data d, Locale loc = null);

So there's no more default locale. If you pass in null, that's the 
default locale.


Andrei
March 02, 2009
Re: std.locale
Walter Bright wrote:
> Andrei Alexandrescu wrote:
>> There will be a global reference to a Locale class, e.g. 
>> defaultLocale. By default the reference will be null, implying the C 
>> locale should be in effect. Applications can assign to it as they find 
>> fit, and also pass around multiple locale variables.
> 
> I disagree with being able to assign to the global defaultLocale. This 
> is going to cause endless problems. Just one is that any function that 
> uses locale can no longer be pure. defaultLocale should be immutable.

The two programs that are most "locale aware" are usually spread sheets 
and word processors.

It is usual that the user needs to write, say, in Swedish or in Russian, 
while in a Finnish setting. Or that one wants to use a decimal separator 
other than what is "proper" for the country.

For example, a lot of people use "." instead of the official "," in 
Finland, and many use time as "18:23" instead of "18.23".


For this purpose, these programs let the users define these any way they 
want.

I think the notion of locales is, slowly but steadily, going away.

It was a nice idea at the time, but with two problems: users don't use 
it, and programmers don't use it.


Of course, eventually we will want to "do something" about this. But 
that should be left to the day when real issues are all sorted out in D. 
This is a non-urgent, low-priority thing.
March 02, 2009
Re: std.locale
Georg Wrede wrote:
> Andrei Alexandrescu wrote:
>> Sooner or later that will need to be defined. I know next to nothing 
>> about locales. (I know I dislike the design C++ uses.)
> 
> 
> D uses Utf-8, and that is *good enough*!
> 
> This lets my programs "understand" Finnish, and doesn't give me undue 
> headaches.
> 
> 
> Seriously tending to locale issues would be an *endless swamp*. Just for 
> this, I looked up something suitable to read:
> 
> http://www.manpagez.com/man/1/perllocale/
> 
> It may even be that you would find the time, but think about Walter and 
> us, please. There *really are* other things to do.

I don't find that scary at all. It's quite what I expected. We should 
phase it in, after we do a good design. Also I don't plan to sit down 
and write locale definition files, I want to parse the XML in that 
locale repository I referred to.

> An excellent string hierarchy without the entire rest of i18n, is only 
> going to look like a Ferrari with a Trabant engine. Which is worse than 
> nothing at all.

I don't understand this. What is the rest of i18n?

> Besides, there's more to this than just designing the perfect, or even a 
> good locale system in a language. *Somebody should actually use it*.
> 
> Now, the non-English programmer, what does he really want? He wants to 
> be able to type stuff into his program in his native character set. D 
> already does that, by way of Utf-8.
> 
> What else? Well, it is conceivable that he wants his program to print 
> dates and times the way it's done over there. He simply writes the 
> program "by hand" so it does dates and times like he wants. Even if 
> there was a locale thing in the language, he wouldn't bother with the 
> hassle. And he couldn't care less about Urdu.

If we come up with a good design, then they will be compelled to use it. 
Applications meant to be used across multiple countries have fumbled 
with locale support because there's no good support in most languages. 
So then why not offer a compelling support in D?

> The hypothetical Ambitious Programmer might want to use locale. He could 
> then have the dates and times (and currencies, etc.) follow the country. 
> Now, that might sound commendable, but in practice it *crumbles*.
> He can't possibly know how to deal with languages that are written 
> backwards, languages where several characters make one letter, exotic 
> ways of writing dates, etc.

Well my understanding is that the guys who wrote those RFCs and whatnot 
spent time figuring out the right abstractions. Why not use them?

> So, his fancy i18n project is doomed to be, at most, as usable as the 
> "normal" D program. Probably less, since his decisions will actually 
> worsen the user experience -- for users in another culture.
> 
> 
> And, any project big enough to tackle this, will implement its own 
> locale handling anyway. I'm sorry to say.

They will implement their own because the language doesn't offer an 
extensible framework that they can build on.

> Yes, locales are nice and all.
> For D 3.5 that is.
> Honestly.

I just don't see where the big problem is. I'm talking about a blessed 
hierarchical hashtable to begin with. My initial desire is to be able to 
customize the array separators in writeln.


Andrei
March 02, 2009
Re: std.locale
Georg Wrede wrote:
> Walter Bright wrote:
>> Andrei Alexandrescu wrote:
>>> There will be a global reference to a Locale class, e.g. 
>>> defaultLocale. By default the reference will be null, implying the C 
>>> locale should be in effect. Applications can assign to it as they 
>>> find fit, and also pass around multiple locale variables.
>>
>> I disagree with being able to assign to the global defaultLocale. This 
>> is going to cause endless problems. Just one is that any function that 
>> uses locale can no longer be pure. defaultLocale should be immutable.
> 
> The two programs that are most "locale aware" are usually spread sheets 
> and word processors.
> 
> It is usual that the user needs to write, say, in Swedish or in Russian, 
> while in a Finnish setting. Or that one wants to use a decimal separator 
> other than what is "proper" for the country.
> 
> For example, a lot of people use "." instead of the official "," in 
> Finland, and many use time as "18:23" instead of "18.23".
> 
> 
> For this purpose, these programs let the users define these any way they 
> want.

That's exactly what my proposal is doing. People can start with the 
defaults of the Finnish locale and then overwrite whichever parts they want.

> I think the notion of locales is, slowly but steadily, going away.

Do you have any data backing this up?

> It was a nice idea at the time, but with two problems: users don't use 
> it, and programmers don't use it.

Is it because it hasn't been properly packaged?

> Of course, eventually we will want to "do something" about this. But 
> that should be left to the day when real issues are all sorted out in D. 
> This is a non-urgent, low-priority thing.

I guess. Now please tell me how I print arrays in D.


Andrei
March 02, 2009
Re: std.locale
Andrei Alexandrescu wrote:
> Walter Bright wrote:
>> Andrei Alexandrescu wrote:
>>> There will be a global reference to a Locale class, e.g. 
>>> defaultLocale. By default the reference will be null, implying the C 
>>> locale should be in effect. Applications can assign to it as they 
>>> find fit, and also pass around multiple locale variables.
>>
>> I disagree with being able to assign to the global defaultLocale. This 
>> is going to cause endless problems. Just one is that any function that 
>> uses locale can no longer be pure. defaultLocale should be immutable.
>>
>> Any function that is locale aware should be parameterized with a 
>> locale parameter. (Not only is that better design, it self-documents 
>> the dependency.)
> 
> I don't understand this. That means there's no more default locale. 
> Here's what I had in mind:
> 
> class Locale { ... }
> 
> // function parameterized with an optional locale
> void foo(Data d, Locale loc = null);
> 
> So there's no more default locale. If you pass in null, that's the 
> default locale.

That's fine, I was thrown off by your reference to a "global reference".
March 02, 2009
Re: std.locale
Georg Wrede wrote:
> What else? Well, it is conceivable that he wants his program to print 
> dates and times the way it's done over there. He simply writes the 
> program "by hand" so it does dates and times like he wants. Even if 
> there was a locale thing in the language, he wouldn't bother with the 
> hassle. And he couldn't care less about Urdu.

I've attempted to use locales, but the reason I'd always wind up doing 
it by hand is because the existing libraries to do it are obtuse, 
impenetrable, execrable, and pretty much unusable.

So it may be that it's an insoluble problem, or maybe nobody has come up 
with the right abstraction yet. I don't have nearly enough experience 
with it to know the answer.
March 02, 2009
Re: std.locale
Walter Bright wrote:
> Andrei Alexandrescu wrote:
>> Walter Bright wrote:
>>> Andrei Alexandrescu wrote:
>>>> There will be a global reference to a Locale class, e.g. 
>>>> defaultLocale. By default the reference will be null, implying the C 
>>>> locale should be in effect. Applications can assign to it as they 
>>>> find fit, and also pass around multiple locale variables.
>>>
>>> I disagree with being able to assign to the global defaultLocale. 
>>> This is going to cause endless problems. Just one is that any 
>>> function that uses locale can no longer be pure. defaultLocale should 
>>> be immutable.
>>>
>>> Any function that is locale aware should be parameterized with a 
>>> locale parameter. (Not only is that better design, it self-documents 
>>> the dependency.)
>>
>> I don't understand this. That means there's no more default locale. 
>> Here's what I had in mind:
>>
>> class Locale { ... }
>>
>> // function parameterized with an optional locale
>> void foo(Data d, Locale loc = null);
>>
>> So there's no more default locale. If you pass in null, that's the 
>> default locale.
> 
> That's fine, I was thrown off by your reference to a "global reference".

Well I was thinking a global reference might be handy for people who 
e.g. want to set the locale once and then be done with it. I think only 
a few apps actually manipulate multiple locales simultaneously. Most 
would just want to load the locale present on the user's computer and 
then use it.

Andrei
« First   ‹ Prev
1 2 3 4 5
Top | Discussion index | About this forum | D home