View mode: basic / threaded / horizontal-split · Log in · Help
March 03, 2009
Re: std.locale
On Mon, 02 Mar 2009 06:28:12 -0800, Andrei Alexandrescu wrote:


> You're right, we won't engage in the business of maintaining locale 
> databases. We provide mechanism, not policy.

Ok, for awhile there I thought you were attempting to duplicate the efforts
that the operating systems already do. 

I see locale support in D as being a platform-independant method of
invoking existing operating system functionality.

-- 
Derek Parnell
Melbourne, Australia
skype: derek.j.parnell
March 03, 2009
Re: std.locale
On Mon, 02 Mar 2009 07:02:10 -0800, Andrei Alexandrescu wrote:

> Consider some code in phobos that must throw an exception:
> 
> throw Exception("File `%s' not found, system error is %s.",
>      filename, errnomsg);
> 
> The localized version will look like this:
> 
> auto format = "File `%s' not found, system error is %s.";
> auto localFormat = currentLocale ? currentLocale.peek(format) : null;
> if (!localFormat) localFormat = format;
> throw Exception(localFormat, filename, errnomsg);

One problem with this approach is that we meet the limitation of the
formatting string's micro-syntax. Currently, there is no way to reorder the
tokens in a message string, and that is required for /some/ messages in
/some/ languages.

I have used my own text formatting routine rather than Phobos' because it
allows the implementer to develop messages whose word order is correct for
their target language.

-- 
Derek Parnell
Melbourne, Australia
skype: derek.j.parnell
March 03, 2009
Re: std.locale
Derek Parnell wrote:
> On Mon, 02 Mar 2009 07:02:10 -0800, Andrei Alexandrescu wrote:
> 
>> Consider some code in phobos that must throw an exception:
>>
>> throw Exception("File `%s' not found, system error is %s.",
>>      filename, errnomsg);
>>
>> The localized version will look like this:
>>
>> auto format = "File `%s' not found, system error is %s.";
>> auto localFormat = currentLocale ? currentLocale.peek(format) : null;
>> if (!localFormat) localFormat = format;
>> throw Exception(localFormat, filename, errnomsg);
> 
> One problem with this approach is that we meet the limitation of the
> formatting string's micro-syntax. Currently, there is no way to reorder the
> tokens in a message string, and that is required for /some/ messages in
> /some/ languages.
> 
> I have used my own text formatting routine rather than Phobos' because it
> allows the implementer to develop messages whose word order is correct for
> their target language.
> 

Phobos has supported Posix positional syntax since 2.006.

http://digitalmars.com/d/2.0/phobos/std_stdio.html


Andrei
March 03, 2009
Re: std.locale
On 2009-03-02 14:58:26 -0500, Walter Bright <newshound1@digitalmars.com> said:

> It's a silly thing, but I love the little google widget you can add to 
> a web page to automatically translate the pages. All the D site pages 
> have it in the left column.

It's not a silly thing, it's hilarious. Look, Google has invented the 
D-French language:

-	import std.stdio;
+	std.stdio importation;
----------
-	delete cl;
+	supprimer cl;
----------
-	s.allocated += argv.length * typeof (argv[0]).sizeof;
+	s.allocated + = * argv.length typeof (argv [0]). sizeof;
----------
-	writefln( "argc = %d, "  ~ "allocated = %d" ,
-	  argspecs().count, argspecs().allocated);
+	Writefln ( "argc =% d," ~ "attribués =% d",
+	  argspecs (). count, argspecs (). alloué);
----------
-	this ( int  argc, string argv) // constructor
+	ce (int argc, string argv) / / constructeur

Funny French that is. Perhaps DMD should make its identifiers and 
keywords localizable, the result would be much better. :-)

-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/
March 03, 2009
Re: std.locale
Michel Fortin wrote:
> On 2009-03-02 14:58:26 -0500, Walter Bright <newshound1@digitalmars.com> 
> said:
> 
>> It's a silly thing, but I love the little google widget you can add to 
>> a web page to automatically translate the pages. All the D site pages 
>> have it in the left column.
> 
> It's not a silly thing, it's hilarious. Look, Google has invented the 
> D-French language:

A bug in Google's translator is there's no way to tell it to ignore a 
section, like a code section.
March 03, 2009
Re: std.locale
On Mon, 02 Mar 2009 18:36:09 -0800, Andrei Alexandrescu wrote:

> Phobos has supported Posix positional syntax since 2.006.
> 
> http://digitalmars.com/d/2.0/phobos/std_stdio.html

Thank you. I was behind the times (again). 

-- 
Derek Parnell
Melbourne, Australia
skype: derek.j.parnell
March 03, 2009
Re: std.locale
On 2009-03-02 10:02:10 -0500, Andrei Alexandrescu 
<SeeWebsiteForEmail@erdani.org> said:

> Michel Fortin wrote:
>> I think there are three aspects to localization. One is date and number 
>> formating. Another is offering a facility for translating all the 
>> messages an application can give. And the last one is the configuration 
>> part, where you know which format to use.
> 
> Sounds like a good start.
> 
>> The only problem I've seen addressed by you right now is the 
>> configuration part; I believe it's the wrong end to start with.
>> 
>> We should start by defining how to perform the tasks I enumerated 
>> above: translating date and number formats, selecting strings for a 
>> given language. After that we can figure out how to pass the proper 
>> default configuration around. And then you're done.
>> 
>> For date and number formatting, I like very much the NSDateFormatter 
>> and NSNumberFormatter approach in Cocoa for instance: you have a base 
>> class to format dates, another for numbers; you can easily create your 
>> own subclass if you want, and there's a way to get the default 
>> formatter instance.
> 
> Well I was thinking of passing the buck around. Instead of std.locale 
> defining a hierarchy for formatting numbers and dates, it provides a 
> means for user code to plant a routine in the locale object that knows 
> how to format numbers and dates. Of course, with time default localized 
> routine implementations will show up (hopefully contributed to by 
> people), but the basic mechanism is simple - there exists a locale 
> table that allows you to store a delegate in it.

Looks somewhat like what I proposed. But the point I was trying to make 
is that you don't need to regroup all these in one big object called a 
"locale".

Instead of seeing a locale as a central object for localizing every 
kind of data, I'm suggesting that we have different kinds of formatters 
capable of localizing different kinds of data. Each formatter would 
have its own definition of a locale that suits its needs. All you need 
is a standardized naming scheme for locales compatible between 
formatters, but that we have.

Note that while I've proposed that formatters be classes, I have no 
problem in them being structs which could be accepted in template 
functions.

What's good about a class, or a struct, is that it can regroup a bunch 
of related functions. For instance, you could have a number formatter 
help you display the right string, read a formatted string, and 
validate a formatted string. And you could configure the formatter for 
a fixed number of decimals, specific rounding behaviour, negative 
format, etc.


>> This is extensible, because if you wanted to go further, you could add 
>> formatter classes for various units (length, mass...), or anything else.
> 
> This I want to avoid, at least for the time being. I want to define a 
> table that can contain strings, integers, delegates, and other 
> sub-tables. This is it. The path to extensibility will not be Phobos 
> defining new classes to format various things. This could go on 
> forever. Phobos will use the table consistently, and users who do want 
> to format various things will simply plant their delegates in the table.

Well, when I said "you", I really meant anyone, and not necessarily 
inside Phobos. That was just to point out that the design is 
extensible. Sorry, it was confusing.


>> Translating strings is a little harder because 1) strings are 
>> application-defined, 2) strings are often not available in the user's 
>> prefered language, adding the need for a fallback mecanism, and 3) 
>> different applications will want to to store those strings in different 
>> ways. Perhaps we could define a base class for getting translated 
>> strings, then allow the program to use whatever subclass it wants.
> 
> There's no need for classes and subclasses. It's all data. Why should 
> we replace data with code? Data is easier.
> 
> Consider some code in phobos that must throw an exception:
> 
> throw Exception("File `%s' not found, system error is %s.",
>      filename, errnomsg);
> 
> The localized version will look like this:
> 
> auto format = "File `%s' not found, system error is %s.";
> auto localFormat = currentLocale ? currentLocale.peek(format) : null;
> if (!localFormat) localFormat = format;
> throw Exception(localFormat, filename, errnomsg);
> 
> What happens is that the default format string _is_ the key for looking 
> up the localized strings. If there's no value for that string, the 
> default format string is in vigor. Note that on the default path, 
> currentLocale is null so there is hardly any inefficiency.

Firstly, while you and I both agree that it's good that the key for 
searching a localized string be a readable message, not everyone does. 
It often doesn't work well when you want to translate small words 
having an overloaded meaning in English for instance.

Secondly, always falling back to english (or the developer's locale) 
when the currentLocale is not available isn't flexible enough. On Mac 
OS X for instance, you can select a number of languages for 
applications to use in order of preference. When the first isn't 
available, it looks for the second (skipping some details).

Thirdly, I hope you don't expect everyone to write the above each time. 
We should provide a nice fucntion to do the localization, say 
"localize"? This function should really be an overridable delegate.

	auto format = "File `%s' not found, system error is %s.";
	throw Exception(localize(format), filename, errnomsg);

Fourthly, various libraries are likely to provide their own translation 
tables (perhaps even in various formats). Unless you merge them all 
(risking some clashes) so you may want a second argument for specifying 
the translation table to use.

	auto format = "File `%s' not found, system error is %s.";
	throw Exception(localize(format, PHOBOS), filename, errnomsg);

Finally, no current library address this, but I'd be great if there was 
a way to correctly manage plurals in all languages. Perhaps making a 
word parametrizable depending on a number...


>> Notice how I'm not using the word "locale" to talk about these things. 
>> "Locale" is a concept too abstract to be able to do something good with 
>> it. Since you could only define it using Algebraic type and a loosely 
>> defined tree of strings, that seems to confirm my view. Call the module 
>> std.locale if you want, but keep in mind that the most important task 
>> at hand is facilitating localization, not defining what constitutes a 
>> locale, that can wait.
> 
> How should I call it?

My point was that there shouldn't be a class/struct/thing representing 
a locale. Having a collection of formatters, each knowning where to get 
their locale information (when given a locale name) would work better 
in my opinion.


-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/
March 03, 2009
Re: std.locale
On 2009-03-02 16:42:37 -0500, Andrei Alexandrescu 
<SeeWebsiteForEmail@erdani.org> said:

> I want to put together a string-based hierarchical string table that 
> allows depositing ALL OF THE ABOVE in it, without initially putting 
> ANYTHING in it. What's nice is that others have already defined the 
> keys and the possible values used by that table.
> 
> Possibly you are missing one or more of the following points:
> 
> 1) The existence of a hierarchical nomenclature for localization;
> 
> 2) The existence of a large database containing localized values for 
> said nomenclature;
> 
> 2) The power of Algebraic, which allows depositing data, functions, and 
> subtables alike in a uniform format.

What I'm missing is a justification as of why you need all this data in 
a common deposit in the first place. How do you justify the need for 
that? Which function needs this data and why using an Algebraic makes 
it better than other approaches.

As for the large database, I have nothing with using an existing large 
database, but I'd rather see my app use whatever is part of the 
underlying OS first, then rely on an external database if that is 
insuficient.

Your approach seems to be this: Unicode defines a huge database 
containing all kinds of locale information, let's expose that, allow 
other people to plug their own data inside, and use that as the 
standard format for passing locale data to various functions.

I only oppose the last part -- the "use that as the standard format for 
passing locale data to various functions" part. That you're using 
Algebraic does not change that various functions will search data at 
some places in the structure. If the data isn't there, because you want 
to some other formatting system, you'll get wrong results.

Perhaps you should explain more how you see this used in the context 
where we want to localize some data, how we can use it to define our 
own data, etc. Because this dicussion is lost in generalities and vague 
ideas right now.


-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/
March 03, 2009
Re: std.locale
Andrei Alexandrescu wrote:
> Jarrett Billingsley wrote:
>> On Mon, Mar 2, 2009 at 1:52 PM, Georg Wrede <georg.wrede@iki.fi> wrote:
>>> My take:
>>>
>>>  * This is still a moving target
>>>  * Using this is a major hassle for the programmer
>>>  * With D2 itelf a moving target, nobody is going to invest enough 
>>> time in
>>> this to actually use it for something worthwhile in the next 6 to 12 
>>> months
>>> anyway
>>>  * This is more application level stuff than language level stuff
>>>  * Doing this now will steal time from you, Walter, and many of us, both
>>> directly, and indirectly by leaching bandwidth in the newsgroup -- 
>>> time that
>>> should be spent on more urgent or more important things, or even
>>> documentation
>>>  * If it's so easy to do, then why not do it a week before the 
>>> release of
>>> final D2
>>
>> I agree entirely.  Localization and internationalization seem like
>> things that should be at a much higher level than a standard library.
>> Everyone's going to want to do it differently.  Providing a thin,
>> cross-platform wrapper over what the OS exposes is fine, but creating
>> a proper i18n/l10n framework is a huge project in and of itself (I
>> think the 140MB Java package makes that abundantly clear).
> 
> I must be missing something huge because I keep on misunderestimating 
> (sic :o)) the scope of this project.

I agree. :-)

> Let me try to state my point again: I don't want to provide 
> locale-specific strings, collation orders, date, time, and number 
> formatters, or class hierarchies that do all of the above. Zip. Nada. 
> Zilch.
> 
> I want to put together a string-based hierarchical string table that 
> allows depositing ALL OF THE ABOVE in it, without initially putting 
> ANYTHING in it. What's nice is that others have already defined the keys 
> and the possible values used by that table.

One of the problems is, people start expecting something if they find 
this string repository. They'd expect some of the work you said you 
don't provide, done. And if the table isn't even *prepopulated*, then 
people really feel stranded. It doesn't help much to state in the docs 
"if you need to fill it goto http://whatever, and hope the format hasn't 
changed".

Besides, on that site, what exactly should be downloaded is unobvious 
enough that the new user will probably not bother. Nor the normal app 
programmer.

> Possibly you are missing one or more of the following points:
> 
> 1) The existence of a hierarchical nomenclature for localization;

With a hammer in hand, everything looks like a nail. With a swiss army 
knife in your hand, nothing in the house is safe.

> 2) The existence of a large database containing localized values for 
> said nomenclature;

So where will this be stored? In a .dmdrc directory in the user's home? 
One per system? Or every app stores it in a .ini file? Is this per app 
or common to all user's apps?

And when it's updated (by who?), will all his own settings vanish? Or is 
there a mechanism (or does he have to invent one?) for reattaching his 
own settings after the update?

> 2) The power of Algebraic, which allows depositing data, functions, and 
> subtables alike in a uniform format.

Seriously however, Algebraic does sound cool! No question.
March 03, 2009
Re: std.locale
Georg Wrede wrote:
[snip]

Well I guess what I'll do is take the path of least resistance - 
nothing. Looks like locales are rather unpopular...

Actually I will do something. I'll start removing some of the silly 
Exception derivees from std.


Andrei
3 4 5 6 7 8 9 10 11
Top | Discussion index | About this forum | D home