Jump to page: 1 2
Thread overview
Some Thoughts On String Interpolation [l10n, restricting access, AA]
Oct 26, 2023
kdevel
Oct 26, 2023
Imperatorn
Oct 26, 2023
Adam D Ruppe
Oct 27, 2023
kdevel
Oct 27, 2023
Adam D Ruppe
Oct 27, 2023
kdevel
Oct 27, 2023
Adam D Ruppe
Oct 27, 2023
Nick Treleaven
Oct 27, 2023
kdevel
Oct 28, 2023
Paul Backus
Oct 28, 2023
monkyyy
Nov 01, 2023
kdevel
Nov 02, 2023
Paul Backus
Nov 02, 2023
JN
October 26, 2023

Localization

A few days ago this example was posted in the "Learn" group [1]:

writeln(i"You drink $coffees cups a day and it gives you $(coffees + iq) IQ");

A German version would read

writeln(i"Sie trinken $coffees Tassen Kaffee am Tag. Dadurch erhöht sich Ihr IQ auf $(coffees + iq).");

How could the language version be selected (at runtime)? BTW: Is there a "D way of localization"?

Restricting Access

What about

writeln(i"You drink $coffees cups a day and it gives you $(password) IQ");

How is it prevented that the person doing the localization puts arbitrary variable or code references into the localized strings?

Is this attack vector already known? And if so: Has it been named?

Accessing Fields Of A Struct

Having data in a struct Variable s

struct S {
   string value;
}

S s;

would this work out-of-the-box?:

with (s) writeln(i"The name in s is $(value)");

Accessing Elements Of An AA

string[string] aa;
aa["name"] = "value";
writeln (i"The name in s is $(aa[\"name\"])"

That typing is laborious. Isn't there a way to bind the expression to the keys of the AA?

Nesting

Should it nest? How deep? What is the syntax?

writeln(i"does this work $(a + \"$(b)\" + c)");

to be continued

[1] https://forum.dlang.org/post/rkevblyfgvmvmsuhtqmr@forum.dlang.org

October 26, 2023

On Thursday, 26 October 2023 at 11:18:46 UTC, kdevel wrote:

>

Localization

A few days ago this example was posted in the "Learn" group [1]:

[...]

Answers to all questions all indivudials have, and will have, are more or less already answered if we copy what some other languages do.

C# for example. Just do what they do for example.

October 26, 2023

On Thursday, 26 October 2023 at 11:18:46 UTC, kdevel wrote:

>

Localization

A few days ago this example was posted in the "Learn" group [1]:

writeln(i"You drink $coffees cups a day and it gives you $(coffees + iq) IQ");

A German version would read

writeln(i"Sie trinken $coffees Tassen Kaffee am Tag. Dadurch erhöht sich Ihr IQ auf $(coffees + iq).");

How could the language version be selected (at runtime)? BTW: Is there a "D way of localization"?

With gnu gettext, you'd first pass the string through a tr() function, which lets it swap out at runtime. (You'd have to remember to do this though, since writeln will accept a generic string without this step.... unless writeln itself started wrapping through a standard translator function... but that's another story.)

I wrote a sample for the interpolated version here, but the translations might not be obvious. Let me add your example as a concrete thing.

https://github.com/adamdruppe/interpolation-examples/blob/master/04-internationalization.d

It is now added there, running that program (at the time of this writing) gives:

I, Adam, have a singular apple.
I, Adam, have a singular apple.
I, Adam, have 5 apples.
I, Adam, have 5 apples.
GG Adam
GG Adam
ggs 5, Adam
ggs 5, Adam
You drink 5 cups a day and it gives you -25 IQ
Sie trinken 5 Tassen Kaffee am Tag. Dadurch erhöht sich Ihr IQ auf -25.

Note that the translator could also change word order, it uses positional params here. (I'm not entirely happy with this specific syntax but it is just a demo to show that you can do all these things.)

>

Restricting Access

What about

writeln(i"You drink $coffees cups a day and it gives you $(password) IQ");

How is it prevented that the person doing the localization puts arbitrary variable or code references into the localized strings?

The localization thing is done at runtime and only has access to the variables passed to it.

If a programmer wrote "it gives you $password" then yes, password would be available to the translator, same as any other argument, but just... don't do that?

Notice how in the example linked above, the translator uses $1 and $2 rather than the variable name, since that string is handled by the library code, not the D language.

>

Accessing Fields Of A Struct
would this work out-of-the-box?:

Yes, of course, exactly the same as if you passed "name", value to the function. (That's literally what the compiler's rewrite does.)

>

Accessing Elements Of An AA
writeln (i"The name in s is $(aa["name"])"

This is wrong though, it should be:

writeln (i"The name in s is $(aa["name"])"

Once you're inside the $(..) region, it is read as D code, not as part of a string. (This is pretty standard for language support of interpolation.) So you don't want extra \ in there.

I'll add these two to basics.

>

That typing is laborious. Isn't there a way to bind the expression to the keys of the AA?

I don't know what this means.

>

Nesting
Should it nest? How deep? What is the syntax?

It does and and deep as you want. Remember, what's inside the $() is D code, not string, so you'd do:

i"thing $(i"thing $(i"thing"))"

etc. The processing function has the info it needs to support this but may have to do extra work with it.

BTW you don't have to ask me, you can ask the compiler, this is all fully implemented already. https://github.com/dlang/dmd/pull/15715

But let me add a few of these to the examples repo.... and done

https://github.com/adamdruppe/interpolation-examples/blob/master/01-basics.d

shows all these. If you compile the dmd from the PR you can build and run all these examples yourself.

October 27, 2023

On Thursday, 26 October 2023 at 15:34:52 UTC, Adam D Ruppe wrote:

> >

[...]
How could the language version be selected (at runtime)? BTW: Is there a "D way of localization"?

With gnu gettext, you'd first pass the string through a tr() function, which lets it swap out at runtime.

Okay. Usually there is a source string in English language which is subject to translation:

   int n = 3;
   writefln (_("n is %d"), n);

The source string in this example is n is %d. Now for every target language one creates seperate .po-files [1]. These files essentially contain pairs of source/target strings:

msgid "n is %d"
msgstr "n ist %d"

These .po-Files are compiled into .mo-Files from which at runtime the strings read.

If now interpolation is in play

   int n = 3;
   writefln (_(i"n is $(n)"));

How is the translation workflow organized now? What is put in the po-Files?

[1] https://www.gnu.org/software/gettext/manual/html_node/PO-Files.html

October 27, 2023

On Friday, 27 October 2023 at 11:03:55 UTC, kdevel wrote:

>

How is the translation workflow organized now? What is put in the po-Files?

I literally have an example of this:
https://github.com/adamdruppe/interpolation-examples/

There's a few different ways we could do it, this here is working with the existing D gettext lib which married itself to std.format (much to my chagrin) but it still wasn't hard to make it work.

Remember, the library can tell the difference between the string literal segments and the interpolated segments, and it can work with the string literal segments at compile time. So it can CTFE msgids out of it in whatever format it wants and list all the possible things for the .pot file through compile time reflection (aggregated at runtime).

These are all solved problems - my thing is a small wrapper around the existing gettext D lib which is a small wrapper around GNU gettext. All you have to do is arrange the data available to you.

In my example, I used $1, $2, etc as placeholders for the parameters in the msgid/msgstrs, except for the plural param which had to be %d cuz the D gettext lib made that assumption and im trying to be compatible with it. This allows easy reordering etc by the translator, without conflicting with std.format's %s stuff.

Look at that repo to see for yourself.

October 27, 2023

On Thursday, 26 October 2023 at 11:18:46 UTC, kdevel wrote:

>

Accessing Elements Of An AA

string[string] aa;
aa["name"] = "value";
writeln (i"The name in s is $(aa[\"name\"])"

That typing is laborious. Isn't there a way to bind the expression to the keys of the AA?

I think that is orthogonal to the DIP (though having to escape the " does make it worse). You can have a local reference in D:

string[string] aa;
ref elem() => aa["name"];
aa["name"] = "value"; // insert the first key
writeln("The name in s is ", elem);

Note it's not possible to use elem to insert the key because aa["name"] without = is a lookup (not an insertion), even though it is being returned by ref. So elem = "blah"; only works when the key already exists in aa.

October 27, 2023

On Friday, 27 October 2023 at 11:20:04 UTC, Adam D Ruppe wrote:

>

On Friday, 27 October 2023 at 11:03:55 UTC, kdevel wrote:

>

How is the translation workflow organized now? What is put in the po-Files?

I literally have an example of this:
https://github.com/adamdruppe/interpolation-examples/

Sorry for not having looked that up in the first place. german.po reads as follows:

msgid "You drink $1 cups a day and it gives you $2 IQ"
msgstr "Sie trinken $1 Tassen Kaffee am Tag. Dadurch erhöht sich Ihr IQ auf $2."

while the source code 04-internationalization.d reads

writeln(tr(i"You drink $coffees cups a day and it gives you $(coffees + iq) IQ"));

Does this allow the use of the msgerge program [1] for changes in the source code?

[1] https://www.gnu.org/software/gettext/manual/gettext.html#msgmerge-Invocation

October 27, 2023

On Friday, 27 October 2023 at 12:22:00 UTC, Nick Treleaven wrote:

>

On Thursday, 26 October 2023 at 11:18:46 UTC, kdevel wrote:

>

Accessing Elements Of An AA

string[string] aa;
aa["name"] = "value";
writeln (i"The name in s is $(aa[\"name\"])"

That typing is laborious. Isn't there a way to bind the expression to the keys of the AA?

I think that is orthogonal to the DIP

Maybe. I am under the impression that there are no valid real usecases for string interpolation. Currently I have on my notepad "HTML", "SQL" and "composition of filesystem paths", none of which I would like to do using string interpolation.

>

(though having to escape the " does make it worse).

As far as I understood Adam correctly my code must read

writeln (i"The name in s is $(aa["name"])");
//                            ^^^^^^^^^^

There was missing the closing parenthesis and the final semicolon, too.
According to Adam in [1] the marked part (^^^) is plain D code, so the quotation marks must not be escaped.

[1] https://forum.dlang.org/post/wbcvuejmwircauzgxmdh@forum.dlang.org

October 27, 2023

On Friday, 27 October 2023 at 12:48:07 UTC, kdevel wrote:

>

Does this allow the use of the msgerge program [1] for changes in the source code?

Yes, in fact, I used that when adding that example to the existing file.

Now, if the string itself (NOT what is interpolated between, this impl ignores that, though it could be processed if we choose; i was thinking about using it as a comment to translaters) changed that would probably be a different msg id but even that depends.

October 28, 2023

On Thursday, 26 October 2023 at 11:18:46 UTC, kdevel wrote:

>

Accessing Elements Of An AA

string[string] aa;
aa["name"] = "value";
writeln (i"The name in s is $(aa[\"name\"])"

That typing is laborious. Isn't there a way to bind the expression to the keys of the AA?

You can do this with existing language features: https://run.dlang.io/is/QRFNpg

Although I wouldn't really recommend it, since it forces you to write fully-qualified names to access anything that isn't an associative-array key.

« First   ‹ Prev
1 2