December 06, 2018
On 12/6/18 1:28 PM, Neia Neutuladh wrote:
> On Thu, 06 Dec 2018 18:06:51 +0000, Adam D. Ruppe wrote:
>> I would take it one step further and put the other stuff in a wrapped
>> type from the compiler, so the function receiving it can static if and
>> tell what it is, so
>>
>> i"foo $(foo)"
>> would be
>>
>> tuple("foo ", FromInterpolation("foo", foo))
> 
> I was about to suggest wrapping the non-parameters in a Literal{} struct,
> but FromInterpolation makes more sense.
> 
> I was thinking about protecting against errors produced when you have to
> use an even/odd rule to figure out what's part of the literal and what's
> part of the interpolation:

Hm... this is a good point. Two symbols back to back would be confusing if one was a string.

One possibility is to pass an empty string to separate them, but this is hackish.

Or you could check the TBA trait that says which parameters were interpolations and which ones were literals.

> With FromInterpolation, you'd be able to reliably come up with the correct
> SQL: "SELECT * FROM foo WHERE id IN (?, ??)". Which is invalid and would
> be rejected.

Right, but it also requires that the callee deal with the FromInterpolation type, no? Whereas just passing the tuple doesn't require current code changes.

I suppose FromInterpolation could alias itself to the actual value. That way, it lowers to the original proposal.

One thing I'd say is that FromInterpolation should take the string interpolation as a comiple-time item.

In other words, I'd say it should be FromInterpolation!"foo"(foo). But I still like the idea of getting the alias from the original source, you can do so much more that way (thinking of UDAs).

Actually, considering the generative capabilities of D, you could have a function that accepts the DB query as a compile-time interpolated string, and simply generate the query string and call to the underlying library.

So many possibilities...

-Steve
December 06, 2018
Marler's original proposal is simple, orthogonal, and elegant. It makes use of existing D features in natural ways, and is both easy to understand, and easy to use in simple cases without *needing* to understand it. I think adding additional machinery like FromInterpolation on top of it would be a mistake. If users want to opt in to that extra complexity, it can always be made available as a library.

On Thursday, 6 December 2018 at 18:28:09 UTC, Neia Neutuladh wrote:
> I was thinking about protecting against errors produced when you have to use an even/odd rule to figure out what's part of the literal and what's part of the interpolation:
>
>     auto c = ");drop table foo;--";
>     // whoops, forgot a comma
>     db.exec("SELECT * FROM foo WHERE id IN ($a,$b$c)");
>       ->
>     db.prepare("SELECT * FROM foo WHERE id IN(?, ?);drop table foo;--?")
>       .inject(a, b, ")");
>
> With FromInterpolation, you'd be able to reliably come up with the correct SQL: "SELECT * FROM foo WHERE id IN (?, ??)". Which is invalid and would be rejected.

The actual solution here is to use the type system to distinguish between trusted and untrusted strings. E.g.,

    UnsafeString c = getUserInput(); // e.g., "); drop table foo;--"
    db.exec("SELECT * FROM foo WHERE id IN ($a,$b$c)");

...and `db.exec` knows to escape its arguments iff they're UnsafeStrings.
December 06, 2018
On 12/6/18 2:12 PM, Paul Backus wrote:
> Marler's original proposal is simple, orthogonal, and elegant. It makes use of existing D features in natural ways, and is both easy to understand, and easy to use in simple cases without *needing* to understand it. I think adding additional machinery like FromInterpolation on top of it would be a mistake. If users want to opt in to that extra complexity, it can always be made available as a library.

I agree with this. But the points brought up are good ones.

> 
> On Thursday, 6 December 2018 at 18:28:09 UTC, Neia Neutuladh wrote:
>> I was thinking about protecting against errors produced when you have to use an even/odd rule to figure out what's part of the literal and what's part of the interpolation:
>>
>>     auto c = ");drop table foo;--";
>>     // whoops, forgot a comma
>>     db.exec("SELECT * FROM foo WHERE id IN ($a,$b$c)");
>>       ->
>>     db.prepare("SELECT * FROM foo WHERE id IN(?, ?);drop table foo;--?")
>>       .inject(a, b, ")");
>>
>> With FromInterpolation, you'd be able to reliably come up with the correct SQL: "SELECT * FROM foo WHERE id IN (?, ??)". Which is invalid and would be rejected.
> 
> The actual solution here is to use the type system to distinguish between trusted and untrusted strings. E.g.,
> 
>      UnsafeString c = getUserInput(); // e.g., "); drop table foo;--"
>      db.exec("SELECT * FROM foo WHERE id IN ($a,$b$c)");
> 
> ....and `db.exec` knows to escape its arguments iff they're UnsafeStrings.

The more I think about it, the better it is to use the original proposal, and just pass the parameters at compile time in order to make it work. The non-string portions will be aliases to the expressions or variables, making them easily distinguishable from the string portions.

So instead of my original db.exec(...), you'd do db.exec!(...), exactly the same as before, just it's passed at compile time vs. runtime. The runtime possibility is there too, but you wouldn't use it in a database setting.

This can be used to forward to whatever you want. If you want to change the parameters to FromInterpolation("foo", foo), it will be possible.

This is very D-ish, too, where we supply the compile-time mechanisms, and you come up with the cool way to deal with them.

-Steve
December 06, 2018
On Thursday, 6 December 2018 at 19:12:52 UTC, Paul Backus wrote:
> If users want to opt in to that extra complexity, it can always be made available as a library.

Well, maybe, but libraries can't get the info if it isn't there.

We might be able to alias this the FromInterpolation thing so it just works in most places where you would use something else... but indeed, perhaps would be nice to do it in __traits.

(Or just abandon it and settle for the basic thing, I am OK with that too.)

> The actual solution here is to use the type system to distinguish between trusted and untrusted strings. E.g.,

If you do that, then you really must forbid any plain string to force the programmer to declare it...
December 06, 2018
On Thursday, 6 December 2018 at 18:41:34 UTC, Steven Schveighoffer wrote:
> Right, but it also requires that the callee deal with the FromInterpolation type, no? Whereas just passing the tuple doesn't require current code changes.
>
> I suppose FromInterpolation could alias itself to the actual value. That way, it lowers to the original proposal.

Yes, with alias this it would work in most cases.

struct FromInterpolation(T) {
   string originalCode;
   T value;
   alias value this;
}

void main() {
  import std.stdio;
  string name = "adam";
  writeln("hi ", FromInterpolation!string("name", mixin("name")));
}


We can prove that works today by just writing it manually and running it. It also works for other types:

void foo(string s, string b);
foo("hi ", FromInterpolation!string("name", mixin("name")));


So I dare say in the majority of cases where you might use this, an alias this solution covers it. And if you DO want to specialize on the special data, it is there for you to use. (do a deconstructing is() expression to check if it is FromInterpolation and extract the type, then go crazy with it).


Passing the actual variable as a CT arg is doable.. but only if there IS an actual variable.

i"hi $(name)"; // could pass T!("hi ", name)
i"hi $(a+b)"; // error: cannot read variables at compile time


Now, maybe we don't want to support $(a+b)... but I think people would miss it. How many times in PHP or Ruby or whatever do we say:

"hi #{person["name"]}"

I think if we don't support that, people will see it as a crippled interpolation. So in my view, we get the most bang-for-buck by making it a runtime argument rather than a compile time alias.

(BTW it is interesting passing an int to a function with an interpolated string. but since it doesn't automatically convert and just does a tuple.. you can totally do that.)
December 06, 2018
On 12/6/18 2:27 PM, Steven Schveighoffer wrote:
> The more I think about it, the better it is to use the original proposal, and just pass the parameters at compile time in order to make it work. The non-string portions will be aliases to the expressions or variables, making them easily distinguishable from the string portions.
> 
> So instead of my original db.exec(...), you'd do db.exec!(...), exactly the same as before, just it's passed at compile time vs. runtime. The runtime possibility is there too, but you wouldn't use it in a database setting.
> 
> This can be used to forward to whatever you want. If you want to change the parameters to FromInterpolation("foo", foo), it will be possible.
e.g. (need some support here, from new __traits):

foo(StringInterp...)()
{
   auto makeInterpolation(alias x)()
   {
      static if(isCompileTimeString!x) return x;
      else return FromInterpolation!(__traits(interpolationOf, x))(x);
   }
   return fooImpl(staticMap!(makeInterpolation, StringInterp));
}

-Steve
December 06, 2018
On Thu, Dec 06, 2018 at 02:27:05PM -0500, Steven Schveighoffer via Digitalmars-d wrote:
> On 12/6/18 2:12 PM, Paul Backus wrote:
> > Marler's original proposal is simple, orthogonal, and elegant. It makes use of existing D features in natural ways, and is both easy to understand, and easy to use in simple cases without *needing* to understand it. I think adding additional machinery like FromInterpolation on top of it would be a mistake. If users want to opt in to that extra complexity, it can always be made available as a library.
[...]
> The more I think about it, the better it is to use the original proposal, and just pass the parameters at compile time in order to make it work. The non-string portions will be aliases to the expressions or variables, making them easily distinguishable from the string portions.
> 
> So instead of my original db.exec(...), you'd do db.exec!(...), exactly the same as before, just it's passed at compile time vs. runtime. The runtime possibility is there too, but you wouldn't use it in a database setting.
> 
> This can be used to forward to whatever you want. If you want to change the parameters to FromInterpolation("foo", foo), it will be possible.
> 
> This is very D-ish, too, where we supply the compile-time mechanisms, and you come up with the cool way to deal with them.
[...]

Yes, this.  I support this.

Now, I'm also thinking about how to minimize the "surface area" of language change in order to make this proposal more likely to be accepted by W&A.  Because as it stands, the exact syntax of interpolated strings is probably going to ignite a bikeshedding war over nitty-gritty details that will drown the core proposal in less important issues.

While it's clear that *some* language support will be needed, since otherwise we have no way of accessing the surrounding scope of the interpolated string, I wonder how much of the implementation can be pushed to library code (which would make this more attractive to W&A), and what are the bare essentials required, in terms of language change, to make a library implementation possible.

Essentially, what you want to be able to do is essentially for a template to be able to reference symbols in the caller's scope, rather than in the template's scope. We want to be able to do this in a clean way that doesn't break encapsulation (too badly).  If we could somehow pass some kind of reference or symbol that captures the caller's scope to the template, then we can implement interpolated strings as library code instead of adding yet another form of string literal to the language.  For example, perhaps something like this:

	// Note: very crude tentative syntax, bikeshed over this later.
	template myTpl(string s, import context = __CALLER_CONTEXT)
		if (is(typeof(context.name) : string))
	{
		enum myTpl = s ~ context.name;
	}

	void main() {
		string name = "abc";
		enum z = myTpl!"blah";
		pragma(msg, z);	// prints "blahabc"
	}

Basically, `context` serves as a proxy object for looking up symbols in the caller's context. It behaves like a module import in pulling in symbols from the caller's context into the template, as if the template had "imported" the caller's context (like you would import a module).

Armed with such a construct, the template can actually reject instantation unless the caller's context contains a symbol called "name" (see the sig constraint).  This allows a string interpolation template to produce a helpful error message when instantiation fails, e.g., by moving the sig constraint into a static if with a static assert that says "symbol 'blah' in interpolated string not found in caller's context" or something to that effect.

This solves the problem of the template referencing symbols outside its own scope that may not be guaranteed to exist.

Passing the context as a template parameter also lets the user pass in a different scope, e.g., a module, as parameter instead:

	template findSymbols(import context)
	{
		alias findSymbols = __traits(allSymbols, context);
	}
	alias stdStdioSymbols = findSymbols!(std.stdio);

A lot of other possibilities open up here as applications, not just interpolated strings.  Though the primary motivation for this feature would be interpolated strings.

Just throwing out some ideas to see if there's a way to do this without making a big change to the language that will likely be rejected.


T

-- 
To err is human; to forgive is not our policy. -- Samuel Adler
December 06, 2018
On Thursday, 6 December 2018 at 19:52:27 UTC, Adam D. Ruppe wrote:
> On Thursday, 6 December 2018 at 18:41:34 UTC, Steven Schveighoffer wrote:
>> [...]
>
> Yes, with alias this it would work in most cases.
>
> [...]

Just to clarify, I was already assuming this is what FromInterplation would do.
December 06, 2018
On 12/6/18 2:52 PM, Adam D. Ruppe wrote:

> Passing the actual variable as a CT arg is doable.. but only if there IS an actual variable.
> 
> i"hi $(name)"; // could pass T!("hi ", name)
> i"hi $(a+b)"; // error: cannot read variables at compile time

Yeah, I know. I was hoping we could bend the rules a bit in this case. Since an interpolation would be a different kind of alias (with special traits attached), it could have its own allowances that normally don't happen. Make it like a lazy alias ;)

> Now, maybe we don't want to support $(a+b)... but I think people would miss it. How many times in PHP or Ruby or whatever do we say:

We have to support it. Of course, it would be supported for the runtime mechanism (where you lose the compile-time introspection). But it would be cool to have the compile-time traits allowed as well.

> (BTW it is interesting passing an int to a function with an interpolated string. but since it doesn't automatically convert and just does a tuple.. you can totally do that.)

This is why I think the idea is different and vastly superior to most other language string interpolation mechanisms.

-Steve
December 06, 2018
On Thursday, 6 December 2018 at 19:39:17 UTC, Adam D. Ruppe wrote:
> On Thursday, 6 December 2018 at 19:12:52 UTC, Paul Backus wrote:
>> If users want to opt in to that extra complexity, it can always be made available as a library.
>
> Well, maybe, but libraries can't get the info if it isn't there.

I'm not convinced that libraries *should* have access to that info. Can you imagine having to debug a function that works when you call it as `fun(i"foo$(bar)")`, but fails when you call it as `fun("foo", bar)`? Sure, you can tell people not to write code like that, but if you're going to take that position, why go out of your way to make it possible in the first place?

>> The actual solution here is to use the type system to distinguish between trusted and untrusted strings. E.g.,
>
> If you do that, then you really must forbid any plain string to force the programmer to declare it...

Sure, there are ways to make it even safer. My point is just that this isn't a string interpolation problem, it's a problem of conflating two different kinds of data just because they happen to have the same representation.