January 29, 2021
In reply to Steve in the Feedback thread:

> in DIP1027, formatting was a central component of interpolation. What became clear as the prior version was reviewed was that the complexity of specifying format while transforming into a parameter sequence was not worth adding to the language.

Not clear to me that ${?} is that complex, and the language itself did not need to know what the format was, it just had to fit inside the ${ }.

> In particular, many applications of string interpolation did not fit well with format specifications (the sql example above being one of them),

I showed how it would work, see example at end.

> and the end result was something that seemed focused solely on writef and printf functions.

Much more accurately, it was optimized for all functions that use either printf-style formatting strings or writef-style formatting functions. This is because those style functions are overwhelmingly used in D programs and are everywhere in C code.

In another post I compared #DIP1027 and #DIP1036 side-by-side for printf and writeln. (DIP1036 also requires an overload of writefln to work with anything other than %s formatting.)

Let's compare mysql:

DIP1036:

mysql_query(i"select * from foo where id = ${obj.id} and val > ${minval}");

DIP1027:

mysql_query(i"select * from foo where id = ${?}(obj.id) and val > ${?}minval");

The DIP1027 could be shorter or longer, depending on if the ( ) were needed or not. DIP1036 also requires a user-written overload for mysql_query(), DIP1027 does not. DIP1036 is not a clear winner for mysql.
January 29, 2021
On Friday, 29 January 2021 at 21:21:54 UTC, Walter Bright wrote:
> Much more accurately, it was optimized for all functions that use either printf-style formatting strings or writef-style formatting functions. This is because those style functions are overwhelmingly used in D programs and are everywhere in C code.

My impression is that `writeln` is more common than `writefln` in D code, and a quick grep of Phobos supports that impression:

$ grep -R 'writeln' std/  | wc -l
321
$ grep -R 'writefln' std/ | wc -l
225

Obviously not scientific, but at the very least it shows that formatted output is not the clear and unambiguous winner.
January 29, 2021
On Fri, Jan 29, 2021 at 09:33:08PM +0000, Paul Backus via Digitalmars-d wrote:
> On Friday, 29 January 2021 at 21:21:54 UTC, Walter Bright wrote:
> > Much more accurately, it was optimized for all functions that use either printf-style formatting strings or writef-style formatting functions.  This is because those style functions are overwhelmingly used in D programs and are everywhere in C code.
> 
> My impression is that `writeln` is more common than `writefln` in D code, and a quick grep of Phobos supports that impression:
> 
> $ grep -R 'writeln' std/  | wc -l
> 321
> $ grep -R 'writefln' std/ | wc -l
> 225
> 
> Obviously not scientific, but at the very least it shows that formatted output is not the clear and unambiguous winner.

My experience is on the contrary: I use writefln in my own code a lot more than writeln.  Of course, that's just another data point, and may or may not represent actual usage.


T

-- 
Answer: Because it breaks the logical sequence of discussion. / Question: Why is top posting bad?
January 29, 2021
On 1/29/21 4:21 PM, Walter Bright wrote:
> In reply to Steve in the Feedback thread:
> 
[snip]

Forgive me, but isn't this a duplicate of https://forum.dlang.org/post/rv0grf$g1j$1@digitalmars.com ?

I may be missing something new here, but I did reply to that in the feedback thread.

-Steve
January 29, 2021
On Friday, 29 January 2021 at 19:10:55 UTC, Steven Schveighoffer wrote:
> On 1/29/21 7:58 AM, Dukc wrote:
>> A string literal is a string that is implicitly assignable to the other alternatives via value range propagation mechanics, or that's how I understand it at least.
>
> No, this isn't range-value propagation. There is no way to recreate or save the type that is a string literal.

It may be that VRP is not the correct term, but I meant that a string literal (just like an array literal, or VRPed integers) has one unambiguous primary type, that is used if it's not immediately assigned to something else.

> D has, however, added things like typeof(null), which still work as polysemous values (assignable to multiple types).

But even there: it has a primary type that is tried first, before any conversion rules kick in. Unlike what you're proposing.

>
>> I mean that this must be guaranteed to pass IMO:
>> 
>> [snip]
>
> No, that will not pass, and is guaranteed not to pass. typeof(interpolation) is string.
>
> Just like this wouldn't pass:
>
> auto x = 1, 2;
>
> int foo(int, int);
>
> foo(x);
>
> The compiler wouldn't allow it, and would rewrite with idup, yielding a string.

You need to specify the rules about the behaviour of the expression unambiguously, and that is going to be a lot harder if you don't allow yourself the luxury of using a primary type.

If I understood what you're proposing, the expanded form is attempted only when the interpolated string is an argument to a function or a template. But this still leaves a lot of questions:

1: Is the expanded form attempted inside constructors?
2: Is the expanded form attempted if the interpolated string is passed as first argument in UFCS style?
3: What a variable with `enum` storage class stores an interpolated string and that gets passed to a function?
4: What does this template do when called with interpolated string? `auto foo(T...)(T arg){bar(arg);}`
5: If an interpolated string gets called by an operator, what happens?
6: Probably much more issues like these.

>> I don't think it's that bad, we tend to do stuff like `writeln("hello, " ~ name)` anyway. Relatively small inefficiencies like this don't matter in non-critical places, and critical places need to be benchmarked and optimized anyway.
>
> I don't consider it a small inefficiency to involve all of the machinery of generating a string just to throw it away after printing.
>
> But in any case, it's unnecessary without good reason.

The reason would be to have an unambiguous type for the string literal. I'd much rather have inefficiencies like the one mentioned, that I can manually optimise away if needed, than complex language rules that are likely to result in implementation bugs.


January 29, 2021
On 1/29/21 5:39 PM, Dukc wrote:
> On Friday, 29 January 2021 at 19:10:55 UTC, Steven Schveighoffer wrote:
>> On 1/29/21 7:58 AM, Dukc wrote:
>>> A string literal is a string that is implicitly assignable to the other alternatives via value range propagation mechanics, or that's how I understand it at least.
>>
>> No, this isn't range-value propagation. There is no way to recreate or save the type that is a string literal.
> 
> It may be that VRP is not the correct term, but I meant that a string literal (just like an array literal, or VRPed integers) has one unambiguous primary type, that is used if it's not immediately assigned to something else.

This is mostly the same as that. The difference is in when you try things outside of the world of function calls, it uses the conversion unconditionally. This includes mixin, typeof, auto variables.

It's slightly different from normal rules, and I'm very much interested in hearing ways this will break, but I also am still optimistic it is valid.

> 
>> D has, however, added things like typeof(null), which still work as polysemous values (assignable to multiple types).
> 
> But even there: it has a primary type that is tried first, before any conversion rules kick in. Unlike what you're proposing.

Right, the preferred type isn't really a type. Which is what makes it strange in most cases where you would expect to see the results from the preferred type (i.e. typeof, auto variables). But I still think it works, because of the narrow scope of the expanded form usage. For almost all intents and purposes, the thing is a string, unless you accept it as a sequence.

If this DIP goes down, a possibility is to use a different mechanism to ask for the expanded form other than providing a parameter list that matches. I just like the parameter list matching because it's simple and already understood.

> 
>>
>>> I mean that this must be guaranteed to pass IMO:
>>>
>>> [snip]
>>
>> No, that will not pass, and is guaranteed not to pass. typeof(interpolation) is string.
>>
>> Just like this wouldn't pass:
>>
>> auto x = 1, 2;
>>
>> int foo(int, int);
>>
>> foo(x);
>>
>> The compiler wouldn't allow it, and would rewrite with idup, yielding a string.
> 
> You need to specify the rules about the behaviour of the expression unambiguously, and that is going to be a lot harder if you don't allow yourself the luxury of using a primary type.\
> 
> If I understood what you're proposing, the expanded form is attempted only when the interpolated string is an argument to a function or a template. But this still leaves a lot of questions:
> 
> 1: Is the expanded form attempted inside constructors?

The expanded form is attempted when calling a constructor, it's just a function call.

So new Foo(i"...") attempts the expanded form first, and if it does not match, uses the idup rewrite.

> 2: Is the expanded form attempted if the interpolated string is passed as first argument in UFCS style?

UFCS works by putting arguments on the left side of the dot first, so we are counting on it working with the expanded form. Otherwise, something like i"hello, ${name}".idup will not work correctly.

> 3: What a variable with `enum` storage class stores an interpolated string and that gets passed to a function?

It would be a string type, just like the auto storage class.

> 4: What does this template do when called with interpolated string? `auto foo(T...)(T arg){bar(arg);}`

foo would receive the expanded form, bar would only work if it accepted the expanded form (no rewrite is done there). Once the compiler decides not to rewrite into a string, it's a tuple for the duration. Again, this is not a type that implicitly converts, it's a rewrite by the compiler.

This is similar to:

void bar(const char *);

foo(T)(T arg) { bar(arg); }

bar("hello"); // ok
foo("hello"); // error

> 5: If an interpolated string gets called by an operator, what happens?

do you mean something like opAssign(i"...")? It should work the same as other function calls.

> 6: Probably much more issues like these.

The rules are pretty straightforward, so I'm happy to tell you the answers.

> 
>>> I don't think it's that bad, we tend to do stuff like `writeln("hello, " ~ name)` anyway. Relatively small inefficiencies like this don't matter in non-critical places, and critical places need to be benchmarked and optimized anyway.
>>
>> I don't consider it a small inefficiency to involve all of the machinery of generating a string just to throw it away after printing.
>>
>> But in any case, it's unnecessary without good reason.
> 
> The reason would be to have an unambiguous type for the string literal. I'd much rather have inefficiencies like the one mentioned, that I can manually optimise away if needed, than complex language rules that are likely to result in implementation bugs.

If there are implementation bugs, we can deal with them. If there are design problems, I want to figure them out now. So please, continue to try and figure out what could be wrong with this scheme!

To put it another way, in the case where the "dual-mode" interpolation literals work and my rewrite scheme work, I'd prefer the rewrite scheme. If there is a killer problem which makes the rewrite scheme non-viable, then we have to consider other options, including dropping the rewrite.

-Steve
January 29, 2021
FYI, I was just made aware that this is because the message in the feedback thread is about to be deleted. I'm posting my reply the same as it was in there.

On 1/29/21 3:26 AM, Walter Bright wrote:
>  > in DIP1027, formatting was a central component of interpolation. What became clear as the prior version was reviewed was that the complexity of specifying format while transforming into a parameter sequence was not worth adding to the language.
>
> Not clear to me that ${?} is that complex, and the language itself did not need to know what the format was, it just had to fit inside the ${ }.

${?}(expr) is more complex than ${expr}. If that is not clear, I'm not sure how else to describe it.

There are two opportunities for complexity here. The complexity of implementation and the complexity of usage.

In fact, DIP1027 has a higher level of both usage complexity and implementation complexity. And it loads at least one chamber of the footgun at all times.

For DIP1027, a SQL library author can possibly accept $(expr) for parameters. What this means is that a `%s` will be added to the SQL string. It is possible that the library author can ADD the complexity of parsing and detecting %s inside the SQL string, replacing them with ? tokens. This is a huge added complexity that is not necessary for just a 10 minute implementation of concatenating literal strings with ? tokens (which can be done at compile-time by the way).

Not only that, but the user is allowed to use whatever they want in the {}. This means that a user can write ${bbbb}(param) in the string and it will compile. The library can't *possibly* figure this out. It can complain at runtime of an error in format, reimplementing the entire SQL language parsing client side, or potentially just use the SQL server to do it. But it also may result in incorrect *successful* calls. There is just no way to form an API that guards against user error at compile time or runtime with DIP1027.

And on the user side, there is the ever-present guess if the SQL library is going to have special handling for the %s. They can add defensive format specifiers for everything, but again, this results in added complexity that is not necessary in DIP1036. In fact, DIP1036 provides the *opportunity* for malformed SQL statements to be rejected at *compile time*, something that is not available in DIP1027.

Finally, DIP1036 provides a straightforward path for formatting that is simply not possible with DIP1027. This is because no parsing of specifiers is necessary at runtime. The mechanism of "parse this string to see what to do" is a relic of languages that do not provide compile-time metaprogramming capabilities, and should be avoided in a feature designed for one.

>  > and the end result was something that seemed focused solely on writef and printf functions.
>
> Much more accurately, it was optimized for all functions that use either printf-style formatting strings or writef-style formatting functions. This is because those style functions are overwhelmingly used in D programs and are everywhere in C code.

It was not optimized for anything but writef and format. Yes, I purposely omitted printf, since it's not optimized for that. You must provide format specifiers for everything but the rare-in-D const char * parameter type. For example, if you have `string name;`, how would you use printf to format that? With an actual printf call, it's:

printf("hello, %.*s\n", cast(int)name.length, name.ptr);

With DIP1027 it's possible to do, but you wouldn't EVER do this:

printf("hello, ${%.*}(cast(int)name.length)${s}(name.ptr)\n");

With DIP1036 and a 15-minute wrapper that I wrote it's:

printf("hello, ${name}\n");

For all other use cases, one must inject format specifiers for all parameters, or have a function that compiles but always does the wrong thing. One must know the format specifiers for that specific domain, and therefore places the burden of the API writing on the user at all times.

Furthermore, there is NO POSSIBLE WAY for DIP1027 to allow a library author to help. Even if a library wants to provide a mechanism to allow unformatted string interpolation usage, he cannot do so.

>
> In another post I compared #DIP1027 and #DIP1036 side-by-side for printf and writeln. (DIP1036 also requires an overload of writefln to work with anything other than %s formatting.)

DIP1027 does not work with writef unless you call it in a very specific way. DIP1027 cannot work with writeln, even with an overload. DIP1036 will work with writeln out of the box, and with an overload can provide an actual universally usable mechanism for formatted strings with interpolation strings.

DIP1027 is basically a user rewrite of parameters that happens to work with a select few functions when called in a select few ways.

>
> Let's compare mysql:
>
> DIP1036:
>
> mysql_query(i"select * from foo where id = ${obj.id} and val > ${minval}");
>
> DIP1027:
>
> mysql_query(i"select * from foo where id = ${?}(obj.id) and val > ${?}minval");
>
> The DIP1027 could be shorter or longer, depending on if the ( ) were needed or not. DIP1036 also requires a user-written overload for mysql_query(), DIP1027 does not. DIP1036 is not a clear winner for mysql.

DIP1027 will always be longer: ${?} is 4 characters. ${} is 3.

DIP1036 ALLOWS an overload that works exactly as the user expects

DIP1027 PREVENTS an overload that works as the user expects.

DIP1036 can reject malformed parameters at compile time. DIP1027 is basically a rewrite of user parameters, and puts an immense burden on them that is not necessary with DIP1036. It takes all the power away from a library writer in helping the user use their library properly.

DIP1036 is a clear winner for mysql.

-Steve
January 30, 2021
On 2021-01-29 21:45, Steven Schveighoffer wrote:

> Yes, that would return false.
> 
> But, this seems still pretty far fetched for a real use case. Not only that, but there is still a way to fix it, just use .idup if what you really meant was a string. And it's not something that's needed to be done by the author of someTemplate, just the user in the (probably one) case that he uses it.
> 
> Consider that there are still places where one must use AliasSeq!(things) to work with them properly in templates (I get bit by this occasionally). It's quite similar actually.
> 
> I remain unconvinced that this is a problem. But I will concede there are a few more cases where an explicit idup might be required than just tuple.

You could tweak the DIP slightly and say: in all places `idup` is automatically inserted, except in a parameter list where the first parameter is `interp`.

template someTemplate(Args...) {
    static if (anySatisfy!(isSomeString, typeof(Args)) {
        // ...
    }
}

someTemplate!(i"I have ${count} apples");

The above would evaluate to true.

template someTemplate2(interp!string i, Args...) {
    static if (anySatisfy!(isSomeString, typeof(Args)) {
        // ...
    }
}

someTemplate2!(i"I have ${count} apples");

The above would evaluate to false.

-- 
/Jacob Carlborg
January 30, 2021
On Friday, 29 January 2021 at 18:49:18 UTC, Paul Backus wrote:
> Continuing from the feedback thread...
>>
>> I want to say something about this idea of doing only one or the other.
>
> To be clear: you can still have both, just not with the same syntax.
>
> If you want both, my recommendation would be to use the i"..." syntax for the
> convenient string version, and qq"..." ("quasiquote") for the flexible
> tuple version. At that point, it would probably make sense to split them into
> two separate DIPs, too.
>
> [1] https://forum.dlang.org/post/q7u6g1$94p$1@digitalmars.com

https://forum.dlang.org/post/rjih3p$vj2$1@digitalmars.com


January 30, 2021
On 1/29/2021 4:09 AM, Dukc wrote:
> You need to remember that perhaps the most important use case

If it is the most important use case, the DIP should address that in the rationale and examples.