January 13

On Saturday, 13 January 2024 at 06:27:51 UTC, Walter Bright wrote:

>

Escaping % is not hard to do. It's ordinary.

I don't see people arguing that escaping is difficult to do. It's not. What is difficult is remembering to do it perfectly, every time, and accidentally building a silent injection attack when you (inevitably) fail. Especially since the attack vector is not detectable to linting tools. All systems with a special format-specifier are unsafe for use with SQL. Period.

Think of it this way: You have the option to install a fail-safe critical system in your jet, and everybody is telling you do it, but you're saying "Nah fam, it'll be fine. The odds are so low that a human will screw up this one manual step and this design will burn 0.1% less fuel and costs 10% less." coughMCAScough.

Actually, MCAS is a pretty fair analogy here. The system mostly works as designed, except for the one button that if you don't push it when things go bad, brings down two airframes. That's how big a deal SQL injections are.

Don't be 2010's Boeing, be 1970s Boeing. Please build an indestructible 757.

Note that Java considered and rejected your premise in their version of this feature, with their reasoning laid out in the spec-document.

January 13

On Saturday, 13 January 2024 at 07:03:14 UTC, Adam Wilson wrote:

>

Note that Java considered and rejected your premise in their version of this feature, with their reasoning laid out in the spec-document.

Looks like it's already a feature: https://www.baeldung.com/java-21-string-templates

Thanks, I wasn't aware of this.

January 13
On 1/13/24 07:27, Walter Bright wrote:
> On 1/12/2024 8:13 PM, Steven Schveighoffer wrote:
>> On Saturday, 13 January 2024 at 03:59:03 UTC, Timon Gehr wrote:
>>> On 1/13/24 04:36, Walter Bright wrote:
>>>>
>>>> I don't know what "interpolate an expression sequence" means. As for things getting out of sync, execi() with CTFE can reject a mismatch between format specifiers and arguments.
>>>
>>> Oh, not at all.
>>>
>>> ```d
>>> import std.stdio;
>>> alias Seq(T...)=T;
>>> void main(){
>>>     writefln(i"$(Seq!(1,2)) %s");
>>> }
>>> ```
> 
> 1027 can write a format string for a tuple as: "%s%s %%s" because the number of elements in the tuple is known at compile time.
> ...

I am testing all of my DIP1027 snippets against your implementation, and this is not what it does (it prints "1 2\n", in accordance with the DIP1027 specification). And if it were, it still suffers from the drawback that the library cannot detect that the user passed arguments in this fashion.

> 
>> Yes, and there is more:
>>
>> ```d
>> writefln(i"is it here? ${}(1) Or here? %s");
>> ```
> 
> An empty format ${} would be a compile time error. The %s would be rewritten as %%s.
> ...

It's not rewritten like that with DIP1027.

> 
>> Bottom line is that if we make `%s` special, then all functions must deal with the consequences. There is not a format specifier you can come up with that is not easily reproduced in the string literal directly -- you have to escape it and *know* that you must escape it. The easier path is just not to deal with format specifiers at all -- tell the library exactly where the pieces are.
> 
> Escaping % is not hard to do. It's ordinary.
> ...

It's completely unnecessary to require this of a user who wants to use an istring. Why doesn't DIP1027 escape '%' automatically? (Not that that would solve all issues with the format string.)

> 
>> And by the way, your example brings up another point not recently brought up where 1036e handles and DIP1027 does not: tuples as interpolated expressions. Because each expression is enclosed by `InterpolatedLiteral` pieces, you can tell which ones were actually tuples.
> 
> And with 1027 the format string will be of type `FormatString` (not `string`), and you can tell which ones were actually tuples.

How can you tell which ones were actually tuples?

> So I take that back, nested formats are easily supported.
> ...

You are moving the goalposts. I am glad you agree that DIP1027 is insufficient.

> These are all straightforward solutions.
> 

DIP1036e is even more straightforward than the collection of fixes that would be required to make something that is closer to DIP1027 viable.
January 13
On Friday, 12 January 2024 at 22:35:54 UTC, Walter Bright wrote:
> Given the interest in CTFE of istrings, I have been thinking about making a general use case out of it instead of one specific to istrings.

Im sorry but this is ridiculous. In order to try and bring 1027 up to a par with 1036 you are willing to add a special case to template args?


January 13
On 1/13/24 07:27, Walter Bright wrote:
>> Yes, and there is more:
>>
>> ```d
>> writefln(i"is it here? ${}(1) Or here? %s");
>> ```
> 
> An empty format ${} would be a compile time error. The %s would be rewritten as %%s.

./dmd -run test.d
is it here?  Or here? 1

January 13
On Friday, 12 January 2024 at 22:35:54 UTC, Walter Bright wrote:
> So, instead of issuing a compilation error, the compiler can "slide" the arguments to the left, so the first argument is moved into the compile time parameter list. Then, the call will compile.

FeepingCreature proposed this instead, which seems to be more flexible and clearer:
https://forum.dlang.org/post/arzmecuotonnomsehrmk@forum.dlang.org
January 13

On Friday, 12 January 2024 at 22:35:54 UTC, Walter Bright wrote:

>

Given the interest in CTFE of istrings, I have been thinking about making a general use case out of it instead of one specific to istrings.

I agree, good idea, but it can be even more generalized, please check Rikkis idea.
https://gist.github.com/rikkimax/812de12e600a070fe267f3bdc1bb3928

Or distilled to a few lines by me...

alias I(T...) = T;

void pluto(int a)
{
    pragma(msg, __traits(getArgumentAttributes, a));
}

void main()
{
    pluto(@"%s" I!1);
}
January 13

On Saturday, 13 January 2024 at 06:46:54 UTC, Walter Bright wrote:

>

On 1/12/2024 8:35 PM, Steven Schveighoffer wrote:

>

On Saturday, 13 January 2024 at 02:16:06 UTC, Walter Bright wrote:

>

On 1/12/2024 4:15 PM, Steven Schveighoffer wrote:

>

I don't view this as simpler than DIP1036e or DIP1027 -- a simple transformation is a simple transformation.

Adding extra hidden templates isn't that simple. If a user is not using a canned version, he'd have to be pretty familiar with D to be able to write his own handler.

Yes, that is intentional.

So you agree it is not simpler :-)

No it is just as simple. I agree that the user should have to understand the feature before hooking it. I meant it is intentional that you can't "accidentally" hook istring calls without understanding what you are doing.

And I don't understand this line of argument to begin with. You have to be pretty familiar with D to hook anything:

  • operator overloads
  • foreach
  • toString
  • put
  • DIP1027
  • DIP1036e
  • this thing you are proposing

And I'm sure there's more.

> >

You should not be able to call functions with new syntax because the parameters happen to match. We have a type system for a reason.

I proposed in the other topic to type the format string as Format (or FormatString), which resolves that issue, as a string is not implicitly convertible to a FormatString.

This doesn't help, as an enum implicitly converts to its base type.

> > >

1027 is simpler in that if the generated tuple is examined, it matches just what one would have written using a format. Nothing much to learn there.

In other words: "it matches just what one wouldn't have written, unless one is calling writef".

Yes, it is meant for writef, not writeln.

In none of the proposals I have written or supported, has it been meant for writef. I don't understand the desire to hook writef and format. The feature to make 1036e hook writeln is just an easy added thing (just add a toString and it works), but is not fundamentally necessary. We could just as easily change writeln to handle whatever template types we create.

Hooking writef involves adding semantic requirements on the library author that are specialized for writef, for what benefit, I can't really say. You can always create a writef overload that handles these things, but I don't see the point of it. String interpolation isn't aimed at formatting, though it can be used for it (as demonstrated).

> > >

The other reasons:

  1. preventing calls to functions passing an ordinary string as opposed to an istring tuple

I don't see how this proposal fixes that. I'm assuming a function like void foo(string s, int x) will match foo(i"something: $(1)")

Yes, we've seen that example. It's a bit contrived. I've sent a format string to a function unexpectedly now and then. The result is the format string gets printed. I see it, I fix it.

Me too. But shouldn't we prefer compiler errors? Shouldn't we use the type system for what it is intended?

I've literally left bugs like this in code for years without noticing until the actual thing (an exception) was printed, and then it was hours to figure out what was happening.

>

I can't see how it would be some disastrous problem. If it indeed a super problem, Format can be made to be a type that is not implicitly convertible to a string, but can have a string extracted from it with CTFE.

This would be a step up, but still doesn't fix the format specifier problem.

>

What it does fix is your other concern about sending a string to a function (like execi()) that expects a Format as its first argument.

Right, but this doesn't fix the format specifier problem. You seem to be solving all the problems but that one.

> > >
  1. preventing nested istrings

Why do we want to prevent nested istrings? That's not a goal.

I mentioned in another reply to you a simple solution.

Without a trailer, this isn't solvable technically. But I'm not really concerned about nested istrings. I just meant that it isn't a requirement to disallow them somehow.

> > >

have already been addressed. The compile time thing was the only one left.

A compile time format string still needs parsing. Why would we want to throw away all the work the compiler already did?

For the same reason writefln() exists in std.stdio, and people use it instead of writeln(). Also, the SQL example generates a format string.

The compiler is required to parse out the parameters. It has, sitting in it's memory, the list of literals. Why would it reconstruct a string, with an arbitrarily decided placeholder, that you then have to deal with at runtime or CTFE? You are adding unnecessary work for the user, for the benefit of hooking writef -- a function we control and can change to do whatever we want.

The SQL example DOES NOT generate a format string, I've told you this multiple times. It generates a string with placeholders. There is no formatting. In fact, the C function doesn't even accept the parameters, those happen later after you generate the prepared statement.

But also, SQL requires you do it this way. And the C libraries being used require construction of a string (because that's the API C has). An sql library such as mysql-native, which is fully written in D, would not require building a string (and I intend to do this if string interpolation ever happens).

Things other than SQL do not require building a string.

> >

If you want to call writef, you can construct a format string easily at compile time. Or just... call writef the normal way.

??

Yeah, I've never cared about hooking writef, it's fine as-is (well, it's fine if that's what you like). The fact that you have to put in %s everywhere, it's a klunky mechanism for "output this thing to a character stream".

Can't tell you how many times I've written a toString hook that calls outputRange.formattedWrite("%s", thing);. That "%s" is so ugly and useless. But this is all D gives me to use, so I use it.

> >

Compile-time string parsing is way more costly than compile-time string concatenation.

I suspect you routinely use CTFE for far, far more complex tasks. This is a rounding error.

Wait, so generating an extra template is a bridge too far, but parsing a DSL at compile time is a rounding error?

In my testing, CTFE concatenation is twice as fast as parsing, and uses 1/3 less memory.

Not to mention that concatenation is easy. I can do it in one line (if I don't really care about performance). The same cannot be said for parsing.

So I'd say, the user must understand that he's receiving a template, but also does not have to learn how to properly parse a specialized unrelated DSL. Format strings are weird, confusing, klunky, and less efficient.

> >

Ok. This does mean, for intentional overloading of a function to accept a compile-time first parameter, you will have to rename the function.

You can overload it with existing functions, or give it a new name. Your choice, I don't see problem.

If the template-parameter version is less preferred, it will only be used with an explicit template parameter.

It's not a problem, it just is one more quirk that is surprising.

-Steve

January 13

On Saturday, 13 January 2024 at 13:14:28 UTC, Nick Treleaven wrote:

>

On Friday, 12 January 2024 at 22:35:54 UTC, Walter Bright wrote:

>

So, instead of issuing a compilation error, the compiler can "slide" the arguments to the left, so the first argument is moved into the compile time parameter list. Then, the call will compile.

FeepingCreature proposed this instead, which seems to be more flexible and clearer:
https://forum.dlang.org/post/arzmecuotonnomsehrmk@forum.dlang.org

I like this one, the only thing to think about is what to do with existing calls like:

format!"%s"("hello");

What goes where? Does "hello" get pushed to the args, or is it still the format string? Does it just not match that overload, and now you need separate overloads for when you explicitly instantiate with a parameter?

-Steve

January 13

On Friday, 12 January 2024 at 22:35:54 UTC, Walter Bright wrote:

>

Given the interest in CTFE of istrings, I have been thinking about making a general use case out of it instead of one specific to istrings.

But notice that `args` are runtime arguments. It turns out there is no way
to use tuples to split an argument tuple into compile time and runtime tuples:

void pluto(Args...)(Args args)
{
exec!(args[0])(args[1 .. args.length]);
}

What if you split the type-tuple instead but store a value at [0].


import std.stdio;

// Works today!
void pluto_v1(Args...)(string, Args[1] a1, Args[2] a2)
{
    pragma(msg, Args[0]); // Compile-time value!
}

// Almost compiles... add 'void' to explicitly slide instead of implicit magic.
void pluto_v2(Args...)(void, Args[1..$] args)
{
    pragma(msg, Args[0]); // Compile-time value!
    args.writeln;
}

void main()
{
    pluto_v1!("x", int, int)("x", 2, 3);
//  pluto_v2("x", 2, 3);
}