On Tuesday, 9 January 2024 at 23:21:34 UTC, Walter Bright wrote:
>A compile time way is DIP1027 can be modified to reject any arguments that consist of tuples with other than one element. This would eliminate nested istring tuples at compile time.
To sum up, it works with nested istrings poorly; it may even be sensible to forbid them entirely for DIP1027. Glad we’ve reached a consensus on this point. This case doesn’t seem crucial at the moment though; now we can focus on more relevant questions.
>DIP1036 cannot detect other problems with the string literals. It seems like a lot of complexity to deal with only one issue with malformed strings at compile time rather than runtime.
DIP1036 provides full CTFE capabilities at your disposal. You can validate anything about a format string; any compile-time-executable hypothetical validateSql(query)
will fit. I guess none of the examples presented so far featured such validation because it usually tends to be long and not illustrative.
However, another Adam’s example does perform non-trivial compile-time validation. Here is how it is implemented.
> >Constructing it at compile time is essential so that we can validate the generated SQL and abort compilation, as Paolo demonstrated.
That only checks one aspect of correctness - nested string interpolations.
They check a lot more. I agree it is hard to spot the error messages in the linked post so I’ll copy them here:
relation "snapshotsssss" does not exist. SQL: select size_mm, size_px from snapshotsssss where snapshot_id = $1
role "dummyuser" can't select on table "snapshots". SQL: select size_mm, size_px from snapshots where snapshot_id = $1
As you can see, they check sophisticated business logic expressed in terms of relational databases. And all of that happens at compile time. Isn’t that a miracle?
> >I explained here why these two arguments are valuable. Aren’t free of cost—correct unless you enable inlining. execi
may require some changes (like filterOutEmpty
I showed above) to make them free of cost, but it is doable.
You'd have to also make every formatted writer a template,
Err… every formatted writer has to be a template anyway, doesn’t it? It needs to accept argument lists that may contain values of arbitrary types.
>…and add the filter to them.
Yeah. I admit this is a problem. As a rule of thumb, the most obvious code should yield the best results. With DIP1036, this is not the case at the moment: when you pass an interpolation sequence to a function not specifically designed for it, it wastes more stack space than necessary and passes useless junk in registers.
Others have mentioned that DIP1027 performs much worse in terms of speed (due to runtime parsing). While that is undoubtable, I think DIP1036 should be tweaked to behave as good as possible.
There was an idea in this thread to improve the ABI so that it ignores empty structs, but I’m rather sceptical about it.
Instead, let us note there are basically two patterns of usage for istrings:
- Passing to a function that processes an istring and does something non-trivial.
execi
is a good example. - Passing to a function that simply stringifies every fragment, one after another.
writeln
is a good example.
Something counterintuitive, case 1 is easier to address: the function already traverses the received sequence and transforms it. So it is only necessary to write it in such way that it is inline-friendly.
By the way, what functions do we have in Phobos that fall into the case-2 category? write
/writeln
, std.conv.text
, std.logger.core.log
, and… is that all? Must be something else!..
Turns out there are only a handful of relevant functions in the entire stdlib. It shouldn’t be hard to put a filter in each of them. It also hints they are probably not that common in the wild.
However, when one encounters a third-party write
-like function that is unaware of InterpolationHeader
/etc., they should have a means to fix it from outside, i.e., without touching its source and ideally without writing a wrapper by hand. Unfortunately, I could not come up with a satisfactory solution for this. Will keep thinking. Perhaps someone else manages to find it faster.
An idea in a different direction. Currently, InterpolationHeader
/etc. structs interoperate with write
-like functions seamlessly (at the expense of passing zero-sized arguments) due to the fact they all have an appropriate toString
method. If we remove those methods (and do nothing else), then write(i"a$(x)b")
would produce something like:
InterpolationHeader()InterpolatedLiteral!"a"()InterpolatedExpression!"x"()42InterpolatedLiteral!"b"()InterpolationFooter()
The program, rather than introducing a silent inefficiency, immediately tells the user they need to account for these types.
And one more idea. Current implementation of DIP1036 can emit empty chunks—i.e., InterpolatedLiteral!""
—see for example i"$(x)"
. If I was making a guess why it does so, I would say it strives to produce consistent, regular sequences. On the one hand, it might ease the job of interpolation-sequence handlers: they can count on the fact that expressions and literals always alternate inside a sequence. On the other, they have to check if a literal is empty and drop it if it is so it actually makes their job harder.
I do not know whether not producing empty literals in the first place would be a positive or negative change. But it is something worth to consider.
Slightly off-topic: when I was thinking about this, I was astonished by the fact istrings can work with readf
/formattedRead
/scanf
. Just wanted to share this observation.
readf(i" $(&x) $(&y)");
> The compiler will indeed reject it (The error message would be a bit baffling to those who don't know what Interpolation types are)
This is true. I suppose the docs should mention InterpolationHeader
and friends when talking about istrings, explain what an istring is lowered to, and show examples. Then a programmer who’ve read the docs will have a mental association between “istring” and “InterpolationHeader/Footer/etc.” Those who don’t read the docs—well, they won’t have. Only googling will save them.
To be honest, I’m not concerned about this point too much.
>along with any attempt to call execi() with a pre-constructed string.
The end result is that to do manipulation with istring tuples, the programmer is alternately faced with adding Interpolation elements or filtering them out. Is that really what we want?
I’d argue it is wonderful that execi
cannot be called with a pre-constructed string. The API should provide another function instead—say, execDynamicStatement(Sqlite, string, Args...)
. execi
should be used for statically known SQL with interpolated arguments, and execDynamicStatement
—for arbitrary SQL constructed at runtime. A verbose name is intentional to discourage its usage in favour of execi
.
P.S. most keyboarding bugs result from neglecting to add needed syntax, not typing extra stuff.
That makes sense. Though you’ll never guess what beast can be spawned by uncareful refactoring. Extra protection won’t harm, especially if it’s zero-cost.
P.S. Zero-initialization of variables is one of D’s cool features, indeed.