On Wednesday, 10 January 2024 at 19:53:48 UTC, Walter Bright wrote:
> >And you can get rid of the runtime overhead by adding a pragma(inline, true)
writeln
overload. (I guess with DMD that will still bloat the executable,
I didn't mention the other kind of bloat - the rather massive number and size of template names being generated that go into the object file, as well as all the uncalled functions generated only to be removed by the linker.
Yes, DIP1036e has a lot of extra templates generated, and the mangled name is going to be large.
Let's skip for a moment the template that writeln will generate (which I agree isn't ideal, but also is somewhat par for the course).
This shouldn't be a huge problem for the interpolation types because the type doesn't get included in the binary. It is a big problem for the toString
function, because that is included.
However, we can mitigate the ones that return null
:
string __interpNull() => null;
struct InterpolatedExpression(string expr)
{
alias toString = __interpNull;
}
... // and so on
I tested this and it does work. So this reduces all the toString
member functions from InterpolatedExpression
(and InterpolationPrologue
and InterpolationEpilog
, but those are not templated structs anyway) to one function in the binary.
But we can't do this for InterpolatedLiteral
(which by the way is improperly described in Atila's DIP, the associated toString
member function should return the literal).
We can do possibly a couple things here to mitigate:
- We can modify how
std.format
works so it will accept the following as atoString
hook:
struct S
{
enum toString = "I am an S";
}
This means, no function calls, no extra long symobls in the binary (since it's an enum, it should not go in), and I think even the compilation will be faster.
- We modify it to be aware of
InterpolationLiteral
types, and avoid depending on thetoString
API. After all, we own both Phobos and druntime, we can coordinate the release.
And as a further suggestion, though this is kind of off-topic, we may look into ways to have templates that don't make it into the binary explicitly. Basically, they are marked as shims or forwarders by the library author, and just serve as a way to write nicer syntax. This could help in more than just the interpolation DIP.
>As far as I can tell, the only advantage of DIP1036 is the use of inserted templates to "key" the tuples to specific functions. Isn't that what the type system is supposed to do? Maybe the real issue is that a format string should be a different type than a conventional string.
No. While I agree that having a different type makes it more useful and easier to hook, there is a fundamental problem being solved with the compile-time literals being passed to the function. Namely, tremendous power is available to validate, parse, prepare, etc. string data at compile time, for use during runtime. This simply is not possible with 1027.
The runtime benefits are huge:
- No need to allocate anything (
@nogc
,-betterC
, etc. all available) - You get compiler errors instead of runtime errors (if you put in the work)
- It's possible generate "perfect forwarding" to another function that does use another form. For example,
printf
. - If you inline the call, it can be as if you called the forwarded function directly with the exactly correct parameters.
And I want to continue to point out, that a constructed "format string" mechanism just is inferior, regardless if it is another type, as long as you don't need formatting specifiers (and arguably, it's just a difference in taste otherwise). The compiler parsed it out, it knows the separate pieces. Giving those pieces directly to the library is both the most efficient way, and also the most obvious way. The "format string" mechanism, while making sense for writef, must add an element of complexity to the receiving function, since it now has to know what "language" the translated string is. e.g. with DIP1027, one must know that %s
is special and what it represents, and the user must know to escape %s
to avoid miscommunication. With 1036e, there is no format string, so there is no complication there, or confusion. The value being passed is right where you would expect it, and you don't have to parse a separate thing to know.
Note in YAIDIP, this was done partly through an interpolation header, which had all the compile-time information, and then strings and interpolated data were interspersed. I find this also a workable solution, and could even do without the strings being passed interspersed (as I said, we have control over writeln
and text
), but I think the ordering of the tuple to match what the actual string literal looks like is so intuitive, and we would be losing that if we did some kind of "format header" mechanism.
-Steve