DIP 1027--String Interpolation--Final Review Discussion Thread (page 3) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » DIP 1027--String Interpolation--Final Review Discussion Thread (page 3)

February 02, 2020

Re: DIP 1027--String Interpolation--Final Review Discussion Thread

Posted by Tove
in reply to MoonlightSentinel

Tove

Posted in reply to MoonlightSentinel

On Sunday, 2 February 2020 at 16:10:49 UTC, MoonlightSentinel wrote:
> On Sunday, 2 February 2020 at 15:31:46 UTC, Tove wrote:
>> Implict conversion to string would trigger allocations! This is not an option, allocations has to be explict. However alias this to tuple is fine by me.
>
> The GC is usually opt-in. Having this in feature in @gc code would be better than not having it at all.

It's perfectly fine to have a explicit conversion to string, but it needs to be clearly visible from the code where all expensive operations are.

It's the same design philosophy as with the 'in' operator, you shouldn't use 'in' with O(n) algos, because it looks like a deceptively cheap operation.

February 02, 2020

Re: DIP 1027--String Interpolation--Final Review Discussion Thread

Posted by Adam D. Ruppe
in reply to Tove

Adam D. Ruppe

Posted in reply to Tove

On Sun, Feb 02, 2020 at 03:31:46PM +0000, Tove via Digitalmars-d wrote:
> Implict conversion to string would trigger allocations!

 indeed but you can also just not do that; you do need to specift string
 to trigger alias this.


 but i personally would also be oj with explicit toString call, my point
 is really with a struct we can do it all, whereas with the dip as-is we
 privilege one niche case at the expense of almost all others.

February 02, 2020

Re: DIP 1027--String Interpolation--Final Review Discussion Thread

Posted by Tove
in reply to Adam D. Ruppe

Tove

Posted in reply to Adam D. Ruppe

On Sunday, 2 February 2020 at 16:21:18 UTC, Adam D. Ruppe wrote:
> On Sun, Feb 02, 2020 at 03:31:46PM +0000, Tove via Digitalmars-d wrote:
>> Implict conversion to string would trigger allocations!
>
>  indeed but you can also just not do that; you do need to specift string
>  to trigger alias this.
>
>
>  but i personally would also be oj with explicit toString call, my point
>  is really with a struct we can do it all, whereas with the dip as-is we
>  privilege one niche case at the expense of almost all others.

Yes, but my primary concern is calling functions with string parameters.

Otherwise I fully agree with you, I prefer your struct solution with added explict toString and implicit tuple expansion.

February 02, 2020

Re: DIP 1027--String Interpolation--Final Review Discussion Thread

Posted by Dennis
in reply to Tove

Dennis

Posted in reply to Tove

On Sunday, 2 February 2020 at 15:31:46 UTC, Tove wrote:
> Implict conversion to string would trigger allocations! This is not an option, allocations has to be explict.

Memory allocation is an essential tool to make many language features like array literals and nested functions just work. Because of the garbage collector, this can be `@safe` without leaks too, which is a good quality of D.

> It's the same design philosophy as with the 'in' operator, you shouldn't use 'in' with O(n) algos, because it looks like a deceptively cheap operation.

`in` has worst-case time complexity O(n) for associative arrays, and in many practical cases an O(n) linear scan is faster than a complex O(log n) search algorithm. Also, the concatenation operator ~ is O(n) and triggers allocations. D is not C where every operation is supposed to compile to a handful of machine instructions at most.
If you write:

string c = i"name: $firstname $lastname";

I don't think anyone would expect this to be O(1) without allocations and dislike the language for making this simply work in the best possible way.

If you want it to not work unless no allocations are made, you can use @nogc.

February 02, 2020

Re: DIP 1027--String Interpolation--Final Review Discussion Thread

Posted by Tove
in reply to Dennis

Tove

Posted in reply to Dennis

On Sunday, 2 February 2020 at 18:07:00 UTC, Dennis wrote:
> On Sunday, 2 February 2020 at 15:31:46 UTC, Tove wrote:
>> Implict conversion to string would trigger allocations! This is not an option, allocations has to be explict.
>
> Memory allocation is an essential tool to make many language features like array literals and nested functions just work. Because of the garbage collector, this can be `@safe` without leaks too, which is a good quality of D.
>
>> It's the same design philosophy as with the 'in' operator, you shouldn't use 'in' with O(n) algos, because it looks like a deceptively cheap operation.
>
> `in` has worst-case time complexity O(n) for associative arrays, and in many practical cases an O(n) linear scan is faster than a complex O(log n) search algorithm. Also, the concatenation operator ~ is O(n) and triggers allocations.

The point is, it should be a fast operation, my conclusion still stands.

> D is not C where every operation is supposed to compile to a handful of machine instructions at most.
> If you write:
>
> string c = i"name: $firstname $lastname";
>
> I don't think anyone would expect this to be O(1) without allocations and dislike the language for making this simply work in the best possible way.
>
> If you want it to not work unless no allocations are made, you can use @nogc.

Sure @nogc works fine, but sometimes you want to use gc and when you do, the language should help you to do the right thing.

while(...)
  my_fun(i"name: $firstname $lastname");

If my_fun takes a string this would be terrible code, it should not compile until you make a conscious choice:
a) Performance is not important for my app. -> Let's add toString()
b) Add a new overload to my_fun that can handle interpolated strings directly.

February 02, 2020

Re: DIP 1027--String Interpolation--Final Review Discussion Thread

Posted by Steven Schveighoffer
in reply to Adam D. Ruppe

Steven Schveighoffer

Posted in reply to Adam D. Ruppe

On 2/2/20 9:29 AM, Adam D. Ruppe wrote:
> On Sunday, 2 February 2020 at 14:20:16 UTC, Timon Gehr wrote:
>> This is a use case for `alias this`, the language just has to support it.
> 
> Indeed.
> 
> So is
> 
> string s = i"";
> 
> so idk which one we'd use (unless the language gets multiple alias this lol).
> 
> But regardless the struct lets us have it all. With alias this it is every minor detail. Without it it is a simple method call.
> 
> I used to lukewarm support this DIP, but since the struct is so much better, superior in almost every objective measure, I now think we should vote it down unless it changes.

The problem I have with the struct mechanism is that it enforces the parameters are all non-reffable data.

e.g., this should work with the DIP as proposed:

int apples;
int bananas;
readf(i"I have $apples apples and $bananas bananas");

I don't know how you do that with a struct as the result of the interpolated string without knowing how the parameters will be used.

Or this:

struct S {}
foo!(i"I'm passing $S as an alias/type, with some formatting data around it");

Other than that, I like the idea of the struct for the purposes of overloading. I still would like the format string to be a custom type that devolves to string. I also liked the straight conversion to a tuple where every other parameter was a string (i.e. Marler's implementation).

The biggest thing that the DIP has going for it is that there are lots of functions which have a format string + args, due to the way D/C does varargs, and this will be a drop-in call without having to change any code or add special overloads. The whole printf thing is meh to me, I don't use it and don't care to use it with this mechanism anyway (writef is a different story).

And Walter:
C functions aren't overloadable?

This works just fine:

---------
import core.stdc.stdio : printf;

void printf(int x) { printf("%d", x); }

void main()
{
    printf("hello %d\n", 5);
    printf(5);
    printf("\n");
}

---------

output:

hello 5
5

-Steve

February 02, 2020

Re: DIP 1027--String Interpolation--Final Review Discussion Thread

Posted by Dennis
in reply to Tove

Dennis

Posted in reply to Tove

On Sunday, 2 February 2020 at 18:43:20 UTC, Tove wrote:
> Sure @nogc works fine, but sometimes you want to use gc and when you do, the language should help you to do the right thing.
>
> while(...)
>   my_fun(i"name: $firstname $lastname");
>
> If my_fun takes a string this would be terrible code, it should not compile until you make a conscious choice:
> a) Performance is not important for my app. -> Let's add toString()
> b) Add a new overload to my_fun that can handle interpolated strings directly.

You know, maybe instead of trying to give D the reputation of a mechanically checked memory safe language, it should be marketed with something better:

Mechanically enforced premature optimization! ;)

In all seriousness, I understand that eagerly constructing new strings is a bad practice for performant code and that you want to discourage it. I just think that since D is also pretty good for quick scripts or fast prototyping, being able to quickly type an interpolated string to evaluate to `string` would be very convenient.

Personally I especially dislike always importing std.conv: text or std.string: format just to get an informative message for `throw new Exception("...")` or `assert(x, "...")`, so I hope interpolated strings can at least solve that.

February 02, 2020

Re: DIP 1027--String Interpolation--Final Review Discussion Thread

Posted by Adam D. Ruppe
in reply to Steven Schveighoffer

Adam D. Ruppe

Posted in reply to Steven Schveighoffer

On Sunday, 2 February 2020 at 18:46:36 UTC, Steven Schveighoffer wrote:
> The problem I have with the struct mechanism is that it enforces the parameters are all non-reffable data.

This is why I at first wanted a pure language solution instead of the hybrid solution now (language provides syntax sugar, library provides implementation), but I'm not actually sure this is worth fighting over.

> int apples;
> int bananas;
> readf(i"I have $apples apples and $bananas bananas");

It is kinda cool that this can work, but really, do you think it is legitimately useful? readf is so weird in how it works that you'd very rarely find a case where the interpolated string even does the right thing.

Moreover, this would also require ref; it wouldn't work with scanf. So that limits it either more.

But consider if you did want to do something like this, you could i"$(&apples)"... which actually would work with scanf as well as with the struct; a pointer can go in there easily enough.

> struct S {}
> foo!(i"I'm passing $S as an alias/type, with some formatting data around it");

Again, I think that is cool but not useful enough to justify compromising other cases. I'll take it if and only if it comes for free, and the naked tuple does not come for free.

A pure language struct btw can do this - the compiler, instead of calling a library function, just creates the type internally. Then it can declare them as aliases as needed. But that opens up other complications.... and really, what's the value? Why would you want an alias in the middle of a string? There might be some... but I can't think of one right now and I suspect there'd be a better way anyway.

But let's just be careful not to damage real world use cases in the name of "that might be cool in theory someday". The struct has definite real world use cases... the alias/ref tuples not so sure.

> The biggest thing that the DIP has going for it is that there are lots of functions which have a format string + args

Yeah, I do like being able to specify a format thingy (`${%3d}bananas`) in there. That's something I'd definitely see being legitimately used.

Though I'd probably prefer the format string to be built in a library, I like that bit enough that I want the magic lowering to do something with it somehow.

But otherwise it is easy for a library solution to provide format strings as needed, transparently to the user. And a library solution can handle the DIP's limitation:

string tool = "hammer";
writefln(i"hammering %s with $tool", "nails");

How? Well, since it is an independent object, it knows what arguments it had! It could translate itself before going to outside uses. And especially with a method to produce a format string, it can even be smart enough to escape % to %% in the interpolated things.

So that becomes

writefln(_interp!("hammering %s with ", "")(tool), "nails");

and then _interp returns its helper struct. Well, if we do implicit toString and tool == "%s" well, lol we just poisoned our format string.

But at the same time, we could provide a method to escape the interpolated thing - and writefln could overload based on this to just call that. writeln knows what it needs.

So it'd be like

void writefln(__Interp fmt, ...) {
    string f = fmt.toEscapedString!(a => a.replace("%", "%%"));
    // forward back to the normal one
    writefln(f, ...);
}

Then that toEscapedString method on the struct calls the given delegate on each user string as it is appended to the final result. Allowing us to properly encode it in a particular context.

You could us the same thing with html:

string user = "<try_injection>";

string myHtml = i"<b>$user</b>".toEscapedString!htmlEntitesEncode;
// myHtml == "<b>&lt;try_injection&gt;</b>"

and ditto for any format you can imagine. The struct is flexible in so many ways that a tuple isn't! And it is easy to use.

Javascript's string interpolation even lets us do this. Do we want D to be defeated by JAVASCRIPT?!!?!?!?! lol

(this kind of thing btw is another argument against an implicit toString on alias this. Like I propose that to be friendly to people who insist `string s = i"$foo";` must work, and it is cool that we can, though I personally think it should be a decision. I don't care about GC, but I do care about proper encoding of output.)

Again, back to your main counter point, do we want to sacrifice all this *definite* value for possible uses in theory for ref arguments? I'm not against refness per se (I do find it a lil weird, but still)... just I don't think it is worth sacrificing anything for since the to-string potential is far more clear than the from-string readf potential.

February 02, 2020

Re: DIP 1027--String Interpolation--Final Review Discussion Thread

Posted by Adam D. Ruppe
in reply to Dennis

Adam D. Ruppe

Posted in reply to Dennis

On Sunday, 2 February 2020 at 19:22:23 UTC, Dennis wrote:
> In all seriousness, I understand that eagerly constructing new strings is a bad practice for performant code and that you want to discourage it. I just think that since D is also pretty good for quick scripts or fast prototyping, being able to quickly type an interpolated string to evaluate to `string` would be very convenient.

Yeah. Though see my last email for another point on this re encoding for different contexts.

And take note that with the struct proposal, it will not require an import. I'd implement it like:

string toString()() {
   import std.conv;
   return text(args);
}

so the import is internal to it (so you don't have to write it yourself), and being templated, the import is only triggered if the method is actually used (so you don't pay for it in cases like -betterC where you don't want it).

note we can also offer the toString(scope delegate() sink) version for working with those parts of the library that are allocation averse too.


So many of these problems are already solved in D library techniques!

February 02, 2020

Re: DIP 1027--String Interpolation--Final Review Discussion Thread

Posted by Steven Schveighoffer
in reply to Adam D. Ruppe

Steven Schveighoffer

Posted in reply to Adam D. Ruppe

On 2/2/20 2:36 PM, Adam D. Ruppe wrote:
> On Sunday, 2 February 2020 at 18:46:36 UTC, Steven Schveighoffer wrote:
>> The problem I have with the struct mechanism is that it enforces the parameters are all non-reffable data.
> 
> This is why I at first wanted a pure language solution instead of the hybrid solution now (language provides syntax sugar, library provides implementation), but I'm not actually sure this is worth fighting over.
> 
>> int apples;
>> int bananas;
>> readf(i"I have $apples apples and $bananas bananas");
> 
> It is kinda cool that this can work, but really, do you think it is legitimately useful? readf is so weird in how it works that you'd very rarely find a case where the interpolated string even does the right thing.

I'm sure I would rather use that than readf with format string and trailing parameters. And why would it "rarely" do the right thing?

> Moreover, this would also require ref; it wouldn't work with scanf. So that limits it either more.
> 
> But consider if you did want to do something like this, you could i"$(&apples)"... which actually would work with scanf as well as with the struct; a pointer can go in there easily enough.

Well, yeah. It's not limiting scanf, because scanf uses pointers. Either solution works there. But not for something like readf.

> 
>> struct S {}
>> foo!(i"I'm passing $S as an alias/type, with some formatting data around it");
> 
> Again, I think that is cool but not useful enough to justify compromising other cases. I'll take it if and only if it comes for free, and the naked tuple does not come for free.

I don't know for sure what possibilities are unlocked by this, but I know that the struct implementation has limitations that the naked tuple does not. Avoiding limitations means we could potentially find something genuinely useful that would have been blocked with something that has limitations.

I think we could get the best of both worlds if the interpolated string itself was not just a string, but rather a library-defined type (well something slightly more special -- it should implicitly cast to a null-terminated immutable(char)* if needed, just like string literals). You'd get overloading capabilities to do something custom with interpolated strings, and you would get all the niceties you would get from the struct solution, but without the limitations.

> A pure language struct btw can do this - the compiler, instead of calling a library function, just creates the type internally. Then it can declare them as aliases as needed. But that opens up other complications.... and really, what's the value? Why would you want an alias in the middle of a string? There might be some... but I can't think of one right now and I suspect there'd be a better way anyway.

I don't think this is the right route either. Putting "special" types into the language is something we're trying to move away from (it irks me still that the AA is not yet a library type, and needs TypeInfo to work).

> But let's just be careful not to damage real world use cases in the name of "that might be cool in theory someday". The struct has definite real world use cases... the alias/ref tuples not so sure.

The readf example is a real world use case.

> But otherwise it is easy for a library solution to provide format strings as needed, transparently to the user. And a library solution can handle the DIP's limitation:
> 
> string tool = "hammer";
> writefln(i"hammering %s with $tool", "nails");

This can be handled if the iterpolated string has a different type than a normal string, allowing overloading.

> You could us the same thing with html:
> 
> string user = "<try_injection>";
> 
> string myHtml = i"<b>$user</b>".toEscapedString!htmlEntitesEncode;
> // myHtml == "<b>&lt;try_injection&gt;</b>"

This can work with the DIP as-is.

EscapedString toEscapedString(EncodingStyle, T...)(string escapedFormat, T items)

> and ditto for any format you can imagine. The struct is flexible in so many ways that a tuple isn't! And it is easy to use.

Both have benefits, and both have limitations. I feel the limitations of the struct are that it doesn't allow whole categories of usage. The limitations of the tuple are that you'll have to parse the generated string again after the compiler already has done it (and again, this can be solved with a new type for the string).

> Javascript's string interpolation even lets us do this. Do we want D to be defeated by JAVASCRIPT?!!?!?!?! lol

I think anything javascript does with interpolated strings, we can do. We just have to write the function to do it.

> Again, back to your main counter point, do we want to sacrifice all this *definite* value for possible uses in theory for ref arguments? I'm not against refness per se (I do find it a lil weird, but still)... just I don't think it is worth sacrificing anything for since the to-string potential is far more clear than the from-string readf potential.

I see no difference in saying "you can access printf, you just need to put in a little .c at the end" vs. "you can access assignment to string, you just have to put a little .format at the end".

The to-string potential is just fine with the DIP. That's mainly what it's for. On the to-string side, both ideas have merits and drawbacks, and I view them quite ambivalently. The from-string (as you call it) or compile-time potential is non-existent in the "interpolated-string-to-struct" idea, so it's a clear win for something that gives you a tuple of what you passed in.

I would vote a resounding yes to this DIP + making the interpolated string a new type, but still a solid yes on the DIP. I would not vote no on the struct idea, but I would not be as positive about it.

And we can actually add later the idea of making the interpolated string a new type after this DIP is implemented, mitigating a lot of the concern here. I don't want to let perfect be the enemy of good.

-Steve

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation