Thread overview
Continuation passing style vs. wrapper objects in dmd
May 25, 2021
Dennis
May 26, 2021
Daniel N
May 26, 2021
Mathias LANG
May 25, 2021
dmd has a few string functions with names having "Then" as a prefix that take a lambda and call it with a temporary string converted for OS purposes (zero-terminated, encoded a specific way etc). The use goes like this:

int i = module_name.toCStringThen!(name => stat(name.ptr, &statbuf));

The way it goes, `module_name` gets converted from `char[]` to null-terminated `char*`, the lambda gets invoked, then whatever temporary memory allocated is freed just after the lambda returns.

I was thinking there's an easier way that's also more composable:

int i = stat(stringz(name).ptr, &statbuf));

where `stringz` returns a temporary struct offering primitives such as `ptr` and `opSlice`. In the destructor, the struct frees temporary memory if allocated. Better yet, it can return them as `scope` variable, that way ensuring correctness in safe code.

Destruction of temporary objects has been sketchy in the past but I assume things have been ironed out by now.

May 25, 2021
On 5/25/21 1:21 PM, Andrei Alexandrescu wrote:
> dmd has a few string functions with names having "Then" as a prefix

s/prefix/suffix/
May 25, 2021

On Tuesday, 25 May 2021 at 17:21:16 UTC, Andrei Alexandrescu wrote:

>

int i = stat(stringz(name).ptr, &statbuf));

I much prefer that form too, but unfortunately it only works with immediate use. This is a big pitfall:

{
const(char)* s = stringz(name).ptr;
// destructor runs here
printf(s); // use after free
}

You have to write:

{
auto temp = stringz(name);
const(char)* s = temp.ptr;
printf(s); // good!
// destructor runs here
}

The first form should be preventable with -dip1000, but it isn't currently:
https://issues.dlang.org/show_bug.cgi?id=20880
https://issues.dlang.org/show_bug.cgi?id=21868

May 25, 2021
On 5/25/21 1:56 PM, Dennis wrote:
> On Tuesday, 25 May 2021 at 17:21:16 UTC, Andrei Alexandrescu wrote:
>> int i = stat(stringz(name).ptr, &statbuf));
> 
> I much prefer that form too, but unfortunately it only works with immediate use. This is a big pitfall:
> ```D
> {
> const(char)* s = stringz(name).ptr;
> // destructor runs here
> printf(s); // use after free
> }
> ```
> 
> You have to write:
> ```D
> {
> auto temp = stringz(name);
> const(char)* s = temp.ptr;
> printf(s); // good!
> // destructor runs here
> }
> 
> ```

The CPS is exposed to the problem as well:

const char* p;
module_name.toCStringThen!(name => p = name.ptr);

> The first form should be preventable with -dip1000, but it isn't currently:
> https://issues.dlang.org/show_bug.cgi?id=20880
> https://issues.dlang.org/show_bug.cgi?id=21868

DIP1000 should help both, but the wrapper object is (to me at least) vastly easier on the eyes. Not to mention composability (what if you have two of those...)
May 26, 2021
On Tuesday, 25 May 2021 at 17:21:16 UTC, Andrei Alexandrescu wrote:
> int i = module_name.toCStringThen!(name => stat(name.ptr, &statbuf));

This can use all alloca, if it does not inline.

> int i = stat(stringz(name).ptr, &statbuf));

This cannot...

May 26, 2021
On Wednesday, 26 May 2021 at 04:59:36 UTC, Ola Fosheim Grostad wrote:
>> int i = stat(stringz(name).ptr, &statbuf));
>
> This cannot...

Well, actually it could use alloca if you are 100% sure stringz is inlined, but it would be bad in a loop.

May 26, 2021

On Tuesday, 25 May 2021 at 17:21:16 UTC, Andrei Alexandrescu wrote:

>

dmd has a few string functions with names having "Then" as a prefix that take a lambda and call it with a temporary string converted for OS purposes (zero-terminated, encoded a specific way etc). The use goes like this:

[...]

I was thinking there's an easier way that's also more composable:

int i = stat(stringz(name).ptr, &statbuf));

where stringz returns a temporary struct offering primitives such as ptr and opSlice. In the destructor, the struct frees temporary memory if allocated. Better yet, it can return them as scope variable, that way ensuring correctness in safe code.

Well, the usage of CPS is limited to one level here. As your example show, we can return whatever we want from toCStringThen, and if needed, chain the return value with something else.

The struct approach works to a certain degree: DIP1000 would not provide the tool to make this pattern work in a @safe context. I have yet to see a container that is @safe to use with DIP1000 (e.g. https://github.com/dlang/phobos/pull/8101 ), but making the CPS work with DIP1000 is possible (provided DIP1000 works as intended). It's from this observation that this approach became my preferred one, and that's what led to toCStringThen (origin: https://github.com/dlang/dmd/pull/8585 ).

But this function is only used a handful of times (~10?) in DMD, and only for C functions or in trampoline functions (turning a slice into a pointer). Is there a large ROI in finding the best possible pattern for it ? There are many large architectural problems in DMD that needs to be addressed, such as the absolute lack of abstraction despite the OOP hierarchy. The semantic routines will cast a base type (Expression, Dsymbol, etc...) to a more specialized type literally everywhere, instead of relying on virtual functions / properties available in the base classes. Just grep for cast(TypeFunction) to get an idea of what I mean.

May 26, 2021
On Wednesday, 26 May 2021 at 05:18:57 UTC, Ola Fosheim Grostad wrote:
> On Wednesday, 26 May 2021 at 04:59:36 UTC, Ola Fosheim Grostad wrote:
>>> int i = stat(stringz(name).ptr, &statbuf));
>>
>> This cannot...
>
> Well, actually it could use alloca if you are 100% sure stringz is inlined, but it would be bad in a loop.

Thread necromancy yields:

ref E stalloc(E)(ref E mem = *(cast(E*)alloca(E.sizeof)))

Guaranteed to work as default parameter initialisation occurs in callers contex, but indeed loops are no fun.



May 26, 2021
On 2021-05-26 3:48, Mathias LANG wrote:
> On Tuesday, 25 May 2021 at 17:21:16 UTC, Andrei Alexandrescu wrote:
>> dmd has a few string functions with names having "Then" as a prefix that take a lambda and call it with a temporary string converted for OS purposes (zero-terminated, encoded a specific way etc). The use goes like this:
>>
>> [...]
>>
>> I was thinking there's an easier way that's also more composable:
>>
>> int i = stat(stringz(name).ptr, &statbuf));
>>
>> where `stringz` returns a temporary struct offering primitives such as `ptr` and `opSlice`. In the destructor, the struct frees temporary memory if allocated. Better yet, it can return them as `scope` variable, that way ensuring correctness in safe code.
> 
> Well, the usage of CPS is limited to one level here. As your example show, we can return whatever we want from `toCStringThen`, and if needed, chain the return value with something else.
> 
> The `struct` approach works to a certain degree: DIP1000 would *not* provide the tool to make this pattern work in a `@safe` context. I have yet to see a container that is `@safe` to use with DIP1000 (e.g. https://github.com/dlang/phobos/pull/8101 ), but making the CPS work with DIP1000 is possible (provided DIP1000 works as intended). It's from this observation that this approach became my preferred one, and that's what led to `toCStringThen` (origin: https://github.com/dlang/dmd/pull/8585 ).

It's great that DIP1000 works with CPS. Given the familiarity and ubiquity of wrapper structs, the more important conclusion here is we must make DIP1000 work with them.

A struct should be able to expose innards thereof in with "scope" and have the compiler make sure their use doesn't outlive the struct. It's pretty much the primary use case of DIP1000.

> But this function is only used a handful of times (~10?) in DMD, and only for C functions or in trampoline functions (turning a slice into a pointer). Is there a large ROI in finding the best possible pattern for it ? There are many large architectural problems in DMD that needs to be addressed, such as the absolute lack of abstraction despite the OOP hierarchy. The semantic routines will cast a base type (`Expression`, `Dsymbol`, etc...) to a more specialized type literally everywhere, instead of relying on virtual functions / properties available in the base classes. Just grep for `cast(TypeFunction)` to get an idea of what I mean.

Sure. It's a new idiom in D and out of character for dmd so it's worth exploring its pros and cons with an eye for further adoption.