Jump to page: 1 2
Thread overview
Idea for avoiding GC for lambdas
Jun 21, 2021
Elronnd
Jun 22, 2021
H. S. Teoh
Jun 22, 2021
H. S. Teoh
Jun 22, 2021
IGotD-
Jun 22, 2021
rm
Jun 23, 2021
12345swordy
June 21, 2021

I have some code in my codebase that does stuff like this:

struct S
{
   int id;
   int field1;
}

auto foo(S[] arr, int someval) // @nogc not allowed :(
{
   // remove all the elements of arr that have someval for field1
   import std.algorithm;
   return arr.filter!(v => v.field1 == someval);
}

While this works, and is nice, it's using a lambda with local references to filter data. This means a GC closure allocation is involved.

However, the value that is referenced (someval) is the only thing needed, and it seems a huge waste to allocate a stack frame just for that.

Let's look at another way to do this. filter does not have an overload which accepts a value to compare to, but find does. So let's build a new filter that uses it:

auto nogcfilter(alias pred, T, V)(T[] rng, V val)
{
    import std.range;
    import std.algorithm : find;
    static struct Filter {
        T[] src;
        V val;
        auto front() { return src.front; }
        void popFront() {src.popFront; src = src.find!pred(val); }
        auto empty() { return src.empty; }
    }
    return Filter(rng, val);
}

auto foo(S[] arr, int someval) @nogc // :)
{
   // note: the explicit lambda types are needed for some reason
   return arr.nogcfilter!((S v, int a) => v.field1 == a)(someval);
}

Now we get filter capability, but without the GC.

This works great because the context items used do NOT require any mutability. It also works great because the filter function is building a custom type ANYWAY.

Now, we could add this kind of functionality to filter, but being the lazy programmer that I am, what if the compiler could do this for me? In most cases, I don't want to sleuth out what context is needed (which might change as the code evolves). To force the user to both invent the context type (here, it's just an int, but it may require other values) and explicitly pass it is a bit verbose (I already don't like having to pass the int, and declaring the parameter).

What if, a function that accepts a lambda, could also accept a context which the compiler made up and passed in? Here's how it would possibly look (with a new magic type __context):

auto nogcfilter(alias pred, T, __context...)(T[] rng, __context ctxt)
{
    import std.range;
    import std.algorithm : find;
    static struct Filter {
        T[] src;
        __context val;
        auto front() { return src.front; }
        void popFront() {src.popFront; src = src.find!pred(val); }
        auto empty() { return src.empty; }
    }
    return Filter(rng, ctxt);
}

auto foo(S[] arr, int someval) @nogc
{
   return arr.nogcfilter!((v) => v.field1 == someval);
}

The compiler would see there's a lambda involved, automatically pass the context parameter someval into the function itself, and everything magically works. It would only be used when a closure would otherwise be allocated, and the function being called declares the context parameter.

Disclaimer: I am not a language developer.

Does this make sense? Is there anything feasible that might be similar? Would it be more feasible with different syntax?

I realize there are implications if the context parameter is mutable, so perhaps we would have to require only const contexts (though the compiler might be able to figure out that something isn't modified anywhere, and allow it to go in).

-Steve

June 21, 2021

On 6/21/21 4:01 PM, Steven Schveighoffer wrote:

>

   // remove all the elements of arr that have someval for field1

Obviously, that should have said include all the elements that have someval.

-Steve

June 21, 2021
The reason delegates involve GC is because the closure is shared.  E.G.:

int x;
auto f = () => x++;
auto g = () => x++;
f(); //0
g(); //1
f(); //2

f and g share a reference to the same x, so we need GC to know when to clean it up.

Your proposal is essentially a value delegate; the __context stuff makes that explicit, but it's not actually fundamentally different from the current scheme, it's just that the context becomes a value type rather than a reference type.

I think value closures are a good idea--for semantics reasons too, not just avoiding gc--but problematic for existing APIs to support, because every value closure will have a different type and a different size.  And we have enough problems already with compile time template explosion :)

WRT your proposal, I think it would be better to introduce an alternate syntax for function literals: we have  function(parm) => ret  and  delegate(parm) => ret; add  valuedelegate(parm) => ret.  And make it so that valuedelegate.ptr is a struct (the context struct) so you can still introspect it if needed.

> I realize there are implications if the context parameter is mutable, so perhaps we would have to require only const contexts

Why?  What you've made is basically a functor where the compiler infers the context; structs are received by reference (by their methods).  If the context were shared, then const wouldn't be enough; it would have to be immutable to retain value semantics.  But the whole point of this was to avoid sharing (and hence GC), so I don't see why that would be a problem.
June 22, 2021

On Monday, 21 June 2021 at 20:01:27 UTC, Steven Schveighoffer wrote:

>

The compiler would see there's a lambda involved, automatically pass the context parameter someval into the function itself, and everything magically works. It would only be used when a closure would otherwise be allocated, and the function being called declares the context parameter.

Disclaimer: I am not a language developer.

Does this make sense? Is there anything feasible that might be similar? Would it be more feasible with different syntax?

You are asking for C++ lambdas? No need for a new type of template parameter.

June 22, 2021
On 21/06/2021 23:01, Steven Schveighoffer wrote:
> I have some code in my codebase that does stuff like this:
> 
> ```d
> struct S
> {
>     int id;
>     int field1;
> }
> 
> auto foo(S[] arr, int someval) // @nogc not allowed :(
> {
>     // remove all the elements of arr that have someval for field1
>     import std.algorithm;
>     return arr.filter!(v => v.field1 == someval);
> }
> ```
> 
> While this works, and is nice, it's using a lambda with local references to filter data. This means a GC closure allocation is involved.
> 
> *snip*

It does come up every so often. You can also follow the links here.
https://forum.dlang.org/post/qnigarkuxxnqwdernhzv@forum.dlang.org


Maybe the zip+repeat trick should be streamlined a bit.
June 22, 2021

On 6/22/21 1:57 AM, Ola Fosheim Grostad wrote:

>

On Monday, 21 June 2021 at 20:01:27 UTC, Steven Schveighoffer wrote:

>

The compiler would see there's a lambda involved, automatically pass the context parameter someval into the function itself, and everything magically works. It would only be used when a closure would otherwise be allocated, and the function being called declares the context parameter.

Disclaimer: I am not a language developer.

Does this make sense? Is there anything feasible that might be similar? Would it be more feasible with different syntax?

You are asking for C++ lambdas? No need for a new type of template parameter.

I was kinda hoping for the compiler to assist in the generation and management of the context data since it's already doing it. The receiving function just needs a variable to receive and pass the context data to the lambda.

-Steve

June 22, 2021

On 6/21/21 6:19 PM, Elronnd wrote:

>

The reason delegates involve GC is because the closure is shared.  E.G.:

int x;
auto f = () => x++;
auto g = () => x++;
f(); //0
g(); //1
f(); //2

Yes, but only necessary if the data is mutated.

>

Your proposal is essentially a value delegate; the __context stuff makes that explicit, but it's not actually fundamentally different from the current scheme, it's just that the context becomes a value type rather than a reference type.

Right, I honestly don't know how the context is passed for an alias parameter. The compiler must pass some sort of context pointer to the calling function, which then gets stored somehow. But it's all hidden. If it's hidden, it might as well be as big as needed.

>

I think value closures are a good idea--for semantics reasons too, not just avoiding gc--but problematic for existing APIs to support, because every value closure will have a different type and a different size.

Remember, these aren't delegates in the strict sense, they are alias parameters. They can be implemented however D wants. Take a look at the filter implementation, there's not a delegate in sight.

>

And we have enough problems already with compile time template explosion :)

This shouldn't have a significant effect on templates -- aliases are already unique parameters, causing a new template for every call.

>

WRT your proposal, I think it would be better to introduce an alternate syntax for function literals: we have  function(parm) => ret  and delegate(parm) => ret; add  valuedelegate(parm) => ret.  And make it so that valuedelegate.ptr is a struct (the context struct) so you can still introspect it if needed.

I'm unsure if this is how it works, the alias parameters aren't exactly delegates, though that might be how the compiler implements them.

My thought was that the compiler could pick to send the function alias in as a straight alias (without context) and pass the context in as a separate parameter. The function itself would have a type that accepts the context data.

> >

I realize there are implications if the context parameter is mutable, so perhaps we would have to require only const contexts

Why?  What you've made is basically a functor where the compiler infers the context; structs are received by reference (by their methods).  If the context were shared, then const wouldn't be enough; it would have to be immutable to retain value semantics. But the whole point of this was to avoid sharing (and hence GC), so I don't see why that would be a problem.

I mean non-changeable contexts. These can be const data, as long as they aren't const references, they should be unchangeable.

I.e. the compiler should be able to pass an x like:

const x = 5;

-Steve

June 22, 2021

On 6/22/21 2:20 AM, rm wrote:

>

It does come up every so often. You can also follow the links here.
https://forum.dlang.org/post/qnigarkuxxnqwdernhzv@forum.dlang.org

Thanks for the link, I didn't remember this (it just happened a few months ago too!)

>

Maybe the zip+repeat trick should be streamlined a bit.

Yeah, I'm suggesting to avoid having to specify explicitly on the call side how to pass the data. The compiler could figure it out for me, and I get @nogc filter for free.

-Steve

June 22, 2021
On Tue, Jun 22, 2021 at 11:05:18AM -0400, Steven Schveighoffer via Digitalmars-d wrote:
> On 6/21/21 6:19 PM, Elronnd wrote:
[...]
> > Your proposal is essentially a value delegate; the __context stuff makes that explicit, but it's not actually fundamentally different from the current scheme, it's just that the context becomes a value type rather than a reference type.
> 
> Right, I honestly don't know how the context is passed for an alias parameter. The compiler must pass some sort of context pointer to the calling function, which then gets stored somehow. But it's all hidden. If it's hidden, it might as well be as big as needed.

AFAIK, delegates are essentially just:

	struct __dg {
		void* context;
		RetType function(void* context, Args...) func;
	}

An alias parameter, being a template parameter, simply causes the aliased function to get template-expanded in the body of the delegate. Any context it carries must be shared with the delegate's context, which is the source of the recent issue with multi-context template functions that resulted in the compiler change getting reverted.

The context must be stored *somewhere*, usually on the GC heap, which is why delegates depend on the GC. Except for scope delegates where it's guaranteed that the parent scope won't go out of scope before the delegate does, so the context pointer can just point to the stack where the parent's scope is stored.

Unless I'm missing something obvious, I don't think value delegates would work well with D, because it's pretty important for delegates with the same parameter and return types to be interchangeable. I.e., two delegates with external type `int delegate(string)` must have the same fixed size and memory layout, regardless of the size of the context of one delegate vs. the other, otherwise you have a pretty serious problem at the language level.


T

-- 
Everybody talks about it, but nobody does anything about it!  -- Mark Twain
June 22, 2021

On 6/22/21 12:13 PM, H. S. Teoh wrote:

>

Unless I'm missing something obvious, I don't think value delegates
would work well with D, because it's pretty important for delegates with
the same parameter and return types to be interchangeable. I.e., two
delegates with external type int delegate(string) must have the same
fixed size and memory layout, regardless of the size of the context of
one delegate vs. the other, otherwise you have a pretty serious problem
at the language level.

This again, is a misunderstanding of my proposal. I'm not talking about altering delegates in any way.

There are no delegates involved in filter. It is, instead, an inner struct with a hidden context pointer. Could just as well be a struct with context carried with it instead of allocated on the heap.

Note that this generates 2 different FilterResult instances, because they are 2 different lambdas:

int[] arr = [1, 2, 3];

auto f1 = arr.filter!(v => v % 2);
atuo f2 = arr.filter!(v => v % 2);
static assert(!is(typeof(f1) == typeof(f2)));

I'm unclear how the compiler generates and passes the context, but I know that it uses the GC. However, it would be reasonable to allow functions to accept the context by value (if it makes sense). One way to do this is to work out whether you can do it by value, and use that if possible. Another way is to explicitly have different syntax to pass the context by value.

I was hoping for a possible way forward where the compiler figures it all out for me.

-Steve

« First   ‹ Prev
1 2