August 16, 2006
Sean Kelly wrote:
> The alternative would be to use a separate keyword for these delegates, 'closure' or 'lambda' or some such, but that's potentially confusing, and leaves anonymous delegates in an odd position.

I *really* want to avoid having to do this. It's almost guaranteed that it'll be a rich source of bugs.
August 16, 2006
Walter Bright wrote:
> Sean Kelly wrote:
>> The alternative would be to use a separate keyword for these delegates, 'closure' or 'lambda' or some such, but that's potentially confusing, and leaves anonymous delegates in an odd position.
> 
> I *really* want to avoid having to do this. It's almost guaranteed that it'll be a rich source of bugs.

As an alternative, since it's really how the delegate is used that's at issue, perhaps the programmer could simply be given a way to manually "archive" the stack frame used by a delegate if he knows it will need to be called asynchronously?  From the original example:

    Button createButton(char[] click_msg)
    {
        Button b = new Button();
        b.mouseClickCallback = { MsgBox(click_msg); };
        return b;
    }

Let's assume mouseClickCallback is written like so:

    void mouseClickCallback( void delegate() dg )
    {
        clickHandler = dg;
    }

Following my suggestion, it would be changed to this:

    void mouseClickCallback( void delegate() dg )
    {
        dg.archive;
        clickHandler = dg;
    }

The archive routine would check dg's stack frame to see if a heap copy of the frame exists (assume it's stored as a pointer at this[0]).  If not then memory is allocated, the pointer is set, the frame is copied, and dg's 'this' pointer is updated to refer to the dynamic frame. Returning a delegate from a function would just implicitly call this 'archive' routine.  This could still cause errors, as a programmer may forget to call "dg.archive" before storing the delegate, but I think this is an acceptable risk and is far better than having the compiler try to "figure out" whether such a dynamic allocation is needed.  It also seems fairly easy to implement compared to the alternatives, and offering the feature through a property method would eliminate the need for a new keyword.


Sean
August 16, 2006
"Walter Bright" <newshound@digitalmars.com> wrote in message news:ebvl5s$2k03$1@digitaldaemon.com...
> An ideal solution would be if the compiler could statically detect if a nested class reference can 'escape' the stack frame, and only then allocate on the heap. Otherwise, the current (very efficient) method of just passing a frame pointer would be employed.
>

Well, in general it is not decidable if a given nested function escapes. Here is a trivial example:

void delegate() func(void delegate() F)
{
    F();
    return { F(); };
}

This function will only return a delegate when F halts. One conservative strategy is to simply look for any nested function declarations in the function's lexical extent.  If such a declaration exists, then that function must be heap allocated.  I suspect that this is how C# handles the problem.

Another possibility is to add a special attribute modifier to the declaration of the initial function.  This places the burden of determining 'escapism' on the programmer instead of the compiler, and is probably the simplest to implement.

Perhaps the most flexible solution would be a combination of both approaches.  By default, any function containing a nested function declaration gets marked as heap allocated - unless it is declared by the programmer to be stack-based.  For this purpose, we could recycle the deprecated 'volatile' keyword.  Here is an example:

volatile void listBATFiles(char[][] files)
{
    foreach(char[] filename; filter(files,  (char[] fname)  { return
fname[$-3..$] == "BAT"; })
        writefln("%s", filename);
}

In this case, we know that the anonymous delegate will never leave the function's scope, so it could be safely stack allocated.  Philosophically, this fits with D's design.  It makes the default behavior safe, while still allowing the more dangerous behavior when appropriate.


August 16, 2006
Walter Bright wrote:
> BCS wrote:
> 
>> BTW how do you construct a delegate literal inside of a class method that uses "this" as it's context, rather than the frame pointer?
>>
>> class Foo
>> {
>>     int i;
>>     int delegate() fig(){ return {return i;}; }
> 
> 
> It uses the frame pointer of fig() to access fig()'s this. Of course, though, this will currently fail because fig() is exited before the delegate is called.
> 

a.k.a. It can't be done?

>> //    int delegate() fig(){ return this.{return i;}; }  // this maybe?
>> }
>>
>> void main()
>> {
>>     auto foo = new Foo;
>>     auto farm = foo.fig;
>>
>>     foo.i = 5;
>>     auto i = farm(); // valid if context is "foo"
>>     assert(i == 5);
>> }
August 16, 2006
Sean Kelly wrote:
> Walter Bright wrote:
>> Sean Kelly wrote:
>>> The alternative would be to use a separate keyword for these delegates, 'closure' or 'lambda' or some such, but that's potentially confusing, and leaves anonymous delegates in an odd position.
>>
>> I *really* want to avoid having to do this. It's almost guaranteed that it'll be a rich source of bugs.
> 
> As an alternative, since it's really how the delegate is used that's at issue, perhaps the programmer could simply be given a way to manually "archive" the stack frame used by a delegate if he knows it will need to be called asynchronously?

Upon further reflection (and some helpful criticism) this doesn't seem like it may not work so well with delegates from structs and classes. But I do like the general idea better than that of flagging the delegates upon declaration or something like that.  I don't suppose the idea could be somehow refined to eliminate these problems?


Sean
August 16, 2006
kris wrote:

> Oskar Linde wrote:
>> kris wrote:
>> 
>> 
>>>Yeah, this is a serious trap for the unwary and, unfortunately, prohibits the use of such delegates in the one place where the elegance would be most notable: as gui callbacks, per your example. As you say, the way around it (currently) is to create a class to house the 'scope' content, or otherwise refer to it. That's unweildy, even with anonymous-class syntax.
>>>
>>>If, as you suggest, D had some means to indicate that the scope should be placed on the heap instead of the stack, it would resolve the concern nicely. Perhaps that indication might be lambda syntax itself? For example, in C# the lambda indicator is the symbol '=>'
>> 
>> 
>> I don't think any special syntax is needed. I believe the compiler could be able to automatically identify whether a function may have escaping delegates referring to local variables.
>> 

I felt kind of stupid after posting this and finding out that everything (and much more) had already been posted two hours earlier.

> Yes, that's certainly one way. But the suggestion was to "take advantage" of an even simpler form of lambda delegates; one that C# 3.0 is moving toward. It's worth taking a look at, just for comparitive purposes? Two random links:
> 
> http://www.interact-sw.co.uk/iangblog/2005/09/30/expressiontrees http://www.developer.com/net/csharp/article.php/3598381

Interesting. I always found the original D delegate syntax a bit too wordy, and with the new delegate syntax, I think many of us noticed how a little difference in typing overhead and clarity made the language feature much more compelling. The reason is probably purely psychological, but the result is there. Anyway, if you give a mouse a cookie... :)

If you want named and typed delegate arguments and a void return type, the current syntax is probably close to optimal, but most of the delegates I write are single expression functions. I think it is for those the C#3.0 lambda expression syntax was conceived.

What in D is: (int a, int b) { return a + b; }

is in C#:
(a, b) => a + b;

You could theoretically go further if you were willing to accept anonymous
arguments. Something like:
$1 + $2

And some languages would even accept:
'+

While I really like how the short, concise C# lambda expression syntax looks, I can't help but feel it is out of style with the rest of the language. (It is probably just a very temporary feeling though).

The => brings two features D's delegates doesn't have.
1. The short form for the single expression case.
2. Implicitly typed arguments.

#1 is the easy part. #2 is not.

> On the other hand, one of the great things about the D compiler is it's voracious speed. If it turns out that automagically trying to figure out where the 'escapees' are will noticably slow the compiler down, then a bit of syntactic sugar might make all the difference :)

From what little I know of compiler construction, escape analysis is something that is done to all variables anyways. My guess is that most delegate literals will end up escaping in some way or another, with the most common case being passed as arguments to an external function. The problem here is that the compiler can never know if that function intents to keep the delegate reference till after the instantiating function has returned.

Finding out which local or enclosing variables are referred to by escaping delegates can't be that much harder, so I doubt compilation speed is an issue. What could be an issue though is the fact that the compiler defensively would have to heap allocate all variables any escaping delegates refer to even though they in many cases never would be referenced after the enclosing function returns.

What could improve things is if there was a no-immigrants (only visitors) declaration for function arguments that guaranteed that any escaping delegate passed to one such function argument would never be stored after that function returns or be passed as argument to any other immigration friendly function:

void update(int[] arr, visitor void delegate(inout int a) updater) {
        foreach(inout i;arr)
                updater(i);
}

...
int b = ...;
myarr.update((inout int a) { a += b; });
// would not count as an escape and would not need b to be stack allocated

/Oskar

August 16, 2006
Oskar Linde wrote:
> 
> What could improve things is if there was a no-immigrants (only visitors)
> declaration for function arguments that guaranteed that any escaping
> delegate passed to one such function argument would never be stored after
> that function returns or be passed as argument to any other immigration
> friendly function:
> 
> void update(int[] arr, visitor void delegate(inout int a) updater) {
>         foreach(inout i;arr)
>                 updater(i);
> }
> 
> ....
> int b = ...; myarr.update((inout int a) { a += b; }); // would not count as an escape and would not need b to be stack allocated
> 
> /Oskar
> 

Whoa... if I'm not mistaken that is kinda like the converse of const(read-only): it is a write-only (or use-only) variable. It's value cannot be read (at least directly) :

  mydg = updater;  // cannot be read!
  updater(i);      // but can use;
  updater = xpto;  // and can assign to?

:D

-- 
Bruno Medeiros - MSc in CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
August 16, 2006
Sean Kelly wrote:
>> An ideal solution would be if the compiler could statically detect if a nested class reference can 'escape' the stack frame, and only then allocate on the heap. Otherwise, the current (very efficient) method of just passing a frame pointer would be employed.
> 
> Agreed.  As well as detect whether the delegates reference any local stack data in the first place.  The alternative would be to use a separate keyword for these delegates, 'closure' or 'lambda' or some such, but that's potentially confusing, and leaves anonymous delegates in an odd position.
> 

Uh, shouldn't such keyword be applied to the function whose frame *is used*, rather than the function(delegate) that uses the outer frame?

-- 
Bruno Medeiros - MSc in CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
August 16, 2006
Mikola Lysenko wrote:
> "Walter Bright" <newshound@digitalmars.com> wrote in message news:ebvl5s$2k03$1@digitaldaemon.com...
> 
>>An ideal solution would be if the compiler could statically detect if a nested class reference can 'escape' the stack frame, and only then allocate on the heap. Otherwise, the current (very efficient) method of just passing a frame pointer would be employed.
>>
> 
> 
> Well, in general it is not decidable if a given nested function escapes. Here is a trivial example:
> 
> void delegate() func(void delegate() F)
> {
>     F();
>     return { F(); };
> }
> 
> This function will only return a delegate when F halts. One conservative strategy is to simply look for any nested function declarations in the function's lexical extent.  If such a declaration exists, then that function must be heap allocated.  I suspect that this is how C# handles the problem.
> 
> Another possibility is to add a special attribute modifier to the declaration of the initial function.  This places the burden of determining 'escapism' on the programmer instead of the compiler, and is probably the simplest to implement.
> 
> Perhaps the most flexible solution would be a combination of both approaches.  By default, any function containing a nested function declaration gets marked as heap allocated - unless it is declared by the programmer to be stack-based.  For this purpose, we could recycle the deprecated 'volatile' keyword.  Here is an example:
> 
> volatile void listBATFiles(char[][] files)
> {
>     foreach(char[] filename; filter(files,  (char[] fname)  { return fname[$-3..$] == "BAT"; })
>         writefln("%s", filename);
> }
> 
> In this case, we know that the anonymous delegate will never leave the function's scope, so it could be safely stack allocated.  Philosophically, this fits with D's design.  It makes the default behavior safe, while still allowing the more dangerous behavior when appropriate.
> 
> 


A definition
---------------

It's worth making a distinction between the two types of delegate. There's what I'll call the synchronous and asynchronous versions, where the scope of the former is live only for the duration of its "host" function (frame needs no preservation; how D works today). Asynchronous delegates can be invoked beyond the lifespan of their original host, and thus their frame may need to be preserved. I say 'may' because it needs to be preserved only if referenced.


A couple of observations
------------------------

1) seems like the use of "volatile" (above) is more attuned to an asynchronous invocation rather than a synchronous one? The above listBatFiles() example is of a synchronous nature, yes?

2) the 'qualifier' in the above example is placed upon the host, where in fact it is the *usage* of the delegate that is at stake. Not the fact that it simply exists. For example:

# foo(char[] s)
# {
#    bool isNumeric(char c) {return c >= 0 && c <= '9';}
#
#    foreach (c; s)
#             if (isNumeric(c))
#                 // do something
#                 ;
# }

In the above case, the nested isNumeric() function is clearly of the synchronous variety. It should use the stack, as it does today. Whereas this variation

# foo(Gui gui, char[] s)
# {
#    bool isNumeric(char c) {return c >= 0 && c <= '9';}
#
#    bool somethingElse() {return s ~ ": something else";}
#
#    foreach (c; s)
#             if (isNumeric(c))
#                 // do something
#                 ;
#	      else
#                {
#                funcWrittenBySomeoneElse (somethingElse);
#                break;
#                }
# }

How do you know what the else clause will do with the provided delegate? Is the usage-context synchronous, or will it wind up asynchronous? You just don't know what the function will do with the delegate, because we don't have any indication of the intended usage.

(please refrain from comments about indentation in these examples <g>)


Yet another strategy
--------------------

Consider attaching the qualifier to the usage point. For example, the decl of setButtonHandler() could be as follows:

# void setButtonHandler (volatile delegate() handler);

... indicating that the scope of an argument would need to be preserved. For the case of returned delegates, the decl would need to be applied to  the return value and to the assigned lValue:

# alias volatile delegate() Handler;
#
# Handler getButtonHandler();
#
# auto handler = getButtonHandler();

This is clearly adding to the type system (and would be type-checked), but it does seem to catch all cases appropriately. The other example (above) would be declared in a similar manner, if it were to use its argument in an asynchronous manner:

# void funcWrittenBySomeoneElse (delegate() other);
#
# or
#
# void funcWrittenBySomeoneElse (volatile delegate() other);

C# gets around all this by (it's claimed) *always* using a heap-based frame for delegates. That is certainly safe, but would be inappropriate for purely synchronous usage of nexted functions, due to the overhead of frame allocation. It depends very much on the context involved ~ how the delegates are actually used.







August 16, 2006
Oskar Linde wrote:
> kris wrote:
> 
> 
>>Oskar Linde wrote:
>>
>>>kris wrote:
>>>
>>>
>>>
>>>>Yeah, this is a serious trap for the unwary and, unfortunately,
>>>>prohibits the use of such delegates in the one place where the elegance
>>>>would be most notable: as gui callbacks, per your example. As you say,
>>>>the way around it (currently) is to create a class to house the 'scope'
>>>>content, or otherwise refer to it. That's unweildy, even with
>>>>anonymous-class syntax.
>>>>
>>>>If, as you suggest, D had some means to indicate that the scope should
>>>>be placed on the heap instead of the stack, it would resolve the concern
>>>>nicely. Perhaps that indication might be lambda syntax itself? For
>>>>example, in C# the lambda indicator is the symbol '=>'
>>>
>>>
>>>I don't think any special syntax is needed. I believe the compiler could
>>>be able to automatically identify whether a function may have escaping
>>>delegates referring to local variables.
>>>
> 
> 
> I felt kind of stupid after posting this and finding out that everything
> (and much more) had already been posted two hours earlier. 
> 
> 
>>Yes, that's certainly one way. But the suggestion was to "take
>>advantage" of an even simpler form of lambda delegates; one that C# 3.0
>>is moving toward. It's worth taking a look at, just for comparitive
>>purposes? Two random links:
>>
>>http://www.interact-sw.co.uk/iangblog/2005/09/30/expressiontrees
>>http://www.developer.com/net/csharp/article.php/3598381
> 
> 
> Interesting. I always found the original D delegate syntax a bit too wordy,
> and with the new delegate syntax, I think many of us noticed how a little
> difference in typing overhead and clarity made the language feature much
> more compelling. The reason is probably purely psychological, but the
> result is there. Anyway, if you give a mouse a cookie... :)
> 
> If you want named and typed delegate arguments and a void return type, the
> current syntax is probably close to optimal, but most of the delegates I
> write are single expression functions. I think it is for those the C#3.0
> lambda expression syntax was conceived. 
> 
> What in D is: (int a, int b) { return a + b; }
> 
> is in C#:
> (a, b) => a + b;
> 
> You could theoretically go further if you were willing to accept anonymous
> arguments. Something like:
> $1 + $2
> 
> And some languages would even accept:
> '+
> 
> While I really like how the short, concise C# lambda expression syntax
> looks, I can't help but feel it is out of style with the rest of the
> language. (It is probably just a very temporary feeling though). 
> 
> The => brings two features D's delegates doesn't have. 1. The short form for the single expression case.
> 2. Implicitly typed arguments.
> 
> #1 is the easy part. #2 is not.
> 
> 
>>On the other hand, one of the great things about the D compiler is it's
>>voracious speed. If it turns out that automagically trying to figure out
>>where the 'escapees' are will noticably slow the compiler down, then a
>>bit of syntactic sugar might make all the difference :)
> 
> 
> From what little I know of compiler construction, escape analysis is
> something that is done to all variables anyways. My guess is that most
> delegate literals will end up escaping in some way or another, with the
> most common case being passed as arguments to an external function. The
> problem here is that the compiler can never know if that function intents
> to keep the delegate reference till after the instantiating function has
> returned.
> 
> Finding out which local or enclosing variables are referred to by escaping
> delegates can't be that much harder, so I doubt compilation speed is an
> issue. What could be an issue though is the fact that the compiler
> defensively would have to heap allocate all variables any escaping
> delegates refer to even though they in many cases never would be referenced
> after the enclosing function returns.
> 
> What could improve things is if there was a no-immigrants (only visitors)
> declaration for function arguments that guaranteed that any escaping
> delegate passed to one such function argument would never be stored after
> that function returns or be passed as argument to any other immigration
> friendly function:
> 
> void update(int[] arr, visitor void delegate(inout int a) updater) {
>         foreach(inout i;arr)
>                 updater(i);
> }
> 
> ...
> int b = ...; myarr.update((inout int a) { a += b; }); // would not count as an escape and would not need b to be stack allocated
> 
> /Oskar
> 


aye ... we posted similar concerns and solutions :)