February 28, 2003
They're both compiler bugs and I'll fix them. -Walter


March 01, 2003
In article <b3ko0o$ueh$1@digitaldaemon.com>, Walter says...
>
>void startLogging(string logFilename)
>{
>    class Foo
>    {    file_handle logfile;
>         void handleInterrupt(int interrupt) { writeMessage(interrupt,
>logfile); }
>    }
>    Foo f = new Foo;
>    f.logfile = openFile(logFilename, "a");
>    registerInterruptHandler(12, f.handleInterrupt);
>}
>
>It's a little more work, but not so bad.
>

Here's an idea for some syntactic sugar to make that pill easier to swallow:

In C++, classes can implement cast methods, for example:

class IntObj {
int data;
operator float { return (float)data; }
}

How about an overloadable operator such that you can cast a class instance to a delegate?

The declaration syntax may be difficult to get right. However, with it you could do this (the example assumes you add the keyword 'closure' to denote the technique):

void startLogging(string logFilename)
{
class Foo
{    file_handle logfile;
this (file_handle log) { logfile = log; }
void handleInterrupt(int interrupt) { writeMessage(interrupt, logfile); }
delegate void (int) closure { return handleInterrupt; }
}
Foo f = new Foo(openFile(logFilename, "a"));
registerInterruptHandler(12, handleInterrupt);
}

So in this example registerInterruptHandler received a normal "delegate void (int)", just like it expected.   "handleInterrupt" would be restricted by the compiler to only referencing globals or symbols within the Foo class (it would have no reference to anything in the startLogging function).  Anything the delegate needs, pass it to Foo's constructor.

One step further:

void startLogging(string logFilename)
{
class Foo
{    file_handle logfile;
this (char[] name) { logfile = openFile(name, "a") }
void handleInterrupt(int interrupt) { writeMessage(interrupt, logfile); }
delegate void (int) closure { return handleInterrupt; }
}
registerInterruptHandler(12, new Foo(logFilename));
}

Voila... lexical closures, without the implementational complexity, and without any lurking stack traps for the unwary.

Dan


March 01, 2003
> Here's an idea for some syntactic sugar to make that pill easier to
swallow:
>
> In C++, classes can implement cast methods, for example:
>
> class IntObj {
> int data;
> operator float { return (float)data; }
> }
>
> How about an overloadable operator such that you can cast a class instance
to a
> delegate?

Talking about hidden and obscure implicit behavior!
(You should interpret the above as "no, I don't think that's a good idea")

Why not?

a) "f.handleInterrupt" is a lot shorter to type than declaring &
implementing a 'delegator cast operator'
b) since it's a local function/class, it's sure to have limited size and
functionality. In fact, it should be as short as possible to keep
readability acceptable
c) overloadable operators are a questionable syntactic-sugar feature of C++.
Overloadable type-cast operators are a nightmare, since the implicit type
casts they cause go against all type safety principles
d) "casting a class instance to a delegate" - just reading that should raise
questions about the semantics...


March 01, 2003
What information would the compiler need to have in order to be able to detect such usage 100%?

Sean

"Walter" <walter@digitalmars.com> wrote in message news:b3lnk1$1jv2$1@digitaldaemon.com...
>
> "Jeroen van Bemmel" <anonymous@somewhere.com> wrote in message news:b3ligi$1gno$1@digitaldaemon.com...
> > The point is not that you wouldn't be able to make it work, the point is that you create a bug that is hard to find. This kind of use of nested functions should be detected and refused. It should be illegal to store
a
> > reference to a nested local function. In fact, the type of a nested
> function
> > should not be 'delegate' but "non_copyable_delegate' - can you make this
?
>
> I agree that as much as possible the compiler should detect and reject
such
> usage as illegal. But it cannot do it 100%. Another possibility would be
to
> add a runtime check.


March 01, 2003
"Sean L. Palmer" <seanpalmer@directvinternet.com> wrote in message news:b3r11a$2aq1$1@digitaldaemon.com...
> What information would the compiler need to have in order to be able to detect such usage 100%?

It would have to know that any function you passed the delegate to would not store it in some global data structure. I think that's not possible.

But, I think a runtime check can be done.


March 01, 2003
"Jeroen van Bemmel" <anonymous@somewhere.com> wrote in message news:b3mab5$1vde$1@digitaldaemon.com...
> > I'm thinking it may be possible to add a runtime check for that.
>
> It may even be possible to do static checking:
>
> the 'registerInterruptHandler(12, handleInterrupt)' in the example would have to be declared as registerInterruptHandler( int, global delegate()) to be allowed to store the reference to the delegate somewhere. Passing 'handleInterrupt' would then be disallowed, since nested functions are of type "local delegate" and thus not assignment compatible
>
> You could discuss about the wording (local vs global or something else) or
> perhaps define a default ( all 'delegate' parameter values are assumed local
> (not storable) unless explicitly declared "global" (or "storable"), or vice
> versa )
>
I think it can be done even without the "global" keyword, by analyzing the
 program and anotating every delegate argument of every fucntion with the
 golobal flag. This way you can get a initial set of functions and a set of rules.
 To get a result you would find the transitive closure.
This is something like forward inference of expert sytems. I believe that
 you can solve this also using "backward inference" (like prolog uses) to
 resolve only one specific case.
On the other side having the flag explicit can be good hint for a programmer.


March 01, 2003
"Jeroen van Bemmel" <anonymous@somewhere.com> writes:
>> I'm thinking it may be possible to add a runtime check for that.
>
> It may even be possible to do static checking:
>
> the 'registerInterruptHandler(12, handleInterrupt)' in the example would have to be declared as registerInterruptHandler( int, global delegate()) to be allowed to store the reference to the delegate somewhere. Passing 'handleInterrupt' would then be disallowed, since nested functions are of type "local delegate" and thus not assignment compatible
>
> You could discuss about the wording (local vs global or something else) or
> perhaps define a default ( all 'delegate' parameter values are assumed local
> (not storable) unless explicitly declared "global" (or "storable"), or vice
> versa )

Okay, let's discuss.  First, let's make the vocabulary clear.  There are three kinds of callable entities:

"function" - just the normal function.
      Contains the address of the function.

"local delegate" - created by nested function or function literal
      Contains the address of the function and a pointer to
      the activation record of the function it is defined in.
      That activation record will be demolished (if it's allocated
      from the stack which is what we usually do) at some point.

"global delegate" - created from a class member function
      Contains the address of the function and a pointer to
      its environment, which is an object allocated on the
      heap.  The environment won't be destroyed until the delegate
      itself disappears.

[I intentionally avoid the term "closure" because of my superstitiously traditional definition of closure: that it captures its environment. Global delegates are kind-of closures (although there environment == object), and local delegates are pseudo-closures which act like "real" closures until the environment goes out of scope, after which they cause terror and destruction.]

Now the situation is that local delegates cannot be safely stored, while global delegates can.  This is the fundamental difference. However, at the moment, there is only a single "delegate" concept in D.  The user of a delegate doesn't know if it can store it or not.  If this situation stays, it's likely that "Don't store local delegates" will be the first item in the book "D traps and pitfalls".  Which would be scaringly close to those "Remember to write a virtual destructor" hints that C++ books are full of.

The best situation would be if there would be no need for such advice. Delegates can be made safe in different ways, some of which are:


1. Relying on global error checking

After compilation, check every store of every delegate and ensure that they never originate from a nested function or function literal.

Pros:
- doesn't need changes in the language spec
- will probably prevent majority of problems

Cons:
- the source must be available (or just the information about where
delegates are stored)
- complexity
- the resulting error message might still be intimidating to someone
who's about to shoot himself in the foot: "Illegal local delegate
store in function void foobar(delegate int() x) in flabbergast.d:234,
called in function void xyz() in application.d:95".  Still, it would
be better than letting the programmer actually shoot himself in the
foot.


2. Relying on lint-like tools

The same as above, except implemented as a separate tool.
Pros:
- doesn't complicate the compiler nor the language
Cons:
- makes the programmer's life more complex, though
- and it goes without mention that the language would instantly have a
C-like kludgy feeling


3. Making the two types of delegates different types.

Like this: (assume that "global delegate" is made the default.  This can be questioned, but at least that decision won't break the existing code)

- objects of type "delegate" can be stored
- objects of type "local delegate" can be used and passed around, but
not stored
- there should be an implicit conversion "delegate -> local delegate"
- optionally, the programmer should be able to convert local delegates
to "real" delegates, if he's 100% certain that the local delegate
won't outlive its activation record.  The cast should be as ugly and
eye-catching as possible, by similar reasoning as with C++'s
reinterpret_cast<>.  For instance, it could be useful for storing a
temporary copy of the delegate (for some reason).

Pros:
- safety: you cannot store a local delegate unknowingly,
which catches the vast majority of possible delegate bugs
- speed (the performance is the same as it is now)
- simplicity (from the implementor's viewpoint) - just add a new type,
a conversion, and there you go

Cons:
- complexity in the interface: you'll have to state "local
delegate(...)" everywhere where you know that you won't be storing the
delegate.  This is precisely the same problem as with C interfaces and
const: if you have, for example, the function void print(char *c);,
you can't give it a string literal, because that's a const char *.


4. Getting rid of local delegates altogether by allocating some of the allocation records on the heap

This one would be something pretty cool if only possible.  The idea is to allocate from the heap those allocation records that have escaping delegates. It's just a hot-headed idea and I'm not sure if it's at all feasible.

Usually actication records are either all allocated on the stack or on the heap.  I found one description of a hybrid solution at http://www.cs.indiana.edu/~dyb/papers/stack-abstract.html, though. (Mainly targeted for languages with first-class continuations.)

Pros:
- safety: since all delegates are true closures, no need to worry
about pointers to long-gone activation records.
- simplicity: the programmer need not worry about local or global
delegates - he'll only know about delegates.

Cons:
- Speed.  Calling functions with escaping delegates is slower because
of heap-allocation and potential cache misses due to lost memory
locality. However, if the compiler could do global analysis and notice
that the delegate won't be stored anywhere, it could put the
allocation record in the stack as usually and refer to that.
- Complexity in the implementation side, especially if the above
optimization is required


5. Combining the approaches 3 and 4

Include both local delegates and delegates as distinct types.

Whenever local delegate is converted into a global delegate is required, the activation record of the originating function of the delegate is marked to be heap-allocated.  This ensures that there can be no dangling references to the activation record.

Pros:
- speed: when a local delegate is required, we can use one without
allocating anything on the heap
- flexibility: when a global delegate is required, we can still use a
local one with the cost that the activation record is allocated from
the heap

Cons:
- complexity of both the interface (need two different types) and the
implementation (must be able to heap-allocate the activation records)


In summary: all of solutions (3)-(5) are safe.  Solution (3) is
straightforward and emphasizes ease of implementation.  Solution (4)
emphasizes ease of use.  Solution (5) provides both speed and
flexibility with the cost of added complexity both on the usage and
implementation side.

In addition, there probably are feasible solutions that my imagination isn't capable of coming up with. Oh yeah, and then we have yet another alternative:

6. Letting things be as they are now - possibly unsafe, but simple, efficient and mostly useful.

Delegates are definitely good, and experienced developers will know to be careful when they encounter a delegate.  If they're even a bit safety-conscious, they'll never store the delegate, because it might contain a pointer to the stack frame.  Especially in a context where you can never know where the delegate comes from, and you have no way to inform that you must be able to store the delegate.

However, when programming projects get larger, someone will, eventually, in a moment of dim-wittedness, hurry, carelessness, or fatigue, store a pointer to a local delegate and then someone other will happily call it in an unexpected situation (say, an error handler which is called very rarely and tested only a couple of specific situations, or perhaps even untested.)

And there we have a catastrophe.

(To be dramatic, I'll now have to include some examples of software catastrophes to gain appreciation to my view :-)

http://courses.cs.vt.edu/~cs3604/lib/Therac_25/Therac_1.html http://www.ima.umn.edu/~arnold/disasters/ariane.html http://www.cs.tau.ac.il/~nachumd/verify/horror.html


I'd live happily with alternatives 3, 4 or 5 - and even 6, but I'd be much more comfortable with some kind of guarantee that a nonexistent stack frame will never be accessed.

-Antti
March 01, 2003
> It would have to know that any function you passed the delegate to would
not
> store it in some global data structure. I think that's not possible.

No, but I believe it could be implemented as a compile-time check the way I indicated before: no function is allowed to store a delegate (assign it to a member variable / static variable, local variables could be allowed) unless it declares so as a flag to the delegate parameter. So

class X
{

void delegate(int) memberDelegate;

void f( void delegate( int ) ptr )
{
    memberDelegate = ptr;    // <= Compiler error: "Cannot store a reference
to a delegate parameter that is not declared 'global'"
}

}

if the programmer needs f to store the delegate anyway, he would have then
to e.g. change it to:
"void f( void delegate( int ) global ptr )"

Passing a nested function as delegate to such a function would then generate another compiler error: "Cannot pass local delegate as global delegate parameter" or something


March 01, 2003
"Jeroen van Bemmel" <anonymous@somewhere.com> wrote in message news:b3qvu1$2a5o$1@digitaldaemon.com...
> a) "f.handleInterrupt" is a lot shorter to type than declaring & implementing a 'delegator cast operator'

I do agree.

> b) since it's a local function/class, it's sure to have limited size and functionality. In fact, it should be as short as possible to keep readability acceptable

When you need a lexical closures, you need to create an object for it
 currently. This adds lines for the object plus a line for the explicit
 delegate creation.

> c) overloadable operators are a questionable syntactic-sugar feature of C++. Overloadable type-cast operators are a nightmare, since the implicit type casts they cause go against all type safety principles

I do agree.

> d) "casting a class instance to a delegate" - just reading that should raise questions about the semantics...

I'm not sure about this one. You are probably right, but on the other side
 it looks very similar to C++ functors. That is not so bad. Still it can be cast
 to more delegates. Yes, it may be better to create the delegate explicitly
 (even when it is more to type).
Still, there is diference when compared with point c. It canot cause some
 chain of implicit casts (delegate cannot be casted to anything else).
 So may be worth a try. What do other think?


March 01, 2003
Sean L. Palmer wrote:
> What information would the compiler need to have in order to be able to
> detect such usage 100%?

It would need to allocate an object at the start of the function that contains the frame pointer - this is what's now passed as the nested function's parent frame pointer.  When the function exits, it sets the pointer's contents to null; nested functions that are called check that the pointer is currently non-null before continuing.  Then it unpacks the value and continues.  The compiler will have to check that a delegate does use the parent frame pointer before making this check.