August 17, 2006
"Lionello Lunesu" <lio@lunesu.remove.com> wrote in message news:ec1a8l$19ab$1@digitaldaemon.com...
> kris wrote:
>> I'm hoping there's another option, better than both the above (or the above combination) ~ it's seems apparent that neither present an ideal resolution
>
> Delegates are a pretty advanced programming technique. I don't think it would be bad solution to just let the programmer mark the exceptional cases. At least until a fail-safe general solution becomes available.
>

Well, I think everyone can agree that the best possible solution would be an omniscient static escape analyzer.  It would guarantee that programs use the stack as much as possible, and only resort to the heap when absolutely necessary.

Sadly, such a program is an abomination of logic - its very existence a paradox.  So, we will have to settle for whatever escape analysis logic is available.  Real world analyzers have that unfortunate "maybe" case.  If we want to automate everything, then we need to assume heap allocation for all of the places where the implementation could fail.  As Kris pointed out, this is not good enough.

Here is what I propose:
1. The default frame-type is determined through static escape analysis.
2. Functions can be declared to explicitly contain a stack based frame.
3. If the escape analyzer determines that a function declared with a stack
frame contains an escaping delegate, then it may raise an error.

With these rules, we use all the information an escape analyzer can tell us. As Walter posted,

>Static escape analysis can yield 3 results:
>
>1) guaranteed to not escape
>2) might escape
>3) does escape

Simply marking everything except the first case as heap allocated does not distinguish between cases 2 and 3.  Nor does adding an "ultimate" override, since it could overpower the analyzer even in case 3.

This weaker override can only be applied to cases 1 and 2.  In case 1, it is merely redundant, but should be legal.  In case 3, it is blatantly wrong, so the compiler should say so.  In case 2, it is impossible for the compiler to determine whether the human was right or wrong, so we can allow the usage. This allows the programmer to effectively use each piece of information.

Here is a visual representation of this argument using ascii-tables:
[monospace]
Omniscient analyzer:
+-------------------------+----------------+
| Escape Analyzer Result  | Interpretation |
+-------------------------+----------------+
| No Escape               | Stack          |
| Does Escape             | Heap           |
+-------------------------+----------------+
2 total cases: {Stack, Heap}

Without an override (non-omniscient):
+-------------------------+----------------+
| Escape Analyzer Result  | Interpretation |
+-------------------------+----------------+
| No Escape               | Stack          |
| Might Escape            | Heap           |
| Does Escape             | Heap           |
+-------------------------+----------------+
2 total cases: {Stack, Heap}

With an ultimate override:
+-------------------------+-------------+----------+
| Escape Analyzer Result  | No Override | Override |
+-------------------------+-------------+----------+
| No Escape               | Stack       | Stack    |
| Might Escape            | Heap        | Stack    |
| Does Escape             | Heap        | Stack    |
+-------------------------+-------------+----------+
2 total cases: {(Stack, Stack), (Heap, Stack)}

With an weak override:
+-------------------------+-------------+----------+
| Escape Analyzer Result  | No Override | Override |
+-------------------------+-------------+----------+
| No Escape               | Stack       | Stack    |
| Might Escape            | Heap        | Stack    |
| Does Escape             | Heap        | Error    |
+-------------------------+-------------+----------+
3 total cases: {(Stack, Stack), (Heap, Stack), (Heap, Error)}
[/monospace]

Moreover, this strategy is independent of the type of escape analyzer used. The quality of the static checking could become a quality of implementation issue, much like the optimizer.  Picking one specific strategy would require integrating it into the core specification, which will create headaches for future generations of D compilers. This method is flexible enough to allow a short-term simple implementation, which can later be extended into a more powerful and accurate system.


August 17, 2006
1) What happens when a heap based function calls a stack based function? Will the code try to extend the heap frame? How would this be prevented? Use another register for the heap-frame-pointer? (stack-frame, heap-frame, stack, how many register are left?) Have a stub stack-frame to put it in? (makes fn calls more costly.)

2) heap frames only tackles the case where the "lost" scope is the function it's self. What about where the scope is a if block?

void fn()
{
	int i;
	auto dg = {return i;};

	if(i);
	{
		int j;
		dg = {return i+j;};
	}

	if(i)
	{
		int k = 3;
		k = dg();	// j overlaps k
	}
}
August 17, 2006
Lionello Lunesu wrote:
> kris wrote:
> 
>> I'm hoping there's another option, better than both the above (or the above combination) ~ it's seems apparent that neither present an ideal resolution
> 
> 
> Delegates are a pretty advanced programming technique. I don't think it would be bad solution to just let the programmer mark the exceptional cases. At least until a fail-safe general solution becomes available.
> 
> L.

At some point you have to just let the man shoot himself in the foot. You can't stop him. So the best we can do is stop the easy cases and put big warning labels on the hard ones. That is, make them official undefined behavior cases.

If I wanted a really robust systems, I would use a language that only uses heap frames. And has runtime type checking. And a load of other time hogging safety features. All of which I'm glad D *doesn't* have.


Alternative solution:

Requirer the programmer to write "correct" programs. Then, in debug builds, check for errors at run time.

One options for doing this check is this:

In debug builds, put a guard word at the start of each stack frame for every function with delegates. When the function returns, clear the guard.

All delegates are now replaced with something that looks like this (in sudo code):


&(
class {
	int guard;		// storage for the guard
	R delegate(A...) dg;	// storage for the actual delegate

		// whatever args are needed
	R test(A... args)
	{			// test the guard
		assert(guard == dg.this.__guard);
				// chain to the delegate
		return dg(args);
	}
      }
).test


The guard word is pulled from a counter so it will be unique.
August 17, 2006
"BCS" <BCS@pathlink.com> wrote in message news:ec2581$74o$2@digitaldaemon.com...
> 1) What happens when a heap based function calls a stack based function? Will the code try to extend the heap frame? How would this be prevented? Use another register for the heap-frame-pointer? (stack-frame, heap-frame, stack, how many register are left?) Have a stub stack-frame to put it in? (makes fn calls more costly.)
>

This would not present any sort of problem. Here is what a stack based prolog looks like:

push EBP;                    //Save old frame pointer on the stack
mov EBP, ESP;            //Set the new frame pointer & save stack pointer
sub ESP, frame_size;    //Allocate the new frame from the stack

And when we exit, we use the following epilog:

mov ESP, EBP;            //Restore old stack pointer
pop EBP;                     //Restore old frame pointer
ret;                               //Resume execution from the previous
frame

In this case, it does not matter what EBP was before we entered the function.  It could even be an invalid memory location for all we care.  So, calling a stack based function from a heap based function does not create any sort of problem.



> 2) heap frames only tackles the case where the "lost" scope is the function it's self. What about where the scope is a if block?
>
> void fn()
> {
> int i;
> auto dg = {return i;};
>
> if(i);
> {
> int j;
> dg = {return i+j;};
> }
>
> if(i)
> {
> int k = 3;
> k = dg(); // j overlaps k
> }
> }

Typically all the variables used in a given function would get allocated in the same frame.  In this case, the frame would look something like:

EBP[4] = return address
EBP[0] = previous frame pointer
EBP[-4] = i
EBP[-8] = j
EBP[-12] = k

If this is the situation, then i, j and k are all allocated at once when the function first gets called.  They are only initialized once they enter scope.

When considering the semantics of heap allocated frames, I would try to think of it like the class example given in the original post.  In that case, all of the variables and arguments are copied into a container class, which also has methods for every nested function.

Hopefully this makes things clearer.


August 17, 2006
Mikola Lysenko wrote:
> "BCS" <BCS@pathlink.com> wrote in message news:ec2581$74o$2@digitaldaemon.com...
> 
>>1) What happens when a heap based function calls a stack based function? Will the code try to extend the heap frame? [...]
> 
> 
> This would not present any sort of problem. Here is what a stack based prolog looks like:
> 
> push EBP;                    //Save old frame pointer on the stack
> mov EBP, ESP;            //Set the new frame pointer & save stack pointer
> sub ESP, frame_size;    //Allocate the new frame from the stack
> 
> And when we exit, we use the following epilog:
> 
> mov ESP, EBP;            //Restore old stack pointer
> pop EBP;                     //Restore old frame pointer
> ret;                               //Resume execution from the previous frame
> 
> In this case, it does not matter what EBP was before we entered the function. [...]
> 

That may be so but IIRC EBP is generally modified by the function as it executes. This is done to make room for temporaries and such. This will present a problem because it will create a "phantom" stack frame. This could be dealt with by removing the EBP adjustment code from the function, but that would require changes to the code generator.

> 
>>2) heap frames only tackles the case where the "lost" scope is the function it's self. What about where the scope is a if block?
>>
>>void fn()
>>{
>>int i;
>>auto dg = {return i;};
>>if(i);
>>{
>>int j;
>>dg = {return i+j;};
>>}
>>if(i)
>>{
>>int k = 3;
>>k = dg(); // j overlaps k
>>}
>>}
> Typically all the variables used in a given function would get allocated in the same frame.  In this case, the frame would look something like:
> 
> EBP[4] = return address
> EBP[0] = previous frame pointer
> EBP[-4] = i
> EBP[-8] = j
> EBP[-12] = k

This presents a problem, consider:

void fn()
{
	{
		int[BIG] big;
		use big
	}
	{
		float[HUGE] huge;
		use huge;
	}
}

In the general case (stack frames) these /should/ overlap. Otherwise the stack frame will become overly large. Heap frames could us a different layout, but again that will require changes the to code generator.

(BTW, I am now working on a machine that won't allow an int[1000] on the stack. So mandating that all stack variables be non overlapping in not an option.)


Both problems can be overcome, but I really don't like the idea of having the code generator produce different code for heap-frames and stack-frames.
August 17, 2006
Mikola Lysenko wrote:
> [snip]
> Any thoughts or comments?

Well, to me it seems that anything the compiler will try to do automatically will be wrong (or at least needlessly slow) in many cases. And a lot of the problem seems to be simply that one can't attach storage to a delegate without creating a whole class/struct, and doing that is too verbose to be used easily/often.

So, why not simply have some syntax sugar for that?

int delegate() fibs()
{
    int a=0, b=1;
    return delegate with(a,b) { // it takes a and b with it
        ...
    }
}

Which would become exactly what you proposed.


xs0
August 17, 2006
Bruno Medeiros wrote:
> nobody wrote:
>> Mikola Lysenko wrote:
>>
>>> Recent D releases (notably 0.161) have updated the syntax and improved the overall usability of lambda delegates. All the syntactic sugar is nice, but it also lays a few traps for the unwary.  Here is a simple example:
>>>
>>> // The Fibonacci numbers are an integer sequence such that
>>> //        F(n) = F(n - 1) + F(n - 2)
>>> //        And F(0) = 0, F(1) = 1
>>> int delegate() fibs()
>>> {
>>>     int a = 0;            // Initialize the last two terms of the Fibonacci     int b = 1;
>>>
>>>     return
>>>     {
>>>         int c = a + b;     // Calculate the next term in the sequence
>>>         a = b;               // Shift the previous terms back
>>>         b = c;
>>>         return c;            // Return the result
>>>     };
>>> }
>>>
>>> This function returns a function which will sequentially evaluate all of the Fibonacci numbers.  Notice that the inner delegate modifies the variables a and b in fibs() scope.  Because of this, it is not guaranteed to work after fibs returns.  This is most irritating, and it greatly restricts the use of this technique. Another potential use for lambda delegates is to create one-line adapter methods.  Consider the following attempt to create a button which will display an arbitrary message when clicked:
>>
>> It took me awhile to get what you were doing but it is elegant -- or would be if it worked. I am sure I am missing something however so I hope you might explain why using static storage is not sufficient:
>>
>> int delegate() fibs()
>> {
>>   return
>>   {
>>     static int a = 0;
>>     static int b = 1;
>>     int c = a + b;
>>     a = b;
>>     b = c ;
>>     return c;
>>   };
>> }
>>
> 
> For starters, a delegate can access the outer function arguments, but those arguments cannot be made static.
> 
> And then there will only be one "instance" of the returned delegate (think of the delegate as an object). Any call to it will update only one single sequence:
> 
>   auto dg1 = fibs();
>   auto dg2 = fibs();
>   dg1(); // 1
>   dg1(); // 2
>   dg2(); // 3
>   dg2(); // 5
> 
> However, with proper "heaped" frames, there are many delegate (outer frame) "instances":
> 
>   auto dg1 = fibs();
>   auto dg2 = fibs();
>   dg1(); // 1
>   dg1(); // 2
>   dg1(); // 3
>   dg2(); // 1
>   dg2(); // 2
>   dg2(); // 3
> 

Thanks for the reply. Your reply certainly helped me see the issues more clearly. I took some time and played with static storage just a bit more to see where its limits are. It is possible to get scoped vars into static fields. The real problem is somehow getting individual delegates to recognize which index is theirs. You should notice that in addition to making tables for multiple values there is also a table which stores all the delegates made. If there were a way to scan the table for the delegate within the delegate when everytime it is run then the problem would be solved.


  const int MAX_CONSECUTIVE_FIBS = 8;

  int delegate() fibs()
  {
    int a = 0;
    int b = 1;


    int delegate() fibs_dg_wrapper()
    {
      static int[MAX_CONSECUTIVE_FIBS] a_table;
      static int[MAX_CONSECUTIVE_FIBS] b_table;
      static int delegate()[MAX_CONSECUTIVE_FIBS] d_table;
      static int last_index = -1;

      last_index += 1;
      a_table[last_index] = a; // a valid in this wrapper func
      b_table[last_index] = b; // b valid in this wrapper func
      d_table[last_index] =    // keep reference to each dg
      {
        // would work if I could scan table for this dg
        // no idea how to do that and hence the printfs
        int my_index = last_index;

        int c = a_table[my_index] + b_table[my_index];
        a_table[my_index] = b_table[my_index];
        b_table[my_index] = c;
        dout.printf("fibs_dg @ %X\n",d_table[my_index]);
        dout.printf("fibs_dg.c @ %X\n\n",&c);
        return c;
      };
      return d_table[last_index];
    };
    return fibs_dg_wrapper();
  }



  int main(char[][] args)
    int delegate() fibsdg1 = fibs_multi();
    dout.printf("%d\n",fibsdg1());
    dout.printf("%d\n",fibsdg1());
    dout.printf("%d\n",fibsdg1());
    dout.printf("%d\n",fibsdg1());
    dout.printf("%d\n",fibsdg1());
    dout.printf("\n");
    int delegate() fibsdg2 = fibs_multi();
    dout.printf("%d\n",fibsdg2());
    dout.printf("%d\n",fibsdg2());
    dout.printf("%d\n",fibsdg2());
    dout.printf("%d\n",fibsdg2());
    dout.printf("%d\n",fibsdg2());
    dout.printf("\n");

  }
August 17, 2006
Regan Heath wrote:
> On Thu, 17 Aug 2006 01:15:15 +0100, Bruno Medeiros <brunodomedeirosATgmail@SPAM.com> wrote:
>>
>> For starters, a delegate can access the outer function arguments, but those arguments cannot be made static.
>>
>> And then there will only be one "instance" of the returned delegate (think of the delegate as an object). Any call to it will update only one single sequence:
>>
>>    auto dg1 = fibs();
>>    auto dg2 = fibs();
>>    dg1(); // 1
>>    dg1(); // 2
>>    dg2(); // 3
>>    dg2(); // 5
>>
>> However, with proper "heaped" frames, there are many delegate (outer frame) "instances":
>>
>>    auto dg1 = fibs();
>>    auto dg2 = fibs();
>>    dg1(); // 1
>>    dg1(); // 2
>>    dg1(); // 3
>>    dg2(); // 1
>>    dg2(); // 2
>>    dg2(); // 3
> 
> So, the other option is:
> 
> import std.stdio;
> 
> // The Fibonacci numbers are an integer sequence such that
> //        F(n) = F(n - 1) + F(n - 2)
> //        And F(0) = 0, F(1) = 1
> int delegate(inout int,inout int) fibs()
> {
>     return (inout int a, inout int b)
>     {
>         int c = a + b;     // Calculate the next term in the sequence
>         a = b;               // Shift the previous terms back
>         b = c;
>         return c;            // Return the result
>     };
> }
>
> ...
>
> Right? Passing the storage location to the calls.
> 
> Regan

Not sure if you figured it out on your own but I thought maybe someone else might have the same question. The short answer is that the delegate Mikola Lysenko was attempting to define evaluates the first time it is called to the first number in a sequence, S(1), and thereafter the Nth evaluation yields S(N).

The delegate you defined starts with two arbitrary params -- which can but needn't be the start of the sequence. This implementation requires somebody using your function to know what magic numbers start the whole thing off and requires you to test for strange values.

The entire challenge is really that there is no reason to pass arguments to Mikola Lysenko's function because it should know where it starts, how to progress from any N to N+1 and finally remember the last N for which it was evaluated.
August 17, 2006
xs0 wrote:
> Mikola Lysenko wrote:
> 
>> [snip]
>> Any thoughts or comments?
> 
> 
> Well, to me it seems that anything the compiler will try to do automatically will be wrong (or at least needlessly slow) in many cases. And a lot of the problem seems to be simply that one can't attach storage to a delegate without creating a whole class/struct, and doing that is too verbose to be used easily/often.
> 
> So, why not simply have some syntax sugar for that?
> 
> int delegate() fibs()
> {
>     int a=0, b=1;
>     return delegate with(a,b) { // it takes a and b with it
>         ...
>     }
> }
> 
> Which would become exactly what you proposed.
> 
> 
> xs0


Nice, but the problem with that is as follows:

When passing such a delegate as an argument, you don't know whether it'll be used in a synchronous or asynchronous manner. So, perhaps you choose the safe route and assume worst-case. That's cool, but if you choose best case instead (because you 'know' usage is synchronous) then you leave yourself open to nasty issues if that callee is ever changed. Think about passing a delegate to a third-party lib? The contract is weak, and therefore fragile. That's why we'd suggested an extension to the type system instead (which also has some problems).

The other thing that Mik noted is that the entire frame should be on the heap; not just part of it (for performance reasons), and taking a snapshot copy of the scope (or part thereof) does not work at all, since there could be self-references, pointers, whatever, in there. It seems you have to flip the entire frame into the heap.

Other than that, it is a pretty clean syntax :)




August 17, 2006
Bruno Medeiros wrote:
> Also, such analysis would also be useful for object (and data) allocation, as the concept (and the compiler checking method too, I think) is pretty much the same. (A new'ed object
> can be allocated on the stack, if it is determined that it does not escape).

That's correct, escape analysis can open the door for lots of cool improvements.