Thread overview
Proposal: Hide the int in opApply from the user
Jan 07, 2008
Bill Baxter
Jan 07, 2008
Jason House
Jan 07, 2008
Sean Kelly
Jan 07, 2008
Bill Baxter
Jan 07, 2008
Bruce Adams
Jan 08, 2008
Bill Baxter
January 07, 2008
I proposed this iniitally over in D.learn, but I'm cleaning it up and reposting here in hopes of getting some response from Walter who was probably too busy finishing const and eating holiday Turkey at the time to notice.  And rightly so.

I don't believe it's appropriate in a high-level supposedly clean language like D that one of the main facilities for iterating over user types (foreach) requires writing code that passes around magic values generated by the compiler (opApply).

It seems wrong to me that these magic values
- come from code generated by the compiler,
- must be handled exactly the proper way by the user's opApply
    (or else you get undefined behavior, but no compiler errors)
- and then are handed back to code also generated by the compiler.

Furthermore, the compiler-generated code in the first and last steps share the same scope!  So the solution seems obvious -- they should pass the information back and forth using a local variable in their shared local scope.

In this proposal we need to add two things:

1) a new template struct and
2) a new macro [yes, this proposal relies on macros which don't exist yet!]


The template just bundles an int* (pointer to _ret) together with the loop body delegate:

   struct Apply(Args...)
   {
     alias void delegate(Args) LoopBody;
     LoopBody _loop_body;
     int* _ret = null;
   }

the macro is this (just guessing what syntax will be, and hoping macros will support tuple-like varargs):

   macro yield(dg, args...) {
     dg._call(args);
     if (dg._ret && *dg._ret) { return; }
   }

With these two library additions, opApply functions can become this:

void opApply( Apply!(ref T) dg ) {
    for( /*T x in elements*/ ) {
        yield(dg,x);
    }
}

Now the trickiness is *all* shifted to how you call such a beast properly, which is all handled by the compiler.  For a foreach in a void function, the compiler will have to generate code like so:

     int _ret = 0;
     void _loop_body(/*ref*/ T x)
     {
         writefln("x is ", x);
         if (x=="two") { _ret = BREAK; return; }
         if (x=="three") { _ret = RETURN; return; }
         do_something;
     }
     obj.opApply( Apply!(T)(&_loop_body, &_ret) );
     if (_ret==RETURN) return;


The language can ALMOST do this today except for three small things:
1) No macros - but they're on the way!
2) Inability to preserve ref-ness of template arguments -- but I think
this really needs to be solved one way or another regardless.
3) The necessary but changes to the foreach code gen -- this is
straightforward.


Attached is a proof of concept demo.  I've manually inlined the yield() code to work around 1), and made the loop body use a non-ref type to work around 2).  I manually generated the foreach code too to deal with 3).

The great thing about this proposal is that it is backwards compatible.
   foreach already generates different code depending on what the
argument is, this can just be another case detected by the use of the
Apply argument.  Code using old-style opApplys can continue to work.

The main thing fuzzy in my mind is the vague status of yield and Apply.
  They don't need to be keywords per-se, but the compiler at least needs
to know about Apply so that it can recognize the signature of this
"new-style" opApply.   I think it can maybe satisfy all that by going
into object.d?  If there were anonymous struct literals it wouldn't even
need to be a real struct, just an alias like we have for 'string' now.


--bb


January 07, 2008
I think this proposal has one fatal flaw...

There is no way for the opApply function to do something after iteration stops prematurely.  Some data structures could change internal state as they iterate.  Those state changes may require clean-up.  I have no good examples at the moment, but know they exist.
January 07, 2008
Jason House wrote:
> I think this proposal has one fatal flaw...
> 
> There is no way for the opApply function to do something after iteration stops prematurely.  Some data structures could change internal state as they iterate.  Those state changes may require clean-up.  I have no good examples at the moment, but know they exist.

scope(exit) would work, but it's not ideal.


Sean
January 07, 2008
Sean Kelly wrote:
> Jason House wrote:
>> I think this proposal has one fatal flaw...
>>
>> There is no way for the opApply function to do something after iteration stops prematurely.  Some data structures could change internal state as they iterate.  Those state changes may require clean-up.  I have no good examples at the moment, but know they exist.

Hmm, you're right.  That's an issue that had not occurred to me because I've never seen it in code.  But it's not a fatal flaw I do not think.

The yield thing is a pretty trivial macro.  I see several possible ways to handle such rare cases.

1) Add public opCall and a public 'finished' methods to Apply to allow users to do yield's work on their own:

struct Apply(Args...) {
    // .. same as before plus:
    void opCall(Args args) { _loop_body(args); }
    bool finished() { return *_ret!=0; }
}

Then this is possible:

void opApply( Apply!(ref T) dg ) {
   for( /*T x in elements*/ ) {
       dg(x);
       if (dg.finished) {
           // do clean up;
           return;
       }
   }
}


2) Provide an alternate macro that takes a cleanup parameter:

void opApply( Apply!(ref T) dg ) {
   for( /*T x in elements*/ ) {
       yield_with_cleanup(dg, { /*do cleanup*/ }, x);
   }
}


3)
> scope(exit) would work, but it's not ideal.

But it's probably the solution that I would use, and one of the other solutions can be used for what I expect are the very rare situations in which you both have to do clean up in your opApply and for some reason can't use scope(exit).


--bb
January 07, 2008
On Mon, 07 Jan 2008 09:06:52 -0000, Bill Baxter <dnewsgroup@billbaxter.com> wrote:

> I proposed this iniitally over in D.learn, but I'm cleaning it up and
> reposting here in hopes of getting some response from Walter who was
> probably too busy finishing const and eating holiday Turkey at the time
> to notice.  And rightly so.
>
> I don't believe it's appropriate in a high-level supposedly clean
> language like D that one of the main facilities for iterating over user
> types (foreach) requires writing code that passes around magic values
> generated by the compiler (opApply).
>
> It seems wrong to me that these magic values
> - come from code generated by the compiler,
> - must be handled exactly the proper way by the user's opApply
>     (or else you get undefined behavior, but no compiler errors)
> - and then are handed back to code also generated by the compiler.
>
When you say magic value what do you mean? From the context it sounds like you
are describing an iterator without iterators being properly part of the D world (yet).
January 08, 2008
Bruce Adams wrote:
> On Mon, 07 Jan 2008 09:06:52 -0000, Bill Baxter <dnewsgroup@billbaxter.com> wrote:
> 
>> I proposed this iniitally over in D.learn, but I'm cleaning it up and
>> reposting here in hopes of getting some response from Walter who was
>> probably too busy finishing const and eating holiday Turkey at the time
>> to notice.  And rightly so.
>>
>> I don't believe it's appropriate in a high-level supposedly clean
>> language like D that one of the main facilities for iterating over user
>> types (foreach) requires writing code that passes around magic values
>> generated by the compiler (opApply).
>>
>> It seems wrong to me that these magic values
>> - come from code generated by the compiler,
>> - must be handled exactly the proper way by the user's opApply
>>     (or else you get undefined behavior, but no compiler errors)
>> - and then are handed back to code also generated by the compiler.
>>
> When you say magic value what do you mean? From the context it sounds like you
> are describing an iterator without iterators being properly part of the D world (yet).

I mean this:

alias int MagicValue;

MagicValue opApply(MagicValue delegate(ref T) dg)
{
   MagicValue magic_value=0;
   foreach(x; stuff) {
       magic_value = dg(x);
       if (magic_value != 0) return magic_value;
   }
   return magic_value;
}

"Magic value" is that int that we're forced to have scattered all over our opApply functions.

--bb