opApply Magic Function Body Transformation

opApply Magic Function Body Transformation
5 days ago Mike Shah
5 days ago kinke
4 days ago Mike Shah
4 days ago kinke
4 days ago Mike Shah
1 day ago Mike Shah
1 day ago Nick Treleaven

5 days ago

Posted by Mike Shah

Permalink

Mike Shah

Permalink

I'm preparing a video to teach opApply, and I think I'm still kind of unsure on how opApply works in regards to the compiler transformation. It's one of those features I understand how to use, but I'd like a bit of a deeper understanding before I teach it -- to admittingly really know what I am doing. Provided is something I quickly wrote using opApply as an example, and a few concrete questions bolded below if you want to skip ahead.

import std.stdio;

struct Array(T){
  T[] array;
  int opApply(int delegate(ref T) dg){
    int result;
    for(int i=0; i < array.length; i++){
        result = dg(array[i]);
        if(result){
          break;
        }
    }
    return result;
  }
}

void main(){
  Array!int ints;
  ints.array = [4,6,8,10,12];

  foreach(item; ints){
    writeln(item);
  }

}

The purpose of opApply I am clear on -- it's a member function for 'foreach/foreach_reverse' loops for use in iteration. It takes priority over range member functions if both are defined, and otherwise maybe has some performance trade-offs (or at the least, it's slightly easier to template one member function versus 3 for an inputRange to avoid virtual calls -- but that's an aside that needs testing). Okay -- but now onto the part where I need some more understanding -- the delegate and the transformation.

In my understanding/teaching of opApply, I would break opApply into two main concepts:
1.) The operator overloading of 'opApply' -- and the requirement that opApply always returns an integer on the member function signature.
2.) The single parameter to opApply must otherwise be a delegate parameter. The paramaters to the delegate otherwise match the 'foreach' parameters. This portion also has more to do with the magic transform I am not 100% clear on.

The second part (delegate parameter) is what I'm interested in being able to visualize. So in the above code, there's two sort of steps going on:

First, the transformation of a foreach loop being lowered to a regular 'for' loop.

  foreach(item; ints){
    writeln(item);
  }

// is re-written to something-like what is below.
// But we don't really know if it's 'ints.array.length' or some other field.
// Thus we rely on our overload (which we can have multiple) of opApply
// to sort of figure this out..
  {
    int i=0;
    for(; i != ints.array.length; i++){
      writeln( /* item */ ); // 'item' represents whatever the delegate parameter
                             // is i.e. 'ref T' above.
                             // The 'T' is also the item I am iterating on, and
                             // performing some computation on.
    }
  }

The next transformation I can kind of see if I use 'dmd vcg-ast main.d' to compile. I can see the delegate and some magic '__applyArg0'.

  23 void main()
  24 {
  25   Array!int ints = 0;
  26   ints.array = [4, 6, 8, 10, 12];
  27   cast(void)ints.opApply(delegate int(ref int __applyArg0) @safe => 0);
  28   return 0;
  29 }

So really my one concrete question is -- can I see main.main()__foreachbody_L21_C3(ref int) anywhere? This appears to be either a function or label that is the body of my loop. As you'll notice, on line 27, there is no longer a 'foreach' loop anymore. But I don't seem to be able to see the transformation, or otherwise find the symbols anywhere. Perhaps there is another magic compiler flag I am missing?

My second concrete question is When learning opApply, is it useful to think of the 'work' being done as a copy and paste of the work being done in a 'foreach ' loop being pasted in? Or perhaps just to look at cast(void)ints.opApply(delegate int(ref int __applyArg0) @safe => 0); and understand your code has been magically transformed? Some of my intuition in comments is below.

// Somewhere in main
27   cast(void)ints.opApply(delegate int(ref int __applyArg0) @safe => 0);


// "Sort of" what is going on.
// At least to provide some mental model
// for the Array.opApply implementation.
    int opApply(delegate int(ref int __applyArg0) dg)(
    {
      int result;
      for(int i=0; i < this.array.length; i++){
         auto result =  dg(this.array[i]);
                       // {
                       // 'dg' represents the original 'foreach' loop
                       //  effectively copied here.
                       // -- but it's not a 'copy and paste of code here',
                       // instead we have a call to a magic delegate
                       // that exists somewhere from compiler.

                       // I can *think* of the delegate like pasting
                       // in the body of original foreach loop, but only one
                       // item at a time -- considering the single 'index' (elem)
                       // from our loop.
                       // writeln(elem); // same work, but this work comes from
                       // body of original 'foreach' found in 'main()' now wrapped
                       // in a delegate function.
                       //};

          if(result){  // Returns '0' from magic delegate at some point?
            break;
          }
      }
      return result;
    }

A third question Will this call to a delegate provide more hidden allocations I wonder?

Some more investigation

The disassembly (I used gdc-14 to build and then disassemble with 'objdump -d main_binary') seems to generate something like this. In 'C' parlance we have a function pointer otherwise representing where the 'foreach_body' in main otherwise would be stored somewhere. It also appears (both from the disassembly, and from vcg-ast) that the return value of '1' or '0' does not seem important?

void opApply(int* array, int size, void (*func)(int)) {
    int index = 0;
    int result = 0;

    while (index < size) {
        if (index < size) {
            func(array[index]);
            result = 1;
        }
        index++;
    }
}

Sorry if my questions are not clear, any guidance or pointers to examples are helpful!

5 days ago

Re: opApply Magic Function Body Transformation

Posted by kinke
in reply to Mike Shah

Permalink

kinke

Posted in reply to Mike Shah

Permalink

On Monday, 28 July 2025 at 04:39:00 UTC, Mike Shah wrote:

So really my one concrete question is -- can I see main.main()__foreachbody_L21_C3(ref int) anywhere?

I think that's where the confusion comes from, that misleading -vcg-ast output for the loop-body-lambda, apparently printed as … => 0 regardless of the actual loop body.

Based on your example:

import core.stdc.stdio;

struct Array(T) {
  T[] array;
  int opApply(scope int delegate(ref T) dg){
    foreach (ref i; array){
      const result = dg(i);
      if (result)
        return result;
    }
    return 0;
  }
}

void main() {
  Array!int ints;
  ints.array = [4,6,8,10,12];

  foreach (item; ints) {
    if (item == 1)
      continue;
    if (item == 2)
      break;
    printf("%d\n", item);
  }
}

The loop is actually rewritten by the compiler to:

ints.opApply((ref int item) {
  if (item == 1)
    return 0;  // continue => abort this iteration
  if (item == 2)
    return 1;  // break => abort this and all future iterations
  printf("%d\n", item);
  return 0;    // continue with next iteration
});

So the main thing here is that the body is promoted to a lambda, and the control-flow statements inside the body (break and continue in the example above) are transformed to specific return codes for the opApply delegate protocol.

If we add a return statement to the body:

int main() {
  Array!int ints;
  ints.array = [4,6,8,10,12];

  foreach (item; ints) {
    if (item == 0)
      return item;
    if (item == 1)
      continue;
    if (item == 2)
      break;
    printf("%d\n", item);
  }
  return 0;
}

then the rewrite becomes a bit more complex:

int __result;  // magic variable inserted by the compiler, for the main() return value
const __opApplyResult = ints.opApply((ref int item) {
  if (item == 0) {
    __result = item;  // set return value for parent function
    return 2;  // return => abort the loop and exit from parent function
  }
  if (item == 1)
    return 0;  // continue => abort this iteration
  if (item == 2)
    return 1;  // break => abort this and all future iterations
  printf("%d\n", item);
  return 0;    // continue with next iteration
});
switch (__opApplyResult) {
  default:
    break;
  case 2:
    return __result;
}
return __result = 0;

A third question Will this call to a delegate provide more hidden allocations I wonder?

As with any regular lambda, captured outer variables (like the __result in the 2nd example) will cause a closure, but as long as the opApply takes the delegate as scope, the closure will be on the stack, so no harm.

4 days ago

Re: opApply Magic Function Body Transformation

Posted by Mike Shah
in reply to kinke

Permalink

Mike Shah

Posted in reply to kinke

Permalink

On Monday, 28 July 2025 at 12:25:39 UTC, kinke wrote:

On Monday, 28 July 2025 at 04:39:00 UTC, Mike Shah wrote:

[...]

I think that's where the confusion comes from, that misleading -vcg-ast output for the loop-body-lambda, apparently printed as … => 0 regardless of the actual loop body.

[...]

Brilliant -- this I can work with. Is there somewhere already in LDC2 where I can dump out the generated transformation (Otherwise I can probably read the IR well enough)?

Thank you very much for helping break this down!

4 days ago

Re: opApply Magic Function Body Transformation

Posted by kinke
in reply to Mike Shah

Permalink

kinke

Posted in reply to Mike Shah

Permalink

On Monday, 28 July 2025 at 14:28:57 UTC, Mike Shah wrote:

Is there somewhere already in LDC2 where I can dump out the generated transformation (Otherwise I can probably read the IR well enough)?

Yeah I'm afraid the IR is probably the best source. LDC's -vv verbose codegen output would show the actual AST, but as a tree with lots of here unneeded infos.

4 days ago

Re: opApply Magic Function Body Transformation

Posted by Mike Shah
in reply to kinke

Permalink

Mike Shah

Posted in reply to kinke

Permalink

On Monday, 28 July 2025 at 18:57:27 UTC, kinke wrote:

On Monday, 28 July 2025 at 14:28:57 UTC, Mike Shah wrote:

Is there somewhere already in LDC2 where I can dump out the generated transformation (Otherwise I can probably read the IR well enough)?

Yeah I'm afraid the IR is probably the best source. LDC's -vv verbose codegen output would show the actual AST, but as a tree with lots of here unneeded infos.

Ah, I hadn't used -vv. Between that and the IR that should work well enough. Thank you again!

1 day ago

Re: opApply Magic Function Body Transformation

Posted by Mike Shah
in reply to Mike Shah

Permalink

Mike Shah

Posted in reply to Mike Shah

Permalink

On Monday, 28 July 2025 at 19:28:54 UTC, Mike Shah wrote:

On Monday, 28 July 2025 at 18:57:27 UTC, kinke wrote:

On Monday, 28 July 2025 at 14:28:57 UTC, Mike Shah wrote:

Is there somewhere already in LDC2 where I can dump out the generated transformation (Otherwise I can probably read the IR well enough)?

Yeah I'm afraid the IR is probably the best source. LDC's -vv verbose codegen output would show the actual AST, but as a tree with lots of here unneeded infos.

Ah, I hadn't used -vv. Between that and the IR that should work well enough. Thank you again!

For anyone who searches about opApply in the future, here's a video of the result of the discussion (will be released Aug. 5, 2025): https://youtu.be/6d6C-NdDWGk

1 day ago

Re: opApply Magic Function Body Transformation

Posted by Nick Treleaven
in reply to kinke

Permalink

Nick Treleaven

Posted in reply to kinke

Permalink

On Monday, 28 July 2025 at 12:25:39 UTC, kinke wrote:

then the rewrite becomes a bit more complex:

int __result;  // magic variable inserted by the compiler, for the main() return value
const __opApplyResult = ints.opApply((ref int item) {
  if (item == 0) {
    __result = item;  // set return value for parent function
    return 2;  // return => abort the loop and exit from parent function
  }
  if (item == 1)
    return 0;  // continue => abort this iteration
  if (item == 2)
    return 1;  // break => abort this and all future iterations
  printf("%d\n", item);
  return 0;    // continue with next iteration
});
switch (__opApplyResult) {
  default:
    break;
  case 2:
    return __result;
}
return __result = 0;

Interesting, thanks. How does it work if there's a goto in the foreach body with a label outside the foreach?

Top | Forum index | About this forum

Forums