Thread overview
opApply Magic Function Body Transformation
5 days ago
Mike Shah
5 days ago
kinke
4 days ago
Mike Shah
4 days ago
kinke
4 days ago
Mike Shah
1 day ago
Mike Shah
5 days ago

I'm preparing a video to teach opApply, and I think I'm still kind of unsure on how opApply works in regards to the compiler transformation. It's one of those features I understand how to use, but I'd like a bit of a deeper understanding before I teach it -- to admittingly really know what I am doing. Provided is something I quickly wrote using opApply as an example, and a few concrete questions bolded below if you want to skip ahead.

import std.stdio;

struct Array(T){
  T[] array;
  int opApply(int delegate(ref T) dg){
    int result;
    for(int i=0; i < array.length; i++){
        result = dg(array[i]);
        if(result){
          break;
        }
    }
    return result;
  }
}

void main(){
  Array!int ints;
  ints.array = [4,6,8,10,12];

  foreach(item; ints){
    writeln(item);
  }

}

The purpose of opApply I am clear on -- it's a member function for 'foreach/foreach_reverse' loops for use in iteration. It takes priority over range member functions if both are defined, and otherwise maybe has some performance trade-offs (or at the least, it's slightly easier to template one member function versus 3 for an inputRange to avoid virtual calls -- but that's an aside that needs testing). Okay -- but now onto the part where I need some more understanding -- the delegate and the transformation.

In my understanding/teaching of opApply, I would break opApply into two main concepts:
1.) The operator overloading of 'opApply' -- and the requirement that opApply always returns an integer on the member function signature.
2.) The single parameter to opApply must otherwise be a delegate parameter. The paramaters to the delegate otherwise match the 'foreach' parameters. This portion also has more to do with the magic transform I am not 100% clear on.

The second part (delegate parameter) is what I'm interested in being able to visualize. So in the above code, there's two sort of steps going on:

First, the transformation of a foreach loop being lowered to a regular 'for' loop.

  foreach(item; ints){
    writeln(item);
  }

// is re-written to something-like what is below.
// But we don't really know if it's 'ints.array.length' or some other field.
// Thus we rely on our overload (which we can have multiple) of opApply
// to sort of figure this out..
  {
    int i=0;
    for(; i != ints.array.length; i++){
      writeln( /* item */ ); // 'item' represents whatever the delegate parameter
                             // is i.e. 'ref T' above.
                             // The 'T' is also the item I am iterating on, and
                             // performing some computation on.
    }
  }

The next transformation I can kind of see if I use 'dmd vcg-ast main.d' to compile. I can see the delegate and some magic '__applyArg0'.

  23 void main()
  24 {
  25   Array!int ints = 0;
  26   ints.array = [4, 6, 8, 10, 12];
  27   cast(void)ints.opApply(delegate int(ref int __applyArg0) @safe => 0);
  28   return 0;
  29 }

So really my one concrete question is -- can I see main.main()__foreachbody_L21_C3(ref int) anywhere? This appears to be either a function or label that is the body of my loop. As you'll notice, on line 27, there is no longer a 'foreach' loop anymore. But I don't seem to be able to see the transformation, or otherwise find the symbols anywhere. Perhaps there is another magic compiler flag I am missing?

My second concrete question is When learning opApply, is it useful to think of the 'work' being done as a copy and paste of the work being done in a 'foreach ' loop being pasted in? Or perhaps just to look at cast(void)ints.opApply(delegate int(ref int __applyArg0) @safe => 0); and understand your code has been magically transformed? Some of my intuition in comments is below.

// Somewhere in main
27   cast(void)ints.opApply(delegate int(ref int __applyArg0) @safe => 0);


// "Sort of" what is going on.
// At least to provide some mental model
// for the Array.opApply implementation.
    int opApply(delegate int(ref int __applyArg0) dg)(
    {
      int result;
      for(int i=0; i < this.array.length; i++){
         auto result =  dg(this.array[i]);
                       // {
                       // 'dg' represents the original 'foreach' loop
                       //  effectively copied here.
                       // -- but it's not a 'copy and paste of code here',
                       // instead we have a call to a magic delegate
                       // that exists somewhere from compiler.

                       // I can *think* of the delegate like pasting
                       // in the body of original foreach loop, but only one
                       // item at a time -- considering the single 'index' (elem)
                       // from our loop.
                       // writeln(elem); // same work, but this work comes from
                       // body of original 'foreach' found in 'main()' now wrapped
                       // in a delegate function.
                       //};

          if(result){  // Returns '0' from magic delegate at some point?
            break;
          }
      }
      return result;
    }

A third question Will this call to a delegate provide more hidden allocations I wonder?


Some more investigation

The disassembly (I used gdc-14 to build and then disassemble with 'objdump -d main_binary') seems to generate something like this. In 'C' parlance we have a function pointer otherwise representing where the 'foreach_body' in main otherwise would be stored somewhere. It also appears (both from the disassembly, and from vcg-ast) that the return value of '1' or '0' does not seem important?

void opApply(int* array, int size, void (*func)(int)) {
    int index = 0;
    int result = 0;

    while (index < size) {
        if (index < size) {
            func(array[index]);
            result = 1;
        }
        index++;
    }
}

Sorry if my questions are not clear, any guidance or pointers to examples are helpful!

5 days ago

On Monday, 28 July 2025 at 04:39:00 UTC, Mike Shah wrote:

>

So really my one concrete question is -- can I see main.main()__foreachbody_L21_C3(ref int) anywhere?

I think that's where the confusion comes from, that misleading -vcg-ast output for the loop-body-lambda, apparently printed as … => 0 regardless of the actual loop body.

Based on your example:

import core.stdc.stdio;

struct Array(T) {
  T[] array;
  int opApply(scope int delegate(ref T) dg){
    foreach (ref i; array){
      const result = dg(i);
      if (result)
        return result;
    }
    return 0;
  }
}

void main() {
  Array!int ints;
  ints.array = [4,6,8,10,12];

  foreach (item; ints) {
    if (item == 1)
      continue;
    if (item == 2)
      break;
    printf("%d\n", item);
  }
}

The loop is actually rewritten by the compiler to:

ints.opApply((ref int item) {
  if (item == 1)
    return 0;  // continue => abort this iteration
  if (item == 2)
    return 1;  // break => abort this and all future iterations
  printf("%d\n", item);
  return 0;    // continue with next iteration
});

So the main thing here is that the body is promoted to a lambda, and the control-flow statements inside the body (break and continue in the example above) are transformed to specific return codes for the opApply delegate protocol.

If we add a return statement to the body:

int main() {
  Array!int ints;
  ints.array = [4,6,8,10,12];

  foreach (item; ints) {
    if (item == 0)
      return item;
    if (item == 1)
      continue;
    if (item == 2)
      break;
    printf("%d\n", item);
  }
  return 0;
}

then the rewrite becomes a bit more complex:

int __result;  // magic variable inserted by the compiler, for the main() return value
const __opApplyResult = ints.opApply((ref int item) {
  if (item == 0) {
    __result = item;  // set return value for parent function
    return 2;  // return => abort the loop and exit from parent function
  }
  if (item == 1)
    return 0;  // continue => abort this iteration
  if (item == 2)
    return 1;  // break => abort this and all future iterations
  printf("%d\n", item);
  return 0;    // continue with next iteration
});
switch (__opApplyResult) {
  default:
    break;
  case 2:
    return __result;
}
return __result = 0;
>

A third question Will this call to a delegate provide more hidden allocations I wonder?

As with any regular lambda, captured outer variables (like the __result in the 2nd example) will cause a closure, but as long as the opApply takes the delegate as scope, the closure will be on the stack, so no harm.

4 days ago

On Monday, 28 July 2025 at 12:25:39 UTC, kinke wrote:

>

On Monday, 28 July 2025 at 04:39:00 UTC, Mike Shah wrote:

>

[...]

I think that's where the confusion comes from, that misleading -vcg-ast output for the loop-body-lambda, apparently printed as … => 0 regardless of the actual loop body.

[...]

Brilliant -- this I can work with. Is there somewhere already in LDC2 where I can dump out the generated transformation (Otherwise I can probably read the IR well enough)?

Thank you very much for helping break this down!

4 days ago

On Monday, 28 July 2025 at 14:28:57 UTC, Mike Shah wrote:

>

Is there somewhere already in LDC2 where I can dump out the generated transformation (Otherwise I can probably read the IR well enough)?

Yeah I'm afraid the IR is probably the best source. LDC's -vv verbose codegen output would show the actual AST, but as a tree with lots of here unneeded infos.

4 days ago

On Monday, 28 July 2025 at 18:57:27 UTC, kinke wrote:

>

On Monday, 28 July 2025 at 14:28:57 UTC, Mike Shah wrote:

>

Is there somewhere already in LDC2 where I can dump out the generated transformation (Otherwise I can probably read the IR well enough)?

Yeah I'm afraid the IR is probably the best source. LDC's -vv verbose codegen output would show the actual AST, but as a tree with lots of here unneeded infos.

Ah, I hadn't used -vv. Between that and the IR that should work well enough. Thank you again!

1 day ago

On Monday, 28 July 2025 at 19:28:54 UTC, Mike Shah wrote:

>

On Monday, 28 July 2025 at 18:57:27 UTC, kinke wrote:

>

On Monday, 28 July 2025 at 14:28:57 UTC, Mike Shah wrote:

>

Is there somewhere already in LDC2 where I can dump out the generated transformation (Otherwise I can probably read the IR well enough)?

Yeah I'm afraid the IR is probably the best source. LDC's -vv verbose codegen output would show the actual AST, but as a tree with lots of here unneeded infos.

Ah, I hadn't used -vv. Between that and the IR that should work well enough. Thank you again!

For anyone who searches about opApply in the future, here's a video of the result of the discussion (will be released Aug. 5, 2025): https://youtu.be/6d6C-NdDWGk

1 day ago

On Monday, 28 July 2025 at 12:25:39 UTC, kinke wrote:

>

then the rewrite becomes a bit more complex:

int __result;  // magic variable inserted by the compiler, for the main() return value
const __opApplyResult = ints.opApply((ref int item) {
  if (item == 0) {
    __result = item;  // set return value for parent function
    return 2;  // return => abort the loop and exit from parent function
  }
  if (item == 1)
    return 0;  // continue => abort this iteration
  if (item == 2)
    return 1;  // break => abort this and all future iterations
  printf("%d\n", item);
  return 0;    // continue with next iteration
});
switch (__opApplyResult) {
  default:
    break;
  case 2:
    return __result;
}
return __result = 0;

Interesting, thanks. How does it work if there's a goto in the foreach body with a label outside the foreach?