May 25, 2016
I don't know if this is the correct forum to report this. It didn't seam to fit in Learn and General and it's clearly dmd specific.

In my evaluation of a flyweight pattern by using const objects, I looked at the assembly code generated by dmd when using the 'is' operator with object references and Rebindable references which wraps the object reference in a struct.

The test code is as trivial as it could be so that the optimizer can't eliminate the code:
import std.stdio;
import std.typecons;

class Test
    this(string s) { s_ = s; }
    private string s_;
    string toString() { return s_; }

void main(string[] args)
    const Test t1 = new Test(args[0]), t2 = new Test(args[1]);
    bool res1 = t1 is t2;    // (1)

    Rebindable!(const Test) rt1 = t1, rt2 = t2;  // (2)
    bool res2 = rt1 is rt2;

    Rebindable!(const Test) rt3 = Rebindable!(const Test)(t1),
        rt4 = Rebindable!(const Test)(t2);  // (3)
    bool res3 = rt3 is rt4;

    writeln(res1, res2, res3);

I compiled with dub build -b release. Looking at the assembly with objdump on unix (gas assembly and not Intel assembly) I get the following

(1)  object reference comparison
  4394c6:       49 89 c4                mov    %rax,%r12
  4394c9:       4c 3b e3                cmp    %rbx,%r12
  4394cc:       0f 94 c0                sete   %al

It can't be made more efficient than that. Perfect.

(2) Rebindable (struct) comparison
  4394da:       48 8d 75 c8             lea    -0x38(%rbp),%rsi
  4394de:       48 8d 7d d0             lea    -0x30(%rbp),%rdi
  4394e2:       b9 08 00 00 00          mov    $0x8,%ecx
  4394e7:       33 c0                   xor    %eax,%eax
  4394e9:       f3 a6                   repz cmpsb %es:(%rdi),%ds:(%rsi)
  4394eb:       74 05                   je     4394f2 <_Dmain+0x82>
  4394ed:       1b c0                   sbb    %eax,%eax
  4394ef:       83 d8 ff                sbb    $0xffffffff,%eax
  4394f2:       f7 d8                   neg    %eax
  4394f4:       19 c0                   sbb    %eax,%eax
  4394f6:       ff c0                   inc    %eax

It compares the struct as an 8 byte value (size of the object reference) with a byte per byte compare. After that it does some juggling with the result which I don't understand and doesn't seam necessary.

(3) Rebindable (struct) comparison
  43950b:       48 8d 75 d8             lea    -0x28(%rbp),%rsi
  43950f:       48 8d 7d e0             lea    -0x20(%rbp),%rdi
  439513:       b9 08 00 00 00          mov    $0x8,%ecx
  439518:       33 c0                   xor    %eax,%eax
  43951a:       f3 a6                   repz cmpsb %es:(%rdi),%ds:(%rsi)
  43951c:       40 0f 94 c7             sete   %dil

Same as (2) but without any juggling with the boolean result.

It looks like struct comparison is not optimized when its size is a power of two. Since structs are often used as wrappers for "smart values", there is room for improvement here.

My conclusion is that when performance is important, avoid the use of structs for now.
The second conclusion is that Rebindable is currently not equivalent to a true mutable object reference in term of performance. But this is only a compiler issue for the moment and the reason I post this in this forum.

As I side note, I must admit that the optimizer works well. Most of my early test codes were optimized away. ;)

dmd-internals mailing list