January 10, 2012
> I have porting the runtime/phobos asms to gcc asm on my to-do list, will try to get to that within two weeks. What would be the preferred way - version() guards? if yes - what version? Or would you prefer replacing the asms, if the changes are not going to be merged upstream anyway?

So i decided to start with this today. As i have a case where turning on logging increases a programs run time from seconds to hours, while it spends most of the time in the GC, thought gcbits would be a good place to start.

But after adding gdc asm support to GCBits.testClear() the only thing that
changed was this:

XXXXXXXX <uint gc.gcbits.GCBits.testClear(uint)>:
                push   %eRX
                mov    %eRX,%eRX
                mov    XX(%eRX),%eRX
-               push   %eRX
-               mov    %eRX,%eRX
-               shr    $0x5,%eRX
-               lea    XX(,%eRX,4),%eRX
-               mov    XX(%eRX),%eRX
-               add    (%eRX),%eRX
-               mov    $0x1,%eRX
-               shl    %Rl,%eRX
-               mov    %eRX,%eRX
                mov    (%eRX),%eRX
-               not    %eRX
-               and    %eRX,%eRX
-               and    %eRX,%eRX
-               mov    %eRX,(%eRX)
-               pop    %eRX
+               mov    XX(%eRX),%eRX
+               btr    %eRX,XX(%eRX)
+               sbb    %eRX,%eRX
                pop    %eRX
                ret

OK, the function turned into ~ three instructions, good, but why didn't it then get inlined into any of the callers? Trying to force things with an attribute turned up this:

../../../libphobos/gc/gcx.d: In member function 'gc.gcx.Gcx.fullcollect':
BUILD32/gdc/dev/gcc-4.6.1/libphobos/gc/gcbits.d:119:0: sorry, unimplemented: inlining failed in call to 'testClear': function body not available
../../../libphobos/gc/gcx.d:2647:0: sorry, unimplemented: called from here
BUILD32/gdc/dev/gcc-4.6.1/libphobos/gc/gcbits.d:119:0: sorry, unimplemented: inlining failed in call to 'testClear': function body not available
../../../libphobos/gc/gcx.d:2729:0: sorry, unimplemented: called from here
make[3]: *** [gc/gcx.o] Error 1

Any way to make this work? Much of the asm gains will be lost when the code isn't inlined.

artur