Thread overview
GDC optimizations bug ?
Nov 09, 2014
Maor
Nov 09, 2014
Iain Buclaw
Nov 09, 2014
Iain Buclaw
Nov 09, 2014
Maor
November 09, 2014
Hi,

I'm trying to compile a program using inline asm with optimizations and I got my inline asm functions thrown out by the optimizer although I declared them as having side effects (by stating a memory clobber).
I wrote the following toy program to demonstrate the problem:

----------------------------------------------

import std.stdio;
import gcc.attribute;

@attribute("forceinline") ulong exchangeAndAdd(shared ulong *counter, ulong addition) {
      ulong retVal = void; // we don't want it initialized when dmd is used
      asm {
        "
          mov  %2, %0             ;
          lock                    ;
          xadd %0, (%1)           ;
        ":
        "=&r"(retVal) :
        "r"(counter), "r"(addition) :
        "memory";
      }
      return retVal;
}

ulong func1() {
  shared ulong a = 9;
  exchangeAndAdd(&a, 3);
  return a;
}

void main()
{
  ulong b;
  b = func1();
  writeln(b);
}

----------------------------------------------

Compiling it with and without optimizations gives a different output:
> /opt/gdc/bin/gdc ./test.d -o/tmp/a.out
> /tmp/a.out
12
> /opt/gdc/bin/gdc -O3 -g ./test.d -o/tmp/a.out
> /tmp/a.out
9

using gdb we can see that the call to exchangeAndAdd() is being thrown out in the optimized version:

(gdb) disas _D4test5func1FZm
Dump of assembler code for function _D4test5func1FZm:
   0x0000000000406460 <+0>:	movq   $0x9,-0x8(%rsp)
   0x0000000000406469 <+9>:	mov    -0x8(%rsp),%rax
   0x000000000040646e <+14>:	retq
End of assembler dump.


Is this a compiler bug or am I doing something wrong ?
btw, I'm using gdc 4.9.0.

thanks in advance,
Maor
November 09, 2014
On 9 Nov 2014 08:40, "Maor via D.gnu" <d.gnu@puremagic.com> wrote:
>
> Hi,
>
> I'm trying to compile a program using inline asm with optimizations and I
got my inline asm functions thrown out by the optimizer although I declared them as having side effects (by stating a memory clobber).
> I wrote the following toy program to demonstrate the problem:
>
> ----------------------------------------------
>
> import std.stdio;
> import gcc.attribute;
>
> @attribute("forceinline") ulong exchangeAndAdd(shared ulong *counter,
ulong addition) {
>       ulong retVal = void; // we don't want it initialized when dmd is
used
>       asm {
>         "
>           mov  %2, %0             ;
>           lock                    ;
>           xadd %0, (%1)           ;
>         ":
>         "=&r"(retVal) :
>         "r"(counter), "r"(addition) :

Maybe try:  "=m"(*counter)

The bug is likely in your input/output clobbers, gcc will always optimise against you unless you get the input/output/clobbers precisely correct.

Iain.


November 09, 2014
On 9 November 2014 08:54, Iain Buclaw <ibuclaw@gdcproject.org> wrote:
>
> On 9 Nov 2014 08:40, "Maor via D.gnu" <d.gnu@puremagic.com> wrote:
>>
>> Hi,
>>
>> I'm trying to compile a program using inline asm with optimizations and I
>> got my inline asm functions thrown out by the optimizer although I declared
>> them as having side effects (by stating a memory clobber).
>> I wrote the following toy program to demonstrate the problem:
>>
>> ----------------------------------------------
>>
>> import std.stdio;
>> import gcc.attribute;
>>
>> @attribute("forceinline") ulong exchangeAndAdd(shared ulong *counter,
>> ulong addition) {
>>       ulong retVal = void; // we don't want it initialized when dmd is
>> used
>>       asm {
>>         "
>>           mov  %2, %0             ;
>>           lock                    ;
>>           xadd %0, (%1)           ;
>>         ":
>>         "=&r"(retVal) :
>>         "r"(counter), "r"(addition) :
>
> Maybe try:  "=m"(*counter)
>
> The bug is likely in your input/output clobbers, gcc will always optimise against you unless you get the input/output/clobbers precisely correct.

Yep, it looks like using (%1) -> "r"(counter) creates a temporary in
the register but never binds back.  The optimiser sees that memory is
clobbered, but non of the operands have memory side effects.

Telling gcc that this is a memory operand fixes it - %1 -> "m"(counter).  But I think that having counter as an input operand is wrong, as it *has* infact a new value written to it.  You can also omit the 'mov' instruction by telling gcc the "0" register should be loaded with the input "addition"

Your fixed (and simplified) assembler statement now becomes:


@attribute("forceinline") ulong exchangeAndAdd(shared ulong *counter,
ulong addition) {
      ulong retVal = void; // we don't want it initialized when dmd is used
      asm {
          "
          lock                    ;
          xadd %0, %1             ;
          " :
          "=r"(retVal), "=m"(*counter) :
          "0"(addition) :
          "memory";
      }
      return retVal;
}
November 09, 2014
On Sunday, 9 November 2014 at 11:50:24 UTC, Iain Buclaw via D.gnu wrote:
> On 9 November 2014 08:54, Iain Buclaw <ibuclaw@gdcproject.org> wrote:
>>
>> On 9 Nov 2014 08:40, "Maor via D.gnu" <d.gnu@puremagic.com> wrote:
>>>
>>> Hi,
>>>
>>> I'm trying to compile a program using inline asm with optimizations and I
>>> got my inline asm functions thrown out by the optimizer although I declared
>>> them as having side effects (by stating a memory clobber).
>>> I wrote the following toy program to demonstrate the problem:
>>>
>>> ----------------------------------------------
>>>
>>> import std.stdio;
>>> import gcc.attribute;
>>>
>>> @attribute("forceinline") ulong exchangeAndAdd(shared ulong *counter,
>>> ulong addition) {
>>>       ulong retVal = void; // we don't want it initialized when dmd is
>>> used
>>>       asm {
>>>         "
>>>           mov  %2, %0             ;
>>>           lock                    ;
>>>           xadd %0, (%1)           ;
>>>         ":
>>>         "=&r"(retVal) :
>>>         "r"(counter), "r"(addition) :
>>
>> Maybe try:  "=m"(*counter)
>>
>> The bug is likely in your input/output clobbers, gcc will always optimise
>> against you unless you get the input/output/clobbers precisely correct.
>
> Yep, it looks like using (%1) -> "r"(counter) creates a temporary in
> the register but never binds back.  The optimiser sees that memory is
> clobbered, but non of the operands have memory side effects.
>
> Telling gcc that this is a memory operand fixes it - %1 ->
> "m"(counter).  But I think that having counter as an input operand is
> wrong, as it *has* infact a new value written to it.  You can also
> omit the 'mov' instruction by telling gcc the "0" register should be
> loaded with the input "addition"
>
> Your fixed (and simplified) assembler statement now becomes:
>
>
> @attribute("forceinline") ulong exchangeAndAdd(shared ulong *counter,
> ulong addition) {
>       ulong retVal = void; // we don't want it initialized when dmd is used
>       asm {
>           "
>           lock                    ;
>           xadd %0, %1             ;
>           " :
>           "=r"(retVal), "=m"(*counter) :
>           "0"(addition) :
>           "memory";
>       }
>       return retVal;
> }

Hi,

Thanks for the tip!
Indeed, it solves the problem (which uncovered another one, but one which deserves a different subject :)

Cheers,
Maor