View mode: basic / threaded / horizontal-split · Log in · Help
June 15, 2011
SSE asm with functions
In the attached file xmm.d I have a function xnormal that takes a vector ( alias float[4] )  an computes the
unit vector.  The SSE code seems to work fine, but it keeps returning [nan, nan, nan, nan ] and the
writeln prints the same. But if I change the return from r ( output vector ) to v ( input vector ) it prints
the correct normal vector, and returns the input vector.  Is this my bug or a compiler bug?
DMD32 v2.053  OS X

const(vector) xnormal( ref const(vector) v )
{
   vector r;
   asm
   {
       mov EAX, v;
       movups XMM0, [EAX]; //load vector
       movaps XMM2, XMM0; // copy original data

       // find x^2 + y^2 + z^2 + w^2
       mulps XMM0, XMM0; // xx, yy, zz, ww
       movaps XMM1, XMM0; // copy, cause we will write into X0
       shufps XMM0, XMM1, 0x4e; // 0100 1110 zwxy
       addps XMM0, XMM1; // xyzw + zwxy

       movaps XMM1, XMM0; // copy, cause we will write into X0
       shufps XMM0, XMM1, 0x11; // 0001 0001 (y+w)(x+z)(y+w)(x+z)
       addps XMM0, XMM1; // (x+z)(y+w)(z+x)(w+y) + (y+w)(x+z)(y+w)(x+z)
                         // (x+z+y+w)(y+w+x+z)(z+x+y+w)(w+y+x+z)

       rsqrtps XMM0, XMM0; // 1/sqrt(XMM0)
       mulps XMM2, XMM0; // x/sqrt(x^2+y^2+z^2+w^2) , ...
       movups r, XMM2;
   }
   writeln( "Result: ", r, "\t", v );
   return r;
}

I would like to use D for a thesis projects, but wont be able to if its still this buggy.

-Byron
June 15, 2011
Re: SSE asm with functions
Byron:

> Is this my bug or a compiler bug?

DMD doesn't compile asm code. My suggestion is to keep reducing your code until you understand what's going on. Also, a disassembler helps a bit here.

Bye,
bearophile
June 15, 2011
Re: SSE asm with functions
Byron wrote:
> In the attached file xmm.d I have a function xnormal that takes a vector ( alias
float[4] )  an computes the
> unit vector.  The SSE code seems to work fine, but it keeps returning [nan, nan,
nan, nan ] and the
> writeln prints the same. But if I change the return from r ( output vector ) to
v ( input vector ) it prints
> the correct normal vector, and returns the input vector.  Is this my bug or a
compiler bug?
> DMD32 v2.053  OS X
>
> const(vector) xnormal( ref const(vector) v )
> {
>     vector r;
>     asm
>     {
>         mov EAX, v;
>         movups XMM0, [EAX]; //load vector
>         movaps XMM2, XMM0; // copy original data
>
>         // find x^2 + y^2 + z^2 + w^2
>         mulps XMM0, XMM0; // xx, yy, zz, ww
>         movaps XMM1, XMM0; // copy, cause we will write into X0
>         shufps XMM0, XMM1, 0x4e; // 0100 1110 zwxy
>         addps XMM0, XMM1; // xyzw + zwxy
>
>         movaps XMM1, XMM0; // copy, cause we will write into X0
>         shufps XMM0, XMM1, 0x11; // 0001 0001 (y+w)(x+z)(y+w)(x+z)
>         addps XMM0, XMM1; // (x+z)(y+w)(z+x)(w+y) + (y+w)(x+z)(y+w)(x+z)
>                           // (x+z+y+w)(y+w+x+z)(z+x+y+w)(w+y+x+z)
>
>         rsqrtps XMM0, XMM0; // 1/sqrt(XMM0)
>         mulps XMM2, XMM0; // x/sqrt(x^2+y^2+z^2+w^2) , ...
>         movups r, XMM2;
>     }
>     writeln( "Result: ", r, "\t", v );
>     return r;
> }
>
> I would like to use D for a thesis projects, but wont be able to if its still
this buggy.
>
> -Byron
> << xmm.d >>

It seems it is a backend bug in DMD as the same code works just fine with GDC.
(frontend version 2.052 though, this might need some further investigation).

Timon
June 16, 2011
Re: SSE asm with functions
I reduced the complexity of the problem, seems to be SSE and returning local copies.

$ dmd -run db.d
v: [1, 2, 3, 4]
test1 r: [nan, nan, nan, nan]
test1: [nan, nan, nan, nan]
test2 r: [1, 2, 3, 4]
test2: [1, 2, 3, 4]
halle109-251:asm byro


//db.d
import std.stdio;

alias float[4] vector;

const(vector) test1( ref const(vector) v )
{
   vector r;
   asm
   {
       mov EAX, v;
       movups XMM0, [EAX];
       movups r, XMM0;
   }
   writeln( "test1 r: ", r );
   return r;
}

const(vector) test2( ref const(vector) v )
{
   vector r, s;
   asm
   {
       mov EAX, v;
       movups XMM0, [EAX];
       movups r, XMM0;
   }
   writeln( "test2 r: ", r );
   s = r;
   return s;
}

void main()
{
   vector v = [1,2,3,4];
   writeln( "v: ", v );
   writeln( "test1: ", test1(v));
   writeln( "test2: ", test2(v));
}


-Byron
June 16, 2011
Re: SSE asm with functions
Same problem with 64-bit dmd on ubuntu. ( change EAX to RAX )
June 16, 2011
Re: SSE asm with functions
On 6/16/2011 10:20 AM, Byron wrote:
> I reduced the complexity of the problem, seems to be SSE and returning local copies.

What you've run into is the "named return value" optimization. 'r' is rewritten 
by the compiler as a reference to a vector in the caller's stack frame, this 
avoids unnecessary copying when returning r.

The trouble is, the inline assembler does no such rewrites.

I'll file this in bugzilla. In the meantime, you've already discovered the 
workaround in test2().
June 16, 2011
Re: SSE asm with functions
http://d.puremagic.com/issues/show_bug.cgi?id=6166
Top | Discussion index | About this forum | D home