Thread overview
SSE, Inline assembler, Structs, ...
Apr 01, 2008
Audun Wilhelmsen
Apr 01, 2008
Sascha Katzner
Apr 03, 2008
Audun Wilhelmsen
Apr 04, 2008
Sascha Katzner
April 01, 2008
I want to use SSE to create a fast vector/matrix library (if anyone has done this already I'd like to know).

It seems that there's quite a bit of overhead with operator overloading, so I'd probably want to write some of the algorithms in my final app in assembly, but I'd still like to have optimized operators too. But I'm having some problems. I can't get this to work for instance:
align struct Vec 4 {
  float x,y,z,w;
  ....
Vec4 opAdd(Vec4 v) {
  Vec4 res;
  asm {
    movaps XMM0, [this];
    addps XMM0, v[EBP];
    movaps res[EBP], XMM0;
  }
  return res;
}
}

if i add Vec4  *me = this and replace this with me it compiles, but it crashes.

Also, this confuses me:
	Vec4 v1 = Vec4(1,2,3,4);
//	Vec4* p = &v1;
	asm {
		movaps XMM1, v1[EBP];
	}

if I remove the comment, the program crashes.







April 01, 2008
Audun Wilhelmsen wrote:
> Also, this confuses me:
> 	Vec4 v1 = Vec4(1,2,3,4);
> //	Vec4* p = &v1;
> 	asm {
> 		movaps XMM1, v1[EBP]; 	}
> 
> if I remove the comment, the program crashes.

It seems that this is an data alignment problem, IIRC the "a" in movAps stands for "align" that means the command expects aligned data. You can use movups (unaligned), but that is not as fast. Or you can align your data.

LLAP,
Sascha
April 03, 2008
Sascha Katzner Wrote:

> Audun Wilhelmsen wrote:
> > Also, this confuses me:
> > 	Vec4 v1 = Vec4(1,2,3,4);
> > //	Vec4* p = &v1;
> > 	asm {
> > 		movaps XMM1, v1[EBP];
> > 	}
> > 
> > if I remove the comment, the program crashes.
> 
> It seems that this is an data alignment problem, IIRC the "a" in movAps stands for "align" that means the command expects aligned data. You can use movups (unaligned), but that is not as fast. Or you can align your data.
> 
> LLAP,
> Sascha

Well I've tried align, align(4) and align(16) in front of struct Vec4.. Isn't that supposed to align the data?
April 03, 2008
"Audun Wilhelmsen" <seronor@gmail.com> wrote in message news:ft3bdl$1enj$1@digitalmars.com...
> Sascha Katzner Wrote:
>
>> Audun Wilhelmsen wrote:
>> > Also, this confuses me:
>> > Vec4 v1 = Vec4(1,2,3,4);
>> > // Vec4* p = &v1;
>> > asm {
>> > movaps XMM1, v1[EBP];
>> > }
>> >
>> > if I remove the comment, the program crashes.
>>
>> It seems that this is an data alignment problem, IIRC the "a" in movAps stands for "align" that means the command expects aligned data. You can use movups (unaligned), but that is not as fast. Or you can align your data.
>>
>> LLAP,
>> Sascha
>
> Well I've tried align, align(4) and align(16) in front of struct Vec4.. Isn't that supposed to align the data?

I think align(n) only works on data alignment within the struct, and not the alignment of the struct itself in memory.

I _think_.


April 04, 2008
Audun Wilhelmsen wrote:
> Well I've tried align, align(4) and align(16) in front of struct
> Vec4.. Isn't that supposed to align the data?

Since 1.023...
> Data items in static data segment >= 16 bytes in size are now
> paragraph aligned.

So, you have to put your structs in the static data segment, structs on the stack are not properly aligned as far as I know.

LLAP,
Sascha