Thread overview | |||||||
---|---|---|---|---|---|---|---|
|
April 01, 2008 SSE, Inline assembler, Structs, ... | ||||
---|---|---|---|---|
| ||||
I want to use SSE to create a fast vector/matrix library (if anyone has done this already I'd like to know). It seems that there's quite a bit of overhead with operator overloading, so I'd probably want to write some of the algorithms in my final app in assembly, but I'd still like to have optimized operators too. But I'm having some problems. I can't get this to work for instance: align struct Vec 4 { float x,y,z,w; .... Vec4 opAdd(Vec4 v) { Vec4 res; asm { movaps XMM0, [this]; addps XMM0, v[EBP]; movaps res[EBP], XMM0; } return res; } } if i add Vec4 *me = this and replace this with me it compiles, but it crashes. Also, this confuses me: Vec4 v1 = Vec4(1,2,3,4); // Vec4* p = &v1; asm { movaps XMM1, v1[EBP]; } if I remove the comment, the program crashes. |
April 01, 2008 Re: SSE, Inline assembler, Structs, ... | ||||
---|---|---|---|---|
| ||||
Posted in reply to Audun Wilhelmsen | Audun Wilhelmsen wrote:
> Also, this confuses me:
> Vec4 v1 = Vec4(1,2,3,4);
> // Vec4* p = &v1;
> asm {
> movaps XMM1, v1[EBP]; }
>
> if I remove the comment, the program crashes.
It seems that this is an data alignment problem, IIRC the "a" in movAps stands for "align" that means the command expects aligned data. You can use movups (unaligned), but that is not as fast. Or you can align your data.
LLAP,
Sascha
|
April 03, 2008 Re: SSE, Inline assembler, Structs, ... | ||||
---|---|---|---|---|
| ||||
Posted in reply to Sascha Katzner | Sascha Katzner Wrote:
> Audun Wilhelmsen wrote:
> > Also, this confuses me:
> > Vec4 v1 = Vec4(1,2,3,4);
> > // Vec4* p = &v1;
> > asm {
> > movaps XMM1, v1[EBP];
> > }
> >
> > if I remove the comment, the program crashes.
>
> It seems that this is an data alignment problem, IIRC the "a" in movAps stands for "align" that means the command expects aligned data. You can use movups (unaligned), but that is not as fast. Or you can align your data.
>
> LLAP,
> Sascha
Well I've tried align, align(4) and align(16) in front of struct Vec4.. Isn't that supposed to align the data?
|
April 03, 2008 Re: SSE, Inline assembler, Structs, ... | ||||
---|---|---|---|---|
| ||||
Posted in reply to Audun Wilhelmsen | "Audun Wilhelmsen" <seronor@gmail.com> wrote in message news:ft3bdl$1enj$1@digitalmars.com... > Sascha Katzner Wrote: > >> Audun Wilhelmsen wrote: >> > Also, this confuses me: >> > Vec4 v1 = Vec4(1,2,3,4); >> > // Vec4* p = &v1; >> > asm { >> > movaps XMM1, v1[EBP]; >> > } >> > >> > if I remove the comment, the program crashes. >> >> It seems that this is an data alignment problem, IIRC the "a" in movAps stands for "align" that means the command expects aligned data. You can use movups (unaligned), but that is not as fast. Or you can align your data. >> >> LLAP, >> Sascha > > Well I've tried align, align(4) and align(16) in front of struct Vec4.. Isn't that supposed to align the data? I think align(n) only works on data alignment within the struct, and not the alignment of the struct itself in memory. I _think_. |
April 04, 2008 Re: SSE, Inline assembler, Structs, ... | ||||
---|---|---|---|---|
| ||||
Posted in reply to Audun Wilhelmsen | Audun Wilhelmsen wrote: > Well I've tried align, align(4) and align(16) in front of struct > Vec4.. Isn't that supposed to align the data? Since 1.023... > Data items in static data segment >= 16 bytes in size are now > paragraph aligned. So, you have to put your structs in the static data segment, structs on the stack are not properly aligned as far as I know. LLAP, Sascha |
Copyright © 1999-2021 by the D Language Foundation