Thread overview | |||||||
---|---|---|---|---|---|---|---|
|
February 08, 2005 How to set struct alignment on the stack? | ||||
---|---|---|---|---|
| ||||
Was going to optimize my vector functions for SSE capable CPUs but I ran into a problem. How does one set the alignment for a struct? Not the byte packing alignment for the member data, but how the struct gets aligned on the stack? This is very important for SIMD operations. In the code that follows, there are two main functions. The first one will crash. The second one works, but is not optimal and sucks. Am I doing something lame? Thanks for any time you take to reply! - Brian version = ia32simd; // this is the version being tested. /********************************************************/ align (16) struct vector { float x,y,z,w; void set (float a, float b, float c) {x=a;y=b;z=c;w=1;} void print () {printf ("[ %g, %g, %g, %g ]\n",x,y,z,w);} } void add (inout vector result, inout vector a, inout vector b) { version (ia32simd) asm { mov ESI,a; mov EDI,b; movaps XMM0,[ESI]; addps XMM0,[EDI]; mov ESI,result; movaps [ESI],XMM0; } else { c.x = a.x + b.x; c.y = a.y + b.y; c.z = a.z + b.z; } } /********************************************************/ /* This Main Doesn't Work */ static assert (vector.sizeof == 16); //static assert (vector.alignof == 16); // FAILS! ??? void main1 () { vector a,b,c; //assert ((cast(int)(&a) & 0b1111) == 0); // FAILS! //assert ((cast(int)(&b) & 0b1111) == 0); // FAILS! //assert ((cast(int)(&c) & 0b1111) == 0); // FAILS! a.set (1,2,3); b.set (4,5,6); add (c,a,b); // Error: Win32 Exception !!! c.print(); } /********************************************************/ /* This Main Works, but SUCKS! */ vector *alloc16aligned () { /* allocate a vector off the heap 16 bytes aligned */ byte *p = new byte [vector.sizeof+0b1111]; return cast(vector*)(((cast(int)(p))+0b1111)&~0b1111); } void main2 () { vector *a = alloc16aligned(); vector *b = alloc16aligned(); vector *c = alloc16aligned(); a.set (1,2,3); b.set (4,5,6); add (*c,*a,*b); c.print(); assert (c.x == 5); assert (c.y == 7); assert (c.z == 9); assert (c.w == 2); } |
February 08, 2005 Re: How to set struct alignment on the stack? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Brian Chapman | Brian Chapman wrote: > Was going to optimize my vector functions for SSE capable CPUs but I ran into a problem. How does one set the alignment for a struct? Not the byte packing alignment for the member data, but how the struct gets aligned on the stack? This is very important for SIMD operations. It's equally important for AltiVec, as well as it is for SSE. It would be nice to avoid having to use assembler*, but then D would have to have the same vector extensions that C has... http://developer.apple.com/hardware/ve/model.html And since the PowerPC G4+ has 32 vector registers, in addition to the 32 integer and the 32 floating-point registers, passing vector data on the stack does suck in comparison with registers. But a first small step is aligning the thing to 16-byte boundaries. Otherwise one would have permute all loads, and that sucks worse. http://developer.apple.com/hardware/ve/alignment.html --anders * not that GDC supports any inline assembler yet anyway, but... |
February 09, 2005 Re: How to set struct alignment on the stack? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Anders F Björklund | SIMD extensions for D would be really cool. "Anders F Björklund" <afb@algonet.se> wrote in message news:cubf02$s48$1@digitaldaemon.com... > Brian Chapman wrote: > >> Was going to optimize my vector functions for SSE capable CPUs but I ran into a problem. How does one set the alignment for a struct? Not the byte packing alignment for the member data, but how the struct gets aligned on the stack? This is very important for SIMD operations. > > It's equally important for AltiVec, as well as it is for SSE. > > It would be nice to avoid having to use assembler*, but then D would have to have the same vector extensions that C has... http://developer.apple.com/hardware/ve/model.html > > And since the PowerPC G4+ has 32 vector registers, in addition to the 32 integer and the 32 floating-point registers, passing vector data on the stack does suck in comparison with registers. > > But a first small step is aligning the thing to 16-byte boundaries. Otherwise one would have permute all loads, and that sucks worse. http://developer.apple.com/hardware/ve/alignment.html > > --anders > > * not that GDC supports any inline assembler yet anyway, but... |
February 09, 2005 Re: How to set struct alignment on the stack? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Anders F Björklund | On 2005-02-08 16:37:54 -0600, =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb@algonet.se> said: > It's equally important for AltiVec, as well as it is for SSE. Yeah, I was wanting to do some altivec too, but that's going to require an external asm file since, as you mentioned, GDC doesn't support inline asm. Which means, that it's only worth while to do on longer operations with more data (like matrices). But since its external, I may as well do it in C and use the compiler intrinsics, as you also mentioned. But then suddenly I'm back to using C again which I was wanting to get away from. *sigh* > It would be nice to avoid having to use assembler*, but then > D would have to have the same vector extensions that C has... > http://developer.apple.com/hardware/ve/model.html > > And since the PowerPC G4+ has 32 vector registers, in addition > to the 32 integer and the 32 floating-point registers, passing > vector data on the stack does suck in comparison with registers. > > But a first small step is aligning the thing to 16-byte boundaries. > Otherwise one would have permute all loads, and that sucks worse. > http://developer.apple.com/hardware/ve/alignment.html > > --anders > > * not that GDC supports any inline assembler yet anyway, but... It would be nice if at the very least there was a way, perhaps via the command line, to globally set the data alignment to an arbitrary value (in this case 16 bytes). |
February 09, 2005 Re: How to set struct alignment on the stack? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Brian Chapman | Brian Chapman wrote: > Yeah, I was wanting to do some altivec too, but that's going to require an external asm file since, as you mentioned, GDC doesn't support inline asm. Which means, that it's only worth while to do on longer operations with more data (like matrices). But since its external, I may as well do it in C and use the compiler intrinsics, as you also mentioned. But then suddenly I'm back to using C again which I was wanting to get away from. *sigh* D doesn't let you get away from C. It lets you get away from *C++* :-) AltiVec works fine if you compile it with /usr/bin/gcc, and then link in the objects in the D source ? (it'll require a PPC G4/G5, of course) It might be possible (with a few months or something of work) to get the AltiVec patches and the D patches to co-exist in the GCC 3.3 base... See this changelog for all the patches that are being applied to it: http://www.opensource.apple.com/darwinsource/DevToolsAug2004/gcc-1762/CHANGES.Apple (some examples) > Owner Status Name of change > ----- ------ -------------- > zlaski local -Wno-altivec-long-deprecated > shebs mixed AltiVec > shebs unknown Altivec related > shebs unknown darwin native, AltiVec > shebs local disable generic AltiVec patterns And a ton of other patches, mostly related to 1) Objective-C 2) Objective-C++ 3) Macintosh legacy 4) Fat i386/ppc builds (the sources are modified, so you need to use "diff" a lot) To my local GCC/GDC copy, I have applied the Apple framework patches (so that "#include <Carbon/Carbon.h>" and -framework Carbon works) as well as the -mcpu patches so that G3, G4 and G5 are recognized. http://dstress.kuehne.cn/raw_results/mac-OS-X-10.3.7_gdc-0.10-patch/ But perhaps a worthier effort would be to port GDC to GCC 4.0 ? --anders |
Copyright © 1999-2021 by the D Language Foundation