Thread overview | |||||
---|---|---|---|---|---|
|
April 07, 2003 align doesn't work | ||||
---|---|---|---|---|
| ||||
Locals should be able to be aligned to the specified requirements. This is vital once we start dealing with types that have hard alignment requirements (such as structs that contain 128-bit xmmwords that must be 16-byte aligned so that inline asm that references them won't get alignment faults). That cent/ucent type would sure be handy too. ;) Sean align(16) struct foo { uint x,y; }; void main () { foo f; uint x; foo f2; printf("foo.y.offset = %d, foo.size = %d\n", foo.y.offset, foo.size); // this is good printf("f = %p, f2 = %p\n", &f, &f2); // these should both be aligned to 16 bytes // align(16) foo f3; // syntax error, I don't understand the reasoning why. } |
April 07, 2003 Re: align doesn't work | ||||
---|---|---|---|---|
| ||||
Posted in reply to Sean L. Palmer | "Sean L. Palmer" wrote: > > Locals should be able to be aligned to the specified requirements. This is vital once we start dealing with types that have hard alignment requirements (such as structs that contain 128-bit xmmwords that must be 16-byte aligned so that inline asm that references them won't get alignment faults). > > That cent/ucent type would sure be handy too. ;) > > Sean > > align(16) struct foo { uint x,y; }; > > void main () > > { > > foo f; > > uint x; > > foo f2; > > printf("foo.y.offset = %d, foo.size = %d\n", foo.y.offset, foo.size); // > this is good > > printf("f = %p, f2 = %p\n", &f, &f2); // these should both be aligned to > 16 bytes > > // align(16) foo f3; // syntax error, I don't understand the reasoning > why. > > } I'll add some weird facts to the topic alignment. While doing precision benchmarks I found, that delegates and functions are extremely senible to alignment. The same functions void TestLoop1000A () { for(int i=0; i<1000; i++) { // empty } } void TestLoop1000B () { for(int i=0; i<1000; i++) { // empty } } will perform quite differently (about 20% up) depending on their starting offset within a 16-Byte frame (at least that is what the benchmrks seem to proof). The measurement error is below 1% (reproducibility). I don't understand it. I'm not a hardware man. It may be CPU-dependent (I used an Athlon 750 for this). Exactly the same effect can be seen when benchmarking the same code by using closures. -- Helmut Leitner leitner@hls.via.at Graz, Austria www.hls-software.com |
April 07, 2003 Re: align doesn't work | ||||
---|---|---|---|---|
| ||||
Posted in reply to Helmut Leitner | On x86 architecture, branch targets do considerably better when aligned to at least 4 byte alignment (8 is better for modern CPU's I think) It's even better to have your entire inner loop fit into as few cache lines as possible. This is something the compiler should deal with internally when you specify -O; the programmer should not have to concern themselves with such petty implementation details. It's part of the standard size vs. speed tradeoff. Or were you driving at the need for some directive to control code alignment manually? Sean "Helmut Leitner" <leitner@hls.via.at> wrote in message news:3E913C6A.CBF417D1@hls.via.at... > I'll add some weird facts to the topic alignment. > > While doing precision benchmarks I found, that delegates and functions are extremely senible to alignment. The same functions > > void TestLoop1000A () > { > for(int i=0; i<1000; i++) { > // empty > } > } > > void TestLoop1000B () > { > for(int i=0; i<1000; i++) { > // empty > } > } > > will perform quite differently (about 20% up) depending on their starting > offset within a 16-Byte frame (at least that is what the benchmrks seem to proof). > The measurement error is below 1% (reproducibility). > > I don't understand it. I'm not a hardware man. > It may be CPU-dependent (I used an Athlon 750 for this). > > Exactly the same effect can be seen when benchmarking the same code by using > closures. > > > -- > Helmut Leitner leitner@hls.via.at > Graz, Austria www.hls-software.com |
Copyright © 1999-2021 by the D Language Foundation