Thread overview
Re: DMD 1.034 and 2.018 releases - Align
Aug 11, 2008
bearophile
Aug 11, 2008
Pete
Aug 11, 2008
Don
Aug 11, 2008
Wayne Anderson
Aug 11, 2008
Koroskin Denis
August 11, 2008
Aligned memory for array ops:

The MMX (etc), SSE, SSE2, SSE3, SSSE3, SSE4, SSE4.1, SSE4.2, SSE4a, SSE5, AVE instructions require data aligned to 8/16 bytes (and maybe more in the future, the D language has to be flexible for things present in the CPUs 3 years from now), the current array ops of D contain code that manages such problems in the alignment, but I presume properly aligned data may lead to better performance.

If the static/dynamic pointers in D aren't guaranteed to be aligned to 8/16 (and in the future maybe 32 bytes) then a syntax may be added to ensure it.

I presume that something like this may solve the problem with dynamically allocated pointers, but it's not nice looking, it's error-prone, and you have to keep both pointers around in your program:

memPtr = malloc(sizeInBytes + alignmentInBytes - 1);
alignedPtr = (T*)( ((int)memPtr + alignmentInBytes - 1) & ~(alignmentInBytes - 1) );

A first possible syntax (mostly by LeoD):

align(16) int[10] a; // static
auto a = align(32) new int[100]; // dynamic pointer

Bye,
bearophile
August 11, 2008
bearophile Wrote:

> Aligned memory for array ops:
> 
> The MMX (etc), SSE, SSE2, SSE3, SSSE3, SSE4, SSE4.1, SSE4.2, SSE4a, SSE5, AVE instructions require data aligned to 8/16 bytes (and maybe more in the future, the D language has to be flexible for things present in the CPUs 3 years from now), the current array ops of D contain code that manages such problems in the alignment, but I presume properly aligned data may lead to better performance.
> 
> If the static/dynamic pointers in D aren't guaranteed to be aligned to 8/16 (and in the future maybe 32 bytes) then a syntax may be added to ensure it.
> 
> I presume that something like this may solve the problem with dynamically allocated pointers, but it's not nice looking, it's error-prone, and you have to keep both pointers around in your program:
> 
> memPtr = malloc(sizeInBytes + alignmentInBytes - 1);
> alignedPtr = (T*)( ((int)memPtr + alignmentInBytes - 1) & ~(alignmentInBytes - 1) );
> 
> A first possible syntax (mostly by LeoD):
> 
> align(16) int[10] a; // static
> auto a = align(32) new int[100]; // dynamic pointer
> 
> Bye,
> bearophile

Ah. You got there before me :-)
August 11, 2008
bearophile wrote:
> Aligned memory for array ops:
> 
> The MMX (etc), SSE, SSE2, SSE3, SSSE3, SSE4, SSE4.1, SSE4.2, SSE4a, SSE5, AVE instructions require data aligned to 8/16 bytes (and maybe more in the future, the D language has to be flexible for things present in the CPUs 3 years from now), the current array ops of D contain code that manages such problems in the alignment, but I presume properly aligned data may lead to better performance.
> 
> If the static/dynamic pointers in D aren't guaranteed to be aligned to 8/16 (and in the future maybe 32 bytes) then a syntax may be added to ensure it. 

Dynamic and static memory allocation of arrays is guaranteed to be aligned to 16 since D1.023.
Stack-allocated arrays aren't yet aligned. I've added a possible solution in bugzilla #2278.
There'll always be a problem with slicing, though -- if you start a slice from an odd index, it's going to be misaligned.



August 11, 2008
Don Wrote:

> bearophile wrote:
> > Aligned memory for array ops:
> > 
> > The MMX (etc), SSE, SSE2, SSE3, SSSE3, SSE4, SSE4.1, SSE4.2, SSE4a, SSE5, AVE instructions require data aligned to 8/16 bytes (and maybe more in the future, the D language has to be flexible for things present in the CPUs 3 years from now), the current array ops of D contain code that manages such problems in the alignment, but I presume properly aligned data may lead to better performance.
> > 
> > If the static/dynamic pointers in D aren't guaranteed to be aligned to 8/16 (and in the future maybe 32 bytes) then a syntax may be added to ensure it.
> 
> Dynamic and static memory allocation of arrays is guaranteed to be
> aligned to 16 since D1.023.
> Stack-allocated arrays aren't yet aligned. I've added a possible
> solution in bugzilla #2278.
> There'll always be a problem with slicing, though -- if you start a
> slice from an odd index, it's going to be misaligned.
> 
> 
> 
I don't understand that.  Doesn't it depend on the size of the data type being stored in the array.  If the size of the type being stored is a multiple of 16 shouldn't any slice still be aligned?
August 11, 2008
On Mon, 11 Aug 2008 22:13:03 +0400, Wayne Anderson <wanderon@comcast.net> wrote:

> Don Wrote:
>
>> bearophile wrote:
>> > Aligned memory for array ops:
>> >
>> > The MMX (etc), SSE, SSE2, SSE3, SSSE3, SSE4, SSE4.1, SSE4.2, SSE4a,  
>> SSE5, AVE instructions require data aligned to 8/16 bytes (and maybe more in the future, the D language has to be flexible for things present in the CPUs 3 years from now), the current array ops of D contain code that manages such problems in the alignment, but I presume properly aligned data may lead to better performance.
>> >
>> > If the static/dynamic pointers in D aren't guaranteed to be aligned  
>> to 8/16 (and in the future maybe 32 bytes) then a syntax may be added to ensure it.
>>
>> Dynamic and static memory allocation of arrays is guaranteed to be
>> aligned to 16 since D1.023.
>> Stack-allocated arrays aren't yet aligned. I've added a possible
>> solution in bugzilla #2278.
>> There'll always be a problem with slicing, though -- if you start a
>> slice from an odd index, it's going to be misaligned.
>>
>>
>>
> I don't understand that.  Doesn't it depend on the size of the data type being stored in the array.  If the size of the type being stored is a multiple of 16 shouldn't any slice still be aligned?

I think he was talking about ints, floats and doubles since SSE# operate with those types.