Thread overview
Using inline assembler
Oct 09, 2014
Etienne
Oct 09, 2014
anonymous
Oct 09, 2014
Etienne
Oct 09, 2014
anonymous
Oct 09, 2014
Etienne
Oct 09, 2014
Etienne
October 09, 2014
I'm a bit new to the inline assembler, I'm trying to use the `movdqu` operation to move a 128 bit double quadword from a pointer location into another location like this:

align(16) union __m128i { ubyte[16] data };

void store(__m128i* src, __m128i* dst) {
	asm { movdqu [dst], src; }
}


The compiler complains about a "bad type/size of operands 'movdqu'", but these two data segments are 16 byte align so they should be in an XMM# register? Is there something I'm missing here?
October 09, 2014
On Thursday, 9 October 2014 at 12:37:20 UTC, Etienne wrote:
> I'm a bit new to the inline assembler, I'm trying to use the `movdqu` operation to move a 128 bit double quadword from a pointer location into another location like this:
>
> align(16) union __m128i { ubyte[16] data };
>
> void store(__m128i* src, __m128i* dst) {
> 	asm { movdqu [dst], src; }
> }
>
>
> The compiler complains about a "bad type/size of operands 'movdqu'", but these two data segments are 16 byte align so they should be in an XMM# register? Is there something I'm missing here?

I know virtually nothing about SSE, but you can't move directly
from memory to memory, can you? You need go through a register,
no?

This compiles:

align(16) union __m128i { ubyte[16] data; } /* note the position
of the semicolon */

void store(__m128i* src, __m128i* dst) {
     asm
     {
         movdqu XMM0, [src]; /* note: [src] */
         movdqu [dst], XMM0;
     }
}
October 09, 2014
On 2014-10-09 8:54 AM, anonymous wrote:
> This compiles:
>
> align(16) union __m128i { ubyte[16] data; } /* note the position
> of the semicolon */
>
> void store(__m128i* src, __m128i* dst) {
>       asm
>       {
>           movdqu XMM0, [src]; /* note: [src] */
>           movdqu [dst], XMM0;
>       }
> }

Yes, this does compile, but the value from src never ends up stored in dst.

void main() {
	__m128i src;
	src.data[0] = 255;
	__m128i dst;
	writeln(src.data); // shows 255 at offset 0
	store(&src, &dst);
	writeln(dst.data); // remains set as the initial array
}

http://x86.renejeschke.de/html/file_module_x86_id_184.html

Is this how it's meant to be used?
October 09, 2014
Maybe someone can help with the more specific problem. I'm translating a crypto engine here:

https://github.com/etcimon/botan/blob/master/source/botan/block/aes_ni/aes_ni.d

But I need this to work on DMD, LDC and GDC. I decided to write the assembler code directly for the functions in this module:

https://github.com/etcimon/botan/blob/master/source/botan/utils/simd/xmmintrin.d

If there's anything someone can tell me about this, I'd be thankful. I'm very experienced in every aspect of programming, but still at my first baby steps in assembler.
October 09, 2014
On Thursday, 9 October 2014 at 13:29:27 UTC, Etienne wrote:
> On 2014-10-09 8:54 AM, anonymous wrote:
>> This compiles:
>>
>> align(16) union __m128i { ubyte[16] data; } /* note the position
>> of the semicolon */
>>
>> void store(__m128i* src, __m128i* dst) {
>>      asm
>>      {
>>          movdqu XMM0, [src]; /* note: [src] */
>>          movdqu [dst], XMM0;
>>      }
>> }
>
> Yes, this does compile, but the value from src never ends up stored in dst.
>
> void main() {
> 	__m128i src;
> 	src.data[0] = 255;
> 	__m128i dst;
> 	writeln(src.data); // shows 255 at offset 0
> 	store(&src, &dst);
> 	writeln(dst.data); // remains set as the initial array
> }
>
> http://x86.renejeschke.de/html/file_module_x86_id_184.html
>
> Is this how it's meant to be used?

I'm out of my knowledge zone here, but it seems to work when you
move the pointers to registers first:

void store(__m128i* src, __m128i* dst) {
     asm
     {
         mov RAX, src;
         mov RBX, dst;
         movdqu XMM0, [RAX];
         movdqu [RBX], XMM0;
     }
}
October 09, 2014
On 2014-10-09 9:46 AM, anonymous wrote:
> I'm out of my knowledge zone here, but it seems to work when you
> move the pointers to registers first:
>
> void store(__m128i* src, __m128i* dst) {
>       asm
>       {
>           mov RAX, src;
>           mov RBX, dst;
>           movdqu XMM0, [RAX];
>           movdqu [RBX], XMM0;
>       }
> }

Absolutely incredible! My first useful working assembler code. You save the day. Now I can probably write a whole SIMD library ;)