Jump to page: 1 2
Thread overview
asm woes...
May 27, 2016
Era Scarecrow
May 27, 2016
Era Scarecrow
May 27, 2016
Guillaume Piolat
May 27, 2016
Era Scarecrow
May 27, 2016
Era Scarecrow
May 27, 2016
Guillaume Piolat
May 27, 2016
Era Scarecrow
May 28, 2016
Era Scarecrow
May 28, 2016
ZombineDev
May 28, 2016
Era Scarecrow
May 27, 2016
rikki cattermole
May 27, 2016
Era Scarecrow
May 27, 2016
Guillaume Piolat
May 31, 2016
Marco Leise
May 27, 2016
Era Scarecrow
May 27, 2016
Era Scarecrow
May 31, 2016
Marco Leise
May 31, 2016
Era Scarecrow
May 27, 2016
 Well decided I should dig my hand in assembly just to see if it would work. Using wideint.d as a starting point I thought I would do the simplest operation I could do, an increment.

  https://github.com/d-gamedev-team/gfm/blob/master/integers/gfm/integers/wideint.d
  https://dlang.org/spec/iasm.html

 Most of my code was failing outright until I looked at the integrated assembler page, which TDPL doesn't go into at all. To access variables for example I have to do var[ESP] or var[RSP] to access it from the stack frame. Unintuitive, but sure I can work with it.

 So the code for incrementing is pretty simple...

  @nogc void increment() pure nothrow
    ++lo;
    if (lo == 0) ++hi;
  }

 That's pretty simple to work with. I know the assembly instructions can be done 1 of 2 ways.

   add lo, 1
   adc hi, 0

 OR

   inc lo
   jnc L1 //jump if not carry
   inc hi


 So I've tried. Considering the wideint basically is self calling if you want to make a larger type than 128bit, then that means I need to leave the original code alone if it's a type that's too large, but only inject assembly if it's the right time and size. Thankfully bits is there to tell us.

So, add version
  @nogc void increment() pure nothrow
  {
    static if (bits > 128) {
      ++lo;
      if (lo == 0) ++hi;
    } else {
      version(X86) {
        asm pure @nogc nothrow {
          add lo[ESP], 1;
          adc hi[ESP], 0;
        }
      } else {
        ++lo;
        if (lo == 0) ++hi;
      }
    }
  }

 I compile and get: Error: asm statements cannot be interpreted at compile time

 The whole thing now fails, rather than compiling to do the unittests... Doing the inc version gives the same error..

        asm pure @nogc nothrow {
          inc lo[ESP];
          jnc L1;
          inc hi[ESP];
          L1:;
        }

 Naturally it wasn't very specific about if I should rely on RSP or ESP or what, but since it's X86 rather than X86_64 I guess that answers it... would be easy to write the x64 version, if it would let me.

 So i figure i put a check for __ctfe and that will avoid the assembly calls if that's the case. So...

    version(X86) {
      @nogc void increment() pure nothrow
      {
        if (!__ctfe && bits == 128) {
          asm pure @nogc nothrow {
            add lo[ESP], 1;
            adc hi[ESP], 0;
          }
        } else {
          ++lo;
          if (lo == 0) ++hi;
        }
      }
    } else {
      //original declaration
    }

 Now it compiles, however it hangs the program when doing the unittest. Why does it hang the program? I have no clue. Tried changing the ESP to EBP just in case that was actually what it wanted, but doesn't seem to be the case. I can tell how I will be refactoring the code, assuming i can figure out what's wrong in the first place...

 Anyone with inline assembly experience who can help me out a little? 2 add instructions shouldn't cause it to hang...
May 27, 2016
On Friday, 27 May 2016 at 08:20:02 UTC, Era Scarecrow wrote:
>  Anyone with inline assembly experience who can help me out a little? 2 add instructions shouldn't cause it to hang...

 Hmmm it just occurs to me I made a big assumption. I assumed that if the CPU supports 64bit operations, that it would be compiled to use 64bit registers when possible. I'm assuming this is not the case. As such the tests I was doing will probably be of little help _unless_ it was X86_64 code, or a check that verifies it's 64bit hardware?

 Does this mean the 64bit types are emulated rather than using hardware?
May 27, 2016
On Friday, 27 May 2016 at 09:11:01 UTC, Era Scarecrow wrote:
>  Hmmm it just occurs to me I made a big assumption. I assumed that if the CPU supports 64bit operations, that it would be compiled to use 64bit registers when possible. I'm assuming this is not the case. As such the tests I was doing will probably be of little help _unless_ it was X86_64 code, or a check that verifies it's 64bit hardware?

You have to write your code three times, one for

version(D_InlineAsm_X86)
version (D_InlineAsm_X86_64)
and a version without assembly.

In rare cases you can merge D_InlineAsm_X86 and D_InlineAsm_X86_64 versions. D provides unfortunately less support to write code that is valid in both compared to C++! This causes lots of duplication </rant>


TBH I don't know how to access members in assembly, I think you shouldn't ever do that. It will depend heavily on the particular calling convention called.
Just put these fields in local variables.

void increment()
{
    auto lo_local = lo;
    auto hi_local = hi;
    asm
    {
        add dword ptr lo_local, 1;
        adc dword ptr hi_local, 0;
    }
    lo = lo_local;
    hi = hi_local;
}

The compiler will replace with the right register-indexed stuff.
But honestly I doubt it will be any faster because on the other hand you mess with the optimizer.
May 27, 2016
On Friday, 27 May 2016 at 09:22:49 UTC, Guillaume Piolat wrote:
> On Friday, 27 May 2016 at 09:11:01 UTC, Era Scarecrow wrote:
>>  Hmmm it just occurs to me I made a big assumption. I assumed that if the CPU supports 64bit operations, that it would be compiled to use 64bit registers when possible. I'm assuming this is not the case. As such the tests I was doing will  probably be of little help _unless_ it was X86_64 code, or a check that verifies it's 64bit hardware?
>
> You have to write your code three times, one for
>
> version(D_InlineAsm_X86)
> version (D_InlineAsm_X86_64)
> and a version without assembly.

 If longs are emulated, then only X86_64 and without assembly would be considered, as there would be no benefit to doing the X86 version. If i can do it, the two will be identical, except for which stack register is used. (A lot of wasted space for so little to add).

> TBH I don't know how to access members in assembly, I think you shouldn't ever do that. It will depend heavily on the particular calling convention called.
> Just put these fields in local variables.
>
> <snip>
>
> The compiler will replace with the right register-indexed stuff.
> But honestly I doubt it will be any faster because on the other hand you mess with the optimizer.

 Hmmm tried it as you have it listed. Still hangs. Tried it directly with qword with and without [ESP], still hangs.

 The listed inline assembler here on Dlang says to use 'variableName[ESP]', which then becomes obvious it's a variable and even probably inserts type-size information as appropriate. Although I did it manually as you had listed but it still hangs. I suppose there's the requirement to have a register pointing to this, which then would be mov EAX, this, and then add lo[EAX], 1...
May 27, 2016
On Friday, 27 May 2016 at 09:39:36 UTC, Era Scarecrow wrote:
> I suppose there's the requirement to have a register pointing to this, which then would be mov EAX, this, and then add lo[EAX], 1...

 Nope, still hangs...
May 27, 2016
On Friday, 27 May 2016 at 09:44:47 UTC, Era Scarecrow wrote:
> On Friday, 27 May 2016 at 09:39:36 UTC, Era Scarecrow wrote:
>> I suppose there's the requirement to have a register pointing to this, which then would be mov EAX, this, and then add lo[EAX], 1...
>
>  Nope, still hangs...

We can't know why your code hangs if you don't post any code.

https://dpaste.dzfl.pl/4026d9e6d3c0
May 27, 2016
On 27/05/2016 8:20 PM, Era Scarecrow wrote:
>  Well decided I should dig my hand in assembly just to see if it would
> work. Using wideint.d as a starting point I thought I would do the
> simplest operation I could do, an increment.
>
>
> https://github.com/d-gamedev-team/gfm/blob/master/integers/gfm/integers/wideint.d
>
>   https://dlang.org/spec/iasm.html
>
>  Most of my code was failing outright until I looked at the integrated
> assembler page, which TDPL doesn't go into at all. To access variables
> for example I have to do var[ESP] or var[RSP] to access it from the
> stack frame. Unintuitive, but sure I can work with it.
>
>  So the code for incrementing is pretty simple...
>
>   @nogc void increment() pure nothrow
>     ++lo;
>     if (lo == 0) ++hi;
>   }
>
>  That's pretty simple to work with. I know the assembly instructions can
> be done 1 of 2 ways.
>
>    add lo, 1
>    adc hi, 0
>
>  OR
>
>    inc lo
>    jnc L1 //jump if not carry
>    inc hi
>
>
>  So I've tried. Considering the wideint basically is self calling if you
> want to make a larger type than 128bit, then that means I need to leave
> the original code alone if it's a type that's too large, but only inject
> assembly if it's the right time and size. Thankfully bits is there to
> tell us.
>
> So, add version
>   @nogc void increment() pure nothrow
>   {
>     static if (bits > 128) {
>       ++lo;
>       if (lo == 0) ++hi;
>     } else {
>       version(X86) {
>         asm pure @nogc nothrow {
>           add lo[ESP], 1;
>           adc hi[ESP], 0;
>         }
>       } else {
>         ++lo;
>         if (lo == 0) ++hi;
>       }
>     }
>   }
>
>  I compile and get: Error: asm statements cannot be interpreted at
> compile time
>
>  The whole thing now fails, rather than compiling to do the unittests...
> Doing the inc version gives the same error..
>
>         asm pure @nogc nothrow {
>           inc lo[ESP];
>           jnc L1;
>           inc hi[ESP];
>           L1:;
>         }
>
>  Naturally it wasn't very specific about if I should rely on RSP or ESP
> or what, but since it's X86 rather than X86_64 I guess that answers
> it... would be easy to write the x64 version, if it would let me.
>
>  So i figure i put a check for __ctfe and that will avoid the assembly
> calls if that's the case. So...
>
>     version(X86) {
>       @nogc void increment() pure nothrow
>       {
>         if (!__ctfe && bits == 128) {
>           asm pure @nogc nothrow {
>             add lo[ESP], 1;
>             adc hi[ESP], 0;
>           }
>         } else {
>           ++lo;
>           if (lo == 0) ++hi;
>         }
>       }
>     } else {
>       //original declaration
>     }
>
>  Now it compiles, however it hangs the program when doing the unittest.
> Why does it hang the program? I have no clue. Tried changing the ESP to
> EBP just in case that was actually what it wanted, but doesn't seem to
> be the case. I can tell how I will be refactoring the code, assuming i
> can figure out what's wrong in the first place...
>
>  Anyone with inline assembly experience who can help me out a little? 2
> add instructions shouldn't cause it to hang...

Me and p0nce solved this on IRC.

struct Foo {
        int x;

        void foobar() {
                asm {
                        mov EAX, this;
                        inc [EAX+Foo.x.offsetof];
                }
        }
}

void main() {
        import std.stdio;

        Foo foo = Foo(8);
        foo.foobar;

        writeln(foo.x);
}

You have to reference the field via a register.
May 27, 2016
On Friday, 27 May 2016 at 09:51:56 UTC, rikki cattermole wrote:
> Me and p0nce solved this on IRC.
>
> struct Foo {
>   int x;
>
>   void foobar() {
>     asm {
>       mov EAX, this;
>       inc [EAX+Foo.x.offsetof];
>     }
>   }
> }
>
> void main() {
>   import std.stdio;
>
>   Foo foo = Foo(8);
>   foo.foobar;
>
>   writeln(foo.x);
> }
>
> You have to reference the field via a register.

 This is good progress. Using the assembler doesn't have many documentation examples of how to do things, guess the x[ESP] example was totally useless on the iasm page.
May 27, 2016
On Friday, 27 May 2016 at 09:51:36 UTC, Guillaume Piolat wrote:
> On Friday, 27 May 2016 at 09:44:47 UTC, Era Scarecrow wrote:
>>  Nope, still hangs...
>
> We can't know why your code hangs if you don't post any code.

 Considering I'd have to include the whole of wideint.d, that is highly implausible to do. But already got a possible answer which I'm about to test.
May 27, 2016
On Friday, 27 May 2016 at 10:00:40 UTC, Era Scarecrow wrote:
> On Friday, 27 May 2016 at 09:51:56 UTC, rikki cattermole wrote:
>
>  This is good progress. Using the assembler doesn't have many documentation examples of how to do things, guess the x[ESP] example was totally useless on the iasm page.

Referencing EBP or ESP yourself is indeed dangerous. Not sure why the documentation would advise that. Using "this", names of parameters/locals/field offset is much safer.


« First   ‹ Prev
1 2