Thread overview
assembler copy char[]
Jun 07, 2007
nobody
Jun 07, 2007
Daniel Keep
Jun 07, 2007
Frits van Bommel
Jun 07, 2007
nobody
Jun 07, 2007
Don Clugston
Jun 07, 2007
Daniel Keep
Jun 07, 2007
nobody
Jun 08, 2007
nobody
Jun 07, 2007
Thomas Kuehne
June 07, 2007
Hello,

i try copy mystring to st in assembler. Can someone give me an advice how to do this.

import std.stdio;
void main()
{
        char[] mystring = "Hey Assembler\n";
        char[] st;

        asm
        {
                mov EAX, dword ptr [mystring+4];
                mov st,EAX;
        }
        writefln("st: ", st);
}

I get a Segmentation fault .

Thanks
June 07, 2007

nobody wrote:
> Hello,
> 
> i try copy mystring to st in assembler. Can someone give me an advice how to do this.
> 
> import std.stdio;
> void main()
> {
>         char[] mystring = "Hey Assembler\n";
>         char[] st;
> 
>         asm
>         {
>                 mov EAX, dword ptr [mystring+4];
>                 mov st,EAX;
>         }
>         writefln("st: ", st);
> }
> 
> I get a Segmentation fault .
> 
> Thanks

I have to wonder why on earth you're using assembly, and how you arrived at the above code.  From what I can tell, you're copying mystring's length into st's *pointer* field, and then trying to print it.

You say you want to "copy" the string, but you could do it just as easily like this:

  st = mystring;

Why exactly are you doing this?  I ask because I'm somewhat hesitant to hand a bazooka to someone who seems to be having trouble working out which end the rocket comes out of...

	-- Daniel
June 07, 2007
nobody wrote:
> i try copy mystring to st in assembler. Can someone give me an advice how to do this.
> 
> import std.stdio;
> void main()
> {
>         char[] mystring = "Hey Assembler\n";
>         char[] st;
> 
>         asm
>         {
>                 mov EAX, dword ptr [mystring+4];
>                 mov st,EAX;
>         }
>         writefln("st: ", st);
> }
> 
> I get a Segmentation fault .

Unsurprising, since you're copying mystring.ptr to st.length, creating an array reference to a (likely huge) array with (.ptr == null) ;).

Like Daniel, who posted while I was composing this post, I wonder why you're trying to do this. There's a much easier way that can likely be better optimized by the compiler.
But if you're determined to do this (or are just trying to expand your knowledge of asm coding and/or D array implementation), read on.

Assuming you're trying to code "st = mystring" in assembler, try this:
---
import std.stdio;
void main()
{
        char[] mystring = "Hey Assembler\n";
        char[] st;

        asm
        {
		// Copy length
                mov EAX, dword ptr [mystring];
                mov [st],EAX;
		
		// Copy ptr
                mov EAX, dword ptr [mystring+4];
                mov [st+4],EAX;
        }
        writefln("st: ", st);
}
---
Remember, dynamic arrays have two parts: a length and a pointer to the data. You need to copy both of them (and to the corresponding part of the destination, obviously) to copy a dynamic array reference.



P.S. It'll likely be a bit more efficient to do this:
---
        asm
        {
                mov ECX, dword ptr [mystring];
                mov EDX, dword ptr [mystring+4];
                mov [st],ECX;
                mov [st+4],EDX;
		
        }
---
because it leaves more time between reading and writing, allowing the CPU to perform better pipelining.
About register usage: I used ECX & EDX here because (like EAX) those don't need to be preserved between function calls, so the compiler doesn't necessarily need to insert extra code to preserve them. I didn't use EAX because that's more likely to contain a useful value due to its special uses in calling conventions, and is thus more likely to require extra code to be emitted to preserve it.

But better optimization would likely result from just changing it to "st = mystring" and adding '-O' (or '-O3' for GDC) to the command line options passed to the compiler :). (Plus it'll be platform-independent)
June 07, 2007
Frits van Bommel Wrote:

> nobody wrote:
> > i try copy mystring to st in assembler. Can someone give me an advice how to do this.
> > 
> > import std.stdio;
> > void main()
> > {
> >         char[] mystring = "Hey Assembler\n";
> >         char[] st;
> > 
> >         asm
> >         {
> >                 mov EAX, dword ptr [mystring+4];
> >                 mov st,EAX;
> >         }
> >         writefln("st: ", st);
> > }
> > 
> > I get a Segmentation fault .
> 
> Unsurprising, since you're copying mystring.ptr to st.length, creating an array reference to a (likely huge) array with (.ptr == null) ;).
> 
> Like Daniel, who posted while I was composing this post, I wonder why
> you're trying to do this. There's a much easier way that can likely be
> better optimized by the compiler.
> But if you're determined to do this (or are just trying to expand your
> knowledge of asm coding and/or D array implementation), read on.

Yes i try to learn Inline Assembler but it is more difficult then i thought ;-)

> 
> Assuming you're trying to code "st = mystring" in assembler, try this:
> ---
> import std.stdio;
> void main()
> {
>          char[] mystring = "Hey Assembler\n";
>          char[] st;
> 
>          asm
>          {
> 		// Copy length
>                  mov EAX, dword ptr [mystring];
>                  mov [st],EAX;
> 
> 		// Copy ptr
>                  mov EAX, dword ptr [mystring+4];
>                  mov [st+4],EAX;
>          }
>          writefln("st: ", st);
> }
> ---
> Remember, dynamic arrays have two parts: a length and a pointer to the data.

Yes i figured this out.

 You need to copy both of them (and to the corresponding part of
> the destination, obviously) to copy a dynamic array reference.

Ah, that's the trick.
> 
> 
> 
> P.S. It'll likely be a bit more efficient to do this:
> ---
>          asm
>          {
>                  mov ECX, dword ptr [mystring];
>                  mov EDX, dword ptr [mystring+4];
>                  mov [st],ECX;
>                  mov [st+4],EDX;
> 
>          }
> ---
> because it leaves more time between reading and writing, allowing the
> CPU to perform better pipelining.
> About register usage: I used ECX & EDX here because (like EAX) those
> don't need to be preserved between function calls, so the compiler
> doesn't necessarily need to insert extra code to preserve them. I didn't
> use EAX because that's more likely to contain a useful value due to its
> special uses in calling conventions, and is thus more likely to require
> extra code to be emitted to preserve it.
> 
> But better optimization would likely result from just changing it to "st = mystring" and adding '-O' (or '-O3' for GDC) to the command line options passed to the compiler :). (Plus it'll be platform-independent)

Wow, that's exactly what i want.
In dmd it's all ok, but the gdc  didn't like it:

mov EDX, dword ptr [mystring+4];
mov [st+4],EDX;


gdmd  string.d
/tmp/cc4h5AKo.s: Assembler messages:
/tmp/cc4h5AKo.s:30: Error: junk `(%ebp)+4' after expression
/tmp/cc4h5AKo.s:31: Error: junk `(%ebp)+4' after expression



June 07, 2007
nobody wrote:
> Frits van Bommel Wrote:
> 
>> nobody wrote:
>>> i try copy mystring to st in assembler. Can someone give me an advice how to do this.
>>>
>>> import std.stdio;
>>> void main()
>>> {
>>>         char[] mystring = "Hey Assembler\n";
>>>         char[] st;
>>>
>>>         asm
>>>         {
>>>                 mov EAX, dword ptr [mystring+4];
>>>                 mov st,EAX;
>>>         }
>>>         writefln("st: ", st);
>>> }
>>>
>>> I get a Segmentation fault .
>> Unsurprising, since you're copying mystring.ptr to st.length, creating an array reference to a (likely huge) array with (.ptr == null) ;).
>>
>> Like Daniel, who posted while I was composing this post, I wonder why you're trying to do this. There's a much easier way that can likely be better optimized by the compiler.
>> But if you're determined to do this (or are just trying to expand your knowledge of asm coding and/or D array implementation), read on.
> 
> Yes i try to learn Inline Assembler but it is more
> difficult then i thought ;-)
> 
>> Assuming you're trying to code "st = mystring" in assembler, try this:
>> ---
>> import std.stdio;
>> void main()
>> {
>>          char[] mystring = "Hey Assembler\n";
>>          char[] st;
>>
>>          asm
>>          {
>> 		// Copy length
>>                  mov EAX, dword ptr [mystring];
>>                  mov [st],EAX;
>> 		
>> 		// Copy ptr
>>                  mov EAX, dword ptr [mystring+4];
>>                  mov [st+4],EAX;
>>          }
>>          writefln("st: ", st);
>> }
>> ---
>> Remember, dynamic arrays have two parts: a length and a pointer to the data.
> 
> Yes i figured this out.
> 
>  You need to copy both of them (and to the corresponding part of 
>> the destination, obviously) to copy a dynamic array reference.
> 
> Ah, that's the trick.
>>
>>
>> P.S. It'll likely be a bit more efficient to do this:
>> ---
>>          asm
>>          {
>>                  mov ECX, dword ptr [mystring];
>>                  mov EDX, dword ptr [mystring+4];
>>                  mov [st],ECX;
>>                  mov [st+4],EDX;
>> 		
>>          }
>> ---
>> because it leaves more time between reading and writing, allowing the CPU to perform better pipelining.
>> About register usage: I used ECX & EDX here because (like EAX) those don't need to be preserved between function calls, so the compiler doesn't necessarily need to insert extra code to preserve them. I didn't use EAX because that's more likely to contain a useful value due to its special uses in calling conventions, and is thus more likely to require extra code to be emitted to preserve it.
>>
>> But better optimization would likely result from just changing it to "st = mystring" and adding '-O' (or '-O3' for GDC) to the command line options passed to the compiler :). (Plus it'll be platform-independent)
> 
> Wow, that's exactly what i want.
> In dmd it's all ok, but the gdc  didn't like it: 
> 
> mov EDX, dword ptr [mystring+4];
> mov [st+4],EDX;
> 
> 
> gdmd  string.d
> /tmp/cc4h5AKo.s: Assembler messages:
> /tmp/cc4h5AKo.s:30: Error: junk `(%ebp)+4' after expression
> /tmp/cc4h5AKo.s:31: Error: junk `(%ebp)+4' after expression
> 
> 
> 
ST is a register (top of stack in x87 FPU). In DMD, all registers must be uppercase; maybe not true for GDC.
June 07, 2007

nobody wrote:
> Wow, that's exactly what i want.
> In dmd it's all ok, but the gdc  didn't like it:
> 
> mov EDX, dword ptr [mystring+4];
> mov [st+4],EDX;
> 
> 
> gdmd  string.d
> /tmp/cc4h5AKo.s: Assembler messages:
> /tmp/cc4h5AKo.s:30: Error: junk `(%ebp)+4' after expression
> /tmp/cc4h5AKo.s:31: Error: junk `(%ebp)+4' after expression

IIRC, that's because local variables are actually EBP plus some offset.
 I think DMD is inlining the two offsets, but GDC isn't.

If you want to learn assembler, the best way is to just write code in D, compile it, and then disassemble it.  I assume you're running under Linux; gdb should have an option to disassemble the current function. That way, you can read the original line of source code, and what the compiler actually produces.  Here's what ddbg gives me for the program

void main()
{
    auto mystring = "Hello, World!";
    auto st = mystring;
}

Disassembly:

copy_string.d:2 void main()
00402010: c8200000                enter 0x20, 0x0
00402014: 53                      push ebx

copy_string.d:4     auto mystring = "Hello, World!";
00402015: 8d45e0                  lea eax, [ebp-0x20]
00402018: 50                      push eax
00402019: 6a0d                    push 0xd
0040201b: ff3594f04000            push dword [0x40f094]
00402021: ff3590f04000            push dword [0x40f090]
00402027: 6a01                    push 0x1
00402029: e87e010000              call 0x4021ac __d_arraycopy

copy_string.d:5     auto st = mystring;
0040202e: 8d4df0                  lea ecx, [ebp-0x10]
00402031: 51                      push ecx
00402032: 6a0d                    push 0xd
00402034: 8d55e0                  lea edx, [ebp-0x20]
00402037: bb0d000000              mov ebx, 0xd
0040203c: 52                      push edx
0040203d: 53                      push ebx
0040203e: 6a01                    push 0x1
00402040: e867010000              call 0x4021ac __d_arraycopy
00402045: 31c0                    xor eax, eax
00402047: 83c428                  add esp, 0x28
copy_string.obj
0040204a: 5b                      pop ebx
0040204b: c9                      leave
0040204c: c3                      ret

	-- Daniel
June 07, 2007
Daniel Keep Wrote:

> 
> 
> nobody wrote:
> > Wow, that's exactly what i want.
> > In dmd it's all ok, but the gdc  didn't like it:
> > 
> > mov EDX, dword ptr [mystring+4];
> > mov [st+4],EDX;
> > 
> > 
> > gdmd  string.d
> > /tmp/cc4h5AKo.s: Assembler messages:
> > /tmp/cc4h5AKo.s:30: Error: junk `(%ebp)+4' after expression
> > /tmp/cc4h5AKo.s:31: Error: junk `(%ebp)+4' after expression
> 
> IIRC, that's because local variables are actually EBP plus some offset.
>  I think DMD is inlining the two offsets, but GDC isn't.
> 
> If you want to learn assembler, the best way is to just write code in D, compile it, and then disassemble it.  I assume you're running under Linux; gdb should have an option to disassemble the current function. That way, you can read the original line of source code, and what the compiler actually produces.  Here's what ddbg gives me for the program
> 
> void main()
> {
>     auto mystring = "Hello, World!";
>     auto st = mystring;
> }
> 
> Disassembly:
> 
> copy_string.d:2 void main()
> 00402010: c8200000                enter 0x20, 0x0
> 00402014: 53                      push ebx
> 
> copy_string.d:4     auto mystring = "Hello, World!";
> 00402015: 8d45e0                  lea eax, [ebp-0x20]
> 00402018: 50                      push eax
> 00402019: 6a0d                    push 0xd
> 0040201b: ff3594f04000            push dword [0x40f094]
> 00402021: ff3590f04000            push dword [0x40f090]
> 00402027: 6a01                    push 0x1
> 00402029: e87e010000              call 0x4021ac __d_arraycopy
> 
> copy_string.d:5     auto st = mystring;
> 0040202e: 8d4df0                  lea ecx, [ebp-0x10]
> 00402031: 51                      push ecx
> 00402032: 6a0d                    push 0xd
> 00402034: 8d55e0                  lea edx, [ebp-0x20]
> 00402037: bb0d000000              mov ebx, 0xd
> 0040203c: 52                      push edx
> 0040203d: 53                      push ebx
> 0040203e: 6a01                    push 0x1
> 00402040: e867010000              call 0x4021ac __d_arraycopy
> 00402045: 31c0                    xor eax, eax
> 00402047: 83c428                  add esp, 0x28
> copy_string.obj
> 0040204a: 5b                      pop ebx
> 0040204b: c9                      leave
> 0040204c: c3                      ret
> 
> 	-- Daniel


With gdb

Dump of assembler code for function main:
0x08049a70 <main+0>:    lea    0x4(%esp),%ecx
0x08049a74 <main+4>:    and    $0xfffffff0,%esp
0x08049a77 <main+7>:    pushl  0xfffffffc(%ecx)
0x08049a7a <main+10>:   push   %ebp
0x08049a7b <main+11>:   mov    %esp,%ebp
0x08049a7d <main+13>:   push   %ecx
0x08049a7e <main+14>:   sub    $0x14,%esp
0x08049a81 <main+17>:   mov    (%ecx),%edx
0x08049a83 <main+19>:   mov    0x4(%ecx),%eax
0x08049a86 <main+22>:   mov    $0x8049154,%ecx
0x08049a8b <main+27>:   mov    %ecx,0x8(%esp)
0x08049a8f <main+31>:   mov    %edx,(%esp)
0x08049a92 <main+34>:   mov    %eax,0x4(%esp)
0x08049a96 <main+38>:   call   0x8049af0 <_d_run_main>
0x08049a9b <main+43>:   add    $0x14,%esp
0x08049a9e <main+46>:   pop    %ecx
0x08049a9f <main+47>:   pop    %ebp

Ok,  i try it tomorrow again, with ESP .
I have thought that's only needed by naked asm blocks?


June 07, 2007
nobody schrieb am 2007-06-07:
> Frits van Bommel Wrote:
>
>> nobody wrote:
>> > i try copy mystring to st in assembler. Can someone give me an advice how to do this.
>> > 
>> > import std.stdio;
>> > void main()
>> > {
>> >         char[] mystring = "Hey Assembler\n";
>> >         char[] st;
>> > 
>> >         asm
>> >         {
>> >                 mov EAX, dword ptr [mystring+4];
>> >                 mov st,EAX;
>> >         }
>> >         writefln("st: ", st);
>> > }
>> > 
>> > I get a Segmentation fault .
>> 
>> Unsurprising, since you're copying mystring.ptr to st.length, creating an array reference to a (likely huge) array with (.ptr == null) ;).
>> 
>> Like Daniel, who posted while I was composing this post, I wonder why
>> you're trying to do this. There's a much easier way that can likely be
>> better optimized by the compiler.
>> But if you're determined to do this (or are just trying to expand your
>> knowledge of asm coding and/or D array implementation), read on.
>
> Yes i try to learn Inline Assembler but it is more difficult then i thought ;-)

If you like to learn inline assembler the hard way have a look at the asm_*d files at http://dstress.kuehne.cn/run/a . Some of those require the helper http://dstress.kuehne.cn/addon/cpuinfo.d . Please don't read cpuinfo.d because it uses various hacks to get around some GDC bugs.

Thomas


June 08, 2007
nobody Wrote:

> Daniel Keep Wrote:
> 
> > 
> > 
> > nobody wrote:
> > > Wow, that's exactly what i want.
> > > In dmd it's all ok, but the gdc  didn't like it:
> > > 
> > > mov EDX, dword ptr [mystring+4];
> > > mov [st+4],EDX;
> > > 
> > > 
> > > gdmd  string.d
> > > /tmp/cc4h5AKo.s: Assembler messages:
> > > /tmp/cc4h5AKo.s:30: Error: junk `(%ebp)+4' after expression
> > > /tmp/cc4h5AKo.s:31: Error: junk `(%ebp)+4' after expression
> > 
> > IIRC, that's because local variables are actually EBP plus some offset.
> >  I think DMD is inlining the two offsets, but GDC isn't.
> > 
> > If you want to learn assembler, the best way is to just write code in D, compile it, and then disassemble it.  I assume you're running under Linux; gdb should have an option to disassemble the current function. That way, you can read the original line of source code, and what the compiler actually produces.  Here's what ddbg gives me for the program
> > 
> > void main()
> > {
> >     auto mystring = "Hello, World!";
> >     auto st = mystring;
> > }
> > 
> > Disassembly:
> > 
> > copy_string.d:2 void main()
> > 00402010: c8200000                enter 0x20, 0x0
> > 00402014: 53                      push ebx
> > 
> > copy_string.d:4     auto mystring = "Hello, World!";
> > 00402015: 8d45e0                  lea eax, [ebp-0x20]
> > 00402018: 50                      push eax
> > 00402019: 6a0d                    push 0xd
> > 0040201b: ff3594f04000            push dword [0x40f094]
> > 00402021: ff3590f04000            push dword [0x40f090]
> > 00402027: 6a01                    push 0x1
> > 00402029: e87e010000              call 0x4021ac __d_arraycopy
> > 
> > copy_string.d:5     auto st = mystring;
> > 0040202e: 8d4df0                  lea ecx, [ebp-0x10]
> > 00402031: 51                      push ecx
> > 00402032: 6a0d                    push 0xd
> > 00402034: 8d55e0                  lea edx, [ebp-0x20]
> > 00402037: bb0d000000              mov ebx, 0xd
> > 0040203c: 52                      push edx
> > 0040203d: 53                      push ebx
> > 0040203e: 6a01                    push 0x1
> > 00402040: e867010000              call 0x4021ac __d_arraycopy
> > 00402045: 31c0                    xor eax, eax
> > 00402047: 83c428                  add esp, 0x28
> > copy_string.obj
> > 0040204a: 5b                      pop ebx
> > 0040204b: c9                      leave
> > 0040204c: c3                      ret
> > 
> > 	-- Daniel
> 
> 
> With gdb
> 
> Dump of assembler code for function main:
> 0x08049a70 <main+0>:    lea    0x4(%esp),%ecx
> 0x08049a74 <main+4>:    and    $0xfffffff0,%esp
> 0x08049a77 <main+7>:    pushl  0xfffffffc(%ecx)
> 0x08049a7a <main+10>:   push   %ebp
> 0x08049a7b <main+11>:   mov    %esp,%ebp
> 0x08049a7d <main+13>:   push   %ecx
> 0x08049a7e <main+14>:   sub    $0x14,%esp
> 0x08049a81 <main+17>:   mov    (%ecx),%edx
> 0x08049a83 <main+19>:   mov    0x4(%ecx),%eax
> 0x08049a86 <main+22>:   mov    $0x8049154,%ecx
> 0x08049a8b <main+27>:   mov    %ecx,0x8(%esp)
> 0x08049a8f <main+31>:   mov    %edx,(%esp)
> 0x08049a92 <main+34>:   mov    %eax,0x4(%esp)
> 0x08049a96 <main+38>:   call   0x8049af0 <_d_run_main>
> 0x08049a9b <main+43>:   add    $0x14,%esp
> 0x08049a9e <main+46>:   pop    %ecx
> 0x08049a9f <main+47>:   pop    %ebp
> 
> Ok,  i try it tomorrow again, with ESP .
> I have thought that's only needed by naked asm blocks?


Here my new versions:
With ESP

import tango.io.Stdout;
void main()
{
        char[] mystring = "Hey Assembler";
        char[] str;

        asm
        {
                // Load Adress
                lea ECX,dword ptr [str]        ;

                // Copy length
                mov EDX, dword ptr [ESP]       ;
                mov [ECX],EDX                  ;

                // Copy ptr
                mov EDX, dword ptr [ESP+4]     ;
                mov [ECX+4],EDX                ;
        }
        Stdout("str ")(str).newline;
}

With lea:
 import tango.io.Stdout;
void main()
{
        char[] mystring = "Hey Assembler";
        char[] str;

        asm
        {
                // Load Adress
                lea ECX,dword ptr [str]        ;
                lea EBX,dword ptr [mystring]   ;

                // Copy length
                mov EDX, dword ptr [EBX]       ;
                mov [ECX],EDX                  ;

                // Copy ptr
                mov EDX, dword ptr [EBX+4]     ;
                mov [ECX+4],EDX                ;
        }
        Stdout("str ")(str).newline;
}