Thread overview
dmd asm output
Apr 01, 2013
John Colvin
Apr 01, 2013
bearophile
Apr 01, 2013
bearophile
Apr 01, 2013
John Colvin
Apr 01, 2013
nazriel
Apr 01, 2013
js.mdnq
Apr 01, 2013
Artur Skawina
Apr 01, 2013
John Colvin
April 01, 2013
I've been learning assembler a bit and I decided to have a look at what dmd spits out. I tried a simple function with arrays to see what vectorization gets done

void addto(int[] a, int[] b) {
    a[] += b[];
}

dmd -O -release -inline -noboundscheck -gc -c test.d

disassembled with gdb:
_D3sse5addtoFAiAiZv:
0x0000000000000040 <+0>:      push   rbp
0x0000000000000041 <+1>:      mov    rbp,rsp
0x0000000000000044 <+4>:      sub    rsp,0x30
0x0000000000000048 <+8>:      mov    QWORD PTR [rbp-0x20],rdi
0x000000000000004c <+12>:    mov    QWORD PTR [rbp-0x18],rsi
0x0000000000000050 <+16>:    mov    QWORD PTR [rbp-0x10],rdx
0x0000000000000054 <+20>:    mov    QWORD PTR [rbp-0x8],rcx
0x0000000000000058 <+24>:    mov    rcx,QWORD PTR [rbp-0x18]
0x000000000000005c <+28>:    mov    rax,QWORD PTR [rbp-0x20]
0x0000000000000060 <+32>:    mov    rdx,rax
0x0000000000000063 <+35>:    mov    QWORD PTR [rbp-0x28],rdx
0x0000000000000067 <+39>:    mov    rdx,QWORD PTR [rbp-0x8]
0x000000000000006b <+43>:    mov    rdi,QWORD PTR [rbp-0x10]
0x000000000000006f <+47>:     mov    rsi,rdx
0x0000000000000072 <+50>:    mov    rdx,QWORD PTR [rbp-0x28]
0x0000000000000076 <+54>:    call   0x7b <_D3sse5addtoFAiAiZv+59>
0x000000000000007b <+59>:    mov    rsp,rbp
0x000000000000007e <+62>:    pop    rbp
0x000000000000007f <+63>:     ret

This looks nothing like what I expected. At first I thought maybe it was due to a crazy calling convention, but adding extern(C) changed nothing.

Can anyone explain what on earth is going on here? All that moving things on and off the stack, a call to the next line (strange) and then we're done bar the cleanup?  I feel i must be missing something.
April 01, 2013
John Colvin:

> Can anyone explain what on earth is going on here?

In the dmd sources there are the sources for those array operations too.

In what you are seeing I think something is not recognizing the SSE+ instructions.

Bye,
bearophile
April 01, 2013
> In what you are seeing I think something is not recognizing the SSE+ instructions.

Sorry, I was wrong. The SSE ops are done elsewhere. You see that "call   0x7b <_D3sse5addtoFAiAiZv+59>".

Bye,
bearophile
April 01, 2013
On Monday, 1 April 2013 at 02:03:12 UTC, bearophile wrote:
>> In what you are seeing I think something is not recognizing the SSE+ instructions.
>
> Sorry, I was wrong. The SSE ops are done elsewhere. You see that "call   0x7b <_D3sse5addtoFAiAiZv+59>".
>
> Bye,
> bearophile

Woops, sorry the actual filename I used was sse.d

You can see that in the function name at the top, the same name as in the call
April 01, 2013
On Monday, 1 April 2013 at 01:54:10 UTC, John Colvin wrote:
> I've been learning assembler a bit and I decided to have a look at what dmd spits out. I tried a simple function with arrays to see what vectorization gets done
>
> void addto(int[] a, int[] b) {
>     a[] += b[];
> }
>
> dmd -O -release -inline -noboundscheck -gc -c test.d
>
> disassembled with gdb:
> _D3sse5addtoFAiAiZv:
> 0x0000000000000040 <+0>:      push   rbp
> 0x0000000000000041 <+1>:      mov    rbp,rsp
> 0x0000000000000044 <+4>:      sub    rsp,0x30
> 0x0000000000000048 <+8>:      mov    QWORD PTR [rbp-0x20],rdi
> 0x000000000000004c <+12>:    mov    QWORD PTR [rbp-0x18],rsi
> 0x0000000000000050 <+16>:    mov    QWORD PTR [rbp-0x10],rdx
> 0x0000000000000054 <+20>:    mov    QWORD PTR [rbp-0x8],rcx
> 0x0000000000000058 <+24>:    mov    rcx,QWORD PTR [rbp-0x18]
> 0x000000000000005c <+28>:    mov    rax,QWORD PTR [rbp-0x20]
> 0x0000000000000060 <+32>:    mov    rdx,rax
> 0x0000000000000063 <+35>:    mov    QWORD PTR [rbp-0x28],rdx
> 0x0000000000000067 <+39>:    mov    rdx,QWORD PTR [rbp-0x8]
> 0x000000000000006b <+43>:    mov    rdi,QWORD PTR [rbp-0x10]
> 0x000000000000006f <+47>:     mov    rsi,rdx
> 0x0000000000000072 <+50>:    mov    rdx,QWORD PTR [rbp-0x28]
> 0x0000000000000076 <+54>:    call   0x7b <_D3sse5addtoFAiAiZv+59>
> 0x000000000000007b <+59>:    mov    rsp,rbp
> 0x000000000000007e <+62>:    pop    rbp
> 0x000000000000007f <+63>:     ret
>
> This looks nothing like what I expected. At first I thought maybe it was due to a crazy calling convention, but adding extern(C) changed nothing.
>
> Can anyone explain what on earth is going on here? All that moving things on and off the stack, a call to the next line (strange) and then we're done bar the cleanup?  I feel i must be missing something.

It just looks like wrong snippet. Probably GDB isn't best assembly level debugger.

.text._D4test5addtoFAiAiZAi:08000044                 public _D4test5addtoFAiAiZAi
.text._D4test5addtoFAiAiZAi:08000044 _D4test5addtoFAiAiZAi proc near
.text._D4test5addtoFAiAiZAi:08000044
.text._D4test5addtoFAiAiZAi:08000044 arg_0           = dword ptr  8
.text._D4test5addtoFAiAiZAi:08000044 arg_8           = dword ptr  10h
.text._D4test5addtoFAiAiZAi:08000044 arg_C           = dword ptr  14h
.text._D4test5addtoFAiAiZAi:08000044
.text._D4test5addtoFAiAiZAi:08000044                 push    ebp
.text._D4test5addtoFAiAiZAi:08000045                 mov     ebp, esp
.text._D4test5addtoFAiAiZAi:08000047                 push    dword ptr [esp+0Ch]
.text._D4test5addtoFAiAiZAi:0800004B                 push    [ebp+arg_0]
.text._D4test5addtoFAiAiZAi:0800004E                 push    [ebp+arg_C]
.text._D4test5addtoFAiAiZAi:08000051                 push    [ebp+arg_8]
.text._D4test5addtoFAiAiZAi:08000054                 call    _arraySliceSliceAddass_i
.text._D4test5addtoFAiAiZAi:08000059                 add     esp, 10h
.text._D4test5addtoFAiAiZAi:0800005C                 pop     ebp
.text._D4test5addtoFAiAiZAi:0800005D                 retn    10h
.text._D4test5addtoFAiAiZAi:0800005D _D4test5addtoFAiAiZAi endp

Pardon 32bits, my IDA free doesn't handle 64bit too well.
The only difference is the fact that arguments here are passed on stack instead of rdi, rsi etc like it takes place on System V AMD64 calling convention
April 01, 2013
On Monday, 1 April 2013 at 01:54:10 UTC, John Colvin wrote:
> I've been learning assembler a bit and I decided to have a look at what dmd spits out. I tried a simple function with arrays to see what vectorization gets done
>
> void addto(int[] a, int[] b) {
>     a[] += b[];
> }
>
> dmd -O -release -inline -noboundscheck -gc -c test.d
>
> disassembled with gdb:
> _D3sse5addtoFAiAiZv:
> 0x0000000000000040 <+0>:      push   rbp
> 0x0000000000000041 <+1>:      mov    rbp,rsp
> 0x0000000000000044 <+4>:      sub    rsp,0x30
> 0x0000000000000048 <+8>:      mov    QWORD PTR [rbp-0x20],rdi
> 0x000000000000004c <+12>:    mov    QWORD PTR [rbp-0x18],rsi
> 0x0000000000000050 <+16>:    mov    QWORD PTR [rbp-0x10],rdx
> 0x0000000000000054 <+20>:    mov    QWORD PTR [rbp-0x8],rcx
> 0x0000000000000058 <+24>:    mov    rcx,QWORD PTR [rbp-0x18]
> 0x000000000000005c <+28>:    mov    rax,QWORD PTR [rbp-0x20]
> 0x0000000000000060 <+32>:    mov    rdx,rax
> 0x0000000000000063 <+35>:    mov    QWORD PTR [rbp-0x28],rdx
> 0x0000000000000067 <+39>:    mov    rdx,QWORD PTR [rbp-0x8]
> 0x000000000000006b <+43>:    mov    rdi,QWORD PTR [rbp-0x10]
> 0x000000000000006f <+47>:     mov    rsi,rdx
> 0x0000000000000072 <+50>:    mov    rdx,QWORD PTR [rbp-0x28]
> 0x0000000000000076 <+54>:    call   0x7b <_D3sse5addtoFAiAiZv+59>
> 0x000000000000007b <+59>:    mov    rsp,rbp
> 0x000000000000007e <+62>:    pop    rbp
> 0x000000000000007f <+63>:     ret
>
> This looks nothing like what I expected. At first I thought maybe it was due to a crazy calling convention, but adding extern(C) changed nothing.
>
> Can anyone explain what on earth is going on here? All that moving things on and off the stack, a call to the next line (strange) and then we're done bar the cleanup?  I feel i must be missing something.

What's after the code?

The 0x76 call is an inline call function, the ret returns it. The stuff before it is setting up the registers for the call and what comes after

> 0x0000000000000076 <+54>:    call   0x7b <_D3sse5addtoFAiAiZv+59>
> 0x000000000000007b <+59>:    mov    rsp,rbp
> 0x000000000000007e <+62>:    pop    rbp
> 0x000000000000007f <+63>:    ret

As you can see, the call is calling the function right below it, but when it returns it depends on what is on the stack as to where the function returns(since ip is being popped into rbp).

To me, and this is a guess, this looks like some type of table of functions being called(the ret function is being redirected to somewhere other than to the place that it was being called from).

So there is much more going on than meets the eye. It would be easier to understand if you stepped through the code to see where the ret is headed.



April 01, 2013
On 04/01/13 12:24, js.mdnq wrote:
> On Monday, 1 April 2013 at 01:54:10 UTC, John Colvin wrote: What's after the code?
> 
> The 0x76 call is an inline call function, the ret returns it. The stuff before it is setting up the registers for the call and what comes after
> 
>> 0x0000000000000076 <+54>:    call   0x7b <_D3sse5addtoFAiAiZv+59>
>> 0x000000000000007b <+59>:    mov    rsp,rbp
>> 0x000000000000007e <+62>:    pop    rbp
>> 0x000000000000007f <+63>:    ret
> 
> As you can see, the call is calling the function right below it, [...]

This is just how objdump/gdb shows the code - it does *not* display relocations inline, so you get this misleading output. The call instruction will not end up having a zero offset (that is why it seems to point at the next op), but will be fixed up to call the right function. Run

   objdump -dr your_obj_or_exe_file

and the real call target will be shown as a relocation entry after the call instruction.

artur
April 01, 2013
On Monday, 1 April 2013 at 11:10:56 UTC, Artur Skawina wrote:
> On 04/01/13 12:24, js.mdnq wrote:
>> On Monday, 1 April 2013 at 01:54:10 UTC, John Colvin wrote:
>> What's after the code?
>> 
>> The 0x76 call is an inline call function, the ret returns it. The stuff before it is setting up the registers for the call and what comes after
>> 
>>> 0x0000000000000076 <+54>:    call   0x7b <_D3sse5addtoFAiAiZv+59>
>>> 0x000000000000007b <+59>:    mov    rsp,rbp
>>> 0x000000000000007e <+62>:    pop    rbp
>>> 0x000000000000007f <+63>:    ret
>> 
>> As you can see, the call is calling the function right below it, [...]
>
> This is just how objdump/gdb shows the code - it does *not* display
> relocations inline, so you get this misleading output. The call
> instruction will not end up having a zero offset (that is why it
> seems to point at the next op), but will be fixed up to call the
> right function. Run
>
>    objdump -dr your_obj_or_exe_file
>
> and the real call target will be shown as a relocation entry after
> the call instruction.
>
> artur

thanks, that explains it.