two mains

Jan 26, 2013

Tyro[17]

Jan 26, 2013

Tyro[17]

Jan 27, 2013

Jan 27, 2013

Jan 27, 2013

Jan 27, 2013

January 26, 2013

Posted by Tyro[17]

Permalink

Tyro[17]

Permalink

Trying to learn from the ground up and would appreciate some assistance making sense of the following:

// void main(){} [1]
.text._Dmain	segment
	assume	CS:.text._Dmain
_Dmain:
		push	RBP
		mov	RBP,RSP
		xor	EAX,EAX
		pop	RBP
		ret
.text._Dmain	ends
.cto

// void main(string[] args) {} [2]
.text._Dmain	segment
	assume	CS:.text._Dmain
_Dmain:
		push	RBP
		mov	RBP,RSP
		sub	RSP,010h
		xor	EAX,EAX
		leave
		ret
.text._Dm

Both segments of code deal with a minimal D program: the first taking no arguments and the second taking a string array. Information prior to the ".text_Dmain segment" in both files mirror each other with the exception of module name differences.

The first difference that jumps out at me is that the ".text._Dmain segment" in [1] properly terminates with ".text._Dmain ends" like all other segments in the file up to this point. [2] is improperly: ".text._Dm".

The second is the use of leave in [2]. If I understand correctly, leave is the exact same as:

		mov	RBP,RSP
		pop	RBP

So why do we need to mov RBP, RSP in [2] but not in [1]? I'm thinking this is because RBP contains the address of args but not sure.

Last is the .cto at the end of [1], what on earth is it and what is it used for? Why does it not exist in [2]?

Thanks,
Andrew

On 1/26/13 3:42 PM, Tyro[17] wrote: [Snip] > > Both segments of code deal with a minimal D program: the first taking no > arguments and the second taking a string array. Information prior to the > ".text_Dmain segment" in both files mirror each other with the exception > of module name differences. > > The first difference that jumps out at me is that the ".text._Dmain > segment" in [1] properly terminates with ".text._Dmain ends" like all > other segments in the file up to this point. [2] is improperly: > ".text._Dm". [1] This seems to be a problem with the disassembler @DPaste > The second is the use of leave in [2]. If I understand correctly, leave > is the exact same as: > > mov RBP,RSP > pop RBP > > So why do we need to mov RBP, RSP in [2] but not in [1]? I'm thinking > this is because RBP contains the address of args but not sure. Still not clear on this choice. > Last is the .cto at the end of [1], what on earth is it and what is it > used for? Why does it not exist in [2]? See [1] above. > Thanks, > Andrew

On Saturday, 26 January 2013 at 20:42:27 UTC, Tyro[17] wrote: > So why do we need to mov RBP, RSP in [2] but not in [1]? I'm thinking this is because RBP contains the address of args but not sure. The x64 calling convention passes the first few arguments via registers. I think it's most likely that the function prolog is allocating stack space to save the value of whatever registers (RDI/RDX?) which contain the string[] parameter, so that it could reuse those registers in the code of the function - but the assignment seems to have been optimized out, yet the stack allocation wasn't. FWIW, on Windows x64, DMD generates slightly different code, presumably because it's using the Microsoft x64 calling convention instead of the System V one. There is no stack allocation when compiled with -O, however without -O, DMD adds a "mov [RBP+10h], RCX" instruction. I assume it makes use of the 32-byte "shadow space" to "spill" ECX: http://en.wikipedia.org/wiki/X86_calling_conventions#x86-64_calling_conventions

On Saturday, 26 January 2013 at 20:42:27 UTC, Tyro[17] wrote: > Trying to learn from the ground up and would appreciate some assistance making sense of the following: > > // void main(){} [1] > [...] This might not be directly relevant here, but in general, I'd steer clear of main() for such experiments. Due to its special nature (several possible signatures, but the calling code in druntime stays the same), there is quite a bit of extra magic going on internally. For example, you wouldn't normally find "xor EAX, EAX" in a void-returning functions, as its purpose is to set the return value to 0, which implicitly happens to make the void main() and void main(string[]) variants conform to the "full" int main(string[]) signature. David

On Saturday, 26 January 2013 at 20:57:54 UTC, Tyro[17] wrote: > On 1/26/13 3:42 PM, Tyro[17] wrote: >> The second is the use of leave in [2]. If I understand correctly, leave >> is the exact same as: >> >> mov RBP,RSP >> pop RBP >> >> So why do we need to mov RBP, RSP in [2] but not in [1]? I'm thinking >> this is because RBP contains the address of args but not sure. > > Still not clear on this choice. Both functions could be replaced with --- _Dmain: xor EAX, EAX ret --- as they don't need to store anything on the stack at all. What you are seeing is just an odd result of the way the compiler generates the code internally, especially if you are compiling without optimizations on. For further information on what EBP/RBP is needed for, try searching for discussions about the (mis)use of GCC's omit-frame-pointer option, such as: http://stackoverflow.com/questions/579262/what-is-the-purpose-of-the-frame-pointer David

On 1/27/13 8:57 AM, David Nadlinger wrote: > On Saturday, 26 January 2013 at 20:42:27 UTC, Tyro[17] wrote: >> Trying to learn from the ground up and would appreciate some >> assistance making sense of the following: >> >> // void main(){} [1] >> [...] > > This might not be directly relevant here, but in general, I'd steer > clear of main() for such experiments. Due to its special nature (several > possible signatures, but the calling code in druntime stays the same), > there is quite a bit of extra magic going on internally. > > For example, you wouldn't normally find "xor EAX, EAX" in a > void-returning functions, as its purpose is to set the return value to > 0, which implicitly happens to make the void main() and void > main(string[]) variants conform to the "full" int main(string[]) signature. > > David Thank you much. Things are starting to get a little clearer. Oh, thanks Vlad.

Forums