March 17, 2007
Frits van Bommel wrote:

> Lars Ivar Igesund wrote:
>> Frits van Bommel wrote:
>> 
>>> [2]: At least, I presume this discussion was brought on by the non-portable code in tango.text.convert.Layout?
>> 
>> Rather that that code breaks because GDC don't follow the D ABI :)
> 
> You may notice that the same text that specifies _argptr should be a
> void* then goes on to say[1] "To protect against the vagaries of stack
> layouts on different CPU architectures, use std.stdarg to access the
> variadic arguments".
> Doesn't this pretty much mean there's no good reason for _argptr to be a
> void* instead of a CPU-specific va_list type as declared by std.stdarg,
> which would be aliased to whatever the appropriate type for the CPU
> architecture is. (i.e. gcc.builtins.__builtin_va_list for GDC, void* for
> DMD)
> 
> In short: I don't think GDC is broken here, the ABI is just still a bit x86-centric and is the one that should be fixed. There is simply no good reason to decide in the standard that _argptr must be a pointer. It may be simpler (compiler) implementation-wise, but if you can't portably use the pointer anyway you may as well leave _argptr's type up to the compiler and just specify it as std.vararg.va_list, whatever that may be.

If there is an actual win in having _argptr as a pointer as opposed to the current behaviour, for instance the possibility to easily indexing into the argument list, then there is a good reason to have it in the specification.

It is possible that other ways to have this working can be devised, but not being able to index into them portably, when they in any case need to be in the correct order, seems to me to be silly.

-- 
Lars Ivar Igesund
blog at http://larsivi.net
DSource, #d.tango & #D: larsivi
Dancing the Tango
March 17, 2007
Lars Ivar Igesund wrote:
> If there is an actual win in having _argptr as a pointer as opposed to the
> current behaviour, for instance the possibility to easily indexing into the
> argument list, then there is a good reason to have it in the specification.
> 
> It is possible that other ways to have this working can be devised, but not
> being able to index into them portably, when they in any case need to be in
> the correct order, seems to me to be silly.

I agree that indexing is important to have, especially given how Tango's format strings work.
However I don't think that justifies the language standard basically requiring that va_list is a pointer, especially if it can't be portably used without special standard library functions (i.e. va_arg and friends) anyway.
If on a given architecture a pointer happens to be the most efficient or natural implementation, fine. But I think that decision should be up to the compiler, not the language.
March 17, 2007
Frits van Bommel wrote:
> Sean Kelly wrote:
>>
>> So they define an ABI for C to facilitate interoperability across compilers.  This is laudable, but it has nothing to do with calling D functions.
> 
> At the moment it does, since GDC 0.23 seems to follow it on amd64...
> (IIRC GDC just passes he context/this pointer (if any) as a hidden first argument, ensuring it'll always be in a register)
> I would wholly support using a slight variation when it comes to varargs though, to put them in a contiguous area on the stack.

Same here.  I'm actually fine with the calling convention otherwise, and the D spec doesn't address AMD64 anyway.  A naive translation might be to require the first argument in RAX and the rest on the stack, just as in x86, but given that there are so many new registers available it does seem reasonable to use them.

> It's slightly better in that only one set of registers is used for vararg parameters (vararg floats & doubles are passed in general-purpose registers) but that's about it. This may fix the immediate problem[2] though: parameter location is not so much dependent on type anymore, as long as D's value types remain free of non-trivial copy constructors & destructors. That means it may be possible to determine an argument location based solely on its size and position in the list.

But the argument would still not be addressable, assuming it's a vararg.  For D functions, I think it makes sense to follow the AMD64/C calling convention for normal function parameters but to always pass varargs on the stack.  The alternatives are just too messy from a user perspective.

> Oh, but in order to be able to use this for amd64 you'd also have to define a new register mapping. (Note that IA-64 has *way* more registers than amd64, and seems to do some funky magic with them to rename most of them on calls)

I actually like the IA-64 spec--it has a well-defined memory model and seems well considered overall.  But experience has shown that it was too big a move for the market.  Still, Intel has retaken control of the market with Core, and it seems to be moving towards IA-64 in terms of how it works.  But things are interesting everywhere.  SSE128 is even coming soon, which should restore high-precision floating point as a solid option on new PCs.  FWIW, there is a pretty good discussion of upcoming CPU features here:

http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2939

> [2]: At least, I presume this discussion was brought on by the non-portable code in tango.text.convert.Layout?

It was.  That code is correct according to the D spec, but obviously doesn't work on GDC/AMD64.  The only reasonable option right now is to perform two passes across the vararg list--the first to compute the required size for a dynamic buffer, and the second to copy all vararg data into that buffer.  This isn't terrible as a temporary fix (perhaps wrapped in a version(GCC) block), but it isn't something we really want to do as a long-term solution.

I'm beginning to think that some time should be spent on an AMD64 version of the D ABI, even if GDC is the only compiler it matters for at the moment.  In fact, the more I think about this the more I feel that D needs another compiler in development for comparison.  I'm starting to really wish I had time to play with Microsoft's Phoenix.  Maybe once Tango goes 1.0 :-p


Sean
March 17, 2007
Frits van Bommel wrote:
> Lars Ivar Igesund wrote:
>> If there is an actual win in having _argptr as a pointer as opposed to the
>> current behaviour, for instance the possibility to easily indexing into the
>> argument list, then there is a good reason to have it in the specification.
>>
>> It is possible that other ways to have this working can be devised, but not
>> being able to index into them portably, when they in any case need to be in
>> the correct order, seems to me to be silly.
> 
> I agree that indexing is important to have, especially given how Tango's format strings work.
> However I don't think that justifies the language standard basically requiring that va_list is a pointer, especially if it can't be portably used without special standard library functions (i.e. va_arg and friends) anyway.
> If on a given architecture a pointer happens to be the most efficient or natural implementation, fine. But I think that decision should be up to the compiler, not the language.

For what it's worth, I think the compiler could likely optimize many uses of register-based varargs down to single-instruction operations. Addressing is the only issue that's a problem.  However, it's a sufficiently large one that I'm not convinced that passing varargs via registers is truly a good idea.


Sean
March 17, 2007
Sean Kelly wrote:
> Frits van Bommel wrote:
>> It's slightly better in that only one set of registers is used for vararg parameters (vararg floats & doubles are passed in general-purpose registers) but that's about it. This may fix the immediate problem[2] though: parameter location is not so much dependent on type anymore, as long as D's value types remain free of non-trivial copy constructors & destructors. That means it may be possible to determine an argument location based solely on its size and position in the list.
> 
> But the argument would still not be addressable, assuming it's a vararg.  For D functions, I think it makes sense to follow the AMD64/C calling convention for normal function parameters but to always pass varargs on the stack.  The alternatives are just too messy from a user perspective.

I said "slightly" for a reason ;). But I do think this means it may be at least *possible* to do it without distinguishing arguments on anything other than .tsize() and argument position. Though I may be wrong when it comes to aggregate stack-based types, such as structs/unions and dynamic array references. If those are split up like in the amd64 version, we'd also need TypeInfo to communicate the TypeInfo for the fields for this to work.
Of course, for this to be usable in any portable way we'd also need some abstraction layer over this that uses TypeInfo instead of template arguments, and returns a pointer to memory storing the extracted data. That memory would of course have to be allocated somehow. Perhaps something like a "void* va_arg()(TypeInfo, inout va_list, void[] buffer = null)" overload? (following the idiom I've seen in Tango: an optional argument containing a buffer, that's re-allocated if empty or too small)

But a purely stack-based approach when it comes to varargs would seem to clearly be superior. At least, on architectures I know of.
If any future architecture (or a current one I'm unaware of) allows, for instance, register indexing (i.e. put the contents of register number N into r10, where N is the value of rax) then on that architecture it may be worth it to use register-based varargs. I don't know if that's likely to happen (though I rather doubt it) but for this sort of reason I don't think the spec should limit varargs to memory-based calling conventions.
Especially, as I've mentioned quite often by now, since you can't use the pointer directly anyway in a portable manner.

>> Oh, but in order to be able to use this for amd64 you'd also have to define a new register mapping. (Note that IA-64 has *way* more registers than amd64, and seems to do some funky magic with them to rename most of them on calls)
> 
> I actually like the IA-64 spec

I never said I didn't like it, just that register renaming seemed a bit strange to me. I only read about it today, so I have no idea how well this works (nor do I have an IA-64 computer to run benchmarks on :) ) so I don't have an opinion on it other than at first sight it looked a bit weird...

[snip]
> I'm beginning to think that some time should be spent on an AMD64 version of the D ABI, even if GDC is the only compiler it matters for at the moment.

This may well be a good idea. But who would write it? If it's David it might just become GDC documentation, but if it's Walter or anyone else it might be flawed because of inexperience with GCC or general compiler internals and/or the amd64 CPU. (nothing personal, just speculating)
To make sure all aspects are thought of, you'd likely need to form a committee, but design-by-committee is often a bad idea...

Personally, my first instinct would be to just copy the 32-bit spec pretty closely for the data layout, but base the function calling convention on the amd64 C one (that GDC currently seems to use) with the exception that varargs are passed on the stack.
Though perhaps there could be some modifications made to data layout for classes while we're at it.
For one thing, field reordering can decrease space consumption by putting fields with larger .alignof at the front (on a per-class basis, but perhaps filling up 'holes' at the end of base classes as well).
Also, the implementation of interfaces may merit some thought.

And then there are of course all the areas not documented in the current ABI, like the names and signatures of internal functions, RTTI, static (~)this(), unit tests, etc.
(interfaces also fall into this category but were already mentioned above)

> In fact, the more I think about this the more I feel that D
> needs another compiler in development for comparison.  I'm starting to really wish I had time to play with Microsoft's Phoenix.  Maybe once Tango goes 1.0 :-p

LLVM has also been previously mentioned as an option.
March 17, 2007
Frits van Bommel wrote:
> 
> But a purely stack-based approach when it comes to varargs would seem to clearly be superior. At least, on architectures I know of.
> If any future architecture (or a current one I'm unaware of) allows, for instance, register indexing (i.e. put the contents of register number N into r10, where N is the value of rax) then on that architecture it may be worth it to use register-based varargs. I don't know if that's likely to happen (though I rather doubt it) but for this sort of reason I don't think the spec should limit varargs to memory-based calling conventions.
> Especially, as I've mentioned quite often by now, since you can't use the pointer directly anyway in a portable manner.

Good point.  I think it would also be possible for the compiler to detect the addressing of varargs and generate code to push that data onto the stack.  This optimization wouldn't be much more difficult than codegen for direct-accessing varargs in registers (or so it seems).

> [snip]
>> I'm beginning to think that some time should be spent on an AMD64 version of the D ABI, even if GDC is the only compiler it matters for at the moment.
> 
> This may well be a good idea. But who would write it? If it's David it might just become GDC documentation, but if it's Walter or anyone else it might be flawed because of inexperience with GCC or general compiler internals and/or the amd64 CPU. (nothing personal, just speculating)
> To make sure all aspects are thought of, you'd likely need to form a committee, but design-by-committee is often a bad idea...

I figured Walter would write it based on the AMD64 spec and feedback from the community.  It probably couldn't be set in stone, however, since some issues never seem to be discovered until implementation time.

> Personally, my first instinct would be to just copy the 32-bit spec pretty closely for the data layout, but base the function calling convention on the amd64 C one (that GDC currently seems to use) with the exception that varargs are passed on the stack.

Mine as well.  There is really no reason why the AMD64 spec for D shouldn't largely mirror the one for C in terms of parameter passing. The rules for register use do seem somewhat complicated, but doing so is more efficient than stack operations.  I can see it wreaking havoc with naked asm code however.  As it is, I don't use naked asm in extern (D) functions because of the optional use of EAX--I can't imagine trying to sort out what parameter is where under the new rules.


Sean
March 17, 2007
Frits van Bommel wrote:
>>>> kris wrote:
>>>>
>>>>> The D spec says:
>>>>>
>>>>> "The implementiations of these variadic functions have a special local variable declared for them, _argptr, which is a void* pointer to the first of the variadic arguments. To access the arguments, _argptr must be cast to a pointer to the expected argument type"
>>>>>
>>>>>
>>>>> To me, this means that a D compiler must implement _argptr in these terms. Is that indeed the case, Walter? Or is my interpretation incorrect?
[snip]
> [2]: At least, I presume this discussion was brought on by the non-portable code in tango.text.convert.Layout?

Indeed, although non-portable holds true only if you think in C instead of D. According to the D documentation that code is written "correctly", and thus it ought to be portable.

Walter, it would be helpful if you'd clear the air on this one? Is _argptr expected to be memory-based/addressable (per the documentation)?

- Kris
March 17, 2007
kris wrote:
> Frits van Bommel wrote:
>>>>> kris wrote:
>>>>>
>>>>>> The D spec says:
>>>>>>
>>>>>> "The implementiations of these variadic functions have a special local variable declared for them, _argptr, which is a void* pointer to the first of the variadic arguments. To access the arguments, _argptr must be cast to a pointer to the expected argument type"
>>>>>>
>>>>>>
>>>>>> To me, this means that a D compiler must implement _argptr in these terms. Is that indeed the case, Walter? Or is my interpretation incorrect?
> [snip]
>> [2]: At least, I presume this discussion was brought on by the non-portable code in tango.text.convert.Layout?
> 
> Indeed, although non-portable holds true only if you think in C instead of D. According to the D documentation that code is written "correctly", and thus it ought to be portable.

Assuming _argptr is a pointer and dereferencing it may be according to the spec as it is currently written, but that's not all the code does.
It also *increments* the pointer, at which time it _assumes_ all arguments are aligned at a multiple of int.sizeof (4). Even if varargs were entirely stack-based, that would likely break in the case of amd64 since the most natural alignment for most parameters would be 8 (aka size_t.sizeof and (void*).sizeof). That's the size of anything a 'push' pushes and anything a 'pop' pops, as well as the size of return addresses. This could be 'fixed' for this case by using size_t or void*, but even that isn't guaranteed to work anywhere else.
Note that incrementing _argptr manually is discouraged:
"To protect against the vagaries of stack layouts on different CPU architectures, use std.stdarg to access the variadic arguments"
Nowhere on that page (that I can see) is the minimum alignment for arguments specified.
So even given that dereferencing the pointer may be portable, there's no way to portably increment it without using va_arg.
In fact, AFAICT it's not even guaranteed you need increment instead of decrement to get to the next argument (other than considerable difficulties / impossibility of implementing va_arg for that case).

So even though GDC may be going against the spec here, I'm still pretty sure that piece of code is not portable.
March 17, 2007
Frits van Bommel wrote:
> kris wrote:
> 
>> Frits van Bommel wrote:
>>
>>>>>> kris wrote:
>>>>>>
>>>>>>> The D spec says:
>>>>>>>
>>>>>>> "The implementiations of these variadic functions have a special local variable declared for them, _argptr, which is a void* pointer to the first of the variadic arguments. To access the arguments, _argptr must be cast to a pointer to the expected argument type"
>>>>>>>
>>>>>>>
>>>>>>> To me, this means that a D compiler must implement _argptr in these terms. Is that indeed the case, Walter? Or is my interpretation incorrect?
>>
>> [snip]
>>
>>> [2]: At least, I presume this discussion was brought on by the non-portable code in tango.text.convert.Layout?
>>
>>
>> Indeed, although non-portable holds true only if you think in C instead of D. According to the D documentation that code is written "correctly", and thus it ought to be portable.
> 
> 
> Assuming _argptr is a pointer and dereferencing it may be according to the spec as it is currently written, but that's not all the code does.
> It also *increments* the pointer, at which time it _assumes_ all arguments are aligned at a multiple of int.sizeof (4). Even if varargs were entirely stack-based, that would likely break in the case of amd64 since the most natural alignment for most parameters would be 8 (aka size_t.sizeof and (void*).sizeof). That's the size of anything a 'push' pushes and anything a 'pop' pops, as well as the size of return addresses. This could be 'fixed' for this case by using size_t or void*, but even that isn't guaranteed to work anywhere else.
> Note that incrementing _argptr manually is discouraged:
> "To protect against the vagaries of stack layouts on different CPU architectures, use std.stdarg to access the variadic arguments"
> Nowhere on that page (that I can see) is the minimum alignment for arguments specified.
> So even given that dereferencing the pointer may be portable, there's no way to portably increment it without using va_arg.
> In fact, AFAICT it's not even guaranteed you need increment instead of decrement to get to the next argument (other than considerable difficulties / impossibility of implementing va_arg for that case).
> 
> So even though GDC may be going against the spec here, I'm still pretty sure that piece of code is not portable.

That's why "correctly" was quoted, Frits. I think you'll find a similar concern in the Phobos stdarg? There's a couple of issues going on here, and it would be useful to tease them apart first. Let's at least figure out if the D doc is correct first, before we get all pedantic? :)
March 17, 2007
kris wrote:
> Frits van Bommel wrote:
>> kris wrote:
>>> Indeed, although non-portable holds true only if you think in C instead of D. According to the D documentation that code is written "correctly", and thus it ought to be portable.
>>
>>
>> Assuming _argptr is a pointer and dereferencing it may be according to the spec as it is currently written, but that's not all the code does.
>> It also *increments* the pointer, at which time it _assumes_ all arguments are aligned at a multiple of int.sizeof (4). Even if varargs were entirely stack-based, that would likely break in the case of amd64 since the most natural alignment for most parameters would be 8 (aka size_t.sizeof and (void*).sizeof).
[snip]
>> So even though GDC may be going against the spec here, I'm still pretty sure that piece of code is not portable.
> 
> That's why "correctly" was quoted, Frits.

I thought that to mean the code followed the spec to the letter but still failed, while IMHO it didn't follow the spec exactly (even though that wasn't the direct reason it failed on amd64).

> I think you'll find a similar concern in the Phobos stdarg?

If used with GDC, yes. But the current implementation is just fine when it comes to DMD, since that only compiles for 32-bit x86-family processors.
You'll note that GDC distributes a different version :) (and has some special-case magic for the module).

> There's a couple of issues going on here,
> and it would be useful to tease them apart first. Let's at least figure out if the D doc is correct first, before we get all pedantic? :)

And how do you suggest we figure that out, except by waiting for Walter? And if that would be your suggestion, we may as well work ahead on the pedantic-ness until he comes by ;). (It's not like there's anything good on tv tonight over here...)