March 16, 2007
[Fair warning: this seems to have become a pretty big post]

Sean Kelly wrote:
> Frits van Bommel wrote:
>>
>> I attached a patch, though I'm not sure how much use it'll be if you insist on your own version of std.c.stdarg ...
> 
> Thanks.  I think the stdarg issues should now be resolved  [...]

Sorry to disappoint again.
Unfortunately, build-gdc.sh only compiles files in lib/ (and some in std/ imported by them). These compile successfully with my patch, AFAICT.

The stuff in tango/ doesn't work quite as well though.
For instance, it turns out tango.text.convert.Layout is pretty screwed in the current implementation. Again, this has to do with varargs.

At the top it has:
-----
/*******************************************************************************

        Platform issues ...

*******************************************************************************/

version (DigitalMars)
         alias void* Arg;
   else
      alias char* Arg;
-----
and it then proceeds to use Arg to receive va_list arguments. Well, I thought this would be an easy fix:
-----
private import tango.core.Vararg;
alias va_list Arg;
-----

Unfortunately, that doesn't work. Specifically because of how Arg is used in "public final uint convert (Sink sink, TypeInfo[] arguments, Arg args, T[] formatStr)":
-----
Arg[64] arglist = void;
foreach (i, arg; arguments)
	{
	arglist[i] = args;
	args += (arg.tsize + int.sizeof - 1) & ~ (int.sizeof - 1);
	}

return parse (formatStr, arguments, arglist, sink);
-----
Here it looks like it tries to manually extract the arguments, assuming they all lie (on the stack) at int.sizeof boundaries. While this may work on (32-bit?) platforms where varargs are all passed on the stack like x86 with C or D calling convention[1], on amd64 this is not the case: va_list isn't a pointer and the variable arguments aren't necessarily all on the stack.

Since it seems GDC basically uses the C calling convention, I'll give a short description of the problems I see with it:
The C(++?) calling convention on amd64[2] passes the first several parameters in registers, and it seems not to deviate from this in case of varargs. Worse, it uses different sets of registers for different types of arguments (6 general-purpose registers for integer types & pointers etc., 8 SSE regs for float/double, and the rest in memory). Aggregates are potentially split up(!) if certain conditions are met (< 16 bytes, all members are naturally aligned, fits entirely into registers, and a few more).

Basically, what I'm trying to say is: This would most likely be hell to implement manually, and the code using it still unportable to yet other calling conventions.
IMHO what we really need to do is either something like educating TypeInfo to know how to "increment" a va_list to skip the appropriate type[3], or changing GDC to just pass varargs on the stack when compiling an extern(D) function, no matter the local C calling convention :).
In the mean time, perhaps this function (and any vararg functions calling it with their own va_list) could be changed to use variadic template args (I do hope none of these are overridden anywhere, or this won't work). This will likely result in some code bloat though, with most calls using their own private instantiation, so it's not a really nice long-term solution. It _should_ be pretty portable though :).


There were some other errors I less thoroughly investigated, but most of these look like they can probably easily be fixed (in the first two cases perhaps even by simply using the same code as for x86?):
* tango.sys.linux.socket static asserts(0) if version(X86) isn't defined, with comment "// Different values on other platforms." (the X86 branch defines a constant named "SOL_SOCKET", whatever that may be)
* tango.math.IEEE doesn't define the FPU masks for amd64 (just for x86 and PPC it seems).
* tango.text.Regex also seems to assume that va_list is a pointer to a 1-byte type instead of using va_start/va_arg/va_end.
* tango.stdc.posix.setjmp straight static asserts(false, "Architecture not supported.") on version(X86_64) (though the actual error is an undefined identifier used after that, presumably because static asserts are evaluated a bit late in the parsing process)


[1]: IIRC x86 GDC uses the same calling convention for both, and DMD uses a very similar convention in case of varargs.

[2]: As detailed in http://www.x86-64.org/documentation/abi.pdf, specifically the section "Parameter passing" on pages 15-22.

[3]: Or a va_arg variant that uses a run-time TypeInfo argument instead of a compile-time template type argument...
March 16, 2007
Frits van Bommel wrote:
[snip]

Thanks for digging into this, Frits

> [Fair warning: this seems to have become a pretty big post]
> Unfortunately, that doesn't work. Specifically because of how Arg is used in "public final uint convert (Sink sink, TypeInfo[] arguments, Arg args, T[] formatStr)":
> -----
> Arg[64] arglist = void;
> foreach (i, arg; arguments)
>     {
>     arglist[i] = args;
>     args += (arg.tsize + int.sizeof - 1) & ~ (int.sizeof - 1);
>     }
> 
> return parse (formatStr, arguments, arglist, sink);
> -----
> Here it looks like it tries to manually extract the arguments, assuming they all lie (on the stack) at int.sizeof boundaries. While this may work on (32-bit?) platforms where varargs are all passed on the stack like x86 with C or D calling convention[1], on amd64 this is not the case: va_list isn't a pointer and the variable arguments aren't necessarily all on the stack.

Aye, it is intended to make each argument indexable. Only works on 32-bit platform, making the same assumptions as the code in std.stdarg

> 
> Since it seems GDC basically uses the C calling convention, I'll give a short description of the problems I see with it:
> The C(++?) calling convention on amd64[2] passes the first several parameters in registers, and it seems not to deviate from this in case of varargs. Worse, it uses different sets of registers for different types of arguments (6 general-purpose registers for integer types & pointers etc., 8 SSE regs for float/double, and the rest in memory). Aggregates are potentially split up(!) if certain conditions are met (< 16 bytes, all members are naturally aligned, fits entirely into registers, and a few more).

stdarg.d would fail also. That is not good :(

[snip]
March 16, 2007
kris wrote:
> Frits van Bommel wrote:
> [snip]
> 
> Thanks for digging into this, Frits

No problem. I assure you, it's mostly for selfish motives ;). I just want it to work without having to resort to 32-bit compilers on a 64-bit system.
(Though I get by with Phobos just fine)

>> [Fair warning: this seems to have become a pretty big post]
>> Unfortunately, that doesn't work. Specifically because of how Arg is used in "public final uint convert (Sink sink, TypeInfo[] arguments, Arg args, T[] formatStr)":
>> -----
>> Arg[64] arglist = void;
>> foreach (i, arg; arguments)
>>     {
>>     arglist[i] = args;
>>     args += (arg.tsize + int.sizeof - 1) & ~ (int.sizeof - 1);
>>     }
>>
>> return parse (formatStr, arguments, arglist, sink);
>> -----
>> Here it looks like it tries to manually extract the arguments, assuming they all lie (on the stack) at int.sizeof boundaries. While this may work on (32-bit?) platforms where varargs are all passed on the stack like x86 with C or D calling convention[1], on amd64 this is not the case: va_list isn't a pointer and the variable arguments aren't necessarily all on the stack.
> 
> Aye, it is intended to make each argument indexable.

That's what I thought.
I actually tried to change it to a more portable equivalent but gave up once I realized that wasn't easily done without having access to the actual types of the parameters at compile time.

> Only works on
> 32-bit platform, making the same assumptions as the code in std.stdarg

AFAICT only DMD's version of std.stdarg makes those assumptions. (And it technically has every right to, since DMD is currently 32-bit only)

>> Since it seems GDC basically uses the C calling convention, I'll give a short description of the problems I see with it:
>> The C(++?) calling convention on amd64[2] passes the first several parameters in registers, and it seems not to deviate from this in case of varargs. Worse, it uses different sets of registers for different types of arguments (6 general-purpose registers for integer types & pointers etc., 8 SSE regs for float/double, and the rest in memory). Aggregates are potentially split up(!) if certain conditions are met (< 16 bytes, all members are naturally aligned, fits entirely into registers, and a few more).
> 
> stdarg.d would fail also. That is not good :(

It shouldn't for the GDC version, I think.
Note these snippets from the code for std.stdarg distributed with GDC:
---
    // va_arg is handled magically by the compiler
// [...]
template va_arg(T)
{
    T va_arg(inout va_list _argptr)
    {
	/*
	T arg = *cast(T*)_argptr;
	_argptr = _argptr + ((T.sizeof + int.sizeof - 1) & ~(int.sizeof - 1));
	return arg;
	*/
	T t; return t;
    }
}
---
Note how little va_arg does (just enough to compile), and the comment mentioning special compiler magic that presumably fixes it to do the Right Thing(TM).

I haven't tried it out though, other than using writefln() in programs and getting the expected output :).
March 16, 2007
Frits van Bommel wrote:
> [Fair warning: this seems to have become a pretty big post]
> 
> Sean Kelly wrote:
>> Frits van Bommel wrote:
>>>
>>> I attached a patch, though I'm not sure how much use it'll be if you insist on your own version of std.c.stdarg ...
>>
>> Thanks.  I think the stdarg issues should now be resolved  [...]
> 
> Sorry to disappoint again.
> Unfortunately, build-gdc.sh only compiles files in lib/ (and some in std/ imported by them). These compile successfully with my patch, AFAICT.

Is this related to the stdarg issues?  Just want to make sure I understand what you meant by "sorry to disappoint."
> 
> The stuff in tango/ doesn't work quite as well though.
> For instance, it turns out tango.text.convert.Layout is pretty screwed in the current implementation. Again, this has to do with varargs.

Saw that.  I passed the issue to Kris, but I suspect we may have to work together on this one.

> There were some other errors I less thoroughly investigated, but most of these look like they can probably easily be fixed (in the first two cases perhaps even by simply using the same code as for x86?):
> * tango.sys.linux.socket static asserts(0) if version(X86) isn't defined, with comment "// Different values on other platforms." (the X86 branch defines a constant named "SOL_SOCKET", whatever that may be)
> * tango.math.IEEE doesn't define the FPU masks for amd64 (just for x86 and PPC it seems).
> * tango.text.Regex also seems to assume that va_list is a pointer to a 1-byte type instead of using va_start/va_arg/va_end.

Thanks.  We'll look into these.  And please feel free to submit tickets for these or any other problem you find.

> * tango.stdc.posix.setjmp straight static asserts(false, "Architecture not supported.") on version(X86_64) (though the actual error is an undefined identifier used after that, presumably because static asserts are evaluated a bit late in the parsing process)

This was deliberate, as a clear and obvious reminder to expand support as needed.  But perhaps this should be allowed to compile, accompanied by a pragma(msg) that Fibers won't work?  That said, if someone wants to pass on the relevant setjmp headers for a 64-bit glibc then I'll see about expanding support.  With this in mind, I don't suppose GDC yet supports inline ASM for the new 64-bit registers, etc?


Sean
March 17, 2007
Sean Kelly wrote:
> Frits van Bommel wrote:
>> [Fair warning: this seems to have become a pretty big post]
>>
>> Sean Kelly wrote:
>>> Frits van Bommel wrote:
>>>>
>>>> I attached a patch, though I'm not sure how much use it'll be if you insist on your own version of std.c.stdarg ...
>>>
>>> Thanks.  I think the stdarg issues should now be resolved  [...]
>>
>> Sorry to disappoint again.
>> Unfortunately, build-gdc.sh only compiles files in lib/ (and some in
>> std/ imported by them). These compile successfully with my patch, AFAICT.
> 
> Is this related to the stdarg issues?  Just want to make sure I understand what you meant by "sorry to disappoint."

Well, the biggest issue was definitely vararg-related, yeah. And specifically unportable assumptions made in regards to them.

>> The stuff in tango/ doesn't work quite as well though.
>> For instance, it turns out tango.text.convert.Layout is pretty screwed
>> in the current implementation. Again, this has to do with varargs.
> 
> Saw that.  I passed the issue to Kris, but I suspect we may have to work together on this one.
> 
>> There were some other errors I less thoroughly investigated, but most
>> of these look like they can probably easily be fixed (in the first two
>> cases perhaps even by simply using the same code as for x86?):
>> * tango.sys.linux.socket static asserts(0) if version(X86) isn't
>> defined, with comment "// Different values on other platforms." (the
>> X86 branch defines a constant named "SOL_SOCKET", whatever that may be)
>> * tango.math.IEEE doesn't define the FPU masks for amd64 (just for x86
>> and PPC it seems).
>> * tango.text.Regex also seems to assume that va_list is a pointer to a
>> 1-byte type instead of using va_start/va_arg/va_end.
> 
> Thanks.  We'll look into these.  And please feel free to submit tickets for these or any other problem you find.

I don't think I'll find any others until these are fixed, since all I
did was try to compile every .d file in the tango/ hierarchy with "find
-name *.d -exec gdc [...]" :).
Though that last one should be trivial to fix.

>> * tango.stdc.posix.setjmp straight static asserts(false, "Architecture not supported.") on version(X86_64) (though the actual error is an undefined identifier used after that, presumably because static asserts are evaluated a bit late in the parsing process)
> 
> This was deliberate, as a clear and obvious reminder to expand support as needed.

Consider yourself reminded :).

 > But perhaps this should be allowed to compile, accompanied
> by a pragma(msg) that Fibers won't work?

Oh, I didn't even know what it was used for, or that it was used by Tango itself at all. Like mentioned above, I just indiscriminately tried to compile everything...

 > That said, if someone wants to
> pass on the relevant setjmp headers for a 64-bit glibc then I'll see about expanding support.

You mean the ones attached? Or do you need any others?

 > With this in mind, I don't suppose GDC yet
> supports inline ASM for the new 64-bit registers, etc?

Not using the regular DMD-style with Intel syntax, IIRC. But the "extended asm" (using at&t syntax in strings) supports them, I think. Actually, if it's anything like regular GCC then it's almost literally dumped into the input sent to gas (the gnu assembler).


March 17, 2007
Frits van Bommel wrote:
> Sean Kelly wrote:
> 
>  > That said, if someone wants to
>> pass on the relevant setjmp headers for a 64-bit glibc then I'll see about expanding support.
> 
> You mean the ones attached? Or do you need any others?

That should do it.  Though if there are any related comments on what registers are stored in __jmp_buf, that would come in handy.  The ones I have for Linux contain this in bits/setjmp.h:

#if defined __USE_MISC || defined _ASM
# define JB_BX	0
# define JB_SI	1
# define JB_DI	2
# define JB_BP	3
# define JB_SP	4
# define JB_PC	5
# define JB_SIZE 24
#endif

Your headers don't contain that however, so I'm not entirely sure how to slice & dice jmp_buf to inject the proper information.  I'll google a bit and try to find out as well.

>  > With this in mind, I don't suppose GDC yet
>> supports inline ASM for the new 64-bit registers, etc?
> 
> Not using the regular DMD-style with Intel syntax, IIRC. But the "extended asm" (using at&t syntax in strings) supports them, I think. Actually, if it's anything like regular GCC then it's almost literally dumped into the input sent to gas (the gnu assembler).

Oh, I didn't know GDC supported this the at&t string syntax.  Interesting.


Sean
March 17, 2007
Sean Kelly wrote:
> Frits van Bommel wrote:
>> Sean Kelly wrote:
>>
>>  > That said, if someone wants to
>>> pass on the relevant setjmp headers for a 64-bit glibc then I'll see about expanding support.
>>
>> You mean the ones attached? Or do you need any others?
> 
> That should do it.  Though if there are any related comments on what registers are stored in __jmp_buf, that would come in handy.  The ones I have for Linux contain this in bits/setjmp.h:
> 
> #if defined __USE_MISC || defined _ASM
> # define JB_BX    0
> # define JB_SI    1
> # define JB_DI    2
> # define JB_BP    3
> # define JB_SP    4
> # define JB_PC    5
> # define JB_SIZE 24
> #endif
> 
> Your headers don't contain that however, so I'm not entirely sure how to slice & dice jmp_buf to inject the proper information.  I'll google a bit and try to find out as well.

Sorry, a "grep JB_ * -rn" didn't find anything...

[some time later]

I disassembled libc, this is the result:
0x00 rbx
0x08 rbp XOR POINTER_GUARD
0x10 r12
0x18 r13
0x20 r14
0x28 r15
0x30 (rsp before function call) XOR POINTER_GUARD
0x38 (return address) XOR POINTER_GUARD

Where POINTER_GUARD is the 8-byte value at fs:0x30.
Not sure why this is done, perhaps to avoid accidental dereferences?

later verified by apt-getting the source:
// from glibc-2.4/sysdeps/x86_64/jmpbuf-offsets.h
#define JB_RBX	0
#define JB_RBP	1
#define JB_R12	2
#define JB_R13	3
#define JB_R14	4
#define JB_R15	5
#define JB_RSP	6
#define JB_PC	7
#define JB_SIZE (8*8)

>>  > With this in mind, I don't suppose GDC yet
>>> supports inline ASM for the new 64-bit registers, etc?
>>
>> Not using the regular DMD-style with Intel syntax, IIRC. But the "extended asm" (using at&t syntax in strings) supports them, I think. Actually, if it's anything like regular GCC then it's almost literally dumped into the input sent to gas (the gnu assembler).
> 
> Oh, I didn't know GDC supported this the at&t string syntax.  Interesting.

I think I tried it a while back because I remembered that GCC passes the string along almost literally to the assembler, after failing to get the extended names to work in "regular" asm{} blocks.
March 17, 2007
Frits van Bommel wrote:
> Sean Kelly wrote:
>> Frits van Bommel wrote:
>>> Sean Kelly wrote:
>>>
>>>  > That said, if someone wants to
>>>> pass on the relevant setjmp headers for a 64-bit glibc then I'll see about expanding support.
>>>
>>> You mean the ones attached? Or do you need any others?
>>
>> That should do it.  Though if there are any related comments on what registers are stored in __jmp_buf, that would come in handy.  The ones I have for Linux contain this in bits/setjmp.h:
>>
>> #if defined __USE_MISC || defined _ASM
>> # define JB_BX    0
>> # define JB_SI    1
>> # define JB_DI    2
>> # define JB_BP    3
>> # define JB_SP    4
>> # define JB_PC    5
>> # define JB_SIZE 24
>> #endif
>>
>> Your headers don't contain that however, so I'm not entirely sure how to slice & dice jmp_buf to inject the proper information.  I'll google a bit and try to find out as well.
> 
> Sorry, a "grep JB_ * -rn" didn't find anything...
> 
> [some time later]
> 
> I disassembled libc, this is the result:
> 0x00 rbx
> 0x08 rbp XOR POINTER_GUARD
> 0x10 r12
> 0x18 r13
> 0x20 r14
> 0x28 r15
> 0x30 (rsp before function call) XOR POINTER_GUARD
> 0x38 (return address) XOR POINTER_GUARD
> 
> Where POINTER_GUARD is the 8-byte value at fs:0x30.
> Not sure why this is done, perhaps to avoid accidental dereferences?
> 
> later verified by apt-getting the source:
> // from glibc-2.4/sysdeps/x86_64/jmpbuf-offsets.h
> #define JB_RBX    0
> #define JB_RBP    1
> #define JB_R12    2
> #define JB_R13    3
> #define JB_R14    4
> #define JB_R15    5
> #define JB_RSP    6
> #define JB_PC    7
> #define JB_SIZE (8*8)

Perfect.  Thanks!


Sean
March 17, 2007
Frits van Bommel wrote:
> 
> However, copying GDC's std.c.stdarg module over it lets it compile a bit further. Unfortunately, it then chokes on something else:
> ---
> gcc -c -O -m32 core/ThreadASM.S -ocore/ThreadASM.o
> gcc -c -O -m32 stdc/wrap.c -ostdc/wrap.o
> In file included from /usr/include/features.h:346,
>                  from /usr/include/errno.h:29,
>                  from stdc/wrap.c:1:
> /usr/include/gnu/stubs.h:7:27: error: gnu/stubs-32.h: No such file or directory
> make[1]: *** [stdc/wrap.o] Error 1
> make[1]: Leaving directory `/home/urxae/opt/tango/svn-trunk/lib/common/tango'
> make: *** [lib] Error 2

Okay, I've changed the makefiles a bit.  Please let me know if you still see this error (or if I've created a new one).


Sean
March 17, 2007
Sean Kelly wrote:
> Frits van Bommel wrote:
>>
>> However, copying GDC's std.c.stdarg module over it lets it compile a bit further. Unfortunately, it then chokes on something else:
>> ---
>> gcc -c -O -m32 core/ThreadASM.S -ocore/ThreadASM.o
>> gcc -c -O -m32 stdc/wrap.c -ostdc/wrap.o
>> In file included from /usr/include/features.h:346,
>>                  from /usr/include/errno.h:29,
>>                  from stdc/wrap.c:1:
>> /usr/include/gnu/stubs.h:7:27: error: gnu/stubs-32.h: No such file or directory
>> make[1]: *** [stdc/wrap.o] Error 1
>> make[1]: Leaving directory `/home/urxae/opt/tango/svn-trunk/lib/common/tango'
>> make: *** [lib] Error 2
> 
> Okay, I've changed the makefiles a bit.  Please let me know if you still see this error (or if I've created a new one).

The above error is fixed.

However, after reverting my changes, I still get the old va_arg error:
---
gdc -o lifetime.o -g -frelease -O2 -fversion=GC_Use_Alloc_MMap -fversion=GC_Use_Stack_GLibC -fversion=GC_Use_Data_Fixed -nostdinc -pipe -I../../..   \
        -c lifetime.d
../../../std/c/stdarg.di:19: Error: cannot have out or inout parameter of type ubyte[24][1]
../../../std/c/stdarg.di:815: template instance std.c.stdarg.va_start!(uint) error instantiating
lifetime.d:815: Error: cannot change reference to static array 'va'
make[2]: *** [lifetime.o] Error 1
make[2]: Leaving directory `/home/urxae/opt/tango/svn-trunk/lib/compiler/gdc'
make[1]: *** [all] Error 2
make[1]: Leaving directory `/home/urxae/opt/tango/svn-trunk/lib/compiler/gdc'
make: *** [lib] Error 2
---

I dug into it a bit more and it turns out GDC chokes if va_start and/or va_arg are inside version(GNU)(!).
This patch fixes it, at least for GDC:
=====
Index: std/c/stdarg.di
===================================================================
--- std/c/stdarg.di     (revision 1923)
+++ std/c/stdarg.di     (working copy)
@@ -14,6 +14,7 @@
     alias __builtin_va_end  va_end;
     alias __builtin_va_copy va_copy;

+}
     template va_start(T)
     {
         void va_start( out va_list ap, inout T parmn )
@@ -29,4 +30,3 @@
             return T.init;
         }
     }
-}
=====
(just moves the closing brace of version(GNU) above those functions)

I don't think this should cause problems for DMD since the only import of this module is inside version(GNU) itself...
User-level code importing std.c.stdarg may be a problem. Unfortunately, this seems unavoidable since the code is needed by tango.stdc.stdarg. Unless perhaps it could be moved to lib/compiler/gdc and only copied to an import path when building/installing for GDC?