April 16, 2011
On 4/16/2011 11:43 AM, Iain Buclaw wrote:
> I was thinking more of a case of FPU precision rather than ordering: as in you get
> a different result computing on SSE in double precision mode on the one hand, and
> by computing on x87 in double precision then writing to a double variable in memory.

You're right on that one.
April 16, 2011
On Apr 16, 2011, at 11:43 AM, dsimcha wrote:
>> 
>> Close: If I add this instruction to the function for the new thread, the difference goes away. The relevant statement is:
>> 
>> auto t = new Thread( {
>> asm { fninit; }
>> res2 = sumRange(terms);
>> } );
>> 
>> At any rate, this is a **huge** WTF that should probably be fixed in druntime. Once I understand it a little better, I'll file a bug report.
> 
> Read up a little on what fninit does, etc.  This is IMHO a druntime bug.  Filed as http://d.puremagic.com/issues/show_bug.cgi?id=5847 .

Really a Windows bug that should be fixed in druntime :-)  I know I'm splitting hairs.  This will be fixed for the next release.

April 16, 2011
On 4/16/2011 11:51 AM, Sean Kelly wrote:
>
> On Apr 16, 2011, at 11:43 AM, dsimcha wrote:
>>>
>>> Close: If I add this instruction to the function for the new thread, the
>>> difference goes away. The relevant statement is:
>>>
>>> auto t = new Thread( { asm { fninit; } res2 = sumRange(terms); } );
>>>
>>> At any rate, this is a **huge** WTF that should probably be fixed in
>>> druntime. Once I understand it a little better, I'll file a bug report.
>>
>> Read up a little on what fninit does, etc.  This is IMHO a druntime bug.
>> Filed as http://d.puremagic.com/issues/show_bug.cgi?id=5847 .
>
> Really a Windows bug that should be fixed in druntime :-)  I know I'm
> splitting hairs.  This will be fixed for the next release.


The dmd startup code (actually the C startup code) does an fninit. I never thought about new thread starts. So, yeah, druntime should do an fninit on thread creation.
April 16, 2011
Walter:

> The dmd startup code (actually the C startup code) does an fninit. I never thought about new thread starts. So, yeah, druntime should do an fninit on thread creation.

My congratulations to all the (mostly two) people involved in finding this bug and its causes :-)
I'd like to see this module in Phobos.

Bye,
bearophile
April 16, 2011
On Sat, 16 Apr 2011 15:32:12 -0400, Walter Bright <newshound2@digitalmars.com> wrote:

> On 4/16/2011 11:51 AM, Sean Kelly wrote:
>>
>> On Apr 16, 2011, at 11:43 AM, dsimcha wrote:
>>>>
>>>> Close: If I add this instruction to the function for the new thread, the
>>>> difference goes away. The relevant statement is:
>>>>
>>>> auto t = new Thread( { asm { fninit; } res2 = sumRange(terms); } );
>>>>
>>>> At any rate, this is a **huge** WTF that should probably be fixed in
>>>> druntime. Once I understand it a little better, I'll file a bug report.
>>>
>>> Read up a little on what fninit does, etc.  This is IMHO a druntime bug.
>>> Filed as http://d.puremagic.com/issues/show_bug.cgi?id=5847 .
>>
>> Really a Windows bug that should be fixed in druntime :-)  I know I'm
>> splitting hairs.  This will be fixed for the next release.
>
>
> The dmd startup code (actually the C startup code) does an fninit. I never thought about new thread starts. So, yeah, druntime should do an fninit on thread creation.

The documentation I've found on fninit seems to indicate it defaults to 64-bit precision, which means that by default we aren't seeing the benefit of D's reals. I'd much prefer 80-bit precision by default.
April 19, 2011
On Apr 16, 2011, at 1:02 PM, Robert Jacques wrote:

> On Sat, 16 Apr 2011 15:32:12 -0400, Walter Bright <newshound2@digitalmars.com> wrote:
>> 
>> 
>> The dmd startup code (actually the C startup code) does an fninit. I never thought about new thread starts. So, yeah, druntime should do an fninit on thread creation.
> 
> The documentation I've found on fninit seems to indicate it defaults to 64-bit precision, which means that by default we aren't seeing the benefit of D's reals. I'd much prefer 80-bit precision by default.

There is no option to set "80-bit precision" via the FPU control word.  Section 8.1.5.2 of the Intel 64 SDM says the following:

"The precision-control (PC) field (bits 8 and 9 of the x87 FPU control word) determines the precision (64, 53, or 24 bits) of floating-point calculations made by the x87 FPU (see Table 8-2). The default precision is double extended precision, which uses the full 64-bit significand available with the double extended-precision floating-point format of the x87 FPU data registers. This setting is best suited for most applications, because it allows applications to take full advantage of the maximum precision available with the x87 FPU data registers."

So it sounds like finit/fninit does what we want.
April 20, 2011
On Tue, 19 Apr 2011 14:18:46 -0400, Sean Kelly <sean@invisibleduck.org> wrote:

> On Apr 16, 2011, at 1:02 PM, Robert Jacques wrote:
>
>> On Sat, 16 Apr 2011 15:32:12 -0400, Walter Bright <newshound2@digitalmars.com> wrote:
>>>
>>>
>>> The dmd startup code (actually the C startup code) does an fninit. I never thought about new thread starts. So, yeah, druntime should do an fninit on thread creation.
>>
>> The documentation I've found on fninit seems to indicate it defaults to 64-bit precision, which means that by default we aren't seeing the benefit of D's reals. I'd much prefer 80-bit precision by default.
>
> There is no option to set "80-bit precision" via the FPU control word.  Section 8.1.5.2 of the Intel 64 SDM says the following:
>
> "The precision-control (PC) field (bits 8 and 9 of the x87 FPU control word) determines the precision (64, 53, or 24 bits) of floating-point calculations made by the x87 FPU (see Table 8-2). The default precision is double extended precision, which uses the full 64-bit significand available with the double extended-precision floating-point format of the x87 FPU data registers. This setting is best suited for most applications, because it allows applications to take full advantage of the maximum precision available with the x87 FPU data registers."
>
> So it sounds like finit/fninit does what we want.

Yes, that sounds right. Thanks for clarifying.
April 20, 2011
Sean Kelly wrote:
> On Apr 16, 2011, at 1:02 PM, Robert Jacques wrote:
> 
>> On Sat, 16 Apr 2011 15:32:12 -0400, Walter Bright <newshound2@digitalmars.com> wrote:
>>>
>>> The dmd startup code (actually the C startup code) does an fninit. I never thought about new thread starts. So, yeah, druntime should do an fninit on thread creation.
>> The documentation I've found on fninit seems to indicate it defaults to 64-bit precision, which means that by default we aren't seeing the benefit of D's reals. I'd much prefer 80-bit precision by default.
> 
> There is no option to set "80-bit precision" via the FPU control word.  

??? Yes there is.

enum PrecisionControl : short {
    PRECISION80 = 0x300,
    PRECISION64 = 0x200,
    PRECISION32 = 0x000
};

/** Set the number of bits of precision used by 'real'.
 *
 * Returns: the old precision.
 * This is not supported on all platforms.
 */
PrecisionControl reduceRealPrecision(PrecisionControl prec) {
   version(D_InlineAsm_X86) {
        short cont;
        asm {
            fstcw cont;
            mov CX, cont;
            mov AX, cont;
            and EAX, 0x0300; // Form the return value
            and CX,  0xFCFF;
            or  CX,  prec;
            mov cont, CX;
            fldcw cont;
        }
    } else {
           assert(0, "Not yet supported");
    }
}
April 20, 2011
On Apr 20, 2011, at 5:06 AM, Don wrote:

> Sean Kelly wrote:
>> On Apr 16, 2011, at 1:02 PM, Robert Jacques wrote:
>>> On Sat, 16 Apr 2011 15:32:12 -0400, Walter Bright <newshound2@digitalmars.com> wrote:
>>>> 
>>>> The dmd startup code (actually the C startup code) does an fninit. I never thought about new thread starts. So, yeah, druntime should do an fninit on thread creation.
>>> The documentation I've found on fninit seems to indicate it defaults to 64-bit precision, which means that by default we aren't seeing the benefit of D's reals. I'd much prefer 80-bit precision by default.
>> There is no option to set "80-bit precision" via the FPU control word.
> 
> ??? Yes there is.
> 
> enum PrecisionControl : short {
>    PRECISION80 = 0x300,
>    PRECISION64 = 0x200,
>    PRECISION32 = 0x000
> };

So has Intel deprecated 80-bit FPU support?  Why do the docs for this say that 64-bit is the highest precision?  And more importantly, does this mean that we should be setting the PC field explicitly instead of relying on fninit?  The docs say that fninit initializes to 64-bit precision.  Or is that inaccurate as well?
April 20, 2011
"Sean Kelly" <sean@invisibleduck.org> wrote in message news:mailman.3597.1303316625.4748.digitalmars-d@puremagic.com... On Apr 20, 2011, at 5:06 AM, Don wrote:

> Sean Kelly wrote:
>> On Apr 16, 2011, at 1:02 PM, Robert Jacques wrote:
>>> On Sat, 16 Apr 2011 15:32:12 -0400, Walter Bright <newshound2@digitalmars.com> wrote:
>>>>
>>>> The dmd startup code (actually the C startup code) does an fninit. I never thought about new thread starts. So, yeah, druntime should do an fninit on thread creation.
>>> The documentation I've found on fninit seems to indicate it defaults to 64-bit precision, which means that by default we aren't seeing the benefit of D's reals. I'd much prefer 80-bit precision by default.
>> There is no option to set "80-bit precision" via the FPU control word.
>
> ??? Yes there is.
>
> enum PrecisionControl : short {
>    PRECISION80 = 0x300,
>    PRECISION64 = 0x200,
>    PRECISION32 = 0x000
> };
>
>So has Intel deprecated 80-bit FPU support?  Why do the docs for this say that 64-bit
> is the highest precision?  And more importantly, does this mean that we
> should be setting
> the PC field explicitly instead of relying on fninit?  The docs say that
> fninit initializes to
> 64-bit precision.  Or is that inaccurate as well?=

You misread the docs, it's talking about precision which is just the size of the mantisa, not the actual full size of the floating point data. IE...

80 float = 64 bit precision
64 float = 53 bit precision
32 float = 24 bit precision