April 16, 2011
== Quote from dsimcha (dsimcha@yahoo.com)'s article
> On 4/16/2011 10:11 AM, dsimcha wrote:
> Output:
> Rounding mode:  0
> 0.7853986633972191094
> Rounding mode:  0
> 0.7853986633972437348

This is not something I can replicate on my workstation.
April 16, 2011
On 4/16/2011 11:16 AM, Iain Buclaw wrote:
> == Quote from dsimcha (dsimcha@yahoo.com)'s article
>> On 4/16/2011 10:11 AM, dsimcha wrote:
>> Output:
>> Rounding mode:  0
>> 0.7853986633972191094
>> Rounding mode:  0
>> 0.7853986633972437348
>
> This is not something I can replicate on my workstation.

Interesting.  Since I know you're a Linux user, I fired up my Ubuntu VM and tried out my test case.  I can't reproduce it either on Linux, only on Windows.
April 16, 2011
"dsimcha" <dsimcha@yahoo.com> wrote in message news:iocalv$2h58$1@digitalmars.com...
> Output:
> Rounding mode:  0
> 0.7853986633972191094
> Rounding mode:  0
> 0.7853986633972437348

Could be something somewhere is getting truncated from real to double, which would mean 12 fewer bits of mantisa. Maybe the FPU is set to lower precision in one of the threads?


April 16, 2011
On 4/16/11 9:59 AM, dsimcha wrote:
> On 4/16/2011 10:55 AM, Andrei Alexandrescu wrote:
>> On 4/16/11 9:52 AM, dsimcha wrote:
>>> Output:
>>> Rounding mode: 0
>>> 0.7853986633972191094
>>> Rounding mode: 0
>>> 0.7853986633972437348
>>
>> I think at this precision the difference may be in random bits. Probably
>> you don't need to worry about it.
>>
>> Andrei
>
> "random bits"? I am fully aware that these low order bits are numerical
> fuzz and are meaningless from a practical perspective. I am only
> concerned because I thought these bits are supposed to be deterministic
> even if they're meaningless. Now that I've ruled out a bug in
> std.parallelism, I'm wondering if it's a bug in druntime or DMD.

I seem to remember that essentially some of the last bits printed in such a result are essentially arbitrary. I forgot what could cause this.

Andrei
April 16, 2011
On 4/16/2011 6:46 AM, Iain Buclaw wrote:
> == Quote from Walter Bright (newshound2@digitalmars.com)'s article
>> That's a good thought. FP addition results can differ dramatically depending on
>> associativity.
>
> And not to forget optimisations too. ;)

The dmd optimizer is careful not to reorder evaluation in such a way as to change the results.
April 16, 2011
> Could be something somewhere is getting truncated from real to double, which would mean 12 fewer bits of mantisa. Maybe the FPU is set to lower precision in one of the threads?

Yes indeed, this is a _Windows_ "bug".
I have experienced this in Windows before, the main thread's FPU state register is
initialized to lower FPU-Precision (64bits) by default by the OS, presumably to
make FP calculations faster. However, when you start a new thread, the FPU will
use the whole 80 bits for computation because, curiously, the FPU is not
reconfigured for those.
Suggested fix: Add

asm{fninit};

to the beginning of your main function, and the difference between the two will be gone.

This would be a compatibility issue DMD/windows which disables the "real" data type. You might want to file a bug report for druntime if my suggested fix works. (This would imply that the real type was basically identical to the double type in Windows all along!)
April 16, 2011
On 4/16/2011 2:15 PM, Timon Gehr wrote:
>
>> Could be something somewhere is getting truncated from real to double, which
>> would mean 12 fewer bits of mantisa. Maybe the FPU is set to lower precision
>> in one of the threads?
>
> Yes indeed, this is a _Windows_ "bug".
> I have experienced this in Windows before, the main thread's FPU state register is
> initialized to lower FPU-Precision (64bits) by default by the OS, presumably to
> make FP calculations faster. However, when you start a new thread, the FPU will
> use the whole 80 bits for computation because, curiously, the FPU is not
> reconfigured for those.
> Suggested fix: Add
>
> asm{fninit};
>
> to the beginning of your main function, and the difference between the two will be
> gone.
>
> This would be a compatibility issue DMD/windows which disables the "real" data
> type. You might want to file a bug report for druntime if my suggested fix works.
> (This would imply that the real type was basically identical to the double type in
> Windows all along!)

Close:  If I add this instruction to the function for the new thread, the difference goes away.  The relevant statement is:

    auto t = new Thread( {
        asm { fninit; }
        res2 = sumRange(terms);
    } );

At any rate, this is a **huge** WTF that should probably be fixed in druntime.  Once I understand it a little better, I'll file a bug report.
April 16, 2011
== Quote from Walter Bright (newshound2@digitalmars.com)'s article
> On 4/16/2011 6:46 AM, Iain Buclaw wrote:
> > == Quote from Walter Bright (newshound2@digitalmars.com)'s article
> >> That's a good thought. FP addition results can differ dramatically depending on associativity.
> >
> > And not to forget optimisations too. ;)
> The dmd optimizer is careful not to reorder evaluation in such a way as to change the results.

And so it rightly shouldn't!

I was thinking more of a case of FPU precision rather than ordering: as in you get a different result computing on SSE in double precision mode on the one hand, and by computing on x87 in double precision then writing to a double variable in memory.


Classic example (which could either be a bug or non-bug depending on your POV):

void test(double x, double y)
{
  double y2 = x + 1.0;
  assert(y == y2);   // triggers with -O
}

void main()
{
  double x = .012;
  double y = x + 1.0;
  test(x, y);
}

April 16, 2011
On 4/16/2011 2:24 PM, dsimcha wrote:
> On 4/16/2011 2:15 PM, Timon Gehr wrote:
>>
>>> Could be something somewhere is getting truncated from real to
>>> double, which
>>> would mean 12 fewer bits of mantisa. Maybe the FPU is set to lower
>>> precision
>>> in one of the threads?
>>
>> Yes indeed, this is a _Windows_ "bug".
>> I have experienced this in Windows before, the main thread's FPU state
>> register is
>> initialized to lower FPU-Precision (64bits) by default by the OS,
>> presumably to
>> make FP calculations faster. However, when you start a new thread, the
>> FPU will
>> use the whole 80 bits for computation because, curiously, the FPU is not
>> reconfigured for those.
>> Suggested fix: Add
>>
>> asm{fninit};
>>
>> to the beginning of your main function, and the difference between the
>> two will be
>> gone.
>>
>> This would be a compatibility issue DMD/windows which disables the
>> "real" data
>> type. You might want to file a bug report for druntime if my suggested
>> fix works.
>> (This would imply that the real type was basically identical to the
>> double type in
>> Windows all along!)
>
> Close: If I add this instruction to the function for the new thread, the
> difference goes away. The relevant statement is:
>
> auto t = new Thread( {
> asm { fninit; }
> res2 = sumRange(terms);
> } );
>
> At any rate, this is a **huge** WTF that should probably be fixed in
> druntime. Once I understand it a little better, I'll file a bug report.

Read up a little on what fninit does, etc.  This is IMHO a druntime bug.  Filed as http://d.puremagic.com/issues/show_bug.cgi?id=5847 .
April 16, 2011
On 4/16/2011 9:52 AM, Andrei Alexandrescu wrote:
> I seem to remember that essentially some of the last bits printed in such a
> result are essentially arbitrary. I forgot what could cause this.

To see what the exact bits are, print using %A format.

In any case, floating point bits are not random. They are completely deterministic.