November 27, 2010
On 26.11.2010 23:02, Don wrote:
> The code below compiles to a single machine instruction, yet the results are CPU manufacturer-dependent.
> ----
> import std.math;
>
> void main()
> {
>      assert( yl2x(0x1.0076fc5cc7933866p+40L, LN2)
>     == 0x1.bba4a9f774f49d0ap+4L); // Pass on Intel, fails on AMD
> }
> ----
> The results for yl2x(0x1.0076fc5cc7933866p+40L, LN2) are:
>
> Intel:  0x1.bba4a9f774f49d0ap+4L
> AMD:    0x1.bba4a9f774f49d0cp+4L
>
> The least significant bit is different. This corresponds only to a fraction of a bit (that is, it's hardly important for accuracy. For comparison, sin and cos on x86 lose nearly sixty bits of accuracy in some cases!). Its importance is only that it is an undocumented difference between manufacturers.
>
> The difference was discovered through the unit tests for the mathematical Special Functions which will be included in the next compiler release. Discovery of the discrepancy happened only because of several features of D:
>
> - built-in unit tests (encourages tests to be run on many machines)
>
> - built-in code coverage (the tests include extreme cases, simply because I was trying to increase the code coverage to high values)
>
> - D supports the hex format for floats. Without this feature, the discrepancy would have been blamed on differences in the floating-point conversion functions in the C standard library.
>
> This experience reinforces my belief that D is an excellent language for scientific computing.
>
> Thanks to David Simcha and Dmitry Olshansky for help in tracking this down.
Glad to help!
I was genuinely intrigued because not more then a few weeks ago I discussed with a friend of mine a possibility of differences in FP calculations of AMD vs Intel.
You see, his scientific app yielded different results when working at home/at work, which is a frustrating experience. Since that's exactly same binary, written in Delphi (no C run-time involved and so on) and environment is pretty much the same... I suggested to check CPU vendors just in case... of course, different.

In the meantime, I sort of "ported" the test case to M$ c++ inline asm and posted it on AMD forums, let's see what they have to say.
http://forums.amd.com/forum/messageview.cfm?catid=319&threadid=142893&enterthread=y <http://forums.amd.com/forum/messageview.cfm?catid=319&threadid=142893&enterthread=y>

-- 
Dmitry Olshansky

November 27, 2010
Don Wrote:

> The great tragedy was that an early AMD processor gave much accurate sin and cos than the 387. But, people complained that it was different from Intel! So, their next processor duplicated Intel's hopelessly wrong trig functions.

The same question goes to you. Why do you call this bug?
November 27, 2010
On 28-11-2010 5:49, Dmitry Olshansky wrote:
> On 26.11.2010 23:02, Don wrote:
>> The code below compiles to a single machine instruction, yet the
>> results are CPU manufacturer-dependent.
>> ----
>> import std.math;
>>
>> void main()
>> {
>> assert( yl2x(0x1.0076fc5cc7933866p+40L, LN2)
>> == 0x1.bba4a9f774f49d0ap+4L); // Pass on Intel, fails on AMD
>> }
>> ----
>> The results for yl2x(0x1.0076fc5cc7933866p+40L, LN2) are:
>>
>> Intel: 0x1.bba4a9f774f49d0ap+4L
>> AMD: 0x1.bba4a9f774f49d0cp+4L
>>
>> The least significant bit is different. This corresponds only to a
>> fraction of a bit (that is, it's hardly important for accuracy. For
>> comparison, sin and cos on x86 lose nearly sixty bits of accuracy in
>> some cases!). Its importance is only that it is an undocumented
>> difference between manufacturers.
>>
>> The difference was discovered through the unit tests for the
>> mathematical Special Functions which will be included in the next
>> compiler release. Discovery of the discrepancy happened only because
>> of several features of D:
>>
>> - built-in unit tests (encourages tests to be run on many machines)
>>
>> - built-in code coverage (the tests include extreme cases, simply
>> because I was trying to increase the code coverage to high values)
>>
>> - D supports the hex format for floats. Without this feature, the
>> discrepancy would have been blamed on differences in the
>> floating-point conversion functions in the C standard library.
>>
>> This experience reinforces my belief that D is an excellent language
>> for scientific computing.
>>
>> Thanks to David Simcha and Dmitry Olshansky for help in tracking this
>> down.
> Glad to help!
> I was genuinely intrigued because not more then a few weeks ago I
> discussed with a friend of mine a possibility of differences in FP
> calculations of AMD vs Intel.
> You see, his scientific app yielded different results when working at
> home/at work, which is a frustrating experience. Since that's exactly
> same binary, written in Delphi (no C run-time involved and so on) and
> environment is pretty much the same... I suggested to check CPU vendors
> just in case... of course, different.
>
> In the meantime, I sort of "ported" the test case to M$ c++ inline asm
> and posted it on AMD forums, let's see what they have to say.
> http://forums.amd.com/forum/messageview.cfm?catid=319&threadid=142893&enterthread=y
> <http://forums.amd.com/forum/messageview.cfm?catid=319&threadid=142893&enterthread=y>
>
>

http://forums.amd.com/forum/messageview.cfm?catid=29&threadid=135771
This post also talks about a fyl2x bug. Wonder if it's the same bug.

L.
November 28, 2010
Kagamin wrote:
> Don Wrote:
> 
>> The great tragedy was that an early AMD processor gave much accurate sin and cos than the 387. But, people complained that it was different from Intel! So, their next processor duplicated Intel's hopelessly wrong trig functions.
> 
> The same question goes to you. Why do you call this bug?

The Intel CPU gives the correct answer, but AMD's is wrong. They should both give the correct result.
November 28, 2010
Don Wrote:

> The Intel CPU gives the correct answer, but AMD's is wrong. They should both give the correct result.

Really? I think, the answer is neither correct nor wrong. It's approximate.
If your friend's program operates on ~0x1p+40 values and critically depends on on the value of the last bit, then double precision doesn't suit his needs (on both Intel and AMD), he should take a couple of classes on computational theory and rewrite his algorithm, or use arithmetic with higher precision.
November 28, 2010
If it happens once its a bug, if its repeatable its a feature ;-)

-=mike=-

"Kagamin" <spam@here.lot> wrote in message news:icth6h$1nqe$1@digitalmars.com...
> Don Wrote:
>
>> The Intel CPU gives the correct answer, but AMD's is wrong. They should both give the correct result.
>
> Really? I think, the answer is neither correct nor wrong. It's
> approximate.
> If your friend's program operates on ~0x1p+40 values and critically
> depends on on the value of the last bit, then double precision doesn't
> suit his needs (on both Intel and AMD), he should take a couple of classes
> on computational theory and rewrite his algorithm, or use arithmetic with
> higher precision.


November 28, 2010
Kagamin wrote:
> Don Wrote:
> 
>> The Intel CPU gives the correct answer, but AMD's is wrong. They should both give the correct result.
> 
> Really? I think, the answer is neither correct nor wrong. It's approximate.

The rules for rounding the mathematical value to the representation are precise, and so there is such a thing as the correctly rounded result and the wrong result.

An FPU should strive to always produce the correctly rounded result.
November 29, 2010
> import std.conv: to;
> void main() {
>     auto r = to!real("0x1.0076fc5cc7933866p+40L");
>     auto d = to!double("0x1.0076fc5cc7933866p+40L");
>     auto f = to!float("0x1.0076fc5cc7933866p+40L");
> }
>
>
>> Regarding unit tests, I should really use them :(
>
> Yep, and DbC too, and compile your D code with -w.
>
> Bye,
> bearophile

I have an unrelated question, this is not a criticism but an honest one.
Why don't you write these 3 lines like:

>     auto r = to!real  ("0x1.0076fc5cc7933866p+40L");
>     auto d = to!double("0x1.0076fc5cc7933866p+40L");
>     auto f = to!float ("0x1.0076fc5cc7933866p+40L");

Thank you.

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/
November 29, 2010
> The same question goes to you. Why do you call this bug?

It is approximate, but approximation is not an "undefined behavior".
It is same as "2 + 1 = 4".

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/
November 29, 2010
On 27/11/2010 06:26, Don wrote:
> I haven't seen any examples of values which are calculated differently
> between the processors. I only found one vague reference in a paper from
> CERN.

And because of that comment, I've once again checked http://hasthelargehadroncolliderdestroyedtheworldyet.com/
, just to make sure... :P
CERN better be aware of that stuff! :D

-- 
Bruno Medeiros - Software Engineer