Jump to page: 1 2
Thread overview
Equality == comparisons with floating point numbers
Dec 06, 2013
Ali Çehreli
Dec 07, 2013
Abdulhaq
Dec 07, 2013
Abdulhaq
Dec 08, 2013
Abdulhaq
Dec 08, 2013
Ali Çehreli
Dec 09, 2013
Abdulhaq
Dec 09, 2013
Abdulhaq
Dec 07, 2013
John Colvin
December 06, 2013
A dangerous topic for everyone :-)

I've been working with some unittests involving comparing the output of different but theoretically equivalent versions of the same calculation.  To my surprise, calculations which I assumed would produce identical output, were failing equality tests.

It seemed unlikely this would be due to any kind of different rounding error, but I decided to check by writing out the whole floating-point numbers formatted with %.80f.  This confirmed my suspicion that the numbers were indeed identical.  You can read the detailed story here:
https://github.com/D-Programming-Language/phobos/pull/1740

It seems like I can probably use isIdentical for the unittests, but I am more concerned about the equality operator.  I completely understand that equality comparisons between FP are dangerous in general as tiny rounding errors may induce a difference, but == in D seems to see difference in circumstances where (so far as I can see) it really shouldn't happen.

Can anybody offer an explanation, a prognosis for improving things, and possible coping strategies in the meantime (other than the ones I already know, isIdentical and approxEqual)?
December 06, 2013
On 12/06/2013 05:47 AM, Joseph Rushton Wakeling wrote:

> I decided to check by writing out the whole floating-point
> numbers formatted with %.80f.  This confirmed my suspicion that the
> numbers were indeed identical.

Are they identical when printed with %a?

Ali

December 06, 2013
On 06/12/13 15:02, Ali Çehreli wrote:
> Are they identical when printed with %a?

On my 64-bit Linux system, yes.  I'll push an updated patch to test and see if the various 32-bit systems report similar results (I was getting failures on 32-bit Darwin, BSD and Linux).

Thanks very much for the suggestion, as that's a print formatting option I wasn't familiar with.
December 06, 2013
On 06/12/13 15:02, Ali Çehreli wrote:
> Are they identical when printed with %a?

Yes.  You can see some of the results here (for the 32-bit systems where I was getting failures):
https://d.puremagic.com/test-results/pull.ghtml?projectid=1&runid=811923&logid=6
https://d.puremagic.com/test-results/pull.ghtml?projectid=1&runid=811924&logid=6
https://d.puremagic.com/test-results/pull.ghtml?projectid=1&runid=811927&logid=6
https://d.puremagic.com/test-results/pull.ghtml?projectid=1&runid=811930&logid=6

So, as I said, it's baffling why the equality operator is not returning true.
December 07, 2013
On Friday, 6 December 2013 at 14:58:31 UTC, Joseph Rushton Wakeling wrote:
> On 06/12/13 15:02, Ali Çehreli wrote:
>> Are they identical when printed with %a?
>
> Yes.  You can see some of the results here (for the 32-bit systems where I was getting failures):
> https://d.puremagic.com/test-results/pull.ghtml?projectid=1&runid=811923&logid=6
> https://d.puremagic.com/test-results/pull.ghtml?projectid=1&runid=811924&logid=6
> https://d.puremagic.com/test-results/pull.ghtml?projectid=1&runid=811927&logid=6
> https://d.puremagic.com/test-results/pull.ghtml?projectid=1&runid=811930&logid=6
>
> So, as I said, it's baffling why the equality operator is not returning true.

Some time ago in a test I had written (C++) apparently identical floating point operations were returning different answers (in the 17th/18th sign fig), when re-running the same code with the same data. The paper described how the result could change if the numbers remained in the FPU (which had a few bits extra precision over the normal register size) during the course of the calculation as a opposed to being swapped in and out of the main registers. Depending on when numbers could get flushed out of the FPU (task swapping I suppose) you would get slightly different answers.

Could this be a factor?
Abdulhaq



December 07, 2013
e.g. see this paper http://software.intel.com/sites/default/files/article/164389/fp-consistency-122712_1.pdf

December 07, 2013
On Friday, 6 December 2013 at 13:47:12 UTC, Joseph Rushton Wakeling wrote:
> A dangerous topic for everyone :-)
>
> I've been working with some unittests involving comparing the output of different but theoretically equivalent versions of the same calculation.  To my surprise, calculations which I assumed would produce identical output, were failing equality tests.
>
> It seemed unlikely this would be due to any kind of different rounding error, but I decided to check by writing out the whole floating-point numbers formatted with %.80f.  This confirmed my suspicion that the numbers were indeed identical.
>  You can read the detailed story here:
> https://github.com/D-Programming-Language/phobos/pull/1740
>
> It seems like I can probably use isIdentical for the unittests, but I am more concerned about the equality operator.  I completely understand that equality comparisons between FP are dangerous in general as tiny rounding errors may induce a difference, but == in D seems to see difference in circumstances where (so far as I can see) it really shouldn't happen.
>
> Can anybody offer an explanation, a prognosis for improving things, and possible coping strategies in the meantime (other than the ones I already know, isIdentical and approxEqual)?

When you print out, you print out at type-precision. The comparison could be happening at higher precision with trailing precision from the last calculation.

I'm pretty sure D is free to do this, it goes with the whole more-precision-is-better-precision philosophy.
December 07, 2013
On 07/12/13 09:29, Abdulhaq wrote:
> Some time ago in a test I had written (C++) apparently identical floating point
> operations were returning different answers (in the 17th/18th sign fig), when
> re-running the same code with the same data. The paper described how the result
> could change if the numbers remained in the FPU (which had a few bits extra
> precision over the normal register size) during the course of the calculation as
> a opposed to being swapped in and out of the main registers. Depending on when
> numbers could get flushed out of the FPU (task swapping I suppose) you would get
> slightly different answers.
>
> Could this be a factor?

Yes, I think you are right.  In fact monarch_dodra had already pointed me towards this, but I slightly missed the point as I assumed this was about real vs. double (for example) comparisons, rather than type vs. FPU cache.

Interestingly, it appears to only hit 32-bit D.  There's a bug report related to this here:
http://d.puremagic.com/issues/show_bug.cgi?id=8745

Anyway, I think I now have a firm idea how to move forward; I thought I'd ask around here first just to see if there was anything I'd missed or that I otherwise wasn't aware of.  So thanks to everybody for your input! :-)
December 07, 2013
On 07/12/13 12:08, Joseph Rushton Wakeling wrote:
> On 07/12/13 09:29, Abdulhaq wrote:
>> Some time ago in a test I had written (C++) apparently identical floating point
>> operations were returning different answers (in the 17th/18th sign fig), when
>> re-running the same code with the same data. The paper described how the result
>> could change if the numbers remained in the FPU (which had a few bits extra
>> precision over the normal register size) during the course of the calculation as
>> a opposed to being swapped in and out of the main registers. Depending on when
>> numbers could get flushed out of the FPU (task swapping I suppose) you would get
>> slightly different answers.
>>
>> Could this be a factor?
>
> Yes, I think you are right.  In fact monarch_dodra had already pointed me
> towards this, but I slightly missed the point as I assumed this was about real
> vs. double (for example) comparisons, rather than type vs. FPU cache.
>
> Interestingly, it appears to only hit 32-bit D.  There's a bug report related to
> this here:
> http://d.puremagic.com/issues/show_bug.cgi?id=8745
>
> Anyway, I think I now have a firm idea how to move forward

... I thought I did, but now I'm up against an interesting conundrum: while equality == comparison can fail here for 32-bit, isIdentical comparison can fail even for 64-bit, although only for the release-mode build.

What's particularly odd is that if before calling assert(isIdentical( ... )) I use writeln to print the value of isIdentical(...) to the screen, then it prints true, and the assertion passes.  If I don't have the print statement, then the assert fails.

I'm presuming that calling writefln to print the variable involves it being taken off the FPU?

December 08, 2013
>
> ... I thought I did, but now I'm up against an interesting conundrum: while equality == comparison can fail here for 32-bit, isIdentical comparison can fail even for 64-bit, although only for the release-mode build.
>
> What's particularly odd is that if before calling assert(isIdentical( ... )) I use writeln to print the value of isIdentical(...) to the screen, then it prints true, and the assertion passes.  If I don't have the print statement, then the assert fails.
>
> I'm presuming that calling writefln to print the variable involves it being taken off the FPU?

I'm just guessing now but it seems that you are in an area that changes depending on which compiler you are using (how does it compile the FP instructions, does it use SSE instructions, how is it checking equality) and which exact processor are you on, does it support IEEE754, does the compiler try to support IEEE754 exactly? I haven't seen much in the forums about FP behaviour in e.g. dmd. E.g. how does it deal with the issues raised in http://www.cs.berkeley.edu/~wkahan/JAVAhurt.pdf? The people who know these things can found discussing them at http://forum.dlang.org/thread/khbodtjtobahhpzmadap@forum.dlang.org?page=3#post-l4rj5o:24292k:241:40digitalmars.com :-).

It's generally held that checking FP numbers for exact equality isn't practical and it's better to go for equality within a certain tolerance - any reason why you're not happy with that :-)?

« First   ‹ Prev
1 2