Submission: Floating point near-equality (page 2) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Submission: Floating point near-equality (page 2)

August 16, 2005

Re: Submission: Floating point near-equality

Posted by Ben Hinkle
in reply to Don Clugston

Ben Hinkle

Posted in reply to Don Clugston

>>> It'll be hard to avoid magic numbers. Even a statement of "4 bits of
>>> precision" is a magic number - just in binary. I actually wouldn't mind
>>> being up-front about the fact that there are two notions of equality
>>> absolute and relative and have the test take two parameters (the
>>> relative
>>> and absute tolerances) and go with those.
>
> The point is that in IEEE arithmetic, the relative and absolute tolerances
> are
> related via the number of bits of precision.
> Near zero, "bits of precision"
> looks like an absolute tolerance. Away from zero, it looks like relative
> tolerance.

Do you mean "near one"? Near zero life gets very interesting. If I understand your algorithm correctly the number 1e-200 is still very far away from 0 or even 1e-201 or -1e-200. Then again I might not understand the algorithm.

> And there's a region in between where it's really difficult to think in terms of absolute or relative tolerance, they kind of blur together.

Agreed (I think). Over number that aren't "too small or too large" the absolute and relative measure pretty much the same thing. The distinction only gets important at the extremes.

> Note that there are many uses beyond simple "equal-enough" tests.
> This function allows you to say: At what value of x are f(x) and g(x) most
> different? eg when is exp(ln(x)) most different to x?
> You can just try a range of different x values, and find the one where
> precisionEquality(exp(ln(x)), x) is smallest. You don't need to worry
> about
> whether you should be using absolute or relative tolerance, that's already
> taken
> care of. So, if writing a function, you can find where the precision is
> least
> adequate. Once you've done the best you can, you can put that into a unit
> test.
>
> This gives you a simple way of reporting the accuracy of library functions
> and
> identities, and for performing more powerful "second-order" unit tests.
> (eg, acos(cos(x)) must be close to x for all -PI/2 <= x <= PI/2).
> Then, if you know that cos(x) is accurate, you've proved that acos(x) is
> also
> accurate. My goal is to provide one-line unit tests for most of the math
> functions.

In another post I pointed out that sin(PI) (which I just actually tried and got something around -1e-20) tests and having no digits of precision in common with 0. So I think in general an absolute component to the notion of equality will be important if you want trig functions to behave "normally".

> You're right: for equality tests, it probably is necessary to have
> seperate
> implementations for double and float, because subnormals become normal
> when
> moved to higher precision. It's only a problem if your intermediate
> calculations
> were done in double or float precision, instead of reals.
> What is true, though, is that
> if precisionEquality(real x, real y) >= double.mant.dig
> then cast(double)x == cast(double)y.
> The converse does not apply because of the aforementioned subnormals. (But
> in
> this case you kind of "got lucky").

Yup. I have no problem with doing things with real - I think that's the way to go. I'm just suggesting adding some overloads to make the common case of double and float easier to use.

> I did the real case first because it is the most difficult. It's trivial
> to
> extend it to other precisions.
>
> -Don.
>
>

August 16, 2005

Re: Submission: Floating point near-equality

Posted by Don Clugston
in reply to Ben Hinkle

Don Clugston

Posted in reply to Ben Hinkle

Ben Hinkle wrote:
>>The point is that in IEEE arithmetic, the relative and absolute tolerances are
>>related via the number of bits of precision.
>>Near zero, "bits of precision"
>>looks like an absolute tolerance. Away from zero, it looks like relative
>>tolerance.
> 
> 
> Do you mean "near one"? Near zero life gets very interesting. If I understand your algorithm correctly the number 1e-200 is still very far away from 0 or even 1e-201 or -1e-200. Then again I might not understand the algorithm.

I did mean "near zero". You just need to go much smaller with 80-bit reals! real.min = 3.3e-4932 !!
And the smallest representable number is real.min * real.epsilon = 3.3 e-4951. It's just amazing how much range you have with long doubles.
So 1e-200 is a very poor estimation to 0; there are *billions* of better IEEE reals!
My algorithm only gives 4e-4932 as close to 0 (and it's also close to -4e-4932). It measures closeness in "representability space".

You could reasonably argue that 1e-4900 is close to 1e-4930.
But in representability space they are no closer than 1 vs 1e30. The question then is, is representability space what we want?
(I suspect the answer is: sometimes, but not always. Maybe about half the time?)

> In another post I pointed out that sin(PI) (which I just actually tried and got something around -1e-20) tests and having no digits of precision in common with 0. So I think in general an absolute component to the notion of equality will be important if you want trig functions to behave "normally".

Ah. I'm learning a lot here.
1e-20 is normally an exceedingly poor approximation to 0. But, since sin(x) maps "linearly" to the interval (-1, 1), 1e-20 is close to 0 as sin(x) will get.

Or alternatively, sin(PI) is small but not close to 0, because PI is not close enough to the mathematical pi.

sin(n*PI) ==0 is not an identity which is preserved!
But sin(x)*sin(x) + cos(x)*cos(x) == 1 seems to be preserved for all x.

Yet there are clearly many contexts when you want to treat sin(PI) as zero. The relevant quantity seems to be the difference divided by the range. For trig functions (sin + cos, at least), it seems you only want an absolute difference, regardless of how close to 0 you are.

A few weeks with D, and I've already learned more about numerical analysis than years with C++ :-)

-Don.

August 16, 2005

Re: Submission: Floating point near-equality

Posted by Ben Hinkle
in reply to Don Clugston

Ben Hinkle

Posted in reply to Don Clugston

"Don Clugston" <dac@nospam.com.au> wrote in message news:ddrvuo$2e0g$1@digitaldaemon.com...
> Ben Hinkle wrote:
>>>The point is that in IEEE arithmetic, the relative and absolute
>>>tolerances are
>>>related via the number of bits of precision.
>>>Near zero, "bits of precision"
>>>looks like an absolute tolerance. Away from zero, it looks like relative
>>>tolerance.
>>
>>
>> Do you mean "near one"? Near zero life gets very interesting. If I understand your algorithm correctly the number 1e-200 is still very far away from 0 or even 1e-201 or -1e-200. Then again I might not understand the algorithm.
>
> I did mean "near zero". You just need to go much smaller with 80-bit
> reals! real.min = 3.3e-4932 !!
> And the smallest representable number is real.min * real.epsilon = 3.3
> e-4951. It's just amazing how much range you have with long doubles.
> So 1e-200 is a very poor estimation to 0; there are *billions* of better
> IEEE reals!
> My algorithm only gives 4e-4932 as close to 0 (and it's also close
> to -4e-4932). It measures closeness in "representability space".

Ah ok. I didn't think you meant the underflow numbers as the the only things
near 0. To me that's waaaay too near (typically). I'm talking just about
normalized (ie "proper") numbers. In regular everyday
use the concept of how close to zero is "close" is context dependent. When
you have a simple number like x = 1+eps (which is very close to 1 in
anyone's book) and then you ask "is x-1 near zero?" the answer depends on if
you consider eps to be near zero. By your measurement just about all the
time nothing will be near zero except zero (in particular eps will be
extremely far away). Or to put it another way a given application might
consider 3 and 3.001 to be close and 2 and 2.001 close and 1 and 1.001 close
but as soon as they try to say 0 and .001 are close your test would say they
are infinitely far apart.
In my experience a rough measurement of near zero is
abs(x)<sqrt(eps). That's why people need to be able to control both the
relative tolerance (ie - bits of precision in agreement) and the absolute
tolerance (ie - distance to zero).

> You could reasonably argue that 1e-4900 is close to 1e-4930.
> But in representability space they are no closer than 1 vs 1e30. The
> question then is, is representability space what we want?
> (I suspect the answer is: sometimes, but not always. Maybe about half the
> time?)

You mean in your measurement they are not close. In typical numerical apps that manipulate numbers near 1 a value of 1e-4900 is definetly near zero. It all depends on the scale of the problem being solved. The numerical applications I can think of in fact force the user to choose what they mean when they say "are these roughly equal". There isn't just one notion of equal or approximate so maybe the thing to do is make feqabs and feqrel. I'd say your precision test would make a great feqrel.

August 17, 2005

Re: Submission: Floating point near-equality

Posted by Don Clugston
in reply to Ben Hinkle

Don Clugston

Posted in reply to Ben Hinkle

Ben Hinkle wrote:
> you consider eps to be near zero. By your measurement just about all the
> time nothing will be near zero except zero (in particular eps will be
> extremely far away). Or to put it another way a given application might consider 3 and 3.001 to be close and 2 and 2.001 close and 1 and 1.001 close but as soon as they try to say 0 and .001 are close your test would say they are infinitely far apart.

Yes. There is so much precision available near zero, it's insane. It's clear that with IEEE extendeds, you'll almost never underflow to zero.

> In my experience a rough measurement of near zero is
> abs(x)<sqrt(eps). That's why people need to be able to control both the
> relative tolerance (ie - bits of precision in agreement) and the absolute
> tolerance (ie - distance to zero).

Agreed.

>>You could reasonably argue that 1e-4900 is close to 1e-4930.
>>But in representability space they are no closer than 1 vs 1e30. The
>>question then is, is representability space what we want?
>>(I suspect the answer is: sometimes, but not always. Maybe about half the
>>time?)
> 
> 
> You mean in your measurement they are not close. In typical numerical apps
> that manipulate numbers near 1 a value of 1e-4900 is definetly near zero. 

Well, it is a genuine feature of the represention. I think that almost all of the time, having 64 bits available at 1e-4900 is not much use. They are definitely less useful than having 64 bits near 1!

>It
> all depends on the scale of the problem being solved. The numerical applications I can think of in fact force the user to choose what they mean when they say "are these roughly equal". There isn't just one notion of equal or approximate so maybe the thing to do is make feqabs and feqrel. I'd say your precision test would make a great feqrel.

I concede -- you're right.
There are actually *three* 'close enough tests, not two:
(1) relative error
(2) absolute error near zero, to cope with subnormals/limited precision of FP numbers
(3) absolute error near zero, which is application-specific.

My code deals with (1) and (2) which are intimately related via the machine representation.
But test (3) is important. As you say, it is going to be very rare that the difference between 1e-4900 and 1e-4930 is significant.
It seems like a difficult problem in general, though.

Consider the case you mentioned earlier, sin(PI).
Consider writing a unit test that sin(n*PI)==0 for all n.

sin(0) = 0
sin(PI) = -5e-20
sin(PI*1e5) = 3.7e-15
sin(PI*1e10) = 1.4 e-9
sin(PI*1e15) = 6.7 e-5
sin(PI*1e18) = 0.041
sin(PI*1e20) = 3.14159 e+20    ?????? Hmmm.
             also cos(PI*1e20) = 3.14159 e+20
sin(infinity) = nan.

It appears to be impossible to come up with an absolute tolerance
for what "close enough" means for all these cases; clearly it
depends on the tolerance of the input. Can this be codified somehow?

I think you'd hope that the following should be true for all x, n:
-1 <= sin(x) <= 1
sin(x)*sin(x) + cos(x)*cos(x) = 1
sin (n*PI) is closer to zero
  than sin(n*PI+x.epsilon) and sin(n*PI-x.epsilon), if
  n*PI + PI >= n*PI * (1+ 4*epsilon)
  (ie, sin(x) crosses zero between sin(n*PI) and sin(n*PI +- epsilon) ).

Anyway, sin(PI*1e20) fails any test I can think of. I think it should return nan.

-Don.

August 17, 2005

Re: Floating point near-equality: near zero and infinity

Posted by Don Clugston
in reply to Ben Hinkle

Don Clugston

Posted in reply to Ben Hinkle

Ben Hinkle wrote:
> In typical numerical apps that manipulate numbers near 1 a value of 1e-4900 is definetly near zero.
> It all depends on the scale of the problem being solved.

IDEA: Do you think the same applies for very large numbers?
For example, I think 1e4900 would often be acceptably close to 1e4930,
and both are reasonable approximations to infinity (and in fact they'll be equal to infinity if you later truncate it to double or float precision).

The IEEE number line is basically

-inf       -1           0           1          +inf
|-----------|-----------|-----------|-----------|

with all four sections of almost equal length.

If this is the case, then actually you don't want an absolute tolerance:
you want a comparison of the exponent bits.

So, the appropriate algorithm is something like:
if it is near 1, then compare the mantissa bits. Result: 0 to mant_dig.
if it is near 0 or infinity, compare the exponent bits (4930 vs 4900).

This would mean that 0 and real.min would be equal to 16 bits, even before taking the mantissa into account.
But you'd only start counting from the first set bit in the exponent.
eg. 0x1.0p+1 and 1.1111p+1 would have only one bit in common from the exponent + 1 bit from the mantissa = 2 bits in total.
But 0x1.00p+0x3FFF (6e4931) and ( 1.1111p+0x3F00 (1e4850) would have 6bits in common from the exponent + 0 from the mantissa.

Q. Is there any way to combine these two?
You might hope that
precisionEquality(x, y) >=  double.something
if and only if cast(double)x == cast(double)y;
and likewise for float.
This means that double.min would have to have as much closeness to 0 as 1+double.epsilon does to 1.
I think that when you add the requirements for float, it becomes mathematically impossible. But it might lead in the right direction.

-Don.

August 17, 2005

Re: Floating point near-equality: near zero and infinity

Posted by Ben Hinkle
in reply to Don Clugston

Ben Hinkle

Posted in reply to Don Clugston

"Don Clugston" <dac@nospam.com.au> wrote in message news:ddudr8$1nqo$1@digitaldaemon.com...
> Ben Hinkle wrote:
> > In typical numerical apps that manipulate numbers near 1 a value of
> 1e-4900 is definetly near zero.
> > It all depends on the scale of the problem being solved.
>
> IDEA: Do you think the same applies for very large numbers?
> For example, I think 1e4900 would often be acceptably close to 1e4930,
> and both are reasonable approximations to infinity (and in fact they'll be
> equal to infinity if you later truncate it to double or float precision).
>
> The IEEE number line is basically
>
> -inf       -1           0           1          +inf |-----------|-----------|-----------|-----------|
>
> with all four sections of almost equal length.

Actually I like your original relative equality for large numbers. It's only at zero that relative inequality diverges from people's expectations of the real line (which is how one thinks of floating point numbers casually). Now if the real line were a real circle instead (connect -inf to +inf) then I'd say numbers near +-inf should be treated like near zero.

> If this is the case, then actually you don't want an absolute tolerance: you want a comparison of the exponent bits.
>
> So, the appropriate algorithm is something like:
> if it is near 1, then compare the mantissa bits. Result: 0 to mant_dig.
> if it is near 0 or infinity, compare the exponent bits (4930 vs 4900).
>
> This would mean that 0 and real.min would be equal to 16 bits, even before
> taking the mantissa into account.
> But you'd only start counting from the first set bit in the exponent.
> eg. 0x1.0p+1 and 1.1111p+1 would have only one bit in common from the
> exponent + 1 bit from the mantissa = 2 bits in total.
> But 0x1.00p+0x3FFF (6e4931) and ( 1.1111p+0x3F00 (1e4850) would have 6bits
> in common from the exponent + 0 from the mantissa.
>
> Q. Is there any way to combine these two?
> You might hope that
> precisionEquality(x, y) >=  double.something
> if and only if cast(double)x == cast(double)y;
> and likewise for float.
> This means that double.min would have to have as much closeness to 0 as
> 1+double.epsilon does to 1.
> I think that when you add the requirements for float, it becomes
> mathematically impossible. But it might lead in the right direction.
>
> -Don.

I still think the way to go is either
1) have one function with two knobs - one for abs tol and one for rel tol or
2) have two functions each with one knob - one for abs tol and one for rel
tol
The function I posted originally was feq with two knobs but now I'm thinking
its probably a better idea to have two functions. With one it's easy to
forget which is the abstol and which is the reltol since D doesn't have
named optional parameters. So I'd prefer two functions feqrel and feqabs.
The name feqrel tells the user that essentially nothing is close to zero
since a relative tolerance is used (I don't mind that feqrel doesn't
actually use a relative tolerance for denormalized numbers - it might just
be easier to say underflow values all are approx 0 but who knows). The name
feqabs tells users not to use it for large numbers and that it matches
regular closeness notions for numbers near zero.

August 17, 2005

Re: Floating point near-equality: near zero and infinity

Posted by Shammah Chancellor
in reply to Don Clugston

Shammah Chancellor

Posted in reply to Don Clugston

In article <ddudr8$1nqo$1@digitaldaemon.com>, Don Clugston says...
>
>The IEEE number line is basically
>
>-inf       -1           0           1          +inf |-----------|-----------|-----------|-----------|
>
>with all four sections of almost equal length.

What is the formula that defines this coordinate system?  Why not use this to
redefine the numbers in
terms of this system and then take their distance appart and compare that to an
absolute tolerance for
approximately equal.  That would scale correctly.


-Sha

August 18, 2005

Re: Floating point near-equality: near zero and infinity

Posted by Don Clugston
in reply to Shammah Chancellor

Don Clugston

Posted in reply to Shammah Chancellor

Shammah Chancellor wrote:
> In article <ddudr8$1nqo$1@digitaldaemon.com>, Don Clugston says...
> 
>>The IEEE number line is basically
>>
>>-inf       -1           0           1          +inf
>>|-----------|-----------|-----------|-----------|
>>
>>with all four sections of almost equal length.
> 
> 
> What is the formula that defines this coordinate system?  Why not use this to
> redefine the numbers in terms of this system and then take their distance appart and compare that to an
> absolute tolerance for approximately equal.  That would scale correctly.  
> 
> 
> -Sha
> 

Yes, and that's exactly what my code does, but as Ben has pointed out, the internal representation frequently doesn't match your application.
Specifically, IEEE reals are fantastic at distinguishing really small numbers, but you often don't care whether a number is 1e-2000 or 1e-4000: it's just "really really small".
I think the problem only really comes up when comparing numbers to zero,
or possibly to infinity.
Even 0.000000000000000000001 is very close to 1, and very very far from 0 on the IEEE number line.
Actually comparison with 0 is complicated, because sometimes you demand exact equality, and sometimes 'tiny' is acceptable. Also sometimes sign
matters (a tiny positive number might be completely different to a positive negative number).

-Don.

August 18, 2005

Re: Floating point near-equality: near zero and infinity

Posted by Ben Hinkle
in reply to Don Clugston

Ben Hinkle

Posted in reply to Don Clugston

"Don Clugston" <dac@nospam.com.au> wrote in message news:de0ms5$o94$1@digitaldaemon.com...
> Shammah Chancellor wrote:
>> In article <ddudr8$1nqo$1@digitaldaemon.com>, Don Clugston says...
>>
>>>The IEEE number line is basically
>>>
>>>-inf       -1           0           1          +inf |-----------|-----------|-----------|-----------|
>>>
>>>with all four sections of almost equal length.
>>
>>
>> What is the formula that defines this coordinate system?  Why not use
>> this to
>> redefine the numbers in terms of this system and then take their distance
>> appart and compare that to an
>> absolute tolerance for approximately equal.  That would scale
>> orrectly.  -Sha
>>
>
> Yes, and that's exactly what my code does, but as Ben has pointed out, the
> internal representation frequently doesn't match your application.
> Specifically, IEEE reals are fantastic at distinguishing really small
> numbers, but you often don't care whether a number is 1e-2000 or 1e-4000:
> it's just "really really small".
> I think the problem only really comes up when comparing numbers to zero,
> or possibly to infinity.
> Even 0.000000000000000000001 is very close to 1, and very very far from 0
> on the IEEE number line.
> Actually comparison with 0 is complicated, because sometimes you demand
> exact equality, and sometimes 'tiny' is acceptable. Also sometimes sign
> matters (a tiny positive number might be completely different to a
> positive negative number).
>
> -Don.

Another way to describe your metric is like the usual Euclidean metric on the following nubmer lines:

    4           2           1          1/2         1/4
 ---|-----------|-----------|-----------|-----------|----


   subnormal     0    -subnormal
   --|--|--|--|--|--|--|--|--|--


  -1/4         -1/2         -1         -2          -4
 ---|-----------|-----------|-----------|-----------|----

For example if two numbers like 2.001 and 2.002 are d distance apart then
4.002 and 4.004 are also d distance apart because they have the same
mantissas but just differ in the exponents. That's also why essentially
nothing is close to 0 and why the positive and negative axes are "not
comparable". It's also why 0 is so special.
That is, the formula for the coordinate system is log2(x).

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation