November 12, 2013
On 11/12/13 9:59 AM, Vladimir Panteleev wrote:
> On Tuesday, 12 November 2013 at 16:48:01 UTC, Andrei Alexandrescu wrote:
>> On 11/12/13 8:00 AM, Vladimir Panteleev wrote:
>>> 1) NaN values are accepted as input almost anywhere where doubles are.
>>> std.conv recognizes the string "nan" and converts it to double.nan,
>>
>> They are still singular values. Code has no business comparing them
>> unwittingly.
>>
>>> 2) It is unreasonable to expect every D user out there dealing with
>>> floating point numbers to be forced to check every source of floating
>>> point values in their program against NaNs.
>>
>> I find that perfectly reasonable. How do you mean that?
>
> Both of the above hinge on the relative obscurity of NaNs and the
> problems they could cause. People who are not specifically familiar with
> NaNs and how they interact with other floating-point values will treat
> floating-point values as "just numbers", and expect them to compare and
> sort in the same way as integers.

I think that's their problem, not ours.

>> The language allows NaN floating point number with the semantics of
>> IEEE 754. We hardly have any leeway in the matter unless we want to
>> irate a lot of people for no good reason.
>
>> It's a judgment call, not a cut and dried decision.
>
> Agreed - however, I think there might be more than one way to deal with
> this.
>
> 1) As above, introduce a "less" function which guarantees transitivity
> for basic types, and use it in examples throughout.

This is a bit much.

> 2) Improve documentation and visibility of this problem. For example,
> add this to the documentation of sort:
>
> "The predicate must be transitive (`a<b && b<c` implies `a<c`) in order
> for `sort` to behave properly - otherwise, the program may fail on
> certain inputs (but not others) when not compiled in release mode. Note
> that `a<b` is not transitive for floating points, because the expression
> will always be `false` when either `a` or `b` is `NaN`."

Great idea, just file an enh request or just write the pull request.

>> We should simply add an assert to isSorted.
>
> How would this be implemented, anyway? I stumbled upon this problem with
> this code:
>
> enum sortPred = q{a.hits > b.hits || (a.hits == b.hits && a.value <
> b.value)};
> all.sort!sortPred();
>
> (I know about multiSort, but it currently seems to be buggy - I do have
> a test case, but it has a few million entries and I need to reduce it.)
>
>> if (a[i] < a[i + 1]) { assert(!(a[i + 1] < a[i])); ... }
>> else { assert(a[i + 1] >= a[i]); ... }
>
> Sort et al don't know how to check "a>=b"... and they can't use "a<b"
> either, because !(a<b) && !(b<a) implies a==b.

It really implies "a and b are in the same equivalence class". Building on that notion, we can figure what's wrong with NaNs and how to assert against their failings.

Consider we define equiv(a, b) as (a, b) => !less(a, b) && !less(b, a).

If at least one of the numbers is NaN, all comparisons return false. That puts NaN in the same equivalence class with all numbers, including numbers that are deemed in distinct equivalence classes. That's where NaNs mess things up, and that's what we need to assert. Consider three numbers a, b, and c. Then the following must pass:

assert((equiv(a, b) && equiv(b, c)) <= equiv(a, c));

("<=" on Booleans is actually implication.)

That test will fail with NaNs and should be part of isSorted, sort etc.

We should also check such as:

assert(!less(a, a));
assert(less(a, b) <= !less(b, a));

These should be checked in sort only; isSorted does not need these.


Andrei

November 12, 2013
On 11/12/13 10:31 AM, tn wrote:
> On Tuesday, 12 November 2013 at 15:56:25 UTC, Andrei Alexandrescu wrote:
>> On 11/12/13 12:54 AM, tn wrote:
>>> ...
>>> An alternative would be to check for nans explicitly.
>>
>> I think NaNs are singular unordered values that make invalid inputs
>> for either sort or isSorted. We should simply add an assert to isSorted.
>
> But sort and isSorted both allow user to provide a custom "less"
> function. What if the user needs NaNs and, for example, wants to sort
> them before all the other values? Then he calls "arr.sort!((a,b) => a <
> b || (isnan(a) && !isnan(b)))" and the code asserts?

I think this is a misunderstanding. I never advocated inserting specialized asserts for floating-point numbers. http://goo.gl/3dGGkf takes care of that the right way - by making sure that the less predicate is sensible for the kind of work isSorted and sort do. That is regardless of the nature of the objects being sorted.

Andrei

November 12, 2013
On 11/12/13 11:06 AM, Vladimir Panteleev wrote:
> That doesn't change the reality that not everyone is aware of these
> issues, and that even people who are aware of them may overlook
> something somewhere. We can't pretend that there is no problem just
> because it's something you "have to know" and "have to be careful about"
> - if we can improve the situation without making a compromise elsewhere,
> then it must be done.

I agree. I'll just add that people who plan to get work done with floating point will inevitably stumble upon NaN. There's no two ways about it.

> I don't see this stance as any different than shouting "RTFM!!!" at
> anyone who makes a mistake which, in theory, could be avoided by
> studying, memorization and careful application of the appropriate
> documentation located ... somewhere. This approach does not work - even
> if that user has learned his lesson the hard way, more new users will
> make the same mistake. Instead, you must actually make the problem less
> likely to manifest - improve your documentation, or improve the design
> of your system such that such errors will be less likely to occur.

I agree here, too. Again, I'll add that "improving the documentation" helps little unless RTFM is at work :o).

> This particular situation affects only one specific property of
> floating-point numbers, so other properties are not relevant to this
> discussion. Personally, I've been taught about floating-point precision
> issues back in school, but I've only learned about NaNs while learning
> D. Call me an unlearned idiot if that makes you feel better, but it is
> rather likely that there will be more users making the same mistake in
> the future if the situation is not addressed.

You're young. I guarantee you were to hit NaN whether or not D had anything to do with it.

> Furthermore, the problem is aggravated by that it is hidden by generic
> code. This thread isn't a complaint that "the line `if (a<b)` in my
> program behaves in a funny way" - there are no explicit floating-point
> comparisons in the example in the original post. We are calling sort, a
> function in the standard library, with its default predicate, as
> specified in the standard library, using a built-in type, which is part
> of the language, and it fails its own assertion on certain input. This
> is a problem!

Yes.

> I have suggested possible improvements earlier in this thread, so I'd
> like to ask you to comment on those rather than hand-waving the problem
> away.

I think your solutions are not good. I also think my solution in http://goo.gl/3dGGkf is good. (I know it sounds awful, but I hope you'll appreciate the brevity :o).) The solution I propose explains exactly how NaN messes up the ordering comparison operator, in a way that doesn't make the tests refer to NaN or floating point numbers in particular.


Andrei

November 12, 2013
On Tuesday, 12 November 2013 at 19:39:19 UTC, Vladimir Panteleev wrote:
> On Tuesday, 12 November 2013 at 19:26:12 UTC, tn wrote:
>> Thus, the correct solution would be to modify the functions to take "lessOrEqual" as a template parameter instead of "less".
>
> Too late for that now... just like it's too late to change IEEE comparison logic.

Yes, I was expecting this. :)

>> If "less" is kept, then we need another template argument to separate equality and incomparability (something like "equals(a,b)", or "incomparable(a,b)").
>
> Is this to allow generic partial ordering (rather than to fix sorting NaNs specifically)? I remember seeing a thread about partial ordering and opCmp:
> http://forum.dlang.org/post/dpfbtu$1ocf$1@digitaldaemon.com

I had generic partial orderings in my mind too, but I think it is also needed for user defined floating point types such as bigfloat. (Unless "isnan" function is defined for the bigfloat type too and detected automatically.)
November 12, 2013
On 11/12/2013 12:24 PM, Andrei Alexandrescu wrote:
> You're young. I guarantee you were to hit NaN whether or not D had anything to
> do with it.

I want to add that NaN has been a reality on x86 machines since about 1983. C (and C++) compilers for decades have utterly ignored its existence - but that didn't make it go away. It just meant that things like the fp functions in the Standard library exhibited undefined behavior with NaNs, which is far worse than at least having sensible documented behavior.

The C compilers I wrote (and C++) always made an effort to handle NaN correctly (and overflows, underflows, and subnormals), and I often felt that I was the only one who cared about it :-)

What D does is simply recognize that NaNs are an inevitable characteristic of the floating point hardware, and deal with them in the most straightforward manner practical. Deciding and documenting what .sort should do with NaNs is part of that. Trying to pretend that NaNs aren't a fact of life with IEEE floating point hardware is not going to work.

Note:

http://dlang.org/phobos/std_math.html#.exp

and how the behavior with NaN arguments is all carefully documented.
November 12, 2013
On Tuesday, 12 November 2013 at 20:03:37 UTC, Andrei Alexandrescu wrote:
>> Both of the above hinge on the relative obscurity of NaNs and the
>> problems they could cause. People who are not specifically familiar with
>> NaNs and how they interact with other floating-point values will treat
>> floating-point values as "just numbers", and expect them to compare and
>> sort in the same way as integers.
>
> I think that's their problem, not ours.

(I partially disagree here - see my reply to Walter.)

>> Sort et al don't know how to check "a>=b"... and they can't use "a<b"
>> either, because !(a<b) && !(b<a) implies a==b.
>
> It really implies "a and b are in the same equivalence class". Building on that notion, we can figure what's wrong with NaNs and how to assert against their failings.
>
> Consider we define equiv(a, b) as (a, b) => !less(a, b) && !less(b, a).
>
> If at least one of the numbers is NaN, all comparisons return false. That puts NaN in the same equivalence class with all numbers, including numbers that are deemed in distinct equivalence classes. That's where NaNs mess things up, and that's what we need to assert. Consider three numbers a, b, and c. Then the following must pass:
>
> assert((equiv(a, b) && equiv(b, c)) <= equiv(a, c));

OK, we're on the same page so far (although you've presented the problem more eloquently).

> ("<=" on Booleans is actually implication.)

(Cool!)

> That test will fail with NaNs and should be part of isSorted, sort etc.

OK, but which a, b and c will be checked? Taking all adjacent triples will not work with two adjacent NaNs.

> We should also check such as:
>
> assert(!less(a, a));
> assert(less(a, b) <= !less(b, a));

Again, for which a/b?
November 12, 2013
On 11/12/2013 10:25 AM, Bigsandwich wrote:
> Please, please, please just no.  As someone who works with floating point daily,
> you cannot idiot proof it like this.  They will never behave like "just numbers"
> and you can't make them.
>
> Even leaving out NAN, you have issues with precision, accumulated error,
> denormals, equality comparison, and on and on.
>
> If you don't know what you are doing with them, then you just shouldn't be
> touching them.  Unicode has similar issues.

I think it's apropos to reference this famous document:

"What Every Computer Scientist Should Know About Floating-Point Arithmetic"

http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html

I know, I know, RTFM, but one cannot ignore this stuff and write professional quality fp code, nor is it practical to have the language ignore it for you.
November 12, 2013
On Tuesday, 12 November 2013 at 20:24:11 UTC, Andrei Alexandrescu wrote:
> Again, I'll add that "improving the documentation" helps little unless RTFM is at work :o).

The distinction between good and bad documentation in this context is whether the respective information is discoverable. That is, don't make users ask "How should I have known that?". A warning on the documentation of std.algorithm.sort is good, but an article on floating-points merely included together with the documentation, and not referenced from anywhere, isn't.

> I think your solutions are not good. I also think my solution in http://goo.gl/3dGGkf is good. (I know it sounds awful, but I hope you'll appreciate the brevity :o).)

Brevity appreciated :D
November 12, 2013
On Tuesday, 12 November 2013 at 20:03:37 UTC, Andrei Alexandrescu wrote:
> Consider we define equiv(a, b) as (a, b) => !less(a, b) && !less(b, a).
>
> If at least one of the numbers is NaN, all comparisons return false. That puts NaN in the same equivalence class with all numbers, including numbers that are deemed in distinct equivalence classes. That's where NaNs mess things up, and that's what we need to assert. Consider three numbers a, b, and c. Then the following must pass:
>
> assert((equiv(a, b) && equiv(b, c)) <= equiv(a, c));
>
> ("<=" on Booleans is actually implication.)

Shouldn't the implication be to the other direction? Then it becomes

assert((equiv(a, b) && equiv(b, c)) => equiv(a, c));

> That test will fail with NaNs and should be part of isSorted, sort etc.

But it requires that the input contains at least three elements. And it has to test a lot of possible triples (a,b,c), which I think requires at least O(n^2) time. And it does not fail consistently whenever the input contains at least one NaN (consider an input that contains only NaNs).

> We should also check such as:
>
> assert(!less(a, a));
> assert(less(a, b) <= !less(b, a));
>
> These should be checked in sort only; isSorted does not need these.

The latter fails also if a==b. (Unless the direction of the implication is again wrong.)
November 12, 2013
On 11/12/13 1:03 PM, tn wrote:
> On Tuesday, 12 November 2013 at 20:03:37 UTC, Andrei Alexandrescu wrote:
>> Consider we define equiv(a, b) as (a, b) => !less(a, b) && !less(b, a).
>>
>> If at least one of the numbers is NaN, all comparisons return false.
>> That puts NaN in the same equivalence class with all numbers,
>> including numbers that are deemed in distinct equivalence classes.
>> That's where NaNs mess things up, and that's what we need to assert.
>> Consider three numbers a, b, and c. Then the following must pass:
>>
>> assert((equiv(a, b) && equiv(b, c)) <= equiv(a, c));
>>
>> ("<=" on Booleans is actually implication.)
>
> Shouldn't the implication be to the other direction? Then it becomes
>
> assert((equiv(a, b) && equiv(b, c)) => equiv(a, c));

It is in the other direction, but the latter doesn't compile :o). Again, it just turns out "a <= b" has the meaning of "a implies b". It's not a particularly intuitive way to express it.

>> That test will fail with NaNs and should be part of isSorted, sort etc.
>
> But it requires that the input contains at least three elements. And it
> has to test a lot of possible triples (a,b,c), which I think requires at
> least O(n^2) time. And it does not fail consistently whenever the input
> contains at least one NaN (consider an input that contains only NaNs).

I agree. Ideas?

>> We should also check such as:
>>
>> assert(!less(a, a));
>> assert(less(a, b) <= !less(b, a));
>>
>> These should be checked in sort only; isSorted does not need these.
>
> The latter fails also if a==b. (Unless the direction of the implication
> is again wrong.)

Try it!


Andrei