Jump to page: 1 27  
Page
Thread overview
Why don't we switch to C like floating pointed arithmetic instead of automatic expansion to reals?
Aug 03, 2016
Seb
Aug 04, 2016
Andrew Godfrey
Aug 04, 2016
Walter Bright
Aug 04, 2016
Fool
Aug 04, 2016
Walter Bright
Aug 04, 2016
Fool
Aug 04, 2016
Walter Bright
Aug 05, 2016
Fool
Aug 05, 2016
Walter Bright
Aug 05, 2016
Ilya Yaroshenko
Aug 05, 2016
deadalnix
Aug 05, 2016
Ilya Yaroshenko
Aug 05, 2016
deadalnix
Aug 05, 2016
Ilya Yaroshenko
Aug 05, 2016
deadalnix
Aug 05, 2016
John Colvin
Aug 05, 2016
Walter Bright
Aug 05, 2016
Walter Bright
Aug 05, 2016
Ilya Yaroshenko
Aug 05, 2016
Seb
Aug 05, 2016
Walter Bright
Aug 05, 2016
Walter Bright
Aug 05, 2016
Walter Bright
Aug 05, 2016
Ilya Yaroshenko
Aug 05, 2016
Walter Bright
Aug 06, 2016
Ilya Yaroshenko
Aug 06, 2016
Walter Bright
Aug 06, 2016
Johannes Pfau
Aug 06, 2016
Walter Bright
Aug 06, 2016
Walter Bright
Aug 06, 2016
Ilya Yaroshenko
Aug 06, 2016
Iain Buclaw
Aug 06, 2016
Ilya Yaroshenko
Aug 06, 2016
Iain Buclaw
Aug 06, 2016
Ilya Yaroshenko
Aug 06, 2016
Iain Buclaw
Aug 06, 2016
David Nadlinger
Aug 06, 2016
Patrick Schluter
Aug 06, 2016
Iain Buclaw
Aug 06, 2016
Walter Bright
Aug 06, 2016
David Nadlinger
Aug 07, 2016
Iain Buclaw
Aug 06, 2016
Walter Bright
Aug 06, 2016
Ilya Yaroshenko
Aug 06, 2016
Walter Bright
Aug 06, 2016
David Nadlinger
Aug 06, 2016
Walter Bright
Aug 07, 2016
Ilya Yaroshenko
Aug 07, 2016
Walter Bright
Aug 07, 2016
Ilya Yaroshenko
Aug 06, 2016
David Nadlinger
Aug 06, 2016
Walter Bright
Aug 07, 2016
deadalnix
Aug 05, 2016
Fool
Aug 04, 2016
Walter Bright
Aug 04, 2016
deadalnix
Aug 04, 2016
Walter Bright
Aug 04, 2016
Ilya Yaroshenko
Aug 04, 2016
Iain Buclaw
Aug 04, 2016
Walter Bright
Aug 04, 2016
Seb
Aug 06, 2016
Iain Buclaw
August 03, 2016
There was a recent discussion on Phobos about D's floating point behavior [1]. I think Ilya summarized quite elegantly our problem:

> We need to argue with @WalterBright to switch to C like floating pointed arithmetic instead of automatic expansion to reals (this is so horrible that it may kill D for science for ever, @wilzbach can tell a story about Tinflex, I can tell about precise and KBN summations, which does not work correctly with 32-bit DMD). D had a lot of long discussions about math and GC. We are moving to have great D without GC now. Well, I hope we will have D with normal FP arithmetic. But how much years we need to change @WalterBright's opinion on this problem to the common form?

Here's my reply & experience to this, maybe it helps to show the problem.
I started to document all the bumps with FP math I ran into on our mir wiki [2]. While some of these are expected, there are some that are horrible, cruel & yield a constant fight against the compiler
FP behavior that is different depending on the (1) target, (2) compiler or (3) optimization level is very hard to work with. Wasn't the entire point of D to get it right and avoid weird, unproductive corner-cases across architectures?
The problem with Tinflex and a lot of other scientific math is that you need reproducible, predictive behavior. For example the Tinflex algorithm is quite sensitive as it (1) uses pow and exp, so errors sum up quickly and (2) it has an ordered heap of remaining FP values, which means due to FP magic which will be explained below I get totally different resulting values depending on the architecture. Note that the ordering itself is well defined for equality, e.g. the tuples (mymath(1.5), 100), (mymath(1.5), 200) need to result in same ordering. I don't want my program to fail just because I compiled for 32-bit, but maybe code example show more than words.

Consider the following program, it fails on 32-bit :/

```
alias S = double; // same for float

S fun(S)(in S x)
{
    return -1 / x;
}

void main()
{
    S i = fun!S(3);
    assert(i == S(-1) / 3); // this lines passes
    assert(fun!S(3) == S(-1) / 3); // error on 32-bit

    // just to be clear, the following variants don't work either on 32-bit
    assert(fun!S(3) == S(-1.0 / 3);
    assert(fun!S(3) == cast(S) S(-1) / 3);
    assert(fun!S(3) == S(S(-1) / 3));
    assert(S(fun!S(3)) == S(-1) / 3);
    assert(cast(S) fun!S(3) == S(-1) / 3);
}
```

Maybe it's easier to see why this behavior is tricky when we look at it in action. For example with this program DMD for x86_64 will yield the same result whereas x86_32 will yield different ones.

```
import std.stdio;

alias S = float;
float shL = 0x1.1b95eep-4; // -0.069235
float shR = 0x1.9c7cfep-7; // -0.012588

F fun(F)(F x)
{
    return 1.0 + x * 2;
}

S f1()
{
    S r = fun(shR);
    S l = fun(shL);
    return (r - l);
}

S f2()
{
    return (fun(shR) - fun(shL));
}

void main
{
    writefln("f1: %a", f1()); //  -0x1.d00cap-4
    writefln("f2: %a", f2());  // on 32-bit: -0x1.d00c9cp-4
    assert(f1 == f2); // fails on 32-bit
}
```

To make matters worse std.math yields different results than compiler/assembly intrinsics - note that in this example import std.math.pow adds about 1K instructions to the output assembler, whereas llvm_powf boils down to the assembly powf. Of course the performance of powf is a lot better, I measured [3] that e.g. std.math.pow takes ~1.5x as long for both LDC and DMD. Of course if you need to run this very often, this cost isn't acceptable.

```
void main()
{
    alias S = float;
    S s1 = 0x1.24c92ep+5;
    S s2 = -0x1.1c71c8p+0;

    import std.math : std_pow = pow;
    import core.stdc.stdio : printf;
    import core.stdc.math: powf;

    printf("std: %a\n", std_pow(s1, s2)); // 0x1.2c155ap-6
    printf("pow: %a\n", s1 ^^ s2); // 0x1.2c155ap-6
    printf("powf: %a\n", powf(s1, s2)); // 0x1.2c1558p-6

    version(LDC)
    {
        import ldc.intrinsics : llvm_pow;
        printf("ldc: %a\n", llvm_pow(s1, s2)); // 0x1.2c1558p-6
    }
}
```

I excluded the discrepancies in FP arithmetics between Windows and Linux/macOS as it's hopefully just a bug [4].

[1] https://github.com/dlang/phobos/pull/3217
[2] https://github.com/libmir/mir/wiki/Floating-point-issues
[3] https://github.com/wilzbach/d-benchmarks
[4] https://issues.dlang.org/show_bug.cgi?id=16344
August 04, 2016
On Wednesday, 3 August 2016 at 23:00:11 UTC, Seb wrote:
> There was a recent discussion on Phobos about D's floating point behavior [1]. I think Ilya summarized quite elegantly our problem:
>
> [...]

In my experience (production-quality FP coding in C++), you are in error merely by combining floating point with exact comparison (==). Even if you have just one compiler and architecture to target, you can expect instability if you do this. Writing robust FP algorithms is an art and it's made harder if you use mathematical thinking, because FP arithmetic lacks many properties that integer or fixed-point arithmetic have.

I'm not saying D gets it right (I haven't explored that at all) but I am saying you need better examples.

Now, my major experience is in the context of Intel non-SIMD FP, where internal precision is 80-bit. I can see the appeal of asking for the ability to reduce internal precision to match the data type you're using, and I think I've read something written by Walter on that topic. But this would hardly be "C-like" FP support so I'm not sure that's he topic at hand.
August 04, 2016
On 8/4/2016 7:08 AM, Andrew Godfrey wrote:
> Now, my major experience is in the context of Intel non-SIMD FP, where internal
> precision is 80-bit. I can see the appeal of asking for the ability to reduce
> internal precision to match the data type you're using, and I think I've read
> something written by Walter on that topic. But this would hardly be "C-like" FP
> support so I'm not sure that's he topic at hand.

Also, carefully reading the C Standard, D's behavior is allowed by the C Standard. The idea that C requires rounding of all intermediate values to the target precision is incorrect, and is not "C-like". C floating point semantics can and do vary from platform to platform, and vary based on optimization settings, and this is all allowed by the C Standard.

It has been proposed many times that the solution for D is to have a function called toFloat() or something like that in core.math, which guarantees a round to float precision for its argument. But so far nobody has written such a function.
August 04, 2016
On Thursday, 4 August 2016 at 18:53:23 UTC, Walter Bright wrote:
> It has been proposed many times that the solution for D is to have a function called toFloat() or something like that in core.math, which guarantees a round to float precision for its argument. But so far nobody has written such a function.

How can we ensure that toFloat(toFloat(x) + toFloat(y)) does not involve double-rounding?
August 04, 2016
On 8/4/2016 11:53 AM, Walter Bright wrote:
> It has been proposed many times that the solution for D is to have a function
> called toFloat() or something like that in core.math, which guarantees a round
> to float precision for its argument. But so far nobody has written such a function.

https://github.com/dlang/druntime/pull/1621
August 04, 2016
On 8/4/2016 12:03 PM, Fool wrote:
> How can we ensure that toFloat(toFloat(x) + toFloat(y)) does not involve
> double-rounding?

It's the whole point of it.
August 04, 2016
On Thursday, 4 August 2016 at 18:53:23 UTC, Walter Bright wrote:
> On 8/4/2016 7:08 AM, Andrew Godfrey wrote:
>> Now, my major experience is in the context of Intel non-SIMD FP, where internal
>> precision is 80-bit. I can see the appeal of asking for the ability to reduce
>> internal precision to match the data type you're using, and I think I've read
>> something written by Walter on that topic. But this would hardly be "C-like" FP
>> support so I'm not sure that's he topic at hand.
>
> Also, carefully reading the C Standard, D's behavior is allowed by the C Standard. The idea that C requires rounding of all intermediate values to the target precision is incorrect, and is not "C-like". C floating point semantics can and do vary from platform to platform, and vary based on optimization settings, and this is all allowed by the C Standard.
>

It is actually very common for C compiler to work with double for intermediate values, which isn't far from what D does.

August 04, 2016
On Thursday, 4 August 2016 at 20:00:14 UTC, Walter Bright wrote:
> On 8/4/2016 12:03 PM, Fool wrote:
>> How can we ensure that toFloat(toFloat(x) + toFloat(y)) does not involve
>> double-rounding?
>
> It's the whole point of it.

I'm afraid, I don't understand your implementation. Isn't toFloat(x) + toFloat(y) computed in real precision (first rounding)? Why doesn't toFloat(toFloat(x) + toFloat(y)) involve another rounding?
August 04, 2016
On 8/4/2016 1:03 PM, deadalnix wrote:
> It is actually very common for C compiler to work with double for intermediate
> values, which isn't far from what D does.

In fact, it used to be specified that C behave that way!
August 04, 2016
IEEE behaviour by default is required by numeric software. @fastmath (like recent LDC) or something like that can be used to allow extended precision.

Ilya
« First   ‹ Prev
1 2 3 4 5 6 7