August 04, 2016
On 8/4/2016 1:29 PM, Fool wrote:
> I'm afraid, I don't understand your implementation. Isn't toFloat(x) +
> toFloat(y) computed in real precision (first rounding)? Why doesn't
> toFloat(toFloat(x) + toFloat(y)) involve another rounding?

You're right, in that case, it does. But C does, too:

http://www.exploringbinary.com/double-rounding-errors-in-floating-point-conversions/

This is important to remember when advocating for "C-like" floating point - because C simply does not behave as most programmers seem to assume it does.

What toFloat() does is guarantee that its argument is rounded to float.

The best way to approach this when designing fp algorithms is to not require them to have reduced precision.

It's also important to realize that on some machines, the hardware does not actually support float precision operations, or may do so at a large runtime penalty (x87).
August 04, 2016
On 4 August 2016 at 01:00, Seb via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>
> Consider the following program, it fails on 32-bit :/
>

It would be nice if explicit casts were honoured by CTFE here.
toDouble(a + b) just seems to be avoiding the why CTFE ignores the
cast in cast(double)(a + b).

> To make matters worse std.math yields different results than compiler/assembly intrinsics - note that in this example import std.math.pow adds about 1K instructions to the output assembler, whereas llvm_powf boils down to the assembly powf. Of course the performance of powf is a lot better, I measured [3] that e.g. std.math.pow takes ~1.5x as long for both LDC and DMD. Of course if you need to run this very often, this cost isn't acceptable.
>

This could be something specific to your architecture.  I get the same result on from all versions of powf, and from GCC builtins too, regardless of optimization tunings.
August 04, 2016
On 8/4/2016 2:13 PM, Iain Buclaw via Digitalmars-d wrote:
> This could be something specific to your architecture.  I get the same
> result on from all versions of powf, and from GCC builtins too,
> regardless of optimization tunings.


It's important to remember that what gcc does and what the C standard allows are not necessarily the same - even if the former is standard compliant. C allows for a lot of implementation defined FP behavior.
August 04, 2016
On Thursday, 4 August 2016 at 21:13:23 UTC, Iain Buclaw wrote:
> On 4 August 2016 at 01:00, Seb via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>> To make matters worse std.math yields different results than compiler/assembly intrinsics - note that in this example import std.math.pow adds about 1K instructions to the output assembler, whereas llvm_powf boils down to the assembly powf. Of course the performance of powf is a lot better, I measured [3] that e.g. std.math.pow takes ~1.5x as long for both LDC and DMD. Of course if you need to run this very often, this cost isn't acceptable.
>>
>
> This could be something specific to your architecture.  I get the same result on from all versions of powf, and from GCC builtins too, regardless of optimization tunings.

I can reproduce this on DPaste (also x86_64).

https://dpaste.dzfl.pl/c0ab5131b49d

Behavior with a recent LDC build is similar (as annotated with the comments).
August 05, 2016
On Thursday, 4 August 2016 at 20:58:57 UTC, Walter Bright wrote:
> On 8/4/2016 1:29 PM, Fool wrote:
>> I'm afraid, I don't understand your implementation. Isn't toFloat(x) +
>> toFloat(y) computed in real precision (first rounding)? Why doesn't
>> toFloat(toFloat(x) + toFloat(y)) involve another rounding?
>
> You're right, in that case, it does. But C does, too:
>
> http://www.exploringbinary.com/double-rounding-errors-in-floating-point-conversions/

Yes. It seems, however, as if Rick Regan is not advocating this behavior.


> This is important to remember when advocating for "C-like" floating point - because C simply does not behave as most programmers seem to assume it does.

That's right. "C-like" might be what they say but what they want is double precision computations to be carried out in double precision.


> What toFloat() does is guarantee that its argument is rounded to float.
>
> The best way to approach this when designing fp algorithms is to not require them to have reduced precision.

I understand your point of view. However, there are (probably rare) situations where one requires more control. I think that simulating double-double precision arithmetic using Veltkamp split was mentioned as a resonable example, earlier.


> It's also important to realize that on some machines, the hardware does not actually support float precision operations, or may do so at a large runtime penalty (x87).

That's another story.
August 04, 2016
On 8/4/2016 11:05 PM, Fool wrote:
> I understand your point of view. However, there are (probably rare) situations
> where one requires more control. I think that simulating double-double precision
> arithmetic using Veltkamp split was mentioned as a resonable example, earlier.

There are cases where doing things at higher precision results in double rounding and a less accurate result. But I am pretty sure there are far fewer of those cases compared to routine computations that get a more accurate result with more precision.

If that wasn't true, we wouldn't ever need double precision.
August 05, 2016
On Friday, 5 August 2016 at 06:59:21 UTC, Walter Bright wrote:
> On 8/4/2016 11:05 PM, Fool wrote:
>> I understand your point of view. However, there are (probably rare) situations
>> where one requires more control. I think that simulating double-double precision
>> arithmetic using Veltkamp split was mentioned as a resonable example, earlier.
>
> There are cases where doing things at higher precision results in double rounding and a less accurate result. But I am pretty sure there are far fewer of those cases compared to routine computations that get a more accurate result with more precision.
>
> If that wasn't true, we wouldn't ever need double precision.

You are wrong that there are far fewer of those cases. This is naive point of view. A lot of netlib math functions require exact IEEE arithmetic. Tinflex requires it. Python C backend and Mir library require exact IEEE arithmetic. Atmosphere package requires it, Atmosphere is used as reference code for my publication in JMS, Springer. And the most important case: no one top scientific laboratory will use a language without exact IEEE arithmetic by default.
August 05, 2016
On Friday, 5 August 2016 at 07:43:19 UTC, Ilya Yaroshenko wrote:
> On Friday, 5 August 2016 at 06:59:21 UTC, Walter Bright wrote:
>> On 8/4/2016 11:05 PM, Fool wrote:
>>> I understand your point of view. However, there are (probably rare) situations
>>> where one requires more control. I think that simulating double-double precision
>>> arithmetic using Veltkamp split was mentioned as a resonable example, earlier.
>>
>> There are cases where doing things at higher precision results in double rounding and a less accurate result. But I am pretty sure there are far fewer of those cases compared to routine computations that get a more accurate result with more precision.
>>
>> If that wasn't true, we wouldn't ever need double precision.
>
> You are wrong that there are far fewer of those cases. This is naive point of view. A lot of netlib math functions require exact IEEE arithmetic. Tinflex requires it. Python C backend and Mir library require exact IEEE arithmetic. Atmosphere package requires it, Atmosphere is used as reference code for my publication in JMS, Springer. And the most important case: no one top scientific laboratory will use a language without exact IEEE arithmetic by default.

Most C compilers always promote float to double, so I'm not sure what point you are trying to make here.

August 05, 2016
On Friday, 5 August 2016 at 07:59:15 UTC, deadalnix wrote:
> On Friday, 5 August 2016 at 07:43:19 UTC, Ilya Yaroshenko wrote:
>> You are wrong that there are far fewer of those cases. This is naive point of view. A lot of netlib math functions require exact IEEE arithmetic. Tinflex requires it. Python C backend and Mir library require exact IEEE arithmetic. Atmosphere package requires it, Atmosphere is used as reference code for my publication in JMS, Springer. And the most important case: no one top scientific laboratory will use a language without exact IEEE arithmetic by default.
>
> Most C compilers always promote float to double, so I'm not sure what point you are trying to make here.

1. Could you please provide an assembler example with clang or recent gcc?
2. C compilers not promote double to 80-bit reals anyway.
August 05, 2016
On Friday, 5 August 2016 at 08:17:00 UTC, Ilya Yaroshenko wrote:
> 1. Could you please provide an assembler example with clang or recent gcc?

I have better: compile your favorite project with -Wdouble-promotion and enjoy the rain of warnings.

But try it yourself:

float foo(float a, float b) {
  return 3.0 * a / b;
}

GCC 5.3 gives me

foo(float, float):
        cvtss2sd        xmm0, xmm0
        cvtss2sd        xmm1, xmm1
        mulsd   xmm0, QWORD PTR .LC0[rip]
        divsd   xmm0, xmm1
        cvtsd2ss        xmm0, xmm0
        ret
.LC0:
        .long   0
        .long   1074266112

Which clearly uses double precision.

And clang 3.8:

LCPI0_0:
        .quad   4613937818241073152     # double 3
foo(float, float):                               # @foo(float, float)
        cvtss2sd        xmm0, xmm0
        mulsd   xmm0, qword ptr [rip + .LCPI0_0]
        cvtss2sd        xmm1, xmm1
        divsd   xmm0, xmm1
        cvtsd2ss        xmm0, xmm0
        ret

which uses double as well.

> 2. C compilers not promote double to 80-bit reals anyway.

VC++ does it on 32 bits build, but initialize the x87 unit to double precision (on 80 bits floats - yes that's a x87 setting).

VC++ will keep using float for x64 builds.

Intel compiler use compiler flags to promote or not.

In case you were wondering, this is not limited to X86/64 as GCC gives me on ARM:

foo(float, float):
        fmov    d2, 3.0e+0
        fcvt    d0, s0
        fmul    d0, d0, d2
        fcvt    d1, s1
        fdiv    d0, d0, d1
        fcvt    s0, d0
        ret

Which also promotes to double.