May 16, 2016
On Monday, 16 May 2016 at 08:47:03 UTC, Iain Buclaw wrote:
> But you *didn't* request coercion to 32 bit floats.  Otherwise you would have used 1.30f.

	const float f = 1.3f;
	float c = f;
	assert(c*1.0 == f*1.0); // Fails! SHUTDOWN!


May 16, 2016
Am Mon, 16 May 2016 00:52:41 -0700
schrieb Walter Bright <newshound2@digitalmars.com>:

> On 5/15/2016 11:14 PM, Ola Fosheim Grøstad wrote:
> > But hey, here is another one:
> >
> > const real x = f();
> > assert(0<=x && x<1);
> > x += 1;
> >
> > const float f32 = cast(float)(x);
> > const real residue = x - cast(real)f32; // ZERO!!!!
> >
> > output(dither(f32, residue)); // DITHERING IS FUBAR!!!
> 
> Why would anyone dither based on the difference in precision of floats and doubles? Dithering is usually done based on different pixel densities.

To be fair, several image editors have various dithering
algorithms that seek to reduce the apparent banding when
converting an image from a higher to a lower bit depth.
3dfx also performed dithering in hardware when going from
internal 24-bit RGB calculations to 16-bit output.
Ola's example could be some X-ray imaging format. Although
it is still a vague example, this "cast doesn't do anything
to floating point constants" semantic surprised me too, I
have to say. I don't want to dive too deep into the matter
though.

-- 
Marco

May 16, 2016
On Monday, 16 May 2016 at 09:00:49 UTC, Marco Leise wrote:
> Ola's example could be some X-ray imaging format. Although
> it is still a vague example,

I don't think quantization is a particularly vague example! I think that is rather common.

Doing something like floor(x*(1<<24))*(1.0/(1<<24)) to quantize to 24 bit is just silly and inefficient.

Error diffusion is useful in scenarios where you want to reduce the accumulation-error or avoid banding. It can be used in audio in many other settings.

Anyway, irregular higher precision is almost always worse than regular lower precision when computing time series, comparing results or doing unit testing.

May 16, 2016
On 16 May 2016 at 10:52, Ola Fosheim Grøstad via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
> On Monday, 16 May 2016 at 08:47:03 UTC, Iain Buclaw wrote:
>>
>> But you *didn't* request coercion to 32 bit floats.  Otherwise you would have used 1.30f.
>
>
>         const float f = 1.3f;
>         float c = f;
>         assert(c*1.0 == f*1.0); // Fails! SHUTDOWN!
>
>

Your still using doubles.  Are you intentionally missing the point?

May 16, 2016
On 5/16/2016 1:43 AM, Timon Gehr wrote:
> My examples were not contrived.

I don't see where you posted examples in this thread. You did link to two wikipedia articles, which seemed to be more conjecture than actual examples.


> Manu's even less so.

Manu has said he's had trouble with it. I believe him. But I haven't seen his code to see just what he was doing.


> What you call "illegitimate" are really just legitimate examples that you
> dismiss because they do not match your own specific experience.

Of course, legitimate is a matter of opinion. Can code be written to rely on lower precision? Of course. Is it portable to any standard conforming C/C++ compiler? Nope. Can algorithms be coded to not require reduced precision? I'm pretty sure they can be, and should be.

The links I've posted here make it clear that there is no "obvious" right way for compilers to implement FP. If you're going to write sensitive FP algorithms, it only makes sense to take the arbitrariness of FP implementations into account.

My proposal for D means less is implementation defined than C/C++, the semantics fall within the "implementation defined" part of the C/C++ standards, and the implementation is completely IEEE compliant.

I don't believe it is a service to programmers to have the panopoly of FP behavior switches sported by typical compilers that apparently very few people understand. Worse, writing code that is invisibly, subtly, and critically dependent on specific compiler switches and versions is not something I can recommend as a best practice.


> I think that's a little disrespectful.

Perhaps, but disrespecting code is not an ad homenim argument.
May 16, 2016
On Monday, 16 May 2016 at 08:52:16 UTC, Ola Fosheim Grøstad wrote:
> On Monday, 16 May 2016 at 08:47:03 UTC, Iain Buclaw wrote:
>> But you *didn't* request coercion to 32 bit floats.  Otherwise you would have used 1.30f.
>
> 	const float f = 1.3f;
> 	float c = f;
> 	assert(c*1.0 == f*1.0); // Fails! SHUTDOWN!

IIRC, there are circumstances where you can get an equality failure even for non-compile-time calculations that superficially look identical: e.g. if one of the sides of the comparison is still stored in one of the 80-bit registers used for calculation.  I ran into this when doing some work in std.complex ... :-(

Reworking Ola's example:

    import std.stdio : writefln, writeln;

    const float constVar = 1.3f;
    float var = constVar;

    writefln("%.64g", constVar);
    writefln("%.64g", var);
    writefln("%.64g", constVar * 1.0);
    writefln("%.64g", var * 1.0);
    writefln("%.64g", 1.30 * 1.0);
    writefln("%.64g", 1.30f * 1.0);

... produces:

1.2999999523162841796875
1.2999999523162841796875
1.3000000000000000444089209850062616169452667236328125
1.2999999523162841796875
1.3000000000000000444089209850062616169452667236328125
1.3000000000000000444089209850062616169452667236328125

... which is unintuitive, to say the least; the `f` qualifier on the manifest constant being assigned to `const float constVar` is being completely ignored for the purposes of the multiplication by 1.0.

In other words, the specified precision of floating-point values is being ignored for calculations performed at compile-time, regardless of how much you "qualify" what you want.
May 16, 2016
On Monday, 16 May 2016 at 09:54:51 UTC, Iain Buclaw wrote:
> On 16 May 2016 at 10:52, Ola Fosheim Grøstad via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>> On Monday, 16 May 2016 at 08:47:03 UTC, Iain Buclaw wrote:
>>>
>>> But you *didn't* request coercion to 32 bit floats.  Otherwise you would have used 1.30f.
>>
>>
>>         const float f = 1.3f;
>>         float c = f;
>>         assert(c*1.0 == f*1.0); // Fails! SHUTDOWN!
>>
>>
>
> Your still using doubles.  Are you intentionally missing the point?

I think you're missing what Ola means when he's talking about 32-bit floats.  Of course when you multiply a float by a double (here, 1.0) you promote it to a double; but you'd expect the result to reflect the data available in the 32-bit float.

Whereas in Ola's example, the fact that `const float f` is known at compile-time means that the apparent 32-bit precision is in fact entirely ignored when doing the * 1.0 calculation.

In other words, Ola's expecting

    float * double == float * double

but is getting something more like

    float * double == double * double

or maybe even

    float * double == real * real

... because of the way FP constants are treated at compile time.
May 16, 2016
On Monday, 16 May 2016 at 09:56:07 UTC, Walter Bright wrote:
>> What you call "illegitimate" are really just legitimate examples that you
>> dismiss because they do not match your own specific experience.
>
> Of course, legitimate is a matter of opinion. Can code be written to rely on lower precision? Of course. Is it portable to any standard conforming C/C++ compiler? Nope. Can algorithms be coded to not require reduced precision? I'm pretty sure they can be, and should be.

If I've understood people's arguments right, the point of concern is that there are use cases where the programmer wants to be able to guarantee _a specific precision of their choice_.

That strikes me as a legitimate use-case that it would be worth trying to support.
May 16, 2016
On 5/16/16 12:37 AM, Walter Bright wrote:
> Me, I think of that as "Who cares that you paid $$$ for an 80 bit CPU,
> we're going to give you only 64 bits."

I'm not sure about this. My understanding is that all SSE has hardware for 32 and 64 bit floats, and the the 80-bit hardware is pretty much cut-and-pasted from the x87 days without anyone really looking in improving it. And that's been the case for more than a decade. Is that correct?

I'm looking for example at http://nicolas.limare.net/pro/notes/2014/12/12_arit_speed/ and see that on all Intel and compatible hardware, the speed of 80-bit floating point operations ranges between much slower and disastrously slower.

I think it's time to revisit our attitudes to floating point, which was formed last century in the heydays of x87. My perception is the world has moved to SSE and 32- and 64-bit float; the "real" type is a distraction for D; the whole let's do things in 128-bit during compilation is a time waster; and many of the original things we want to do with floating point are different without a distinction, and a further waste of our resources.


Andrei

May 16, 2016
On 5/16/16 2:46 AM, Walter Bright wrote:
> I used to do numerics work professionally. Most of the troubles I had
> were catastrophic loss of precision. Accumulated roundoff errors when
> doing numerical integration or matrix inversion are major problems. 80
> bits helps dramatically with that.

Aren't good algorithms helping dramatically with that?

Also, do you have a theory that reconciles your assessment of the importance of 80-bit math with the fact that the computing world is moving away from it? http://stackoverflow.com/questions/3206101/extended-80-bit-double-floating-point-in-x87-not-sse2-we-dont-miss-it


Andrei