May 17, 2016
On Tuesday, 17 May 2016 at 02:00:24 UTC, Manu wrote:
> If Ethan and Remedy want to expand their use of D, the compiler CAN
> NOT emit x87 code. It's just a matter of time before a loop is in a
> hot path.

I really need to see what the codegen for the latest DMD looks like, I have no idea what the current state is. But this isn't just true for Remedy, it's true for any engine programmer in another company thinking of using D.

In context of this entire discussion though, a compiler switch to give the codegen I want is my preference. I have no desire to dictate how other people should use D/DMD, I just want the option to use it the way I need to.
May 17, 2016
On Monday, 16 May 2016 at 15:11:21 UTC, Daniel Murphy wrote:
> On 16/05/2016 10:37 PM, Walter Bright wrote:
>> Some counter points:
>>
>> 1. Go uses 256 bit soft float for constant folding.
>>
>
> Then we should use 257 bit soft float!

Go uses at least 288 bits!!!!! (256 bits mantissa and 32 bits exponent).

But Go has a completely different take on constant expressions, and much better than the C family that D belongs to (but still far from ideal). From https://golang.org/ref/spec:

«
Numeric constants represent exact values of arbitrary precision and do not overflow. Consequently, there are no constants denoting the IEEE-754 negative zero, infinity, and not-a-number values.

Constants may be typed or untyped. Literal constants, true, false, iota, and certain constant expressions containing only untyped constant operands are untyped.

A constant may be given a type explicitly by a constant declaration or conversion, or implicitly when used in a variable declaration or an assignment or as an operand in an expression. It is an error if the constant value cannot be represented as a value of the respective type. For instance, 3.0 can be given any integer or any floating-point type, while 2147483648.0 (equal to 1<<31) can be given the types float32, float64, or uint32 but not int32 or string.

An untyped constant has a default type which is the type to which the constant is implicitly converted in contexts where a typed value is required, for instance, in a short variable declaration such as i := 0 where there is no explicit type. The default type of an untyped constant is bool, rune, int, float64, complex128 or string respectively, depending on whether it is a boolean, rune, integer, floating-point, complex, or string constant.

Implementation restriction: Although numeric constants have arbitrary precision in the language, a compiler may implement them using an internal representation with limited precision. That said, every implementation must:

Represent integer constants with at least 256 bits.
Represent floating-point constants, including the parts of a complex constant, with a mantissa of at least 256 bits and a signed exponent of at least 32 bits.
Give an error if unable to represent an integer constant precisely.
Give an error if unable to represent a floating-point or complex constant due to overflow.
Round to the nearest representable constant if unable to represent a floating-point or complex constant due to limits on precision.
These requirements apply both to literal constants and to the result of evaluating constant expressions.
»

See also https://golang.org/ref/spec#Constant_expressions

May 17, 2016
On Tuesday, 17 May 2016 at 07:47:58 UTC, Ethan Watson wrote:
>
> Unless you're doing game/graphics work ;-) 4x3 or 4x4 matrices are commonly used to represent transforms in 3D space in every 3D polygon-based rendering pipeline I know of. It's even a requirement for fixed-function OpenGL 1.x.
>
> Video games - also known around here as "The Exception To The Rule".
>
> (Side note: My own preference is to represent transforms as a quaternion and vector. Inverting such a transform is a simple matter of negating a few components. Generating a matrix from such a transform for rendering purposes is trivial compared to matrix inversion.)

I don't know much about computer graphics, but if you're solving equations, then you can use the techniques mentioned above.

Nevertheless, I'm not really sure what would be the fastest approach to inverting small matrices. I would definitely try the LU or Cholesky approaches. It might be that for a small matrix a Gaussian reduction approach would be fast. There are some analytic tricks you could use if you have some information about them, like when you can represent them as blocks. If some of the blocks are zero or identity matrices, then it simplifies the calculations too.
May 17, 2016
If you try to make compile-time FP math behave exactly like run-time FP math, you'd not only have to use the same precision in the compiler, but also the same rounding mode, denormal handling etc., which can be changed at run time (http://dlang.org/phobos/core_stdc_fenv.html), so the exact same piece of code can yield different results based on the run-time FP-context.

May 17, 2016
On Monday, 16 May 2016 at 12:37:58 UTC, Walter Bright wrote:
>
> 7. 80 bit reals are there and they work. The support is mature, and is rarely worked on, i.e. it does not consume resources.
>
This may not be true for too much longer-- both Intel and AMD are slowly phasing the x86 FPU out.  I think Intel already announced a server chip that omits it entirely, though I can't find the corroborating link.

-Wyatt
May 17, 2016
On Monday, 16 May 2016 at 19:38:10 UTC, Joakim wrote:
> Regarding floating-point, I'll go farther than you and say that if an algorithm depends on lower-precision floating-point to be accurate, it's a bad algorithm.

No. In system level programming a good algorithm is an effective and efficient algorithm on the hardware at hand.

A good system level programming language give you control at the hardware level. Including the floating point unit, in the language itself.

There are lots of algorithms that will break if you randomly switch precision of different expressions.

Heck, nearly all of the SIMD optimizations on differentials that I use will break if one lane is computed with a different precision than the other lanes. I don't give a rats ass about increased precision. I WANT THE DAMN LANES TO BE IN THE SAME PHASE (or close to it)!! Phase-locking is much more important than value accuracy.

Or to put it differently: It does not matter if all clocks are too slow, as long as they are running at the same speed. It is a lot worse if some clocks are too slow and others are too fast. That would lead to some serious noise in a time series.

Of course, no hardware vendor is crazy enough to randomly switch precision on their ALU. Hardware vendors do understand that this _WILL_ lead disaster. Sadly many in the D community don't. Presumably because they don't actually try to write performant floating point code, they are not system level programmers.

(Btw, many error correcting strategies break too if you randomly switch precision.)

> Now, people can always make mistakes in their implementation and unwittingly depend on lower precision somehow, but that _should_ fail.

People WILL make mistakes, but if you cannot control precision then you cannot:

1. create a reference implementation to compare with

2. unit test floating point code in a reliable way

3. test for convergence/divergence in feedback loops (which can have _disastrous_ results and could literally ruin your speakers/hearing in the case of audio).

> None of this is controversial to me: you shouldn't be comparing floating-point numbers with anything other than approxEqual,

I don't agree.

1. Comparing typed constants for equality should be unproblematic. In D that is broken.

2. Testing for zero is a necessity when doing division.

3. Comparing for equality is the same as subtraction followed by testing for zero.

So, the rule is: you shouldn't compare at all unless you know the error bounds, but that depends on WHY you are comparing.

However, with constants/sentinels and some methods you do know... Also, with some input you do know that the algorithm WILL fail for certain values at a _GIVEN_ precision. Testing for equality for those values makes a lot of sense, until some a**hole decides to randomly "improve" precision where it was typed to something specific and known.

Take this:

f(x) = 1/(2-x)

Should I not be able to test for the exact value "2" here? I don't see why "1.3" typed to a given precision should be different. You want to force me to a more than 3x more expensive test just to satisfy some useless FP semantic that does not provide any real world benefits whatsoever?


> increasing precision should never bother your algorithm, and a higher-precision, common soft-float for CTFE will help cross-compiling and you'll never notice the speed hit.

Randomly increasing precision is never a good idea. Yes, having different precision in different code paths can ruin the quality of both rendering, data analysis and break algorithms.

Having infinite precision untyped real that may downgrade to say 64 bits mantissa is acceptable. Or in the case of Go, a 256 bit mantissa. That's a different story.

Having single precision floats that randomly are turned into arbitrary precision floats is not acceptable. Not at all.

May 17, 2016
On 16.05.2016 22:10, Andrei Alexandrescu wrote:
>
>> I bring
>> up our very own Phobos sum algorithm, which was re-implemented later
>> with the Kahan method to reduce precision loss.
>
> Kahan is clear, ingenous, and understandable and a great part of the
> stdlib. I don't see what the point is here. Naive approaches aren't
> going to take anyone far, regardless of precision.

Kahan summation is not guaranteed to work in D.
May 17, 2016
On Tuesday, 17 May 2016 at 14:05:54 UTC, Matthias Bentrup wrote:
> If you try to make compile-time FP math behave exactly like run-time FP math, you'd not only have to use the same precision in the compiler, but also the same rounding mode, denormal handling etc., which can be changed at run time (http://dlang.org/phobos/core_stdc_fenv.html), so the exact same piece of code can yield different results based on the run-time FP-context.

Yes, heading rounding mode is an absolute requirement. If the system don't heed the rounding mode then it will change the semantics and break the program when computing upper vs lower bounds.

I've never liked the stateful approach to rounding. If I was to design a language I probably would make it part of the type or operator or something along those lines.


May 17, 2016
On 16.05.2016 06:26, Walter Bright wrote:
>>
>> Incidentally, I made the mistake of mentioning this thread (due to my
>> astonishment that CTFE ignores float types)
>
> Float types are not selected because they are less accurate,

(AFAIK, accuracy is a property of a value given some additional context. Types have precision.)

> they are selected because they are smaller and faster.

Right. Hence, the 80-bit CTFE results have to be converted to the final precision at some point in order to commence the runtime computation. This means that additional rounding happens, which was not present in the original program. The additional total roundoff error this introduces can exceed the roundoff error you would have suffered by using the lower precision in the first place, sometimes completely defeating precision-enhancing improvements to an algorithm.

This might be counter-intuitive, but this is floating point. The precision should just stay the specified one during the entire computation (even if part of it is evaluated at compile-time).

The claim here is /not/ that lower precision throughout delivers more accurate results. The additional rounding is the problem.


There are other reasons why I think that this kind of implementation-defined behaviour is a terribly bad idea, eg.:

- it breaks common assumptions about code, especially how it behaves under seemingly innocuous refactorings, or with a different set of compiler flags.

- it breaks reproducibility, which is sometimes more important that being close to the infinite precision result (which you cannot guarantee with any finite floating point type anyway).
  (E.g. in a game, it is enough if the result seems plausible, but it should be the same for everyone. For some scientific experiments, the ideal case is to have 100% reproducibility of the computation, even if it is horribly wrong, such that other scientists can easily uncover and diagnose the problem, for example.)
May 17, 2016
On Monday, 16 May 2016 at 19:01:19 UTC, Timon Gehr wrote:

>> You are not even guaranteed to get the same result on two different x86
>> implementations.
>
> Without reading the x86 specification, I think it is safe to claim that you actually are guaranteed to get the same result.

Correct. Sorry for the noise.

>
>> AMD64:
>>
>> "The processor produces a floating-point result defined by the IEEE
>> standard to be infinitely precise.
>> This result may not be representable exactly in the destination format,
>> because only a subset of the
>> continuum of real numbers finds exact representation in any particular
>> floating-point format."
>>
>
> This just says that results of computations will need to be rounded to fit into constant-size storage.

Right.