May 18, 2016
On Wednesday, 18 May 2016 at 07:56:58 UTC, Ethan Watson wrote:
> Not in the standards, no. But older gaming hardware was never known to be standards-conformant.
>
> As it turns out, the original hardware manuals can be found on the internet.

 Hmmm... I can't help but look at this and think about x86 instructions of old, and how they aren't utilized usually fully or properly in language specs to take advantage of them, like the jump carry, overflow bits/checks, or the nature of how the CPUs handle arithmetic. I'm referring more to multiplication where the result of 2 16-bit multiplies will result in a 32bit output (AX&DX, is the same for 32/64bit instructions), but most likely the upper 32bits are just outright ignored rather than making use of those possible features.

 Heh, I'd love to see more hardware level abstraction that's built into the language. Almost like:

 try {}    // Considers the result of 1 line of basic math to be caught by:
 carry     {} //only activates if carry is set
 overflow  {} //if overflowed during some math
 modulus(m){} //get the remainder as m after a division operation
 mult(dx)  {} //get upper 32/64/whatever after a multiply and set as dx

 Of course I'd understand if some hardware doesn't offer such support, so the else could be thrown in to allow a workaround code to detect such an event, or only allow it if it's a compliant architecture. Although workaround detection is always possible, just not as fast as hardware supplied.

 Although I'm not fully familiar with all the results a FPU result could give, or if such a system would be beneficial to the current discussion on floats. I would prefer not to inject fixed x86 instructions if I can avoid it.
May 18, 2016
On 5/18/2016 1:30 AM, Ethan Watson wrote:
>> You're also asking for a mode where the compiler for one machine is supposed
>> to behave like hand-coded assembler for another machine with a different
>> instruction set.
>
> Actually, I'm asking for something exactly like the arch option for MSVC/-mfmath
> option for GCC/etc, and have it respect that for CTFE.


MSVC doesn't appear to have a switch that does what you ask for:

  https://msdn.microsoft.com/en-us/library/e7s85ffb.aspx

May 18, 2016
On 18 May 2016 at 07:49, Ola Fosheim Grøstad via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
> On Wednesday, 18 May 2016 at 03:01:14 UTC, Joakim wrote:
>>
>> There is nothing "random" about increasing precision till the end, it follows a well-defined rule.
>
>
> Can you please quote that well-defined rule?
>
> It is indeed random, or arbitrary (which is the same thing):
>
> if(x<0){
>   // DMD choose 64 bit mantissa
>   const float y = ...
>   ...
>
> } else {
>   // DMD choose 24 bit mantissa
>   float y = ...
>   ...
> }
>
> How is this not arbitrary?
>

Can you back that up statistically?  Try running this same operation 600 million times plot a graph for the result from each run for it so we can get an idea of just how random or arbitrary it really is.

If you get the same result back each time, maybe it isn't as arbitrary or random as you would have some believe.

May 18, 2016
On Wednesday, 18 May 2016 at 07:21:30 UTC, Joakim wrote:
> On Wednesday, 18 May 2016 at 05:49:16 UTC, Ola Fosheim Grøstad wrote:
>> On Wednesday, 18 May 2016 at 03:01:14 UTC, Joakim wrote:
>>> There is nothing "random" about increasing precision till the end, it follows a well-defined rule.
>>
>> Can you please quote that well-defined rule?
>
> It appears to be "the compiler carries everything internally to 80 bit precision, even if they are typed as some other precision."
> http://forum.dlang.org/post/nh59nt$1097$1@digitalmars.com

"The compiler" means: implementation defined. That is the same as not being well-defined. :-)

>> It is indeed random, or arbitrary (which is the same thing):
>
> No, they're not the same thing: rules can be arbitrarily set yet consistent over time, whereas random usually means both arbitrary and inconsistent over time.

In this case it is the same thing then. I have no guarantee that my unit tests and production code will behave the same.

> I believe that means any calculation used to compute y at compile-time will be done in 80-bit or larger reals, then rounded to a const float for run-time, so your code comments would be wrong.

No. The "const float y" will not be coerced to 32 bit, but the "float y" will be coerced to 32 bit. So you get two different y values. (On a specific compiler, i.e. DMD.)

> I don't understand why you're using const for one block and not the other, seems like a contrived example.  If the precision of such constants matters so much, I'd be careful to use the same const float everywhere.

Now, that is a contrived defense for brittle language semantics! :-)

> If matching such small deltas matters so much, I wouldn't be using floating-point in the first place.

Why not? The hardware gives the same delta. It only goes wrong if the compiler decides to "improve".

>> It depends on the unit tests running with the exact same precision as the production code.
>
> What makes you think they don't?

Because the language says that I cannot rely on it and the compiler implementation proves that to be correct.

>> Fast floating point code depends on the specifics of the hardware. A system level language should not introduce a different kind of bias that isn't present in the hardware!
>
> He's doing this to take advantage of the hardware, not the opposite!

I don't understand what you mean.  He is not taking advantage of the hardware?


>> D is doing it wrong because it makes it is thereby forcing programmers to use algorithms that are 10-100x slower to get reliable results.
>>
>> That is _wrong_.
>
> If programmers want to run their code 10-100x slower to get reliably inaccurate results, that is their problem.

Huh?

What I said is that D is doing it wrong because the "improvements" is forcing me to write code that is 10-100x slower to get the same level of reliability and required accuracy as I would get without the "improvements".


> If you're so convinced it's exact for a few cases, then check exact equality there.  For most calculation, you should be using approxEqual.

I am sorry, but this is not a normative rule at all. The rule is that you check for the bounds required. If it is exact, it just means the bounds are the same value (e.g. tight).

It does not help to say that people should use "approxEqual", because it does not improve on correctness. Saying such things just means that non-expert programmers assume that guessing the bounds will be sufficient. Well, it isn't sufficient.


> Since the real error bound is always larger than that, almost any error bound you pick will tend to be closer to the real error bound, or at least usually bigger and therefore more realistic, than checking for exact equality.

I disagree. It is much better to get extremely wrong results frequently and therefore detect the error in testing.

What you are saying is that is better to get extremely wrong results infrequently which usually leads to error passing testing and enter production.

In order to test well you also need to understand for input makes the algorithm unstable/fragile.


> You can still test with approxEqual, so I don't understand why you think that's not testing.

It is not testing anything if the compiler can change the semantics when you use a different context.


> The computer doesn't know that, so it will just plug that x in and keep cranking, till you get nonsense data out the end, if you don't tell it to check that x isn't too close to 2 and not just 2.

Huh? I am not getting nonsense data. I am getting what I am asking for, I only want to avoid dividing by zero because it will make the given hardware 100x slower than the test.


> You have a wrong mental model that the math formulas are the "real world," and that the computer is mucking it up.

Nothing wrong with my mental model. My mental model is the hardware specification + the specifics of the programming platform. That is the _only_ model that matters.

What D prevents me from getting is the specifics of the programming platform by making the specifics hidden.


> The truth is that the computer, with its finite maximums and bounded precision, better models _the measurements we make to estimate the real world_ than any math ever written.

I am not estimating anything. I am synthesising artificial worlds. My code is the model, the world is my code running at specific hardware.

It is self contained. I don't want the compiler to change my model because that will generate the wrong world. ;-)


>>> Oh, it's real world alright, you should be avoiding more than just 2 in your example above.
>>
>> Which number would that be?
>
> I told you, any numbers too close to 2.

All numbers close to 2 in the same precision will work out ok.


> On the contrary, it is done because 80-bit is faster and more precise, whereas your notion of reliable depends on an incorrect notion that repeated bit-exact results are better.

80 bit is much slower. 80 bit mul takes 3 micro ops, 64 bit takes 1. Without SIMD 64 bit is at least twice as fast. With SIMD multiply-add is maybe 10x faster in 64bit.


And it is neither more precise or more accurate when you don't get consistent precision.

In the real world you can get very good performance for the desired accuracy by using unstable algorithms by adding a stage that compensate for the instability. That does not mean that it is acceptable to have differences in the bias as that can lead to accumulating an offset that brings the result away from zero (thus a loss of precision).


> You noted that you don't care that the C++ spec says similar things, so I don't see why you care so much about the D spec now.
>  As for that scenario, nobody has suggested it.

I care about what the C++ spec. I care about how the platform interprets the spec. I never rely on _ONLY_ the C++ spec for production code.

You have said previously that you know the ARM platform. On Apple CPUs you have 3 different floating point units: 32 bit NEON, 64 bit NEON and 64 bit IEEE.

It supports 1x64bit IEEE, 2x64bit NEON and 4x32 bit NEON.

You have to know the language, the compiler and the hardware to make this work out.


>> And so is "float" behaving differently than "const float".
>
> I don't believe it does.

I have proven that it does, and posted it in this thread.

May 18, 2016
On Wednesday, 18 May 2016 at 08:55:03 UTC, Walter Bright wrote:
> MSVC doesn't appear to have a switch that does what you ask for

I'm still not entirely sure what the /fp switch does for x64 builds. The documentation is not clear in the slightest and I haven't been able to find any concrete information. As near as I can tell it has no effect as the original behaviour was tied to how it handles the x87 control words. But it might also be possible that the SSE instructions emitted can differ depending on what operation you're trying to do. I have not dug deep to see exactly how the code gen differs. I can take a guess that /fp:precise was responsible for promoting my float to a double to call CRT functions, but I have not tested that so that's purely theoretical at the moment.

Of course, while this conversation has mostly been for compile time constant folding, the example of passing a value from the EE and treating it as a constant in the VU is still analagous to calculating a value at compile time in D at higher precision than the instruction set the runtime code is compiled to work with.

/arch:sse2 is the default with MSVC x64 builds (Xbox One defaults to /arch:avx), and it sounds like the DMD has defaulted to sse2 for a long time. The exception being the compile time behaviour. That compile time behaviour conforming to the the runtime behaviour is an option I want, with the default being whatever is decided in here. Executing code at compile time at a higher precision than what SSE dictates is effectively undesired behaviour for our use cases.

And in cases where we compile code for another architecture on x64 (let's say ARM code with NEON instructions, as it's the most common case thanks to iOS development) then it would be forced to fallback to the default. Fine for most use cases as well. It would be up to the user to compile their ARM code on an ARM processor to get the code execution match if they need it.

May 18, 2016
On Wednesday, 18 May 2016 at 08:21:18 UTC, Walter Bright wrote:
> Trying to make D behave exactly like various C++ compilers do, with all their semi-documented behavior and semi-documented switches that affect constant folding behavior, is a hopeless task.
>
> I doubt various C++ compilers are this compatible, even if they follow the same ABI.
>

They aren't. For instance, GCC use arbitrary precision FB, and LLVM uses 128 bits soft floats in their innards.
May 18, 2016
On Wednesday, 18 May 2016 at 08:38:07 UTC, Era Scarecrow wrote:
>  try {}    // Considers the result of 1 line of basic math to be caught by:
>  carry     {} //only activates if carry is set
>  overflow  {} //if overflowed during some math
>  modulus(m){} //get the remainder as m after a division operation
>  mult(dx)  {} //get upper 32/64/whatever after a multiply and set as dx
>
>  Of course I'd understand if some hardware doesn't offer such support, so the else could be thrown in to allow a workaround code to detect such an event, or only allow it if it's a compliant architecture. Although workaround detection is always possible, just not as fast as hardware supplied.

https://code.dlang.org/packages/checkedint
https://dlang.org/phobos/core_checkedint.html

May 18, 2016
On Wednesday, 18 May 2016 at 09:13:35 UTC, Iain Buclaw wrote:
> Can you back that up statistically?  Try running this same operation 600 million times plot a graph for the result from each run for it so we can get an idea of just how random or arbitrary it really is.

Huh? This isn't about statistics. It is about math. The magnitude of the difference depends on what you do with the constant. It can be exponentially boosted.



May 18, 2016
On 5/18/2016 3:15 AM, deadalnix wrote:
> On Wednesday, 18 May 2016 at 08:21:18 UTC, Walter Bright wrote:
>> Trying to make D behave exactly like various C++ compilers do, with all their
>> semi-documented behavior and semi-documented switches that affect constant
>> folding behavior, is a hopeless task.
>>
>> I doubt various C++ compilers are this compatible, even if they follow the
>> same ABI.
>>
>
> They aren't. For instance, GCC use arbitrary precision FB, and LLVM uses 128
> bits soft floats in their innards.

Looks like LLVM had the same idea as myself.

Anyhow, this pretty much destroys the idea that I have proposed some sort of cowboy FP that's going to wreck FP programs.

(What is arbitrary precision FB?)
May 18, 2016
On Wednesday, 18 May 2016 at 09:21:30 UTC, Ola Fosheim Grøstad wrote:
> No. The "const float y" will not be coerced to 32 bit, but the "float y" will be coerced to 32 bit. So you get two different y values. (On a specific compiler, i.e. DMD.)

I'm not sure that the `const float` vs `float` is the difference per se.  The difference is that in the examples you've given, the `const float` is being determined (and used) at compile time.

But a `const float` won't _always_ be determined or used at compile time, depending on the context and manner in which the value is set.

Let's be clear about the problem -- compile time vs. runtime, rather than `const` vs non-`const`.