August 13

On Thursday, 8 August 2024 at 10:31:32 UTC, Carsten Schlote wrote:

>

Ok, the fractal code uses 'double' floats, which are by their very nature limited in precision. But assuming that the math emulation of CTFE works identical as the CPU does at runtime, the outcome should be identical.

>

Any experiences or thought on this?

It's worth mentioning that in the wider field, it is generally held that when working with floating point operations we don't expect to see the same duplicate results every time. It is a mistake to compare for equality, you should be checking for "almost equals".

For your particular application it might feel like a "bug" to see different results, but in fact this is expected behaviour and generally not held to be a fault.

One little piece of anecdata, when working with python I saw differing results in the 17th d.p. when running the exact same calculation in the same python session. The reason turned out to be which registers were being used each time, which could vary. This is not considered to be a bug.

August 13
On 8/13/24 09:56, Abdulhaq wrote:
> 
> It's worth mentioning that in the wider field, it is generally held that when working with floating point operations we don't expect to see the same duplicate results every time.

This is not true at the level of hardware, where results are actually reproducible on the same processor architecture (and often even across architectures) with the same sequence of instructions. Treating floats as "fuzzy" is perhaps common, but it is not correct. Rounding is a pure function like any other.

> It is a mistake to compare for equality, you should be checking for "almost equals".

That is a different point.

> 
> For your particular application it might feel like a "bug" to see different results, but in fact this is expected behaviour and generally not held to be a fault.
> ...

It's annoying anyway, and we should hold programming languages to a higher standard than that.

> One little piece of anecdata, when working with python I saw differing results in the 17th d.p. when running the exact same calculation in the same python session. The reason turned out to be which registers were being used each time, which could vary. This is not considered to be a bug. 

Well, I think the x87 is a bad design as a compilation target. If you are targeting that with a compiler whose developers by default do not care about reproducibility, expect weird behavior. These instructions are now deprecated, so the calculation should change a bit.
August 13

On Monday, 12 August 2024 at 11:06:15 UTC, IchorDev wrote:

>

On Monday, 12 August 2024 at 10:22:52 UTC, Quirin Schroll wrote:

>

On almost all non-embedded CPUs, doing non-vector calculations in float is more costly than doing them in double or real because for single-arguments, the floats are converted to double or real. I consider float to be a type used for storing values in arrays that don’t need the precision and save me half the RAM.

I don’t care. Only one family of CPU architectures supports ‘extended precision’ floats (because it’s a waste of time), so I would like to know a way to always perform calculations with double precision for better cross-platform consistency. Imagine trying to implement JRE without being able to do native double precision maths.

I honestly don’t know how JRE did implement double operations on e.g. the Intel 80486, but if I try using gcc -mfpmath=387 -O3 and add some double values, intermediate results are stored and loaded again. Very inefficient. If I use long double, that does not happen.

The assertion that only one CPU family supports extended floats is objectively wrong. You probably meant the x86 with its 80-bit format, which is still noteworthy, as x86 is very, very common. However, at least the POWER9 family supports 128-bit IEEE-754 quad quadruple-precision floats. IIUC, RISC-V also supports them.

August 13
On Tuesday, 13 August 2024 at 08:33:51 UTC, Timon Gehr wrote:

>
> It's annoying anyway, and we should hold programming languages to a higher standard than that.
>
>
> Well, I think the x87 is a bad design as a compilation target. If you are targeting that with a compiler whose developers by default do not care about reproducibility, expect weird behavior. These instructions are now deprecated, so the calculation should change a bit.

It's an interesting discussion, but we are where we are, and I think you'd agree that the OP should understand that in the current world of "systems"  programming languages, this behaviour would not be viewed as a "bug".

I'm only being somewhat tongue-in-cheek when I say that his application, viewing fractals, could be considered a kind of "chaos amplifier" that is especially prone to these sorts of issues.

I'm going to guess that the situation we're in is due to preference of speed over reproduceability in the days when computers ran much more slowly than they do now. Would we choose a different priority these days? I don't know.
August 13

On Tuesday, 13 August 2024 at 11:02:07 UTC, Quirin Schroll wrote:

>

I honestly don’t know how JRE did implement double operations on e.g. the Intel 80486, but if I try using gcc -mfpmath=387 -O3 and add some double values, intermediate results are stored and loaded again. Very inefficient. If I use long double, that does not happen.

For those who haven't really run into numerical analysis before, this paper is worth reading. I've given a link to the HN discussion around it:

https://news.ycombinator.com/item?id=37028310

August 13
On Tuesday, 13 August 2024 at 11:14:35 UTC, Abdulhaq wrote:
> ...
> For those who haven't really run into numerical analysis before, this paper is worth reading. I've given a link to the HN discussion around it:
>
> https://news.ycombinator.com/item?id=37028310

Since I can't access that site, could you please share the URL directly to this paper?

Matheus.
August 13
On Tuesday, 13 August 2024 at 14:20:29 UTC, matheus wrote:
> On Tuesday, 13 August 2024 at 11:14:35 UTC, Abdulhaq wrote:
>> ...
>> For those who haven't really run into numerical analysis before, this paper is worth reading. I've given a link to the HN discussion around it:
>>
>> https://news.ycombinator.com/item?id=37028310
>
> Since I can't access that site, could you please share the URL directly to this paper?
>
> Matheus.

sure, it's https://people.eecs.berkeley.edu/~wkahan/JAVAhurt.pdf
August 13

On Tuesday, 13 August 2024 at 11:02:07 UTC, Quirin Schroll wrote:

>

I honestly don’t know how JRE did implement double operations on e.g. the Intel 80486

Probably in software, but modern x86 CPUs can just use the hardware; so the difference isn’t so meaningful anymore.

>

if I try using gcc -mfpmath=387 -O3 and add some double values, intermediate results are stored and loaded again. Very inefficient. If I use long double, that does not happen.

Who cares? In a situation where we must reach the same result on every platform (cross-platform determinism) the performance can suffer a bit. You are just avoiding my question by making excuses. Do you make sure your program’s data is passed around exclusively via data registers? Do you only write branchless code? Does your entire program fit inside the CPU cache? No, because we make performance sacrifices to achieve the desired outcome. The idea that the only valid way of coding something is the way that compromises on the integrity of the output in favour of performance is a step away from programming nihilism.

>

You probably meant the x86 with its 80-bit format, which is still noteworthy

Yes because it’s referred to as ‘extended precision’ and doesn’t have a proper name because it’s an unstandardised atrocity.
https://en.wikipedia.org/wiki/Extended_precision#x86_extended_precision_format

>

However, at least the POWER9 family supports 128-bit IEEE-754 quad quadruple-precision floats. IIUC, RISC-V also supports them.

binary128 is obviously not the same as x86’s ‘extended precision’. You will not get cross-platform deterministic results from using them interchangeably; you will get misery.

August 13
On 8/13/24 13:09, Abdulhaq wrote:
> On Tuesday, 13 August 2024 at 08:33:51 UTC, Timon Gehr wrote:
> 
>>
>> It's annoying anyway, and we should hold programming languages to a higher standard than that.
>>
>>
>> Well, I think the x87 is a bad design as a compilation target. If you are targeting that with a compiler whose developers by default do not care about reproducibility, expect weird behavior. These instructions are now deprecated, so the calculation should change a bit.
> 
> It's an interesting discussion, but we are where we are, and I think you'd agree that the OP should understand that in the current world of "systems"  programming languages, this behaviour would not be viewed as a "bug".
> ...

Well, I think one thing is where we are now, another question is where we should be in the future.

> I'm only being somewhat tongue-in-cheek when I say that his application, viewing fractals, could be considered a kind of "chaos amplifier" that is especially prone to these sorts of issues.
> ...

Not really. I think it is valid to expect reproducible results for addition, absolute value, and squaring, no matter how many times you repeat them. The reason why this issue happened is that D does not even try a little bit, and explicitly just goes ahead and uses the wrong data type, namely the 80-bit one only supported by the x87.

> I'm going to guess that the situation we're in is due to preference of speed over reproduceability in the days when computers ran much more slowly than they do now.

Ironically, the x87 design is mostly about getting accurate numerics. It is a rather inefficient design. One issue is that it was designed for assembly programmers, not C compilers. C language features do not map well to it. x87 results are reproducible in assembly as it is clear where rounding happens.

It is true that computers used to run more slowly, and the x87 is some part of the reason why that was true. There are many reasons why the design was ditched, efficiency is one of them.

> Would we choose a different priority these days? I don't know.

The x87 results in both slower code and worse reproducibility when used as a target for `float`/`double`/`real` computations.

Ironically, IIRC Kahan was involved in the x87 design, yet algorithms attributed to him, like Kahan summation, can actually not be implemented correctly using D floating-point types, because of the insane rules introduced to support targeting the x87 without changing rounding precision or spilling data to the cache and reloading it.

At the same time, "better accuracy" is part of the reason why CTFE uses higher floating-point precision, but higher precision does not mean the result will be more accurate. Unless arbitrary precision is used, which is even slower, it leads to double-rounding issues which can cause the result to be less accurate.
August 13
On Tuesday, 13 August 2024 at 21:03:00 UTC, Timon Gehr wrote:
> On 8/13/24 13:09, Abdulhaq wrote:
>
> At the same time, "better accuracy" is part of the reason why CTFE uses higher floating-point precision, but higher precision does not mean the result will be more accurate. Unless arbitrary precision is used, which is even slower, it leads to double-rounding issues which can cause the result to be less accurate.

If you have a pristine 24 bit audio sample, maybe 120db SnR due to DAC limitations. Processing it in 16 bit will automatically loose you 4 bits of accuracy, it'll drop the SnR to 96dbs. If you process it at 32 bits you still have 120db SnR, greater precision but same accuracy as the source.

The point is the statement "higher precision does not mean the result will be more accurate." is only half true.

If the precision you are doing the calculations at is already higher than the accuracy of your data, more precision wont get you much of anything.

but if the converse is true, if you are processing the data at lower precision than the accuracy of your source data, then increasing precision will absolutely increase accuracy.