October 24, 2013 Re: Question/request/bug(?) re. floating-point in dmd | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Wednesday, 23 October 2013 at 16:50:52 UTC, Walter Bright wrote: > Java initially tried to enforce a maximum precision, and it was a major disaster for them. If I have been unable to convince you, I suggest reviewing that case history. It was recently on Hacker News. Here is one of the relevant rants: http://www.cs.berkeley.edu/~wkahan/JAVAhurt.pdf Page 16 is probably the core argument "Linguistically legislated exact reproducibility is unenforceable", but everything Kahan says here is worth listening to (despite the ugly presentation). |
October 24, 2013 Re: Question/request/bug(?) re. floating-point in dmd | ||||
---|---|---|---|---|
| ||||
Posted in reply to qznc | On 10/23/2013 10:45 PM, qznc wrote:
> Page 16 is probably the core argument "Linguistically legislated exact
> reproducibility is unenforceable", but everything Kahan says here is worth
> listening to (despite the ugly presentation).
I agree that Prof Kahan is well worth listening to for anyone with even a passing interest in floating point arithmetic.
|
October 24, 2013 Re: Question/request/bug(?) re. floating-point in dmd | ||||
---|---|---|---|---|
| ||||
Posted in reply to Apollo Hogan | On Wednesday, 23 October 2013 at 15:44:54 UTC, Apollo Hogan wrote: > For example, the appended code produces the following output when compiled (DMD32 D Compiler v2.063.2 under WinXP/cygwin) with no optimization: > > immutable(pair)(1.1, -2.03288e-20) > pair(1, 0.1) > pair(1.1, -8.32667e-17) > > and the following results when compiled with optimization (-O): > > immutable(pair)(1.1, -2.03288e-20) > pair(1, 0.1) > pair(1.1, 0) > > The desired result would be: > > immutable(pair)(1.1, -8.32667e-17) > pair(1, 0.1) > pair(1.1, -8.32667e-17) > > Cheers, > --Apollo > > import std.stdio; > struct pair { double hi, lo; } > pair normalize(pair q) > { > double h = q.hi + q.lo; > double l = q.lo + (q.hi - h); > return pair(h,l); > } > void main() > { > immutable static pair spn = normalize(pair(1.0,0.1)); > writeln(spn); > writeln(pair(1.0,0.1)); > writeln(normalize(pair(1.0,0.1))); > } I can replicate it here. Here is an objdump diff of normalize: Optimized: | Unoptimized: 08076bdc <_D6fptest9normalizeFS6fptest4pairZS6fptest4pair>: 08076bdc <_D6fptest9normalizeFS6fptest4pairZS6fptest4pair>: 8076bdc: 55 push %ebp 8076bdc: 55 push %ebp 8076bdd: 8b ec mov %esp,%ebp 8076bdd: 8b ec mov %esp,%ebp 8076bdf: 83 ec 10 sub $0x10,%esp | 8076bdf: 83 ec 14 sub $0x14,%esp 8076be2: dd 45 08 fldl 0x8(%ebp) 8076be2: dd 45 08 fldl 0x8(%ebp) 8076be5: d9 c0 fld %st(0) | 8076be5: dc 45 10 faddl 0x10(%ebp) 8076be7: 89 c1 mov %eax,%ecx | 8076be8: dd 5d ec fstpl -0x14(%ebp) 8076be9: dc 45 10 faddl 0x10(%ebp) | 8076beb: dd 45 08 fldl 0x8(%ebp) 8076bec: dd 55 f0 fstl -0x10(%ebp) | 8076bee: dc 65 ec fsubl -0x14(%ebp) 8076bef: de e9 fsubrp %st,%st(1) | 8076bf1: dc 45 10 faddl 0x10(%ebp) 8076bf1: dd 45 f0 fldl -0x10(%ebp) | 8076bf4: dd 5d f4 fstpl -0xc(%ebp) 8076bf4: d9 c9 fxch %st(1) | 8076bf7: dd 45 ec fldl -0x14(%ebp) 8076bf6: dc 45 10 faddl 0x10(%ebp) | 8076bfa: dd 18 fstpl (%eax) 8076bf9: dd 5d f8 fstpl -0x8(%ebp) | 8076bfc: dd 45 f4 fldl -0xc(%ebp) 8076bfc: dd 45 f8 fldl -0x8(%ebp) | 8076bff: dd 58 08 fstpl 0x8(%eax) 8076bff: d9 c9 fxch %st(1) | 8076c02: c9 leave 8076c01: dd 19 fstpl (%ecx) | 8076c03: c2 10 00 ret $0x10 8076c03: dd 59 08 fstpl 0x8(%ecx) | 8076c06: 90 nop 8076c06: 8b e5 mov %ebp,%esp | 8076c07: 90 nop 8076c08: 5d pop %ebp | 8076c08: 90 nop 8076c09: c2 10 00 ret $0x10 | 8076c09: 90 nop > 8076c0a: 90 nop > 8076c0b: 90 nop I cannot see any significant difference. The fadd-fsub-fadd sequence seems to be the same in both cases. |
October 24, 2013 Re: Question/request/bug(?) re. floating-point in dmd | ||||
---|---|---|---|---|
| ||||
Posted in reply to qznc | On Thursday, 24 October 2013 at 06:12:03 UTC, qznc wrote: >> import std.stdio; >> struct pair { double hi, lo; } >> pair normalize(pair q) >> { >> double h = q.hi + q.lo; >> double l = q.lo + (q.hi - h); >> return pair(h,l); >> } >> void main() >> { >> immutable static pair spn = normalize(pair(1.0,0.1)); >> writeln(spn); >> writeln(pair(1.0,0.1)); >> writeln(normalize(pair(1.0,0.1))); >> } > I cannot see any significant difference. The fadd-fsub-fadd sequence seems to be the same in both cases. More observations. It requires -m32 to reproduce. On 64bit is gives the desired output even in optimized form. I wrote a small C program for comparison: ------ #include <stdio.h> typedef struct pair { double hi, lo; } pair; pair normalize(pair q) { double h = q.hi + q.lo; double l = q.lo + (q.hi - h); //double l = q.lo + (q.hi - (q.hi + q.lo)); pair res = {h,l}; return res; } int main() { pair x = {1.0, 0.1}; pair y = normalize(x); printf("%g %g\n", y.hi, y.lo); return 0; } ------ gcc -m32 normtest_replicate.c; ./a.out Output: 1.1 -8.32667e-17 Note the commented line, if you use that instead of the line above, then Output: 1.1 0 The C semantic says that h must be rounded to double. In the second case l is computed with full hardware precision instead. For a shorter example, two versions: double x = a - b; double y = c + (a - b); double y = c + x; In C those have different semantics in terms of the precision, but in D they are the equivalent. |
October 24, 2013 Re: Question/request/bug(?) re. floating-point in dmd | ||||
---|---|---|---|---|
| ||||
Posted in reply to qznc | On Thursday, 24 October 2013 at 07:27:09 UTC, qznc wrote: > On Thursday, 24 October 2013 at 06:12:03 UTC, qznc wrote: >>> import std.stdio; >>> struct pair { double hi, lo; } >>> pair normalize(pair q) >>> { >>> double h = q.hi + q.lo; >>> double l = q.lo + (q.hi - h); >>> return pair(h,l); >>> } >>> void main() >>> { >>> immutable static pair spn = normalize(pair(1.0,0.1)); >>> writeln(spn); >>> writeln(pair(1.0,0.1)); >>> writeln(normalize(pair(1.0,0.1))); >>> } >> I cannot see any significant difference. The fadd-fsub-fadd sequence seems to be the same in both cases. > > More observations. It requires -m32 to reproduce. On 64bit is gives the desired output even in optimized form. > > I wrote a small C program for comparison: > > ------ > #include <stdio.h> > > typedef struct pair { > double hi, lo; > } pair; > > pair normalize(pair q) { > double h = q.hi + q.lo; > double l = q.lo + (q.hi - h); > //double l = q.lo + (q.hi - (q.hi + q.lo)); > pair res = {h,l}; > return res; > } > > int main() { > pair x = {1.0, 0.1}; > pair y = normalize(x); > printf("%g %g\n", y.hi, y.lo); > return 0; > } > ------ > gcc -m32 normtest_replicate.c; ./a.out > Output: 1.1 -8.32667e-17 > > Note the commented line, if you use that instead of the line above, then > Output: 1.1 0 > > The C semantic says that h must be rounded to double. In the second case l is computed with full hardware precision instead. > > For a shorter example, two versions: > > double x = a - b; double y = c + (a - b); > double y = c + x; > > In C those have different semantics in terms of the precision, but in D they are the equivalent. Ah, finally found the relevant part of the spec: "It's possible that, due to greater use of temporaries and common subexpressions, optimized code may produce a more accurate answer than unoptimized code." http://dlang.org/float.html |
October 29, 2013 Re: Question/request/bug(?) re. floating-point in dmd | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Wednesday, 23 October 2013 at 16:50:52 UTC, Walter Bright wrote:
> On 10/23/2013 9:22 AM, David Nadlinger wrote:
>> On Wednesday, 23 October 2013 at 16:15:56 UTC, Walter Bright wrote:
>>> A D compiler is allowed to compute floating point results at arbitrarily large
>>> precision - the storage size (float, double, real) only specify the minimum
>>> precision.
>>>
>>> This behavior is fairly deeply embedded into the front end, optimizer, and
>>> various back ends.
>>
>> I know we've had this topic before, but just for the record, I'm still not sold
>> on the idea of allowing CTFE to yield different results than runtime execution.
>
> Java initially tried to enforce a maximum precision, and it was a major disaster for them. If I have been unable to convince you, I suggest reviewing that case history.
>
> Back when I designed and built digital electronics boards, it was beaten into my skull that chips always get faster, never slower, and the slower parts routinely became unavailable. This means that the circuits got designed with maximum propagation delays in mind, and with a minimum delay of 0. Then, when they work with a slow part, they'll still work if you swap in a faster one.
>
> FP precision is the same concept. Swap in more precision, and your correctly designed algorithm will still work.
THIS IS COMPLETELY WRONG. You cannot write serious floating-point code under such circumstances. This takes things back to the bad old days before IEEE, where results were implementation-dependent.
We have these wonderful properties, float.epsilon, etc, which allow code to adapt to machine differences. The correct approach is to write generic code which will give full machine precision and will work on any machine configuration. That's actually quite easy.
But to write code which will function correctly when an unspecified and unpredictable error can be added to any calculation -- I believe that's impossible. I don't know how to write such code.
|
October 29, 2013 Re: Question/request/bug(?) re. floating-point in dmd | ||||
---|---|---|---|---|
| ||||
Posted in reply to Don | On 10/29/2013 5:08 AM, Don wrote:
> On Wednesday, 23 October 2013 at 16:50:52 UTC, Walter Bright wrote:
>> On 10/23/2013 9:22 AM, David Nadlinger wrote:
>>> On Wednesday, 23 October 2013 at 16:15:56 UTC, Walter Bright wrote:
>>>> A D compiler is allowed to compute floating point results at arbitrarily large
>>>> precision - the storage size (float, double, real) only specify the minimum
>>>> precision.
>>>>
>>>> This behavior is fairly deeply embedded into the front end, optimizer, and
>>>> various back ends.
>>>
>>> I know we've had this topic before, but just for the record, I'm still not sold
>>> on the idea of allowing CTFE to yield different results than runtime execution.
>>
>> Java initially tried to enforce a maximum precision, and it was a major
>> disaster for them. If I have been unable to convince you, I suggest reviewing
>> that case history.
>>
>> Back when I designed and built digital electronics boards, it was beaten into
>> my skull that chips always get faster, never slower, and the slower parts
>> routinely became unavailable. This means that the circuits got designed with
>> maximum propagation delays in mind, and with a minimum delay of 0. Then, when
>> they work with a slow part, they'll still work if you swap in a faster one.
>>
>> FP precision is the same concept. Swap in more precision, and your correctly
>> designed algorithm will still work.
>
>
> THIS IS COMPLETELY WRONG. You cannot write serious floating-point code under
> such circumstances. This takes things back to the bad old days before IEEE,
> where results were implementation-dependent.
>
> We have these wonderful properties, float.epsilon, etc, which allow code to
> adapt to machine differences. The correct approach is to write generic code
> which will give full machine precision and will work on any machine
> configuration. That's actually quite easy.
>
> But to write code which will function correctly when an unspecified and
> unpredictable error can be added to any calculation -- I believe that's
> impossible. I don't know how to write such code.
Unpredictable, sure, but it is unpredictable in that the error is less than a guaranteed maximum error. The error falls in a range 0<=error<=epsilon. As an analogy, all engineering parts are designed with a maximum deviation from the ideal size, not a minimum deviation.
|
October 30, 2013 Re: Question/request/bug(?) re. floating-point in dmd | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Tuesday, 29 October 2013 at 19:42:08 UTC, Walter Bright wrote:
> On 10/29/2013 5:08 AM, Don wrote:
>> On Wednesday, 23 October 2013 at 16:50:52 UTC, Walter Bright wrote:
>>> On 10/23/2013 9:22 AM, David Nadlinger wrote:
>>>> On Wednesday, 23 October 2013 at 16:15:56 UTC, Walter Bright wrote:
>>>>> A D compiler is allowed to compute floating point results at arbitrarily large
>>>>> precision - the storage size (float, double, real) only specify the minimum
>>>>> precision.
>>>>>
>>>>> This behavior is fairly deeply embedded into the front end, optimizer, and
>>>>> various back ends.
>>>>
>>>> I know we've had this topic before, but just for the record, I'm still not sold
>>>> on the idea of allowing CTFE to yield different results than runtime execution.
>>>
>>> Java initially tried to enforce a maximum precision, and it was a major
>>> disaster for them. If I have been unable to convince you, I suggest reviewing
>>> that case history.
>>>
>>> Back when I designed and built digital electronics boards, it was beaten into
>>> my skull that chips always get faster, never slower, and the slower parts
>>> routinely became unavailable. This means that the circuits got designed with
>>> maximum propagation delays in mind, and with a minimum delay of 0. Then, when
>>> they work with a slow part, they'll still work if you swap in a faster one.
>>>
>>> FP precision is the same concept. Swap in more precision, and your correctly
>>> designed algorithm will still work.
>>
>>
>> THIS IS COMPLETELY WRONG. You cannot write serious floating-point code under
>> such circumstances. This takes things back to the bad old days before IEEE,
>> where results were implementation-dependent.
>>
>> We have these wonderful properties, float.epsilon, etc, which allow code to
>> adapt to machine differences. The correct approach is to write generic code
>> which will give full machine precision and will work on any machine
>> configuration. That's actually quite easy.
>>
>> But to write code which will function correctly when an unspecified and
>> unpredictable error can be added to any calculation -- I believe that's
>> impossible. I don't know how to write such code.
>
> Unpredictable, sure, but it is unpredictable in that the error is less than a guaranteed maximum error. The error falls in a range 0<=error<=epsilon. As an analogy, all engineering parts are designed with a maximum deviation from the ideal size, not a minimum deviation.
I don't think the analagy is strong. There's no reason for there to be any error at all.
Besides, in the x87 case, there are exponent errors as well precision. Eg, double.min * double.min can be zero on some systems, but non-zero on others. This causes a total loss of precision.
If this is allowed to happen anywhere (and not even consistently) then it's back to the pre-IEEE 754 days: underflow and overflow lead to unspecified behaviour.
The idea that extra precision is always a good thing, is simply incorrect.
The problem is that, if calculations can carry extra precision, double rounding can occur. This is a form of error that doesn't otherwise exist. If all calculations are allowed to do it, there is absolutely nothing you can do to fix the problem.
Thus we lose the other major improvement from IEEE 754: predictable rounding behaviour.
Fundamentally, there is a primitive operation "discard extra precision" which is crucial to most mathematical algorithms but which is rarely explicit.
In theory in C and C++ this is applied at each sequence point, but in practice that's not actually done (for x87 anyway) -- for performance, you want to be able to keep values in registers sometimes. So C didn't get this exactly right. I think we can do better. But the current behaviour is worse.
This issue is becoming more obvious in CTFE because the extra precision is not merely theoretical, it actually happens.
|
October 30, 2013 Re: Question/request/bug(?) re. floating-point in dmd | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On 23/10/13 18:16, Walter Bright wrote:
> To precisely control maximum precision, I suggest using inline assembler to use
> the exact sequence of instructions needed for double-double operations.
Could be relevant here: last year I wrote some code which divided up the closed interval [0, 1] into small increments. If the division was by powers of 10, dmd would wind up mis-calculating; with powers of 2, it was fine.
By contrast GDC and LDC were fine, which makes me think their backends must implement effective decimal support for floating-point within certain limited bounds.
|
October 30, 2013 Re: Question/request/bug(?) re. floating-point in dmd | ||||
---|---|---|---|---|
| ||||
Posted in reply to Apollo Hogan | It's called excess precision and regularly causes confusion. http://d.puremagic.com/issues/show_bug.cgi?id=6531 |
Copyright © 1999-2021 by the D Language Foundation