[Issue 360] Compile-time floating-point calculations are sometimes inconsistent (page 2) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » Issues » [Issue 360] Compile-time floating-point calculations are sometimes inconsistent (page 2)

September 22, 2006

[Issue 360] Compile-time floating-point calculations are sometimes inconsistent

Posted by d-bugmail
in reply to d-bugmail

d-bugmail

Posted in reply to d-bugmail

http://d.puremagic.com/issues/show_bug.cgi?id=360


smjg@iname.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |smjg@iname.com




------- Comment #7 from smjg@iname.com  2006-09-22 15:06 -------
(In reply to comment #5)
> const float A = 0.2;  // infinitely accurate 0.2, but type inference on A should return a float.
> 
> const float B = 0.2f; // a 32-bit approximation to 0.2
> const real C = 0.2; // infinitely accurate 0.2
> const real D = 0.2f; // a 32-bit approximation to 0.2, but type inference will
> give an 80-bit quantity.

I agree.  Only I'm not sure about A.  If you want it to be "infinitely accurate", then why would you declare it to be a float?  It appears to me to be a means by which a float can hold more precision than it really can.  On the other hand, D should definitely generate a 32-bit approximation to 0.2.  By using the 'f' suffix, this is exactly what the programmer asked for.


--

September 22, 2006

Re: [Issue 360] Compile-time floating-point calculations are sometimes inconsistent

Posted by Bradley Smith
in reply to Walter Bright

Bradley Smith

Posted in reply to Walter Bright

To summarize: ---

The compiler is allowed to evaluate intermediate results at a greater precision than that of the operands. The literal type suffix (like 'f') only indicates the type. The compiler may maintain internally as much precision as possible, for purposes of constant folding. Committing the actual precision of the result is done as late as possible.

For a low-precision constant put the value into a static, non-const variable. Since this is not really a constant, it cannot be constant folded and therefore affected by a possible compile-time increase in precision. However, if mixed with a higher precision at runtime, a increase in precision will still occur.

The way to write robust floating point calculations in D is to ensure
that increasing the precision of the calculations will not break the result.

--- end of summary

This is the explanation I was looking for. Although it was clear that during runtime, D evaluates intermediate results at high precision. The compile-time behavior (namely using a const) is different than the runtime behavior (using a static), but I don't think that is clearly explained in the documentation.

Would you please add this information to the D documentation? Perhaps an addition to the Floating Point page (http://www.digitalmars.com/d/float.html). Of course, if any of the above is incorrect, please change as necessary.

A follow-on question would be: How does one create an low-precision constant that is ensured to actually stay constant? A static won't do since a static is really non-const, and a programming error would change the value.


Thanks,
  Bradley

September 23, 2006

Re: [Issue 360] Compile-time floating-point calculations are sometimes inconsistent

Posted by Don Clugston
in reply to Walter Bright

Don Clugston

Posted in reply to Walter Bright

Walter Bright wrote:
> Don Clugston wrote:
>> Walter Bright wrote:
>>> Not in D. The 'f' suffix only indicates the type.
>>
>> And therefore, it only matters in implicit type deduction, and in function overloading. As I discuss below, I'm not sure that it's necessary even there.
>> In many cases, it's clearly a programmer error. For example in
>> real BAD = 0.2f;
>> where the f has absolutely no effect.
> 
> It may come about as a result of source code generation, though, so I'd be reluctant to make it an error.
> 
> 
>>> You can by putting the constant into a static, non-const variable. Then it cannot be constant folded.
>>
>> Actually, in this case you still want it to be constant folded.
> 
> A static variable's value can change, so it can't be constant folded. To have it participate in constant folding, it needs to be declared as const.

But if it's const, then it's not float precision! I want both!

>> I agree. But it seems that D is currently in a halfway house on this issue. Somehow, 'double' is privileged, and don't think it's got any right to be.
>>
>>     const XXX = 0.123456789123456789123456789f;
>>     const YYY = 1 * XXX;
>>     const ZZZ = 1.0 * XXX;
>>
>>    auto xxx = XXX;
>>    auto yyy = YYY;
>>    auto zzz = ZZZ;
>>
>> // now xxx and yyy are floats, but zzz is a double.
>> Multiplying by '1.0' causes a float constant to be promoted to double.
> 
> That's because 1.0 is a double. A double*float => double.
> 
>>    real a = xxx;
>>    real b = zzz;
>>    real c = XXX;
>>
>> Now a, b, and c all have different values.
>>
>> Whereas the same operation at runtime causes it to be promoted to real.
>>
>> Is there any reason why implicit type deduction on a floating point constant doesn't always default to real? After all, you're saying "I don't particularly care what type this is" -- why not default to maximum accuracy?
>>
>> Concrete example:
>>
>> real a = sqrt(1.1);
>>
>> This only gives a double precision result. You have to write
>> real a = sqrt(1.1L);
>> instead.
>> It's easier to do the wrong thing, than the right thing.
>>
>> IMHO, unless you specifically take other steps, implicit type deduction should always default to the maximum accuracy the machine could do.
> 
> It is a good idea, but isn't that way for the reasons:
> 
> 1) It's the way C, C++, and Fortran work. Changing the promotion rules would mean that, when translating solid, reliable libraries from those languages to D, one would have to be very, very careful.

That's very important. Still, those languages don't have implicit type deduction. Also, none of those languages guarantee accuracy of decimal->binary conversions, so there's always some error in decimal constants. Incidentally, I recently read that GCC uses something like 160 bits for constant folding, so it's always going to give results that are different to those on other compilers.

Why doesn't D behave like C with respect to 'f' suffixes?
(Ie, do the conversion, then truncate it to float precision).
Actually, I can't imagine many cases where you'd actually want a 'float' constant instead of a 'real' one.

> 2) Float and double are expected to be implemented in hardware. Longer precisions are often not available. I wanted to make it practical for a D implementation on those machines to provide a software long precision floating point type, rather than just making real==double. Such a type would be very slow compared with double.

Interesting. I thought that 'real' was supposed to be the highest accuracy fast floating point type, and would therefore be either 64, 80, or 128 bits. So it could also be a double-double?
For me, the huge benefit of the 'real' type is that it guarantees that optimisation won't change the results. In C, using doubles, it's quite unpredictable when a temporary will be 80 bits, and when it will be 64 bits. In D, if you stick to real, you're guaranteed that nothing weird will happen. I'd hate to lose that.

> 3) Real, even in hardware, is significantly slower than double. Doing constant folding at max precision at compile time won't affect runtime performance, so it is 'free'.

In this case, the initial issue remains: in order to write code which maintains accuracy regardless of machine precision, it is sometimes necessary to specify the precision that should be used for constants.
The original code was an example where weird things happened because
that wasn't respected.

September 24, 2006

Re: [Issue 360] Compile-time floating-point calculations are sometimes inconsistent

Posted by Walter Bright
in reply to Don Clugston

Walter Bright

Posted in reply to Don Clugston

Don Clugston wrote:
> Walter Bright wrote:
>> A static variable's value can change, so it can't be constant folded. To have it participate in constant folding, it needs to be declared as const.
> But if it's const, then it's not float precision! I want both!

You can always use hex float constants. I know they're not pretty, but the point of them is to be able to specify exact floating point bit patterns. There are no rounding errors with them.

>> 1) It's the way C, C++, and Fortran work. Changing the promotion rules would mean that, when translating solid, reliable libraries from those languages to D, one would have to be very, very careful.
> 
> That's very important. Still, those languages don't have implicit type deduction. Also, none of those languages guarantee accuracy of decimal->binary conversions, so there's always some error in decimal constants. Incidentally, I recently read that GCC uses something like 160 bits for constant folding, so it's always going to give results that are different to those on other compilers.
> 
> Why doesn't D behave like C with respect to 'f' suffixes?
> (Ie, do the conversion, then truncate it to float precision).
> Actually, I can't imagine many cases where you'd actually want a 'float' constant instead of a 'real' one.

A float constant would be desirable to keep the calculation all floats for speed reasons. I can't think of many reasons one would want reduced precision.

>> 2) Float and double are expected to be implemented in hardware. Longer precisions are often not available. I wanted to make it practical for a D implementation on those machines to provide a software long precision floating point type, rather than just making real==double. Such a type would be very slow compared with double.
> 
> Interesting. I thought that 'real' was supposed to be the highest accuracy fast floating point type, and would therefore be either 64, 80, or 128 bits. So it could also be a double-double?
> For me, the huge benefit of the 'real' type is that it guarantees that optimisation won't change the results. In C, using doubles, it's quite unpredictable when a temporary will be 80 bits, and when it will be 64 bits. In D, if you stick to real, you're guaranteed that nothing weird will happen. I'd hate to lose that.

I don't see how one would lose that if real were done in software.

>> 3) Real, even in hardware, is significantly slower than double. Doing constant folding at max precision at compile time won't affect runtime performance, so it is 'free'.
> 
> In this case, the initial issue remains: in order to write code which maintains accuracy regardless of machine precision, it is sometimes necessary to specify the precision that should be used for constants.
> The original code was an example where weird things happened because
> that wasn't respected.

Weird things always happen with floating point. It's just a matter of where one chooses the seams to show (you pointed out where seams show in C with temporary precision). I've seen a lot of cases where people were surprised that 0.2f (or similar) was even rounded off, and got caught by the roundoff error.

I used to work in mechanical engineering where a lot of numerical calculations were done. Accumulating roundoff errors were a huge problem, and a lot (most?) engineers didn't understand it. They were using calculators for long chains of calculation, and rounding off after each step instead of carrying the full calculator precision. They were mystified by getting answers at the end that were way off.

It's my experience with that (and also in college where we were taught to never round off anything but the final answer) that led to the D design decision to internally carry around consts in full precision, regardless of type.

Deliberately reduced precision is something that only experts would want, and only for special cases. So it's reasonable that that would be harder to do (i.e. using hex float constants).

P.S. I also did some digital electronic design work long ago. The cardinal rule there was that since TTL devices got faster all the time, and old slower TTL parts became unavailable, one designed so that swapping in a faster chip would not cause the failure of the system. Hence the rule that increasing the precision of a calculation should not cause the program to fail <g>.

September 24, 2006

Re: [Issue 360] Compile-time floating-point calculations are sometimes inconsistent

Posted by Don Clugston
in reply to Walter Bright

Don Clugston

Posted in reply to Walter Bright

Walter Bright wrote:
> Don Clugston wrote:
>> Walter Bright wrote:
>>> A static variable's value can change, so it can't be constant folded. To have it participate in constant folding, it needs to be declared as const.
>> But if it's const, then it's not float precision! I want both!
> 
> You can always use hex float constants. I know they're not pretty, but the point of them is to be able to specify exact floating point bit patterns. There are no rounding errors with them.


>>> 1) It's the way C, C++, and Fortran work. Changing the promotion rules would mean that, when translating solid, reliable libraries from those languages to D, one would have to be very, very careful.
>>
>> That's very important. Still, those languages don't have implicit type deduction. Also, none of those languages guarantee accuracy of decimal->binary conversions, so there's always some error in decimal constants. Incidentally, I recently read that GCC uses something like 160 bits for constant folding, so it's always going to give results that are different to those on other compilers.
>>
>> Why doesn't D behave like C with respect to 'f' suffixes?
>> (Ie, do the conversion, then truncate it to float precision).
>> Actually, I can't imagine many cases where you'd actually want a 'float' constant instead of a 'real' one.
> 
> A float constant would be desirable to keep the calculation all floats for speed reasons. I can't think of many reasons one would want reduced precision.

Me, too. In fact I've seen a lot of code where ignorant programmers were adding 'f' to end of every floating point constant. It could be that the number of cases where you actually care about the precision are so small, that hex constants are adequate.

>>> 2) Float and double are expected to be implemented in hardware. Longer precisions are often not available. I wanted to make it practical for a D implementation on those machines to provide a software long precision floating point type, rather than just making real==double. Such a type would be very slow compared with double.
>>
>> Interesting. I thought that 'real' was supposed to be the highest accuracy fast floating point type, and would therefore be either 64, 80, or 128 bits. So it could also be a double-double?
>> For me, the huge benefit of the 'real' type is that it guarantees that optimisation won't change the results. In C, using doubles, it's quite unpredictable when a temporary will be 80 bits, and when it will be 64 bits. In D, if you stick to real, you're guaranteed that nothing weird will happen. I'd hate to lose that.
> 
> I don't see how one would lose that if real were done in software.
> 
>>> 3) Real, even in hardware, is significantly slower than double. Doing constant folding at max precision at compile time won't affect runtime performance, so it is 'free'.
>>
>> In this case, the initial issue remains: in order to write code which maintains accuracy regardless of machine precision, it is sometimes necessary to specify the precision that should be used for constants.
>> The original code was an example where weird things happened because
>> that wasn't respected.
> 
> Weird things always happen with floating point. It's just a matter of where one chooses the seams to show (you pointed out where seams show in C with temporary precision). I've seen a lot of cases where people were surprised that 0.2f (or similar) was even rounded off, and got caught by the roundoff error.
> 
> I used to work in mechanical engineering where a lot of numerical calculations were done. Accumulating roundoff errors were a huge problem, and a lot (most?) engineers didn't understand it. They were using calculators for long chains of calculation, and rounding off after each step instead of carrying the full calculator precision. They were mystified by getting answers at the end that were way off.
> 
> It's my experience with that (and also in college where we were taught to never round off anything but the final answer) that led to the D design decision to internally carry around consts in full precision, regardless of type.
> 
> Deliberately reduced precision is something that only experts would want, and only for special cases. So it's reasonable that that would be harder to do (i.e. using hex float constants).

OK, you've convinced me. It needs to be better documented, though.

> P.S. I also did some digital electronic design work long ago. The cardinal rule there was that since TTL devices got faster all the time, and old slower TTL parts became unavailable, one designed so that swapping in a faster chip would not cause the failure of the system. Hence the rule that increasing the precision of a calculation should not cause the program to fail <g>.

I think it would be useful to specify more precisely what happens in constant folding. Eg, mention that all constant folding will be done in IEEE round-to-nearest, ties-to-even.

In the longer term, I've been wondering if the precision for real constants even needs to be the same as for the 'real' type. I can see some distinct benefits that would come if the precision of literals was defined to always be IEEE quadruple precision. Of course they'd always be rounded to 64 or 80-bit reals when the time came for them to actually be used.

Looking at the spec for the forthcoming IEEE 754R standard, and the state of SSE3 on AMD-64, it seems that Intel/AMD could very easily add a quadruple precision type (they already have 16 128 bit registers, two 64 bit mantissa units, and the quadruple exponent is the same as for x87. So I don't think it would require much silicon, and it would mean they could emulate the x87 stuff entirely on SSE). Some forward-compatibility things to consider in DMD 2.0; ignore for now.

September 24, 2006

Re: [Issue 360] Compile-time floating-point calculations are sometimes inconsistent

Posted by Walter Bright
in reply to Don Clugston

Walter Bright

Posted in reply to Don Clugston

Don Clugston wrote:
> Walter Bright wrote:
> OK, you've convinced me. It needs to be better documented, though.

I agree with you and Bradley Smith on that.

>> P.S. I also did some digital electronic design work long ago. The cardinal rule there was that since TTL devices got faster all the time, and old slower TTL parts became unavailable, one designed so that swapping in a faster chip would not cause the failure of the system. Hence the rule that increasing the precision of a calculation should not cause the program to fail <g>.
> 
> I think it would be useful to specify more precisely what happens in constant folding. Eg, mention that all constant folding will be done in IEEE round-to-nearest, ties-to-even.

Yes.

> In the longer term, I've been wondering if the precision for real constants even needs to be the same as for the 'real' type. I can see some distinct benefits that would come if the precision of literals was defined to always be IEEE quadruple precision. Of course they'd always be rounded to 64 or 80-bit reals when the time came for them to actually be used.

I agree.

> Looking at the spec for the forthcoming IEEE 754R standard, and the state of SSE3 on AMD-64, it seems that Intel/AMD could very easily add a quadruple precision type (they already have 16 128 bit registers, two 64 bit mantissa units, and the quadruple exponent is the same as for x87. So I don't think it would require much silicon, and it would mean they could emulate the x87 stuff entirely on SSE). Some forward-compatibility things to consider in DMD 2.0; ignore for now.

I was disappointed in the AMD-64 because it didn't do 128 bit floats, in fact, it relegated 80 bit floats to a backwater in the instruction set. Few computer people seem to understand the value in high precision floating point.

September 25, 2006

Re: [Issue 360] Compile-time floating-point calculations are sometimes inconsistent

Posted by Don Clugston
in reply to Walter Bright

Don Clugston

Posted in reply to Walter Bright

Walter Bright wrote:
> Don Clugston wrote:
>> Walter Bright wrote:
>> OK, you've convinced me. It needs to be better documented, though.
> 
> I agree with you and Bradley Smith on that.
> 
>>> P.S. I also did some digital electronic design work long ago. The cardinal rule there was that since TTL devices got faster all the time, and old slower TTL parts became unavailable, one designed so that swapping in a faster chip would not cause the failure of the system. Hence the rule that increasing the precision of a calculation should not cause the program to fail <g>.
>>
>> I think it would be useful to specify more precisely what happens in constant folding. Eg, mention that all constant folding will be done in IEEE round-to-nearest, ties-to-even.
> 
> Yes.
> 
>> In the longer term, I've been wondering if the precision for real constants even needs to be the same as for the 'real' type. I can see some distinct benefits that would come if the precision of literals was defined to always be IEEE quadruple precision. Of course they'd always be rounded to 64 or 80-bit reals when the time came for them to actually be used.
> 
> I agree.

One consequence of that would be in the name mangling for floating point  constants in templates. Currently it's 20 hex characters, which only makes sense for a system with 80-bit reals; might be better to make it 32 hex characters, even if the extra 12 are all '0'.

> 
>> Looking at the spec for the forthcoming IEEE 754R standard, and the state of SSE3 on AMD-64, it seems that Intel/AMD could very easily add a quadruple precision type (they already have 16 128 bit registers, two 64 bit mantissa units, and the quadruple exponent is the same as for x87. So I don't think it would require much silicon, and it would mean they could emulate the x87 stuff entirely on SSE). Some forward-compatibility things to consider in DMD 2.0; ignore for now.
> 
> I was disappointed in the AMD-64 because it didn't do 128 bit floats, in fact, it relegated 80 bit floats to a backwater in the instruction set. Few computer people seem to understand the value in high precision floating point.

Intel seems to be better than AMD in this regard. Intel added an 82 bit floating point type to the Itanium so that it could do 80-bit hypot() without overflow (in fact, Itanium seems to have by far the best floating point support that I've seen); AMD's 3DNow! didn't even support subnormals, infinity, or NaN.

September 25, 2006

Re: [Issue 360] Compile-time floating-point calculations are sometimes inconsistent

Posted by Walter Bright
in reply to Don Clugston

Walter Bright

Posted in reply to Don Clugston

Don Clugston wrote:
> One consequence of that would be in the name mangling for floating point  constants in templates. Currently it's 20 hex characters, which only makes sense for a system with 80-bit reals; might be better to make it 32 hex characters, even if the extra 12 are all '0'.

I'm reluctant to do that because there are already problems with the mangled names getting too long.

September 25, 2006

Re: [Issue 360] Compile-time floating-point calculations are sometimes inconsistent

Posted by xs0
in reply to Walter Bright

xs0

Posted in reply to Walter Bright

Walter Bright wrote:
> Don Clugston wrote:
>> One consequence of that would be in the name mangling for floating point  constants in templates. Currently it's 20 hex characters, which only makes sense for a system with 80-bit reals; might be better to make it 32 hex characters, even if the extra 12 are all '0'.
> 
> I'm reluctant to do that because there are already problems with the mangled names getting too long.

What if you used characters other than A-F to compress the zeros?

G = 2 * '0'
H = 3 * '0'
...
Z = 21 * '0'


xs0

September 25, 2006

Re: [Issue 360] Compile-time floating-point calculations are sometimes inconsistent

Posted by Sean Kelly
in reply to Don Clugston

Sean Kelly

Posted in reply to Don Clugston

Don Clugston wrote:
> Walter Bright wrote:
>>
>> I was disappointed in the AMD-64 because it didn't do 128 bit floats, in fact, it relegated 80 bit floats to a backwater in the instruction set. Few computer people seem to understand the value in high precision floating point.
> 
> Intel seems to be better than AMD in this regard. Intel added an 82 bit floating point type to the Itanium so that it could do 80-bit hypot() without overflow (in fact, Itanium seems to have by far the best floating point support that I've seen); AMD's 3DNow! didn't even support subnormals, infinity, or NaN.

I think AMD simply set its sights on the game industry as the battleground, which seems to be supported by the presence of forums on LAN parties and system modding (http://forums.amd.com/).  This stands in contrast with the Intel, who has an entire set of forums for software development (http://softwareforums.intel.com/).  I decided to ask whether AMD has another location for software development discussion.  I have no idea whether science-minded software companies or developers communicate to AMD that they'd like improved floating-point support, but a bit more couldn't hurt.

Sean

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation