Question/request/bug(?) re. floating-point in dmd (page 4)

On Wednesday, 6 November 2013 at 06:28:59 UTC, Walter Bright wrote: > On 11/5/2013 8:19 AM, Don wrote: >> On Wednesday, 30 October 2013 at 18:28:14 UTC, Walter Bright wrote: >>> Not exactly what I meant - I mean the algorithm should be designed so that >>> extra precision does not break it. >> >> Unfortunately, that's considerably more difficult than writing an algorithm for >> a known precision. >> And it is impossible in any case where you need full machine precision (which >> applies to practically all library code, and most of my work). > > I have a hard time buying this. For example, when I wrote matrix inversion code, more precision was always gave more accurate results. With matrix inversion you're normally far from full machine precision. If half the bits are correct, you're doing very well. The situations I'm referring to, are the ones where the result is correctly rounded, when no extra precision is present. If you then go and add extra precision to some or all of the intermediate results, the results will no longer be correctly rounded. eg, the simplest case is rounding to integer: 3.499999999999999999999999999 must round to 3. If you round it twice, you'll get 4. But we can test this. I predict that adding some extra bits to the internal calculations in CTFE (to make it have eg 128 bit intermediate values instead of 80), will cause Phobos math unit tests to break. Perhaps this can already be done trivially in GCC. >> A compiler intrinsic, which generates no code (simply inserting a barrier for >> the optimiser) sounds like the correct approach. >> >> Coming up for a name for this operation is difficult. > > float toFloatPrecision(real arg) ? Meh. That's wordy and looks like a rounding operation. I'm interested in the operation float -> float and double -> double (and perhaps real->real), where no conversion is happening, and on most architectures it will be a no-op. It should be a name that indicates that it's not generating any code, you're just forbidding the compiler from doing funky weird stuff. And for generic code, the name should be the same for float, double, and real. Perhaps an attribute rather than a function call. double x; double y = x.strictfloat; double y = x.strictprecision; ie, (expr).strictfloat would return expr, discarding any extra precision. That's the best I've come up with so far.

November 06, 2013

Re: Question/request/bug(?) re. floating-point in dmd

Posted by Iain Buclaw
in reply to Don

Permalink

Iain Buclaw

Posted in reply to Don

Attachments:

text/html part

Permalink

On 6 November 2013 09:09, Don <x@nospam.com> wrote:

> On Wednesday, 6 November 2013 at 06:28:59 UTC, Walter Bright wrote:
>
>> On 11/5/2013 8:19 AM, Don wrote:
>>
>>> On Wednesday, 30 October 2013 at 18:28:14 UTC, Walter Bright wrote:
>>>
>>>> Not exactly what I meant - I mean the algorithm should be designed so
>>>> that
>>>> extra precision does not break it.
>>>>
>>>
>>> Unfortunately, that's considerably more difficult than writing an
>>> algorithm for
>>> a known precision.
>>> And it is impossible in any case where you need full machine precision
>>> (which
>>> applies to practically all library code, and most of my work).
>>>
>>
>> I have a hard time buying this. For example, when I wrote matrix inversion code, more precision was always gave more accurate results.
>>
>
> With matrix inversion you're normally far from full machine precision. If half the bits are correct, you're doing very well.
>
> The situations I'm referring to, are the ones where the result is correctly rounded, when no extra precision is present. If you then go and add extra precision to some or all of the intermediate results, the results will no longer be correctly rounded.
>
> eg, the simplest case is rounding to integer:
> 3.499999999999999999999999999
> must round to 3. If you round it twice, you'll get 4.
>
> But we can test this. I predict that adding some extra bits to the internal calculations in CTFE (to make it have eg 128 bit intermediate values instead of 80), will cause Phobos math unit tests to break. Perhaps this can already be done trivially in GCC.
>
>
>
The only tests that break in GDC because GCC operates on 160 bit intermediate values are the 80-bit specific tests  (the unittest in std.math with the comment "Note that these are only valid for 80-bit reals").

Saying that though, GCC isn't exactly IEEE 754 compliant either...





>  A compiler intrinsic, which generates no code (simply inserting a barrier
>>> for
>>> the optimiser) sounds like the correct approach.
>>>
>>> Coming up for a name for this operation is difficult.
>>>
>>
>> float toFloatPrecision(real arg) ?
>>
>
> Meh. That's wordy and looks like a rounding operation. I'm interested in the operation float -> float and double -> double (and perhaps real->real), where no conversion is happening, and on most architectures it will be a no-op.
>
> It should be a name that indicates that it's not generating any code, you're just forbidding the compiler from doing funky weird stuff.
>
> And for generic code, the name should be the same for float, double, and real.
>
> Perhaps an attribute rather than a function call.
>
> double x;
> double y = x.strictfloat;
> double y = x.strictprecision;
>
> ie, (expr).strictfloat  would return expr, discarding any extra precision. That's the best I've come up with so far.
>

double y = cast(float) x;  ?  :o)


-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';

"Don" <x@nospam.com> writes: > On Wednesday, 6 November 2013 at 06:28:59 UTC, Walter Bright wrote: > Perhaps an attribute rather than a function call. > > double x; > double y = x.strictfloat; > double y = x.strictprecision; > > ie, (expr).strictfloat would return expr, discarding any extra precision. That's the best I've come up with so far. What about something like the following? double x; double y; with (strictprecision) { y = x; } The idea being that you can create a scope within which operations are executed with no extra precision. Jerry

On 11/7/2013 8:55 AM, Jerry wrote: > What about something like the following? > > double x; > double y; > with (strictprecision) { > y = x; > } That has immediate problems with things like function calls that might or might not be inlined.

On Thursday, 7 November 2013 at 20:02:05 UTC, Walter Bright wrote: > On 11/7/2013 8:55 AM, Jerry wrote: >> What about something like the following? >> >> double x; >> double y; >> with (strictprecision) { >> y = x; >> } > > That has immediate problems with things like function calls that might or might not be inlined. it could apply only to operations on fundamental types within the region and guarantee nothing for any called code. It could even guarantee to not apply to any called code even if inlined. I think in practice this wouldn't be particularly inconvenient.

On 11/7/2013 12:09 PM, John Colvin wrote: > On Thursday, 7 November 2013 at 20:02:05 UTC, Walter Bright wrote: >> On 11/7/2013 8:55 AM, Jerry wrote: >>> What about something like the following? >>> >>> double x; >>> double y; >>> with (strictprecision) { >>> y = x; >>> } >> >> That has immediate problems with things like function calls that might or >> might not be inlined. > > it could apply only to operations on fundamental types within the region and > guarantee nothing for any called code. It could even guarantee to not apply to > any called code even if inlined. I think in practice this wouldn't be > particularly inconvenient. I think it would be very inconvenient, as it will make problems for use of generic code. Also, it is too blunt - it'll cover a whole set of code, rather than just the one spot where it would matter.

On 11/6/2013 7:07 AM, Iain Buclaw wrote: > double y = cast(float) x; ? :o) I don't like overlaying a new meaning onto the cast operation. For example, if one was using it for type coercion, that is different from wanting precision reduction. There'd be no way to separate the two effects if one desires only one. An intrinsic function solves the issue neatly and cleanly.

On 11/6/2013 1:09 AM, Don wrote: > But we can test this. I predict that adding some extra bits to the internal > calculations in CTFE (to make it have eg 128 bit intermediate values instead of > 80), will cause Phobos math unit tests to break. Bring it on! Challenge accepted!

Forums