about float & double

Jan 18, 2011

spir

Jan 18, 2011

Ali Çehreli

Jan 19, 2011

bearophile

Jan 19, 2011

dennis luehring

Hello, Is there somewhere a (clear) doc about float/double internals? Some more particuliar questions: What is the internal bit layout? (mantissa, sign, exponent) Can I assume the "integral range" is [-2^(m-1) .. 2^⁽m-1)-1], where m is the number of mantissa bits? What are the values used to represent thingies like NaNs, inf, error? (Or are there not represented as values?) How would you get a float's integral and fractional parts without performing arithmetic? (I think at bit ops, indeed) Denis _________________ vita es estrany spir.wikidot.com

January 18, 2011

Re: about float & double

Posted by Ali Çehreli
in reply to spir

Permalink

Ali Çehreli

Posted in reply to spir

Permalink

spir wrote:

> Is there somewhere a (clear) doc about float/double internals?

A very good read is:

  http://digitalmars.com/d/2.0/d-floating-point.html

> Some more particuliar questions:
>
> What is the internal bit layout? (mantissa, sign, exponent)

IEEE floating point format. This page has links to different representations:

  http://en.wikipedia.org/wiki/Ieee_floating_point

Specifically:

  http://en.wikipedia.org/wiki/Single_precision_floating-point_format
  http://en.wikipedia.org/wiki/Double_precision_floating-point_format

> What are the values used to represent thingies like NaNs, inf, error?
> (Or are there not represented as values?)

They are available on the documents above.

> How would you get a float's integral and fractional parts without
> performing arithmetic? (I think at bit ops, indeed)

Here is a function with endianness "issues" that I had used with different types:

import std.stdio;

void display_bytes(T)(ref T variable)
{
    const ubyte * begin = cast(ubyte*)&variable;

    writefln("type          : %s", T.stringof);
    writefln("value         : %s", variable);
    writefln("address       : %s", begin);
    writef(  "representation: ");

    foreach (p; begin .. begin + T.sizeof) {
        writef("%02x ", *p);
    }

    writeln();
    writeln();
}

void main()
{
    auto d_nan = double.nan;
    auto d_inf = double.infinity;
    display_bytes(d_nan);
    display_bytes(d_inf);
}

Ali

spir: > Is there somewhere a (clear) doc about float/double internals? > Some more particuliar questions: > > What is the internal bit layout? (mantissa, sign, exponent) There is the real type too (>= 10 bytes). Bye, bearophile

> Is there somewhere a (clear) doc about float/double internals? > Some more particuliar questions: > > What is the internal bit layout? (mantissa, sign, exponent) > > Can I assume the "integral range" is [-2^(m-1) .. 2^â�œm-1)-1], where m is > the number of mantissa bits? > > What are the values used to represent thingies like NaNs, inf, error? > (Or are there not represented as values?) > > How would you get a float's integral and fractional parts without > performing arithmetic? (I think at bit ops, indeed) Wikipedia is very informative and the phobos math implementation is a very good source for the bit ops stuff http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/math.d

Forums