July 04, 2013
On 07/04/2013 06:16 PM, Ali Çehreli wrote:
> I am commenting without fully understanding the context: Both size_t and double are 64 bit types on a 64-bit system. double.mant_dig being 53, converting from size_t to double loses information for many values.

Oh, bugger.  You mean that because it needs space to store the exponent, it has a reduced number of significant figures compared to size_t?

> import std.stdio;
> import std.conv;
> 
> void main()
> {
>     size_t i = 0x8000_0000_0000_0001;
>     double d0 = i.to!double;
>     double d1 = cast(double)i;
> 
>     writefln("%s", i);
>     writefln("%f", d0);
>     writefln("%f", d1);
> }
> 
> Prints
> 
> 9223372036854775809
> 9223372036854775808.000000
> 9223372036854775808.000000

Don't understand why the error arises here, as the number of significant figures is the same for all the numbers ... ?
July 04, 2013
On 07/04/2013 09:43 AM, Joseph Rushton Wakeling wrote:

> On 07/04/2013 06:16 PM, Ali Çehreli wrote:
>> I am commenting without fully understanding the context: Both size_t and double
>> are 64 bit types on a 64-bit system. double.mant_dig being 53, converting from
>> size_t to double loses information for many values.
>
> Oh, bugger.  You mean that because it needs space to store the exponent, it has
> a reduced number of significant figures compared to size_t?

Exactly.

>> import std.stdio;
>> import std.conv;
>>
>> void main()
>> {
>>      size_t i = 0x8000_0000_0000_0001;
>>      double d0 = i.to!double;
>>      double d1 = cast(double)i;
>>
>>      writefln("%s", i);
>>      writefln("%f", d0);
>>      writefln("%f", d1);
>> }
>>
>> Prints
>>
>> 9223372036854775809
>> 9223372036854775808.000000
>> 9223372036854775808.000000
>
> Don't understand why the error arises here, as the number of significant figures
> is the same for all the numbers ... ?

It is about the difference between the significant number of bits. The size_t value above uses bits 63 and 0, a range of 64 bits. double's 53-bit mantissa has no room for bit 0.

In contrast, the following size_t value can be represented exactly in a double because its representation uses only 53 bits:

import std.stdio;
import std.conv;

void main()
{
    size_t i = 0x001f_ffff_ffff_ffff;
    double d = i.to!double;

    writefln("%s", i);
    writefln("%f", d);
}

Prints:

9007199254740991
9007199254740991.000000

However, set any one of the currently-unset upper 11 bits (64 - 53 == 11), and you get an inexact conversion.

Ali

July 04, 2013
On Thu, Jul 04, 2013 at 06:43:16PM +0200, Joseph Rushton Wakeling wrote:
> On 07/04/2013 06:16 PM, Ali Çehreli wrote:
> > I am commenting without fully understanding the context: Both size_t and double are 64 bit types on a 64-bit system. double.mant_dig being 53, converting from size_t to double loses information for many values.
> 
> Oh, bugger.  You mean that because it needs space to store the exponent, it has a reduced number of significant figures compared to size_t?
[...]

Yes. See:

http://en.wikipedia.org/wiki/Double-precision_floating-point_format

Of the 64 bits, only 53 are available for storing the mantissa (well, actually 52, but the first bit is always 1 except when the exponent is zero so it's not stored -- you get it for free). Of the remaining bits, 11 are reserved for storing the exponent, and the last for storing the sign.

So the maximum precision a double can have is 53 bits. If you have a value that requires more than that, the representation will be inexact.


T

-- 
Programming is not just an act of telling a computer what to do: it is also an act of telling other programmers what you wished the computer to do. Both are important, and the latter deserves care. -- Andrew Morton
July 04, 2013
On Thursday, 4 July 2013 at 16:16:08 UTC, Ali Çehreli wrote:
> On 07/04/2013 03:15 AM, Joseph Rushton Wakeling wrote:
>
> > The cast should be safe, as it's a size_t to a double.
>
> I am commenting without fully understanding the context: Both size_t and double are 64 bit types on a 64-bit system. double.mant_dig being 53, converting from size_t to double loses information for many values.

It's not about losing information, it's about being out of range.

For example, to!float(2.0^^50) will throw an exception.
1 2
Next ›   Last »