Thread overview
Bug of the sqrt() compiled by DMD
Jun 19, 2022
Salih Dincer
Jun 19, 2022
Johan
Jun 19, 2022
Salih Dincer
Jun 19, 2022
kdevel
Jun 20, 2022
Salih Dincer
Jun 19, 2022
jmh530
June 19, 2022

Hi,

Actually this error is not directly related to sqrt() but probably related to cast() and DMD Compiler v2.0.99. Because this no problem occurs in LDC.

Lets start...

n is an integer but n² doesn't have to be... sqrt() does not accept whole number! So the type of n² is correct.

When the root line is hidden, the 2-stage lines below it will toggle and there is no problem anymore. The problem appears exactly at 94906267 and odd numbers.

import std.math, std.stdio;
alias integer = ulong;

void main()
{
  integer n = 94_906_260;
  for( ; n <= 94_906_270; ++n)
  {
    integer root;
    double root2, n2 = n * n;

    root = cast(integer) sqrt(n2);/*
    root2 = sqrt(n2);
    root = cast(integer) root2;
    //**** no problem ****/

    root.write;
    if(root != n)
    {
      n.writefln!" != %s";
    }
    else n.writefln!" == %s";
  }
}

DMD Compiler v2.0.99:

>

94906260 == 94906260
94906261 == 94906261
94906262 == 94906262
94906263 == 94906263
94906264 == 94906264
94906265 == 94906265
94906266 == 94906266
94906266 != 94906267
94906268 == 94906268
94906268 != 94906269
94906270 == 94906270

    //...

    //root = cast(integer) sqrt(n2);/*
    root2 = sqrt(n2);
    root = cast(integer) root2;
    //**** no problem ****/

    //...

2-stage casting with DMD:

>

94906260 == 94906260
94906261 == 94906261
94906262 == 94906262
94906263 == 94906263
94906264 == 94906264
94906265 == 94906265
94906266 == 94906266
94906266 == 94906267
94906268 == 94906268
94906268 == 94906269
94906270 == 94906270

SDB@79

June 19, 2022

On Sunday, 19 June 2022 at 08:36:33 UTC, Salih Dincer wrote:

>
import std.math, std.stdio;
alias integer = ulong;

void main()
{
  integer n = 94_906_260;
  for( ; n <= 94_906_270; ++n)
  {
    integer root;
    double root2, n2 = n * n;

For some of the values of n that you are using, a double cannot represent n*n.

94,906,260 * 94,906,260 = 9,007,198,187,187,600 = 0x1F FFFF C05E 6D90

94,906,267 * 94,906,267 = 9,007,199,515,875,289 = 0x20 0000 0F90 97D9

Both results are 56 bit integers.
An IEEE 64-bit double (D's double) has 52 bits of mantissa. It can represent the first number without loss of precision because of the 4 zero bits at the end (56 - 4 = 52). It can not represent the second number: instead of 0x20 0000 0F90 97D9, n2 will store the value 0x20 0000 0F90 97D0, and you calculate the sqrt of that.

LDC creates instructions that do the sqrt in double domain, but DMD converts the double to a real (80bit floating point on x86) and does the sqrt in the real domain. Probably that is why it seems to work correctly for LDC and not for DMD (representation differences of the sqrt result for double and real, plus rounding when going back to ulong).

-Johan

June 19, 2022

On Sunday, 19 June 2022 at 10:41:32 UTC, Johan wrote:

>

An IEEE 64-bit double (D's double) has 52 bits of mantissa. It can represent the first number without loss of precision because of the 4 zero bits at the end (56 - 4 = 52). It can not represent the second number: instead of 0x20 0000 0F90 97D9, n2 will store the value 0x20 0000 0F90 97D0, and you calculate the sqrt of that.

I think the problem is not with bits. Because the size of the number is only twice uint.max. When integer type is changed qua long, the problem is fixed. Please try and see with your own eyes:

alias integer = long;

SDB@79

June 19, 2022

On Sunday, 19 June 2022 at 20:26:27 UTC, Salih Dincer wrote:

>

On Sunday, 19 June 2022 at 10:41:32 UTC, Johan wrote:

>

An IEEE 64-bit double (D's double) has 52 bits of mantissa. It can represent the first number without loss of precision because of the 4 zero bits at the end (56 - 4 = 52). It can not represent the second number: instead of 0x20 0000 0F90 97D9, n2 will store the value 0x20 0000 0F90 97D0, and you calculate the sqrt of that.

I think the problem is not with bits.

It is not debatable that 94906267² = 9007199515875289 has no exact representation in the double type:

import std.stdio;

union Double {
   double d;
   ulong u;
}

void main ()
{
   Double d = Double (94906267);
   d.d *= 94906267;
   for (int i = -4; i < 5; ++i) {
      Double e = d;
      e.d += i;
      writefln!"%4d %19.2f %14x" (i, e.d, e.u);
   }
}

prints

  -4 9007199515875284.00 4340000007c84bea
  -3 9007199515875284.00 4340000007c84bea
  -2 9007199515875286.00 4340000007c84beb
  -1 9007199515875288.00 4340000007c84bec
   0 9007199515875288.00 4340000007c84bec
   1 9007199515875288.00 4340000007c84bec
   2 9007199515875290.00 4340000007c84bed
   3 9007199515875292.00 4340000007c84bee
   4 9007199515875292.00 4340000007c84bee
>

Because the size of the number is only twice uint.max.

The mantissa of double has only 53 bits which is less than twice the number of bits in an (u)int. 94906267 occupies 27 bits:

                               |                          |
           94906267 0000000000000000000000000000000000000101101010000010011110011011
   9007199515875289 0000000000100000000000000000000000001111100100001001011111011001
>

When integer type is changed qua long, the problem is fixed.

I would say that the problem is revealed. IMHO the result should not depend on wether the long type is signed or not. dmd seems to generate different code for the signed/unsigned cases:

import std.math;
ulong r1 (double d)
{
   return cast (ulong) sqrt (d);
}

long r2 (double d)
{
   return cast (long) sqrt (d);
}

dmd -O -c / objdump -Cdarw shows what goes on:

Disassembly of section .text.ulong sub.r1(double):

0000000000000000 <ulong sub.r1(double)>:
   0:   55                      push   %rbp
   1:   48 8b ec                mov    %rsp,%rbp
   4:   48 83 ec 10             sub    $0x10,%rsp
   8:   f2 0f 11 45 f0          movsd  %xmm0,-0x10(%rbp)
   d:   dd 45 f0                fldl   -0x10(%rbp)
  10:   d9 fa                   fsqrt
  12:   b9 00 00 00 80          mov    $0x80000000,%ecx
  17:   c7 45 f0 00 00 00 00    movl   $0x0,-0x10(%rbp)
  1e:   89 4d f4                mov    %ecx,-0xc(%rbp)
  21:   c7 45 f8 3e 40 bf 0f    movl   $0xfbf403e,-0x8(%rbp)
  28:   db 6d f0                fldt   -0x10(%rbp)
  2b:   d8 d9                   fcomp  %st(1)
  2d:   df e0                   fnstsw %ax
  2f:   d9 7d fc                fnstcw -0x4(%rbp)
  32:   d9 6d fa                fldcw  -0x6(%rbp)
  35:   f6 c4 01                test   $0x1,%ah
  38:   74 18                   je     52 <ulong sub.r1(double)+0x52>
  3a:   db 6d f0                fldt   -0x10(%rbp)
  3d:   de e9                   fsubrp %st,%st(1)
  3f:   df 7d f0                fistpll -0x10(%rbp)
  42:   48 8b 45 f0             mov    -0x10(%rbp),%rax
  46:   48 c1 e1 20             shl    $0x20,%rcx
  4a:   48 03 c1                add    %rcx,%rax
  4d:   d9 6d fc                fldcw  -0x4(%rbp)
  50:   eb 0a                   jmp    5c <ulong sub.r1(double)+0x5c>
  52:   df 7d f0                fistpll -0x10(%rbp)
  55:   48 8b 45 f0             mov    -0x10(%rbp),%rax
  59:   d9 6d fc                fldcw  -0x4(%rbp)
  5c:   c9                      leaveq
  5d:   c3                      retq
        ...

Disassembly of section .text.long sub.r2(double):

0000000000000000 <long sub.r2(double)>:
   0:   55                      push   %rbp
   1:   48 8b ec                mov    %rsp,%rbp
   4:   48 83 ec 10             sub    $0x10,%rsp
   8:   f2 0f 11 45 f0          movsd  %xmm0,-0x10(%rbp)
   d:   dd 45 f0                fldl   -0x10(%rbp)
  10:   d9 fa                   fsqrt
  12:   dd 5d f0                fstpl  -0x10(%rbp)
  15:   f2 0f 10 45 f0          movsd  -0x10(%rbp),%xmm0
  1a:   f2 48 0f 2c c0          cvttsd2si %xmm0,%rax
  1f:   c9                      leaveq
  20:   c3                      retq
  21:   00 00                   add    %al,(%rax)
        ...

The unsigned part

June 19, 2022

On Sunday, 19 June 2022 at 08:36:33 UTC, Salih Dincer wrote:

>

[snip]

Not sure why the discrepancy between ulongs and longs since the maximum ulong and long should be below n^2, but the issue with doubles is exactly as others described: 94,906,267^2 can't be represented exactly as a double.

import std.stdio;

void main()
{
    ulong n_ulong = 94_906_267;
    double n_ulong2 = n_ulong * n_ulong;
    long n_ulong3 = n_ulong * n_ulong;
    ulong n_ulong4 = n_ulong * n_ulong;
    writefln!("%f")(n_ulong2);
    writefln!("%f")(n_ulong3);
    writefln!("%f")(n_ulong4);

    long n_long = 94_906_267;
    double n_long2 = n_long * n_long;
    long n_long3 = n_long * n_long;
    ulong n_long4 = n_long * n_long;
    writefln!("%f")(n_long2);
    writefln!("%f")(n_long3);
    writefln!("%f")(n_long4);
}
June 20, 2022

On Sunday, 19 June 2022 at 22:19:18 UTC, kdevel wrote:

>

On Sunday, 19 June 2022 at 20:26:27 UTC, Salih Dincer wrote:

>

Because the size of the number is only twice uint.max.

The mantissa of double has only 53 bits which is less than twice the number of bits in an (u)int. 94906267 occupies 27 bits:

                               |                          |
           94906267 0000000000000000000000000000000000000101101010000010011110011011
   9007199515875289 0000000000100000000000000000000000001111100100001001011111011001
>

When integer type is changed qua long, the problem is fixed.

Good job, thank you very much. However, I am still not completely convinced. I need to do some work on God Bolt.

SDB@79