Thread overview
[Issue 6660] New: Problem with SSE registers in array ops
Sep 13, 2011
Don
Sep 27, 2011
Brad Roberts
Sep 27, 2011
Don
[Issue 6660] Problem with core.cpuid on Windows7
Sep 27, 2011
Don
Sep 27, 2011
Don
Sep 27, 2011
Don
Sep 27, 2011
Don
Sep 28, 2011
Don
Dec 22, 2011
Don
Mar 28, 2012
Don
September 13, 2011
http://d.puremagic.com/issues/show_bug.cgi?id=6660

           Summary: Problem with SSE registers in array ops
           Product: D
           Version: D1 & D2
          Platform: Other
        OS/Version: Windows
            Status: NEW
          Severity: normal
          Priority: P2
         Component: DMD
        AssignedTo: nobody@puremagic.com
        ReportedBy: clugdbug@yahoo.com.au


--- Comment #0 from Don <clugdbug@yahoo.com.au> 2011-09-13 01:12:12 PDT ---
This program, arrayop.d,

void main()
{
    double[4] a;
    double[4] b;
    a[] = b[] + b[];
}
compiled and run repeatedly in a batch file

dmd arrayop
arrayop
dmd arrayop
arrayop
dmd arrayop
arrayop
... (I put it in about 20 times)
eventually generates this error on a SandyBridge processor, Windows 7.

C:\sandbox\bugs>dmd arrayop
DMD v2.055 DEBUG
OPTLINK (R) for Win32  Release 8.00.12
Copyright (C) Digital Mars 1989-2010  All rights reserved.
http://www.digitalmars.com/ctg/optlink.html
OPTLINK : Error 3: Cannot Create File arrayop.exe
--- errorlevel 1
Also happens in release version of DMD 2.055.

I think it is an SSE issue, since it only happens with arrays of floats and doubles (not reals). But I'm just guessing. Maybe it is corrupting the stack.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
September 27, 2011
http://d.puremagic.com/issues/show_bug.cgi?id=6660


Brad Roberts <braddr@puremagic.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |braddr@puremagic.com


--- Comment #1 from Brad Roberts <braddr@puremagic.com> 2011-09-26 23:50:42 PDT ---
Another data point...

In the auto tester where it's building each test with the sequence of different parameter combinations, it used to fail every once in a while due to the same error below.  Changing it to write to a different executable every time (I just added a counter so it's testfoo_0.exe, testfoo_1.exe, etc..) completely fixed that problem.  I have no recollection which tests were failing.. I thought it was pretty random, but it might not have been.

My assumption is/was that windows isn't releasing the exclusive write lock on the executable file synchronously with the exiting of the application.

Have you tried the same loop with an empty main?

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
September 27, 2011
http://d.puremagic.com/issues/show_bug.cgi?id=6660



--- Comment #2 from Don <clugdbug@yahoo.com.au> 2011-09-27 01:00:39 PDT ---
(In reply to comment #1)
> Another data point...
> 
> In the auto tester where it's building each test with the sequence of different parameter combinations, it used to fail every once in a while due to the same error below.  Changing it to write to a different executable every time (I just added a counter so it's testfoo_0.exe, testfoo_1.exe, etc..) completely fixed that problem.  I have no recollection which tests were failing.. I thought it was pretty random, but it might not have been.
> 
> My assumption is/was that windows isn't releasing the exclusive write lock on the executable file synchronously with the exiting of the application.
> 
> Have you tried the same loop with an empty main?

Yes, I have, and it never fails. It also never fails when 'double' is replaced by 'real'. This makes it very hard for me to blame Windows for this.

I found three tests from the test suite which failed: test15, arrayop, and hospital. I reduced arrayop down to that minimum size. Might be worth trying to reduce the others as well.

It's also possible that it could be an issue with core.cpuid.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
September 27, 2011
http://d.puremagic.com/issues/show_bug.cgi?id=6660


Don <clugdbug@yahoo.com.au> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|Problem with SSE registers  |Problem with core.cpuid on
                   |in array ops                |Windows7


--- Comment #3 from Don <clugdbug@yahoo.com.au> 2011-09-27 01:07:33 PDT ---
Yup, it's core.cpuid. This one fails (intermittently):
-----
import core.cpuid;

void main()
{
    bool b = sse();
}

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
September 27, 2011
http://d.puremagic.com/issues/show_bug.cgi?id=6660



--- Comment #4 from Don <clugdbug@yahoo.com.au> 2011-09-27 01:42:17 PDT ---
Reduced test case is very, very strange:

void main()
{
    __gshared uint a;
    asm {
        mov EAX, 2;
        cpuid;
        mov a, EAX;
    }
    uint numinfos = a& 0xFF;
    do {
    } while (--numinfos);
}

It only happens with cpuid = 2.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
September 27, 2011
http://d.puremagic.com/issues/show_bug.cgi?id=6660



--- Comment #5 from Don <clugdbug@yahoo.com.au> 2011-09-27 03:57:59 PDT ---
This is really incredible. I've removed all of the D code, and I can still reproduce the behaviour. If you uncomment out the jz line, it won't happen. The 'int 3' line is just a breakpoint, to prove that the branch is never taken.

void main()
{
    int ctr; // also works with __gshared int ctr;
    asm {
        mov EAX, 2;
        cpuid;
        and EAX, 0xFF;
        mov ctr, EAX;
//        jz was_zero;
Lxx:
        dec int ptr ctr;
        jnz Lxx;
        jmp done;
was_zero:
        int 3;
done:   ;
    }
}

Wild speculation: there's a bug in CPUID 2: it's not clearing the loopback
buffer. The loop is executed as if 'ctr' were still zero. This means that it
loops 2^^32 times. This is long enough that Windows does a task switch.
In core2, the loopback buffer was between the predecoders and the decoders, but
on core i7, they moved it after the decoders.
I tried to confirm this by extending the size of the loop, by padding with
nops.
When the loop is 63 bytes of code (56 nops), it fails. Once I add a 57th nop,
it stops failing.
These aren't the numbers I expected -- the loopback buffer is 256 bytes on the
core i7. However I have a core i3, perhaps it's different, or it may be a
decoding bug. Regardless, this looks very much like a CPU erratum.


My guess is that affecting the loop predictor. which isn't the branch prediction

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
September 27, 2011
http://d.puremagic.com/issues/show_bug.cgi?id=6660



--- Comment #6 from Don <clugdbug@yahoo.com.au> 2011-09-27 12:20:31 PDT ---
My theory is not correct. I figured that I could check if the number of
iterations was wrong by using rdtsc to see how many instructions are executed.
But it shows nothing unusual.
I'm no longer convinced that this is a loopback issue.
I also found that if I include a writefln after the relevant code, the critical
length of the loop drops from 64 (0x40) to 40 (0x28). It doesn't seem to be
affected by code alignment, so it's not a cache line issue.
This whole thing is very, very strange.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
September 28, 2011
http://d.puremagic.com/issues/show_bug.cgi?id=6660



--- Comment #7 from Don <clugdbug@yahoo.com.au> 2011-09-27 18:12:05 PDT ---
The reduced test case from test15.d looks _completely_ different:

void main()
{
    char[] a = new char[0];
    uint c = 20000;
    while (c--)
    a ~= 'x';
}

This looks as though the gc is still running after the app has exited.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
December 22, 2011
http://d.puremagic.com/issues/show_bug.cgi?id=6660



--- Comment #8 from Don <clugdbug@yahoo.com.au> 2011-12-22 02:41:41 PST ---
This is interesting.

http://msdn.microsoft.com/en-us/library/windows/hardware/ff538528%28v=vs.85%29.aspx

"A CPUID intercept message is delivered by the hypervisor when a virtual processor executes a CPUID instruction and the parent partition previously called the HvInstallIntercept hypercall function to install an intercept on such instructions."

Wow. There is a hypervisor running on my laptop. And it's buggy. Could it be a rootkit?

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
March 28, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=6660


Don <clugdbug@yahoo.com.au> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED


--- Comment #9 from Don <clugdbug@yahoo.com.au> 2012-03-28 00:06:39 PDT ---
Turns out to be caused by Windows Defender.
Disabling it in the development directory solves the problem.

Looks like a bug in Windows Defender to me.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------