Jump to page: 1 2
Thread overview
[Issue 21919] darwin: SEGV in core.thread tests on OSX 11
May 13, 2021
Iain Buclaw
May 13, 2021
Iain Buclaw
May 13, 2021
Iain Buclaw
May 13, 2021
Iain Buclaw
May 13, 2021
Iain Buclaw
Sep 11, 2021
Lionello Lunesu
Sep 11, 2021
Lionello Lunesu
Sep 11, 2021
Lionello Lunesu
Sep 11, 2021
Lionello Lunesu
Nov 07, 2021
Iain Buclaw
Nov 07, 2021
Dlang Bot
Nov 08, 2021
Dlang Bot
Nov 08, 2021
Dlang Bot
Nov 22, 2021
Iain Buclaw
May 13, 2021
https://issues.dlang.org/show_bug.cgi?id=21919

Iain Buclaw <ibuclaw@gdcproject.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ibuclaw@gdcproject.org

--- Comment #1 from Iain Buclaw <ibuclaw@gdcproject.org> ---
Confirmed on DMD when running the unittests.

generated/osx/release/64/unittest/test_runner core.thread.threadgroup
make[1]: *** [generated/osx/release/64/unittest/core/thread/fiber] Bus error:
10
make[1]: *** Deleting file
`generated/osx/release/64/unittest/core/thread/fiber'
make[1]: *** Waiting for unfinished jobs....
generated/osx/release/64/unittest/test_runner core.thread.types
make: *** [unittest-release] Error 2

--
May 13, 2021
https://issues.dlang.org/show_bug.cgi?id=21919

--- Comment #2 from Iain Buclaw <ibuclaw@gdcproject.org> ---
$ sw_vers
ProductName:    macOS
ProductVersion: 11.1
BuildVersion:   20C69

$ clang --version
Apple clang version 12.0.0 (clang-1200.0.32.27)

$ xcodebuild -version
Xcode 12.2
Build version 12B45b

$ uname -v
Darwin Kernel Version 20.2.0: Wed Dec  2 20:39:59 PST 2020;
root:xnu-7195.60.75~1/RELEASE_X86_64

druntime: a79bb0eb0424f77159eb72e1c527db3b2ae2a57d

dmd: 97aa2ae5ee19ce6a2979ca1627479df713f99252

--
May 13, 2021
https://issues.dlang.org/show_bug.cgi?id=21919

--- Comment #3 from Iain Buclaw <ibuclaw@gdcproject.org> ---
Process 65900 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS
(code=2, address=0x100c8cc70)
    frame #0: 0x00007fff203794e0 libsystem_pthread.dylib`___chkstk_darwin + 96
libsystem_pthread.dylib`___chkstk_darwin:
->  0x7fff203794e0 <+96>:  testq  %rcx, (%rcx)
    0x7fff203794e3 <+99>:  popq   %rcx
    0x7fff203794e4 <+100>: retq
libsystem_pthread.dylib`pthread_getspecific:
    0x7fff203794e5 <+0>:   movq   %gs:(,%rdi,8), %rax
Target 0: (test_runner) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS
(code=2, address=0x100c8cc70)
  * frame #0: 0x00007fff203794e0 libsystem_pthread.dylib`___chkstk_darwin + 96
    frame #1: 0x00007fff20379480 libsystem_pthread.dylib`thread_start + 20
    frame #2: 0x00007fff2a542a9c
libunwind.dylib`libunwind::UnwindCursor<libunwind::LocalAddressSpace,
libunwind::Registers_x86_64>::step() + 476
    frame #3: 0x00007fff2a5446ee libunwind.dylib`_Unwind_RaiseException + 189
    frame #4: 0x00000001001bbeb8 test_runner`_d_throwdwarf at dwarfeh.d:317
    frame #5: 0x0000000100188b84
test_runner`_D4core6thread5fiber19__unittest_L1679_C1FZ9__lambda1MFNaNbNfZv at
fiber.d:1686
    frame #6: 0x000000010018fb85
test_runner`_D4core6thread7context8Callable6opCallMFZv at context.d:46
    frame #7: 0x0000000100187ac5 test_runner`_D4core6thread5fiber5Fiber3runMFZv
at fiber.d:869
    frame #8: 0x000000010018749f test_runner`fiber_entryPoint at fiber.d:157

--
May 13, 2021
https://issues.dlang.org/show_bug.cgi?id=21919

--- Comment #4 from Iain Buclaw <ibuclaw@gdcproject.org> ---
This was discovered in December, hence the git commit hashes are 5 months old.

--
May 13, 2021
https://issues.dlang.org/show_bug.cgi?id=21919

--- Comment #5 from Iain Buclaw <ibuclaw@gdcproject.org> ---
To describe what looks like is happening:

1. A D fiber context switch occurs.
2. An exception is thrown.
3. libunwind's entry point for raising exceptions is called.
4. Segfault somewhere deep in libc/pthread.

The unittest block that matches the encoded line numbers in the function name is:
---
// Test exception handling inside fibers.
unittest
{
    enum MSG = "Test message.";
    string caughtMsg;
    (new Fiber({
        try
        {
            throw new Exception(MSG);
        }
        catch (Exception e)
        {
            caughtMsg = e.msg;
        }
    })).call();
    assert(caughtMsg == MSG);
}

--
September 11, 2021
https://issues.dlang.org/show_bug.cgi?id=21919

Lionello Lunesu <lio+bugzilla@lunesu.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |lio+bugzilla@lunesu.com

--- Comment #6 from Lionello Lunesu <lio+bugzilla@lunesu.com> ---
I suspect I'm running into this same bug while running the DMD 2.097.2 test suite on Big Sur:

$ test_results/runnable/test15779_0
fish: “test_results/runnable/test15779…” terminated by signal SIGBUS
(Misaligned address error)

$ lldb test_results/runnable/test15779_0
(lldb) target create "test_results/runnable/test15779_0"
Current executable set to
'/Users/llunesu/repos/d/dmd/test/test_results/runnable/test15779_0' (x86_64).
(lldb) r
Process 854 launched:
'/Users/llunesu/repos/d/dmd/test/test_results/runnable/test15779_0' (x86_64)
Process 854 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS
(code=2, address=0x1001edc60)
    frame #0: 0x00007fff2031b4a8 libsystem_pthread.dylib`___chkstk_darwin + 96
libsystem_pthread.dylib`___chkstk_darwin:
->  0x7fff2031b4a8 <+96>:  testq  %rcx, (%rcx)
    0x7fff2031b4ab <+99>:  popq   %rcx
    0x7fff2031b4ac <+100>: retq

libsystem_pthread.dylib`pthread_getspecific:
    0x7fff2031b4ad <+0>:   movq   %gs:(,%rdi,8), %rax
Target 0: (test15779_0) stopped.
(lldb)

--
September 11, 2021
https://issues.dlang.org/show_bug.cgi?id=21919

--- Comment #7 from Lionello Lunesu <lio+bugzilla@lunesu.com> ---
Stack trace for previous crash:

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS
(code=2, address=0x1001e9c60)
  * frame #0: 0x00007fff2031b4a8 libsystem_pthread.dylib`___chkstk_darwin + 96
    frame #1: 0x00007fff2031b448 libsystem_pthread.dylib`thread_start + 20
    frame #2: 0x00007fff2a4bfb2d
libunwind.dylib`libunwind::UnwindCursor<libunwind::LocalAddressSpace,
libunwind::Registers_x86_64>::getInfoFromDwarfSection(unsigned long,
libunwind::UnwindInfoSections const&, unsigned int) + 191
    frame #3: 0x00007fff2a4bfa01
libunwind.dylib`libunwind::UnwindCursor<libunwind::LocalAddressSpace,
libunwind::Registers_x86_64>::setInfoBasedOnIPRegister(bool) + 999
    frame #4: 0x00007fff2a4c1ec9
libunwind.dylib`libunwind::UnwindCursor<libunwind::LocalAddressSpace,
libunwind::Registers_x86_64>::step() + 461
    frame #5: 0x00007fff2a4c3a18 libunwind.dylib`_Unwind_RaiseException + 189
    frame #6: 0x000000010002fca5 test15779_0`_d_throwdwarf + 185
    frame #7: 0x0000000100002410
test15779_0`_D9test157793barFZ9__lambda1FNaNfZv + 80
    frame #8: 0x000000010002cc2f
test15779_0`_D4core6thread7context8Callable6opCallMFZv + 27
    frame #9: 0x00000001000293a7 test15779_0`fiber_entryPoint + 99

--
September 11, 2021
https://issues.dlang.org/show_bug.cgi?id=21919

--- Comment #8 from Lionello Lunesu <lio+bugzilla@lunesu.com> ---
$ sw_vers
ProductName:    macOS
ProductVersion: 11.5.2
BuildVersion:   20G95

$ clang --version
Apple clang version 12.0.5 (clang-1205.0.22.11)
Target: x86_64-apple-darwin20.6.0
Thread model: posix
InstalledDir:
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

$ xcodebuild -version
Xcode 12.5.1
Build version 12E507

$ uname -v
Darwin Kernel Version 20.6.0: Wed Jun 23 00:26:31 PDT 2021;
root:xnu-7195.141.2~5/RELEASE_X86_64

dmd, druntime, Phobos: tag v2.097.2

--
September 11, 2021
https://issues.dlang.org/show_bug.cgi?id=21919

Lionello Lunesu <lio+bugzilla@lunesu.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           See Also|                            |https://issues.dlang.org/sh
                   |                            |ow_bug.cgi?id=15779

--
November 07, 2021
https://issues.dlang.org/show_bug.cgi?id=21919

--- Comment #9 from Iain Buclaw <ibuclaw@gdcproject.org> ---
Done some prodding around, and the root cause is darwin's libunwind now overflows the Fiber's small 16kb stack.

Fix then is to bump the stack allocated for Fibers.

     version (Windows)
         // exception handling walks the stack, invoking DbgHelp.dll which
         // needs up to 16k of stack space depending on the version of
DbgHelp.dll,
         // the existence of debug symbols and other conditions. Avoid causing
         // stack overflows by defaulting to a larger stack size
         enum defaultStackPages = 8;
+    else version (OSX)
+    {
+        version (X86_64)
+            enum defaultStackPages = 8;
+        else
+            enum defaultStackPages = 4;
+    }
     else
         enum defaultStackPages = 4;

Darwin x86 pagesize is 4k, whilst arm64 is 16k, so this fix should only be applied to 64-bit code.

--
« First   ‹ Prev
1 2