Thread overview
Re: GC buckets in 2.067
Dec 01, 2015
Iain Buclaw
Dec 04, 2015
Iain Buclaw
Dec 04, 2015
Iain Buclaw
Dec 05, 2015
Iain Buclaw
Dec 05, 2015
Iain Buclaw
December 01, 2015
On 1 December 2015 at 09:46, Iain Buclaw <ibuclaw@gdcproject.org> wrote:

>
> Where:
>
> *(cast(List*)p) = {next = 0x4c9fad <__gdc_exception_cleanup>, pool = 0x0}
>
> Not sure what is going on, but it seems to happen after allocating memory a couple dozen or so times.
>
> David, did you get anything like this when moving to 2.067?
>
>
I removed the line in EH where `__gdc_exception_cleanup` is assigned (xh is
GC'd memory)

221│   //xh.unwindHeader.exception_cleanup = & __gdc_exception_cleanup;


The unittester carries on a little longer until it seg faults here.

1796│         // Return next item from free list
1797│         bucket[bin] = (cast(List*)p).next;
1798│         auto pool = (cast(List*)p).pool;
1799│         if (bits)
1800├>            pool.setBits((p - pool.baseAddr) >> pool.shiftBy, bits);

Where:

*cast(List*)p = {next = 0xa, pool = 0x0}


Martin - you've been making changes to the GC no?  Any idea why the bucket list could be storing garbage pointers?  Any hints to narrow this down?  (I could turn on memory stomping).


December 04, 2015
On 1 December 2015 at 09:46, Iain Buclaw <ibuclaw@gdcproject.org> wrote:

> When running the unittest program for druntime.
>
> ---
> Program received signal SIGSEGV, Segmentation fault. __memset_avx2 () at ../sysdeps/x86_64/multiarch/memset-avx2.S:101
>
> backtrace:
> #0  __memset_avx2 () at ../sysdeps/x86_64/multiarch/memset-avx2.S:101
> #1  0x00000000004d45a0 in gc.gc.GC.malloc(ulong, uint, ulong*,
> const(TypeInfo)) (this=..., size=8, bits=0, alloc_size=0x7fffffffd428,
> ti=0x714050 <TypeInfo_PS2rt3aaA4Impl.init$>) at
> ../../../../dev/libphobos/libd
> runtime/gc/gc.d:459
> #2  0x00000000004c5948 in gc_qalloc (sz=8, ba=0, ti=0x714050
> <TypeInfo_PS2rt3aaA4Impl.init$>) at
> ../../../../dev/libphobos/libdruntime/gc/proxy.d:196
> #3  0x00000000004450de in core.memory.GC.qalloc(ulong, uint,
> const(TypeInfo)) (sz=8, ba=0, ti=0x714050 <TypeInfo_PS2rt3aaA4Impl.init$>)
> at ../../../../dev/libphobos/libdruntime/core/memory.d:368
> #4  0x0000000000420e31 in _d_newitemT (_ti=0x714050
> <TypeInfo_PS2rt3aaA4Impl.init$>) at
> ../../../../dev/libphobos/libdruntime/rt/lifetime.d:1096
> #5  0x0000000000411f6c in _aaGetX (aa=0x7ffff7ed2090, keyti=0x7191a0
> <ClassInfo for core.thread.Thread>, valuesize=8, pkey=0x7fffffffd598) at
> ../../../../dev/libphobos/libdruntime/rt/aaA.d:172
>

DMD dropped calling this function in favour for _aaGetY().

Maybe I'm chasing a dead end, but maybe, *maybe* something changed and _aaGetX was not updated parallel?

Iain.


December 05, 2015
On 4 December 2015 at 21:52, Iain Buclaw <ibuclaw@gdcproject.org> wrote:

> On 1 December 2015 at 09:46, Iain Buclaw <ibuclaw@gdcproject.org> wrote:
>
>> When running the unittest program for druntime.
>>
>> ---
>> Program received signal SIGSEGV, Segmentation fault. __memset_avx2 () at ../sysdeps/x86_64/multiarch/memset-avx2.S:101
>>
>> backtrace:
>> #0  __memset_avx2 () at ../sysdeps/x86_64/multiarch/memset-avx2.S:101
>> #1  0x00000000004d45a0 in gc.gc.GC.malloc(ulong, uint, ulong*,
>> const(TypeInfo)) (this=..., size=8, bits=0, alloc_size=0x7fffffffd428,
>> ti=0x714050 <TypeInfo_PS2rt3aaA4Impl.init$>) at
>> ../../../../dev/libphobos/libd
>> runtime/gc/gc.d:459
>> #2  0x00000000004c5948 in gc_qalloc (sz=8, ba=0, ti=0x714050
>> <TypeInfo_PS2rt3aaA4Impl.init$>) at
>> ../../../../dev/libphobos/libdruntime/gc/proxy.d:196
>> #3  0x00000000004450de in core.memory.GC.qalloc(ulong, uint,
>> const(TypeInfo)) (sz=8, ba=0, ti=0x714050 <TypeInfo_PS2rt3aaA4Impl.init$>)
>> at ../../../../dev/libphobos/libdruntime/core/memory.d:368
>> #4  0x0000000000420e31 in _d_newitemT (_ti=0x714050
>> <TypeInfo_PS2rt3aaA4Impl.init$>) at
>> ../../../../dev/libphobos/libdruntime/rt/lifetime.d:1096
>> #5  0x0000000000411f6c in _aaGetX (aa=0x7ffff7ed2090, keyti=0x7191a0
>> <ClassInfo for core.thread.Thread>, valuesize=8, pkey=0x7fffffffd598) at
>> ../../../../dev/libphobos/libdruntime/rt/aaA.d:172
>>
>
> DMD dropped calling this function in favour for _aaGetY().
>
> Maybe I'm chasing a dead end, but maybe, *maybe* something changed and _aaGetX was not updated parallel?
>
> Iain.
>

Well, reverting all of druntime 2.067 (minus the bits that produce new errors) and I don't hit this error.

At least I have a (rather large) starting point to bisect down. :-)


December 05, 2015
On 5 December 2015 at 00:40, Iain Buclaw <ibuclaw@gdcproject.org> wrote:

> On 4 December 2015 at 21:52, Iain Buclaw <ibuclaw@gdcproject.org> wrote:
>
>> On 1 December 2015 at 09:46, Iain Buclaw <ibuclaw@gdcproject.org> wrote:
>>
>>> When running the unittest program for druntime.
>>>
>>> ---
>>> Program received signal SIGSEGV, Segmentation fault. __memset_avx2 () at ../sysdeps/x86_64/multiarch/memset-avx2.S:101
>>>
>>> backtrace:
>>> #0  __memset_avx2 () at ../sysdeps/x86_64/multiarch/memset-avx2.S:101
>>> #1  0x00000000004d45a0 in gc.gc.GC.malloc(ulong, uint, ulong*,
>>> const(TypeInfo)) (this=..., size=8, bits=0, alloc_size=0x7fffffffd428,
>>> ti=0x714050 <TypeInfo_PS2rt3aaA4Impl.init$>) at
>>> ../../../../dev/libphobos/libd
>>> runtime/gc/gc.d:459
>>> #2  0x00000000004c5948 in gc_qalloc (sz=8, ba=0, ti=0x714050
>>> <TypeInfo_PS2rt3aaA4Impl.init$>) at
>>> ../../../../dev/libphobos/libdruntime/gc/proxy.d:196
>>> #3  0x00000000004450de in core.memory.GC.qalloc(ulong, uint,
>>> const(TypeInfo)) (sz=8, ba=0, ti=0x714050 <TypeInfo_PS2rt3aaA4Impl.init$>)
>>> at ../../../../dev/libphobos/libdruntime/core/memory.d:368
>>> #4  0x0000000000420e31 in _d_newitemT (_ti=0x714050
>>> <TypeInfo_PS2rt3aaA4Impl.init$>) at
>>> ../../../../dev/libphobos/libdruntime/rt/lifetime.d:1096
>>> #5  0x0000000000411f6c in _aaGetX (aa=0x7ffff7ed2090, keyti=0x7191a0
>>> <ClassInfo for core.thread.Thread>, valuesize=8, pkey=0x7fffffffd598) at
>>> ../../../../dev/libphobos/libdruntime/rt/aaA.d:172
>>>
>>
>> DMD dropped calling this function in favour for _aaGetY().
>>
>> Maybe I'm chasing a dead end, but maybe, *maybe* something changed and _aaGetX was not updated parallel?
>>
>> Iain.
>>
>
> Well, reverting all of druntime 2.067 (minus the bits that produce new errors) and I don't hit this error.
>
> At least I have a (rather large) starting point to bisect down. :-)
>
>
Squashed down to a 1600 line diff of rt.lifetime, everything else has been applied and passes the unittests just fine.


December 04, 2015
On 12/4/15 7:33 PM, Iain Buclaw via Digitalmars-d wrote:
> On 5 December 2015 at 00:40, Iain Buclaw <ibuclaw@gdcproject.org
> <mailto:ibuclaw@gdcproject.org>> wrote:
>
>     On 4 December 2015 at 21:52, Iain Buclaw <ibuclaw@gdcproject.org
>     <mailto:ibuclaw@gdcproject.org>> wrote:
>
>         On 1 December 2015 at 09:46, Iain Buclaw <ibuclaw@gdcproject.org
>         <mailto:ibuclaw@gdcproject.org>> wrote:
>
>             When running the unittest program for druntime.
>
>             ---
>             Program received signal SIGSEGV, Segmentation fault.
>             __memset_avx2 () at
>             ../sysdeps/x86_64/multiarch/memset-avx2.S:101
>
>             backtrace:
>             #0  __memset_avx2 () at
>             ../sysdeps/x86_64/multiarch/memset-avx2.S:101
>             #1  0x00000000004d45a0 in gc.gc.GC.malloc(ulong, uint,
>             ulong*, const(TypeInfo)) (this=..., size=8, bits=0,
>             alloc_size=0x7fffffffd428, ti=0x714050
>             <TypeInfo_PS2rt3aaA4Impl.init$>) at
>             ../../../../dev/libphobos/libd
>             runtime/gc/gc.d:459
>             #2  0x00000000004c5948 in gc_qalloc (sz=8, ba=0, ti=0x714050
>             <TypeInfo_PS2rt3aaA4Impl.init$>) at
>             ../../../.../dev/libphobos/libdruntime/gc/proxy.d:196
>             #3  0x00000000004450de in core.memory.GC.qalloc(ulong, uint,
>             const(TypeInfo)) (sz=8, ba=0, ti=0x714050
>             <TypeInfo_PS2rt3aaA4Impl.init$>) at
>             ../../../../dev/libphobos/libdruntime/core/memory.d:368
>             #4  0x0000000000420e31 in _d_newitemT (_ti=0x714050
>             <TypeInfo_PS2rt3aaA4Impl.init$>) at
>             ../../../../dev/libphobos/libdruntime/rt/lifetime.d:1096
>             #5  0x0000000000411f6c in _aaGetX (aa=0x7ffff7ed2090,
>             keyti=0x7191a0 <ClassInfo for core.thread.Thread>,
>             valuesize=8, pkey=0x7fffffffd598) at
>             ../../../../dev/libphobos/libdruntime/rt/aaA.d:172
>
>
>         DMD dropped calling this function in favour for _aaGetY().
>
>         Maybe I'm chasing a dead end, but maybe, *maybe* something
>         changed and _aaGetX was not updated parallel?
>
>         Iain.
>
>
>     Well, reverting all of druntime 2.067 (minus the bits that produce
>     new errors) and I don't hit this error.
>
>     At least I have a (rather large) starting point to bisect down. :-)
>
>
> Squashed down to a 1600 line diff of rt.lifetime, everything else has
> been applied and passes the unittests just fine.

I'm interested in hearing what this is. lifetime.d went through a major update with struct destructor support in the GC. We've found a couple of bugs in there. You should examine the history between 2.067 and now.

-Steve
December 05, 2015
On 5 December 2015 at 03:46, Steven Schveighoffer via Digitalmars-d < digitalmars-d@puremagic.com> wrote:

> On 12/4/15 7:33 PM, Iain Buclaw via Digitalmars-d wrote:
>
>> On 5 December 2015 at 00:40, Iain Buclaw <ibuclaw@gdcproject.org <mailto:ibuclaw@gdcproject.org>> wrote:
>>
>>     On 4 December 2015 at 21:52, Iain Buclaw <ibuclaw@gdcproject.org
>>     <mailto:ibuclaw@gdcproject.org>> wrote:
>>
>>         On 1 December 2015 at 09:46, Iain Buclaw <ibuclaw@gdcproject.org
>>         <mailto:ibuclaw@gdcproject.org>> wrote:
>>
>>             When running the unittest program for druntime.
>>
>>             ---
>>             Program received signal SIGSEGV, Segmentation fault.
>>             __memset_avx2 () at
>>             ../sysdeps/x86_64/multiarch/memset-avx2.S:101
>>
>>             backtrace:
>>             #0  __memset_avx2 () at
>>             ../sysdeps/x86_64/multiarch/memset-avx2.S:101
>>             #1  0x00000000004d45a0 in gc.gc.GC.malloc(ulong, uint,
>>             ulong*, const(TypeInfo)) (this=..., size=8, bits=0,
>>             alloc_size=0x7fffffffd428, ti=0x714050
>>             <TypeInfo_PS2rt3aaA4Impl.init$>) at
>>             ../../../../dev/libphobos/libd
>>             runtime/gc/gc.d:459
>>             #2  0x00000000004c5948 in gc_qalloc (sz=8, ba=0, ti=0x714050
>>             <TypeInfo_PS2rt3aaA4Impl.init$>) at
>>             ../../../.../dev/libphobos/libdruntime/gc/proxy.d:196
>>             #3  0x00000000004450de in core.memory.GC.qalloc(ulong, uint,
>>             const(TypeInfo)) (sz=8, ba=0, ti=0x714050
>>             <TypeInfo_PS2rt3aaA4Impl.init$>) at
>>             ../../../../dev/libphobos/libdruntime/core/memory.d:368
>>             #4  0x0000000000420e31 in _d_newitemT (_ti=0x714050
>>             <TypeInfo_PS2rt3aaA4Impl.init$>) at
>>             ../../../../dev/libphobos/libdruntime/rt/lifetime.d:1096
>>             #5  0x0000000000411f6c in _aaGetX (aa=0x7ffff7ed2090,
>>             keyti=0x7191a0 <ClassInfo for core.thread.Thread>,
>>             valuesize=8, pkey=0x7fffffffd598) at
>>             ../../../../dev/libphobos/libdruntime/rt/aaA.d:172
>>
>>
>>         DMD dropped calling this function in favour for _aaGetY().
>>
>>         Maybe I'm chasing a dead end, but maybe, *maybe* something
>>         changed and _aaGetX was not updated parallel?
>>
>>         Iain.
>>
>>
>>     Well, reverting all of druntime 2.067 (minus the bits that produce
>>     new errors) and I don't hit this error.
>>
>>     At least I have a (rather large) starting point to bisect down. :-)
>>
>>
>> Squashed down to a 1600 line diff of rt.lifetime, everything else has been applied and passes the unittests just fine.
>>
>
> I'm interested in hearing what this is. lifetime.d went through a major update with struct destructor support in the GC. We've found a couple of bugs in there. You should examine the history between 2.067 and now.
>
> -Steve
>

Applying changes in a patch by patch manner using git-format, it didn't take long to find the bad patch.

https://github.com/D-Programming-Language/druntime/pull/941

There's apparently a dependency on a compiler change, though I haven't looked at that yet.

Iain