Thread overview
[phobos] opinions on bug 5370
Aug 04, 2011
Brad Roberts
Aug 04, 2011
Dmitry Olshansky
Aug 04, 2011
Dmitry Olshansky
Aug 05, 2011
Walter Bright
Aug 05, 2011
Dmitry Olshansky
Aug 08, 2011
Walter Bright
Aug 08, 2011
Dmitry Olshansky
Aug 05, 2011
Sean Kelly
August 03, 2011
http://d.puremagic.com/issues/show_bug.cgi?id=5370

It's listed as a regression, but it looks like a deliberate behavior change.  My personal opinion is that buffers don't contain pointers.  They shouldn't participate in garbage collection.

Either way, a decision needs to be made and either the changes applied or the bug closed as won't fix.

Thoughts?

Later,
Brad
August 05, 2011
On 04.08.2011 7:21, Brad Roberts wrote:
> http://d.puremagic.com/issues/show_bug.cgi?id=5370
>
> It's listed as a regression, but it looks like a deliberate behavior change.  My personal opinion is that buffers don't contain pointers.  They shouldn't participate in garbage collection.
>
> Either way, a decision needs to be made and either the changes applied or the bug closed as won't fix.
>
> Thoughts?

When I filed it I was certain it's a regression, I discovered it while
chasing memory corruption in my D2 port of DMDscript.
All boiled down to "it works untill garbage collection", and then I
found the cause:  the original code stores pointers to heap allocated
objects in OutBuffer. So I think it _was_ supposed to be able to hold
pointers to heap allocated stuff.
Also grep code for:
GC.clrAttr(data.ptr, GC.BlkAttr.NO_SCAN);
I think this made it to scan GC  before array LRU append cache. Nowdays
this part of code does nothing, since it somehow doesn't point to
_block_. At any rate, I can wrap up the patch as pull request if I
convinced anybody.

-- 
Dmitry Olshansky

August 05, 2011
On 04.08.2011 7:21, Brad Roberts wrote:
> http://d.puremagic.com/issues/show_bug.cgi?id=5370
>
> It's listed as a regression, but it looks like a deliberate behavior change.  My personal opinion is that buffers don't contain pointers.  They shouldn't participate in garbage collection.
>
> Either way, a decision needs to be made and either the changes applied or the bug closed as won't fix.
>
> Thoughts?

When I filed it I was certain it's a regression, I discovered it while
chasing memory corruption in my D2 port of DMDscript.
All boiled down to "it works untill garbage collection", and then I
found the cause:  the original code stores pointers to heap allocated
objects in OutBuffer. So I think it _was_ supposed to be able to hold
pointers to heap allocated stuff.
Also grep code for:
GC.clrAttr(data.ptr, GC.BlkAttr.NO_SCAN);
I think this made it to scan GC  before array LRU append cache. Nowdays
this part of code does nothing, since it somehow doesn't point to
_block_. At any rate, I can wrap up the patch as pull request if I
convinced anybody.

-- 
Dmitry Olshansky

August 05, 2011
On Aug 3, 2011, at 8:21 PM, Brad Roberts wrote:

> http://d.puremagic.com/issues/show_bug.cgi?id=5370
> 
> It's listed as a regression, but it looks like a deliberate behavior change.  My personal opinion is that buffers don't contain pointers.  They shouldn't participate in garbage collection.

Agreed.  Buffers contain bytes, not arbitrary data.
August 05, 2011

On 8/4/2011 4:24 PM, Dmitry Olshansky wrote:
> On 04.08.2011 7:21, Brad Roberts wrote:
>> http://d.puremagic.com/issues/show_bug.cgi?id=5370
>>
>> It's listed as a regression, but it looks like a deliberate behavior change.
>> My personal opinion is that buffers don't
>> contain pointers.  They shouldn't participate in garbage collection.
>>
>> Either way, a decision needs to be made and either the changes applied or the bug closed as won't fix.
>>
>> Thoughts?
>
> When I filed it I was certain it's a regression, I discovered it while chasing
> memory corruption in my D2 port of DMDscript.
> All boiled down to "it works untill garbage collection", and then I found the
> cause:  the original code stores pointers to heap allocated objects in
> OutBuffer. So I think it _was_ supposed to be able to hold pointers to heap
> allocated stuff.
> Also grep code for:
> GC.clrAttr(data.ptr, GC.BlkAttr.NO_SCAN);
> I think this made it to scan GC  before array LRU append cache. Nowdays this
> part of code does nothing, since it somehow doesn't point to _block_. At any
> rate, I can wrap up the patch as pull request if I convinced anybody.
>

Can you add a bug report about DMDScript, then?
August 06, 2011
On 05.08.2011 22:04, Walter Bright wrote:
>
>
> On 8/4/2011 4:24 PM, Dmitry Olshansky wrote:
>> On 04.08.2011 7:21, Brad Roberts wrote:
>>> http://d.puremagic.com/issues/show_bug.cgi?id=5370
>>>
>>> It's listed as a regression, but it looks like a deliberate behavior
>>> change.  My personal opinion is that buffers don't
>>> contain pointers.  They shouldn't participate in garbage collection.
>>>
>>> Either way, a decision needs to be made and either the changes applied or the bug closed as won't fix.
>>>
>>> Thoughts?
>>
>> When I filed it I was certain it's a regression, I discovered it
>> while chasing memory corruption in my D2 port of DMDscript.
>> All boiled down to "it works untill garbage collection", and then I
>> found the cause:  the original code stores pointers to heap allocated
>> objects in OutBuffer. So I think it _was_ supposed to be able to hold
>> pointers to heap allocated stuff.
>> Also grep code for:
>> GC.clrAttr(data.ptr, GC.BlkAttr.NO_SCAN);
>> I think this made it to scan GC  before array LRU append cache.
>> Nowdays this part of code does nothing, since it somehow doesn't
>> point to _block_. At any rate, I can wrap up the patch as pull
>> request if I convinced anybody.
>>
>
> Can you add a bug report about DMDScript, then?

Sorry if I'm not entirely clear on this what kind of report that would be?
If it's that it uses wrong facility (std.outbuffer) for heap allocated
objects, the whole IR codegen is, sort of, centered over this ability.
What should it do then if outbuffer is supposed to be NO_SCAN?
Also I think it may still scan outbuffer in D1, I never checked how the
original source works.

-- 
Dmitry Olshansky

August 07, 2011
Like this: http://bugzilla.digitalmars.com/issues/show_bug.cgi?id=91

On 8/5/2011 2:14 PM, Dmitry Olshansky wrote:
>
> Sorry if I'm not entirely clear on this what kind of report that would be?
> If it's that it uses wrong facility (std.outbuffer) for heap allocated
> objects, the whole IR codegen is, sort of, centered over this ability. What
> should it do then if outbuffer is supposed to be NO_SCAN?
> Also I think it may still scan outbuffer in D1, I never checked how the
> original source works.
>
August 08, 2011


>________________________________
>From: Dmitry Olshansky <dmitry.olsh at gmail.com>
>
>On 04.08.2011 7:21, Brad Roberts wrote:
>> http://d.puremagic.com/issues/show_bug.cgi?id=5370
>> 
>> It's listed as a regression, but it looks like a deliberate behavior change.? My personal opinion is that buffers don't contain pointers.? They shouldn't participate in garbage collection.
>> 
>> Either way, a decision needs to be made and either the changes applied or the bug closed as won't fix.
>> 
>> Thoughts?
>
>When I filed it I was certain it's a regression, I discovered it while chasing memory corruption in my D2 port of DMDscript.
>All boiled down to "it works untill garbage collection", and then I found the cause:? the original code stores pointers to heap allocated objects in OutBuffer. So I think it _was_ supposed to be able to hold pointers to heap allocated stuff.
>Also grep code for:
>GC.clrAttr(data.ptr, GC.BlkAttr.NO_SCAN);
>I think this made it to scan GC? before array LRU append cache. Nowdays this part of code does nothing, since it somehow doesn't point to _block_. At any rate, I can wrap up the patch as pull request if I convinced anybody.

I think I can shed some light on this.

It's not so much the LRU cache but the array append code (that is the code that stores the number of *used* bytes).

In the case where a shared array gets longer than 4096 bytes (including metadata), there is the possibility to *extend* into consecutive pages.? If I store the metadata at the end of the block, then this scenario may occur:

1. thread A appends, gets the block info

2. thread B appends, gets the block info

3. thread A is able to extend to a consecutive page.? The metadata is moved to the end of the new page, and overwrites the original metadata with the appended data.
4. thread B, using its stale block info, reads the overwritten metadata, corruption ensues.

So I decided for blocks >= PAGESIZE should store the metadata at the beginning of the block.? This way, at least when using stale block info, thread B still sees valid metdata, and reallocates.

The other option I think is to hold the GC lock for the entire append operation, not just for the block lookup (I have a separate lock used to prevent races when modifying shared metadata).

If you have other ideas, I'd be open to changing it.

BTW, in my opinion, I think OutBuffer should simply be templated on the array type.

-Steve

August 08, 2011
On 08.08.2011 17:07, Steve Schveighoffer wrote:
>
>> ________________________________
>> From: Dmitry Olshansky<dmitry.olsh at gmail.com>
>>
>> On 04.08.2011 7:21, Brad Roberts wrote:
>>> http://d.puremagic.com/issues/show_bug.cgi?id=5370
>>>
>>> It's listed as a regression, but it looks like a deliberate behavior change.  My personal opinion is that buffers don't contain pointers.  They shouldn't participate in garbage collection.
>>>
>>> Either way, a decision needs to be made and either the changes applied or the bug closed as won't fix.
>>>
>>> Thoughts?
>> When I filed it I was certain it's a regression, I discovered it while chasing memory corruption in my D2 port of DMDscript.
>> All boiled down to "it works untill garbage collection", and then I found the cause:  the original code stores pointers to heap allocated objects in OutBuffer. So I think it _was_ supposed to be able to hold pointers to heap allocated stuff.
>> Also grep code for:
>> GC.clrAttr(data.ptr, GC.BlkAttr.NO_SCAN);
>> I think this made it to scan GC  before array LRU append cache. Nowdays this part of code does nothing, since it somehow doesn't point to _block_. At any rate, I can wrap up the patch as pull request if I convinced anybody.
> I think I can shed some light on this.
>
> It's not so much the LRU cache but the array append code (that is the code that stores the number of *used* bytes).
>
> In the case where a shared array gets longer than 4096 bytes (including metadata), there is the possibility to *extend* into consecutive pages.  If I store the metadata at the end of the block, then this scenario may occur:
>
> 1. thread A appends, gets the block info
>
> 2. thread B appends, gets the block info
>
> 3. thread A is able to extend to a consecutive page.  The metadata is moved to the end of the new page, and overwrites the original metadata with the appended data.
> 4. thread B, using its stale block info, reads the overwritten metadata, corruption ensues.
>
> So I decided for blocks>= PAGESIZE should store the metadata at the beginning of the block.  This way, at least when using stale block info, thread B still sees valid metdata, and reallocates.
>
> The other option I think is to hold the GC lock for the entire append operation, not just for the block lookup (I have a separate lock used to prevent races when modifying shared metadata).
>
> If you have other ideas, I'd be open to changing it.
Now it makes it abundantly clear why the code here is failing, in my fix I just changed underlying array type to void[] avoiding all the tricky calls to GC.clearAttr. As for the position of metadata, I'm no expert on runtime and GC can't really say help here. I think outbuffer might be not so extensively used as other parts of phobos so the this error hasn't surfaced before.
>
> BTW, in my opinion, I think OutBuffer should simply be templated on the array type.
>
Agreed, it looks like the best solution to me. Still the default was to scan array, so if we are changing this default, at minimum it should be noted it in  documentation and changelog.

-- 
Dmitry Olshansky

August 08, 2011



----- Original Message -----
> From: Dmitry Olshansky <dmitry.olsh at gmail.com>

>>  BTW, in my opinion, I think OutBuffer should simply be templated on the
> array type.
>> 
> Agreed, it looks like the best solution to me. Still the default was to scan array, so if we are changing this default, at minimum it should be noted it in? documentation and changelog.

Actually, looking at OutBuffer, I think maybe the others are right.? Distinctly missing is any method that writes a pointer-containing type.

Sorry, but I think I switched sides :)? Outbuffer should not use void[].? And anything that is using OutBuffer for storing pointers is incorrectly written.

OutBuffer almost seems like outdated code anyways...? I wonder if dmdscript shouldn't be rewritten to use Appender.

-Steve