Jump to page: 1 26  
Page
Thread overview
An important pull request: accessing shared affix for immutable data
Feb 12, 2016
Timon Gehr
Feb 13, 2016
Timon Gehr
Feb 13, 2016
ZombineDev
Feb 12, 2016
deadalnix
Feb 13, 2016
John Colvin
Feb 13, 2016
deadalnix
Feb 13, 2016
tsbockman
Feb 13, 2016
tsbockman
Feb 13, 2016
tsbockman
Feb 13, 2016
tsbockman
Feb 13, 2016
tsbockman
Feb 13, 2016
Dicebot
Feb 13, 2016
Dicebot
Feb 13, 2016
Dicebot
Feb 13, 2016
ZombineDev
Feb 13, 2016
Timon Gehr
Feb 13, 2016
Timon Gehr
Feb 13, 2016
Timon Gehr
Feb 13, 2016
ZombineDev
Apr 24, 2016
Marco Leise
Feb 13, 2016
ZombineDev
Feb 13, 2016
ZombineDev
Feb 13, 2016
Mathias Lang
Feb 13, 2016
deadalnix
Feb 13, 2016
ZombineDev
Feb 14, 2016
deadalnix
Feb 13, 2016
rsw0x
Feb 13, 2016
rsw0x
Feb 13, 2016
Iakh
Feb 13, 2016
Iakh
Feb 13, 2016
ZombineDev
Feb 13, 2016
ZombineDev
Feb 14, 2016
Sönke Ludwig
February 12, 2016
https://github.com/D-Programming-Language/phobos/pull/3991

A short while ago Dicebot discussed the notion of using the allocator to store the reference count of objects (and generally metadata). The allocator seems to be a good place because in a way it's a source of "ground truth" - no matter how data is qualified, it originated as untyped mutable bytes from the allocator.

So after thinking a bit I managed to convince myself that the affixes in an AffixAllocator can be accessed with removing immutable, but without actually breaking any guarantee made by the type system. (Affixes are extra bytes allocated before and after the actual allocation.) The logic goes as follows:

* If the buffer is mutable, then the allocator assumes it hasn't been shared across threads, so it returns a reference to a mutable affix.

* If the buffer is const, then the allocator must conservatively assume it might have been immutable and subsequently shared among threads. Therefore, several threads may request the affix of the same buffer simultaneously. So it returns a reference to a shared affix.

* If the buffer is shared, then the allocator assumes again several threads may access the affix so it returns a reference to a shared affix.

One simple way to look at this is: the allocator keeps an associative array mapping allocated buffers to metadata (e.g. reference counts). The allocated buffers may be immutable, which doesn't require the metadata to be immutable as well. The only difference between an approach based on an associative array and AffixAllocator is that the latter is faster (the association is fixed by layout).


Destroy!

Andrei



February 13, 2016
On 12.02.2016 20:12, Andrei Alexandrescu wrote:
> https://github.com/D-Programming-Language/phobos/pull/3991
>
> A short while ago Dicebot discussed the notion of using the allocator to
> store the reference count of objects (and generally metadata). The
> allocator seems to be a good place because in a way it's a source of
> "ground truth" - no matter how data is qualified, it originated as
> untyped mutable bytes from the allocator.
>
> So after thinking a bit I managed to convince myself that the affixes in
> an AffixAllocator can be accessed with removing immutable, but without
> actually breaking any guarantee made by the type system. (Affixes are
> extra bytes allocated before and after the actual allocation.) The logic
> goes as follows:
>
> * If the buffer is mutable, then the allocator assumes it hasn't been
> shared across threads, so it returns a reference to a mutable affix.
>
> * If the buffer is const, then the allocator must conservatively assume
> it might have been immutable and subsequently shared among threads.
> Therefore, several threads may request the affix of the same buffer
> simultaneously. So it returns a reference to a shared affix.
>
> * If the buffer is shared, then the allocator assumes again several
> threads may access the affix so it returns a reference to a shared affix.
>
> One simple way to look at this is: the allocator keeps an associative
> array mapping allocated buffers to metadata (e.g. reference counts). The
> allocated buffers may be immutable, which doesn't require the metadata
> to be immutable as well. The only difference between an approach based
> on an associative array and AffixAllocator is that the latter is faster
> (the association is fixed by layout).
>
>
> Destroy!
>
> Andrei
>

The first thing that comes to mind is that accessing a global associative array is not 'pure'.
February 12, 2016
On Friday, 12 February 2016 at 19:12:48 UTC, Andrei Alexandrescu wrote:
> https://github.com/D-Programming-Language/phobos/pull/3991
>
> A short while ago Dicebot discussed the notion of using the allocator to store the reference count of objects (and generally metadata). The allocator seems to be a good place because in a way it's a source of "ground truth" - no matter how data is qualified, it originated as untyped mutable bytes from the allocator.
>
> So after thinking a bit I managed to convince myself that the affixes in an AffixAllocator can be accessed with removing immutable, but without actually breaking any guarantee made by the type system. (Affixes are extra bytes allocated before and after the actual allocation.) The logic goes as follows:
>
> * If the buffer is mutable, then the allocator assumes it hasn't been shared across threads, so it returns a reference to a mutable affix.
>
> * If the buffer is const, then the allocator must conservatively assume it might have been immutable and subsequently shared among threads. Therefore, several threads may request the affix of the same buffer simultaneously. So it returns a reference to a shared affix.
>
> * If the buffer is shared, then the allocator assumes again several threads may access the affix so it returns a reference to a shared affix.
>
> One simple way to look at this is: the allocator keeps an associative array mapping allocated buffers to metadata (e.g. reference counts). The allocated buffers may be immutable, which doesn't require the metadata to be immutable as well. The only difference between an approach based on an associative array and AffixAllocator is that the latter is faster (the association is fixed by layout).
>
>
> Destroy!
>
> Andrei

So if I may, I think we should avoid affix data in the general case.

Providing some metadata in the allocate is in itself a good idea. Locating these data with the object is usually not :
 - Mutating the metadata will create sharing overhead on the whole cache line. Sharing of immutable would become inefficient.
 - It tends to create allocation size that aren't friendly to underlying allocators. For instance, an alocation of size 2^n + 8 bumps you to the next size class, often 2^(n+1) or alike. This creates a lot of internal fragmentation.
 - Buffer overflow/underflow WILL spill in the alocator metadata. You don't want that. This pretty much guaranteed that the worse thing that can happen will happen: corrupting the alocator.

jemalloc moved away from using them for small runs for these reasons.
February 12, 2016
On 02/12/2016 06:29 PM, Timon Gehr wrote:
> The first thing that comes to mind is that accessing a global
> associative array is not 'pure'.

The weird thing is in this case is. The analogy is limited. -- Andrei
February 12, 2016
On 02/12/2016 06:52 PM, deadalnix wrote:
> Providing some metadata in the allocate is in itself a good idea.
> Locating these data with the object is usually not :
>   - Mutating the metadata will create sharing overhead on the whole
> cache line. Sharing of immutable would become inefficient.
>   - It tends to create allocation size that aren't friendly to
> underlying allocators. For instance, an alocation of size 2^n + 8 bumps
> you to the next size class, often 2^(n+1) or alike. This creates a lot
> of internal fragmentation.
>   - Buffer overflow/underflow WILL spill in the alocator metadata. You
> don't want that. This pretty much guaranteed that the worse thing that
> can happen will happen: corrupting the alocator.
>
> jemalloc moved away from using them for small runs for these reasons.

I think we're good there. -- Andrei

February 13, 2016
On Friday, 12 February 2016 at 19:12:48 UTC, Andrei Alexandrescu wrote:
> * If the buffer is const, then the allocator must conservatively assume it might have been immutable and subsequently shared among threads. Therefore, several threads may request the affix of the same buffer simultaneously. So it returns a reference to a shared affix.

I understand the reasoning here, but I really dislike the idea of `const` as a de-optimization. However, given the way that const is used, I guess this wouldn't be much of a problem in practice provided that some form of borrowing or inc/dec pair elision is implemented.
February 13, 2016
On Friday, 12 February 2016 at 19:12:48 UTC, Andrei Alexandrescu wrote:
> * If the buffer is const, then the allocator must conservatively assume it might have been immutable and subsequently shared among threads. Therefore, several threads may request the affix of the same buffer simultaneously. So it returns a reference to a shared affix.

What about providing a way to mark a piece of data as thread-local at allocation time?

The allocator could then stick it in a separate memory range from the (potentially) shared stuff, and just issue a fatal `Error` if an attempt was made to access it from the wrong thread.

This would prevent `const` from triggering unnecessary atomic ops or locks. The main disadvantage I see is that the logic to find the correct reference count address would be a little more complicated, but I suspect this would be a good trade-off given how expensive thread-safe memory operations can be.
February 13, 2016
On 02/12/2016 09:12 PM, Andrei Alexandrescu wrote:
> So after thinking a bit I managed to convince myself that the affixes in an AffixAllocator can be accessed with removing immutable, but without actually breaking any guarantee made by the type system. (Affixes are extra bytes allocated before and after the actual allocation.) The logic goes as follows:
> 
> * If the buffer is mutable, then the allocator assumes it hasn't been shared across threads, so it returns a reference to a mutable affix.
> 
> * If the buffer is const, then the allocator must conservatively assume it might have been immutable and subsequently shared among threads. Therefore, several threads may request the affix of the same buffer simultaneously. So it returns a reference to a shared affix.
> 
> * If the buffer is shared, then the allocator assumes again several threads may access the affix so it returns a reference to a shared affix.

Thank you for moving forward with this. I haven't yet had chance to play with the code but it looks like a good start. Some minor notes on implementation (I have also discussed it a bit with deadalnix in IRC):

- cache thrashing for shared allocations can be reduced by aligning
those on cache line size (for thread-local ones it is not needed)
- considering this API is going to be dangerously @system, possible
corruption of metadata is indeed very scary - though not a blocker
within my own value list
- I wonder if would make for a cleaner design to have distinct allocator
instances for thread-local and shared memory instead of branching on
qualifiers of requested type

Right now the main concern to me is how such design is going to interact with existing code like `assumeUnique`. With GC which treats all memory similarly it is legal (and somewhat common) to allocate mutable non-shared data first and only later cook it into immutable via `assumeUnique`. But if allocator has to treat such memory differently (i.e. add barriers/atomics  for non-TLS one) such code will start silently introduce major bugs.

Sadly, don't have any specific suggeestions about all of this yet.
February 12, 2016
On 02/12/2016 08:02 PM, tsbockman wrote:
> What about providing a way to mark a piece of data as thread-local at
> allocation time?

Yah, could be part of the metadata. Then after inspecting that bit, shared can be casted away. I consider that an optimization that doesn't add or remove generality. -- Andrei
February 12, 2016
On 02/12/2016 08:24 PM, Dicebot wrote:
> - considering this API is going to be dangerously @system, possible
> corruption of metadata is indeed very scary - though not a blocker
> within my own value list

I don't think this is an argument worth minding, either.

> - I wonder if would make for a cleaner design to have distinct allocator
> instances for thread-local and shared memory instead of branching on
> qualifiers of requested type

TypedAllocator does exactly that. Not difficult to integrate here, but the key here is removing immutability without subverting the type system (which is a distinct matter).

> Right now the main concern to me is how such design is going to interact
> with existing code like `assumeUnique`. With GC which treats all memory
> similarly it is legal (and somewhat common) to allocate mutable
> non-shared data first and only later cook it into immutable via
> `assumeUnique`. But if allocator has to treat such memory differently
> (i.e. add barriers/atomics  for non-TLS one) such code will start
> silently introduce major bugs.

Folks who defined refcounted types use the special AffixAllocator internally in an encapsulated manner. They define it, they use it - outside intervention is not possible. Those who use their own allocation calls and then use assumeUnique cannot use it on the new types. Or maybe I'm not understanding something.


Andrei

« First   ‹ Prev
1 2 3 4 5 6