September 24, 2013
24-Sep-2013 19:56, Andrei Alexandrescu пишет:
> On 9/24/13 1:46 AM, Dmitry Olshansky wrote:
>> 23-Sep-2013 03:49, Andrei Alexandrescu пишет:
>> Do you imagine Typed allocators as something more then adapters that
>> simplify a common pattern of allocate + emplace / destroy + deallocate?
>> (create!T/destroy!T)
>
> Well as a simple matter tracing allocators will have to store dynamic
> type information (just like the current GC does).
>
> Also, I hope we'll be able to allow allocators to define Pointer(T),
> Ref(T) etc. that supplant replacements for the built-in notions of
> pointer, reference etc. Then, user code that uses these notions instead
> of the built-in ones will be guaranteed some nice properties (e.g.
> automatic reference counting). Unfortunately I don't see a way for an
> allocator to enforce that its users don't do illegal things such as
> escaping addresses etc. So it would be a discipline-backed scheme.
> Notable beneficiaries will be containers.

Indeed I thought of Ref!T and Pointer!T for ARC. And just like you said I do not see a way of avoiding any potential abuse. But it is some power to desire.

>> First things first - can't allocator return alignment as run-time value
>> - a property (just like 'available' does)? The implementation contract
>> is that it must be O(1) vanila syscall-free function. (Read this as
>> consult system info exactly once, at construction).
>
> It could, but as I mentioned to Manu - at this level any cost is
> significant. Even changing from a compile-time constant to a global
> static dynamically-initialized constant has a cost. Making alignment an
> instance variable turns stateless allocators into stateful ones.

Hm most could just get away with an enum. Am I missing something?

>> Thinking more of it - it also may be immutable size_t? Then it gets
>> proper value at construction and then is never changed.
>
> I'm hoping to get away with a static function "static uint alignment()"
> that (a) is CTFEable for most allocators/platforms, (b) uses a
> dynamically-initialized static constant for a few. Then derived
> allocators will inherit CTFEability.

A check for CTFE-ablity might do the trick. The run-time dependent ones usually would immediately hit a wall that is a sys-call or WinAPI.

> Would a per-allocator-type alignment suffice?

I see alignment as an inherent property of the way an allocator acquires memory. Sad but true one can't determine the actual alignment of some important ones at CTFE. Even worse adapters would in turn depend on these system-specific run-time values such as half a page, 2 cache lines etc.

auto my2PageAligner = AligningAllocator(page_size*2, rootAllocator)

where page_size is calculated elsewhere.


>>> * deallocate(b) deallocates the memory allocated for b. b must have been
>>> previously allocated by the same allocator. This method is usually
>>> unsafe (but there are notable implementations that may offer safety,
>>> such as unbounded freelists.) This method is optional.
>>
>> Does the following implication hold "have a deallocate" --> must be
>> manually managed? Otherwise how would one reliably work with it and not
>> leak? This brings us to traits that allocators may (should) have
>> a-la automatic? free-all on termination? Zeros on allocate (more common
>> then one would think)? etc.
>
> The global GC does offer manual deallocation. It's the presence of
> collect() that indicates tracing abilities. If deallocateAll() is
> present, user code must assume it will be called during destruction.
>

I was looking into how some library would have to tweak its code based on presence/absence of some methods. For instance if there is collect, it doesn't have to call deallocate (even if present). If not then it has to call deallocate (again if present). Ditto with deallocateAll  with or without explicit deallocate.

> That said, I've also been thinking of adding a variety of Boolean
> telling properties. "Zeros on allocate" has an important relationship
> with safety of higher-level objects - zeros offer "not really
> initialized but without forged pointers so memory-safe" objects.

Great.

>>> * deallocateAll() deallocates in one shot all memory previously
>>> allocated by this allocator. This method is optional, and when present
>>> is almost always unsafe (I see no meaningful @safe implementation.)
>>> Region allocators are notable examples of allocators that define this
>>> method.
>>
>> Okay.. I presume region "mark" works by spiting off a subinstance of
>> allocator and "release" by deallocateAll().
>
> Nice, didn't think of this.

I made it up 'cause I thought you did thought of this ;)

[snip]

>> To the point - we have address space and optionally memory committed in
>> there, leading us to a 3-state of an  _address range_:
>> available, reserved, committed. The ranges of 2nd state are dirt cheap
>> (abundant on 64bit) and could be easily flipped to committed and back.
>>
>> So I have a pattern I want to try to fit in your design. At present it
>> is a datastructure + allocation logic. What I would want is clean and
>> nice composition for ultimate reuse.
> [snip]
>
> Interesting. I'll be looking forward to what you come up with. Yes, for
> large, coarse-granularity allocations, tapping into OS' paging
> capabilities is the best way to go.
>

Would love to do just that.. once we figure out some minor details.

>> Now how to fit this monster into the current API...
>> Plainly I need a way to tell hey Mr. Allocator RESERVE me a 1Gb of RAM
>> but don't chew on, it please. In other words - allocate needs something
>> like a "guaranteed capacity of the block". By default it could be ==
>> size.
>>
>> struct Allocator
>> {
>> ...
>> //allocate size bytes with a block of with no less then capacity bytes.
>> void[] allocate(size_t size, size_t capacity=size);
>> ...
>> }
>>
>> Or maybe somehow add a reserve method explicitly...
>
> Assuming you have an optional two-argument allocate() primitive, once
> you get the initial allocation, how do you commit it? By calling
> expand() I assume? Would that be enough?

expand should do the trick. I think that nicely saves on primitives count. But we may have to add a 'shrink' if you are so bent on never decreasing size in expand :)
And ... reallocate doesn't cut it as long as there is no big red tape on it that says - if decreasing the size reallocation it is ALWAYS in-place (it might not be true for some allocators - e.g. realloc doesn't guarantee even this IIRC).

>
> In the reserve()-based API - how would that work? Does reserve returns a
> void[] with size zero and then you expand() it?

In the reserve curiously one has to return a pointer to where the block is reserver _and_ a size of reserved range. Yes, one can't touch said memory range yet. So said 0 actual size is assumed implicitly.

void[] reserve(size_t maxSize);

Usage is like this:

auto range = reserve(1024*1024*1024);
auto memory = range[0..0];
auto capacity = range.length;
expand(memory, someSize, someSize);
...

Other option is to be "honest" and return a tuple of (pointer,capacity) and not a slice. It's still a bit hairy as it may as well be.

-- 
Dmitry Olshansky
September 24, 2013
On 9/24/13 11:20 AM, Dan Schatzberg wrote:
> On Tuesday, 24 September 2013 at 17:38:34 UTC, Dan Schatzberg wrote:
>> What is your objective though? Aren't you trying to define a hierarchy
>> of allocators where more specific allocators can be composed from
>> general ones? In which case what is the concern with including
>> locality at the base level? It seems to be one characteristic of
>> memory that programmers might be concerned with and rather trivially
>> you can compose a non-locality aware allocator from a locality aware
>> allocator.
>
> I'll elaborate on this point a bit just to make sure I'm clear. I
> understand your initial post here to be asking if the allocator
> interface you designed is comprehensive enough that all other allocators
> (within reason) could be constructed from it.

That is correct.

> Whether or not it is inefficient to pass alignment, or locality domain
> arguments to an allocate call seems orthogonal to whether or not one can
> construct an allocator which does not take such arguments from one that
> does.

That is correct too. A fat API for a core allocator does make everybody's life harder because they need to support it (e.g. by passing calls transparently or with tweaks). I think this will become a bit clearer after I publish the code.

> As I'm sure your original design is intended for, one can imagine
> allocators which have a fixed size (constructed from the original
> allocator you proposed). The allocate path can then very efficiently
> hand off memory from a free-list or some other efficient structure. One
> can make analogous allocators that do not allow for dynamic alignment or
> NUMA locality.
>
> The only reasonable argument against seems to be whether or not the
> flexibility is worth the semantic overhead of a larger hierarchy which
> is a completely reasonable and subjective argument to be had.

Yah, that sounds abour right.


Thanks,

Andrei

September 24, 2013
On 9/24/13 1:02 PM, Dmitry Olshansky wrote:
> 24-Sep-2013 19:56, Andrei Alexandrescu пишет:
>> It could, but as I mentioned to Manu - at this level any cost is
>> significant. Even changing from a compile-time constant to a global
>> static dynamically-initialized constant has a cost. Making alignment an
>> instance variable turns stateless allocators into stateful ones.
>
> Hm most could just get away with an enum. Am I missing something?

Right now I do things like:

struct DopeAllocator(Allocator, Dope)
{
    static assert(Allocator.alignment >= Dope.alignof,
        "DopeAllocator does not work with allocators offering a smaller"
        " alignment than the dope alignment.");
    static assert(!hasElaborateDestructor!Dope,
        "DopeAllocator: ellaborate destructors for Dope not implemented.");

    enum alignment = Dope.alignof;

    ...
}

Such code would be significantly more involved if Allocator had the freedom to define alignment as either an enum, a static method, or a member function.

(Background: DopeAllocator!(A, D) allocates an object of type D just before every allocation and offers user code access to it. Allocations are made using A.)

>> I'm hoping to get away with a static function "static uint alignment()"
>> that (a) is CTFEable for most allocators/platforms, (b) uses a
>> dynamically-initialized static constant for a few. Then derived
>> allocators will inherit CTFEability.
>
> A check for CTFE-ablity might do the trick. The run-time dependent ones
> usually would immediately hit a wall that is a sys-call or WinAPI.

It does. It's just some more work. Guess I'd need to put it in.

>> Would a per-allocator-type alignment suffice?
>
> I see alignment as an inherent property of the way an allocator acquires
> memory.

The question remains, is it a property of the allocator _object_ or a property of the allocator _type_? I.e. do we need to support allocators that store alignment as a nonstatic member variable?

> Sad but true one can't determine the actual alignment of some
> important ones at CTFE. Even worse adapters would in turn depend on
> these system-specific run-time values such as half a page, 2 cache lines
> etc.
>
> auto my2PageAligner = AligningAllocator(page_size*2, rootAllocator)
>
> where page_size is calculated elsewhere.

That can be made to work, it's just that page_size would be a static variable that the user must set (not a template argument).

>> The global GC does offer manual deallocation. It's the presence of
>> collect() that indicates tracing abilities. If deallocateAll() is
>> present, user code must assume it will be called during destruction.
>>
>
> I was looking into how some library would have to tweak its code based
> on presence/absence of some methods. For instance if there is collect,
> it doesn't have to call deallocate (even if present). If not then it has
> to call deallocate (again if present). Ditto with deallocateAll  with or
> without explicit deallocate.

Yah, my std.allocator prototype has a bunch of such logic.

> expand should do the trick. I think that nicely saves on primitives
> count. But we may have to add a 'shrink' if you are so bent on never
> decreasing size in expand :)
> And ... reallocate doesn't cut it as long as there is no big red tape on
> it that says - if decreasing the size reallocation it is ALWAYS in-place
> (it might not be true for some allocators - e.g. realloc doesn't
> guarantee even this IIRC).

OK, I'm willing to add the optional method shrink(void[], size_t newSize) if expand() and realloc() are insufficient. Indeed realloc() does not guarantee immovability, and we also shouldn't because Mallocator must be compatible the C API.



Andrei
September 24, 2013

On 24.09.2013 21:36, Andrei Alexandrescu wrote:
> On 9/24/13 11:28 AM, Rainer Schuetze wrote:
>>
>>
>> expand is nice, but I don't like minDelta and maxDelta as arguments. On
>> a shared allocator this can lead to undesired side-effects, i.e. when
>> this function is called concurrently on the same block by two threads,
>> the net result will be the sum of the deltas of the two threads, while
>> later synchronization on the memory block might only allow one thread to
>> actually use the extended memory area. This can happen with gc_extend in
>> the current GC when appending to a shared array.
>>
>> I'd prefer the function to actually pass the desired sizes:
>>
>> void[] expand(void[] b, size_t minSize, size_t maxSize);
>
> I don't understand this argument. These parameters can be trivially
> computed from the others, after which locking etc. proceeds just the same.
>

Taking the current array implementation as an example, the deltas are computed before the actual GC lock happens inside gc_extend which means that the second of two concurrent requests leads to overallocation.


> FWIW I chose the "delta"-based signature because with it the
> precondition is:
>
> assert(minDelta <= maxDelta);
>
> In fact behavior can be easily defined at no cost even if this
> precondition is not satisfied. With the "absolute"-based signature the
> precondition is:
>
> assert(minSize <= maxSize && b.length <= minSize);
>
> which causes confusion - people will pass small sizes to expand()
> expecting it to contract, something that expand() can't support as a
> matter of principle (safety!!!).
>

Why would you expect a function "expand" to reduce the memory size?

September 24, 2013
On 9/24/13 2:02 PM, Rainer Schuetze wrote:
>
>
> On 24.09.2013 21:36, Andrei Alexandrescu wrote:
>> On 9/24/13 11:28 AM, Rainer Schuetze wrote:
>>>
>>>
>>> expand is nice, but I don't like minDelta and maxDelta as arguments. On
>>> a shared allocator this can lead to undesired side-effects, i.e. when
>>> this function is called concurrently on the same block by two threads,
>>> the net result will be the sum of the deltas of the two threads, while
>>> later synchronization on the memory block might only allow one thread to
>>> actually use the extended memory area. This can happen with gc_extend in
>>> the current GC when appending to a shared array.
>>>
>>> I'd prefer the function to actually pass the desired sizes:
>>>
>>> void[] expand(void[] b, size_t minSize, size_t maxSize);
>>
>> I don't understand this argument. These parameters can be trivially
>> computed from the others, after which locking etc. proceeds just the
>> same.
>>
>
> Taking the current array implementation as an example, the deltas are
> computed before the actual GC lock happens inside gc_extend which means
> that the second of two concurrent requests leads to overallocation.

(I'm confused - which array implementation?) Does this qualify as an implementation bug, or are you referring to the case where the void[] being reallocated is being shared?

>> FWIW I chose the "delta"-based signature because with it the
>> precondition is:
>>
>> assert(minDelta <= maxDelta);
>>
>> In fact behavior can be easily defined at no cost even if this
>> precondition is not satisfied. With the "absolute"-based signature the
>> precondition is:
>>
>> assert(minSize <= maxSize && b.length <= minSize);
>>
>> which causes confusion - people will pass small sizes to expand()
>> expecting it to contract, something that expand() can't support as a
>> matter of principle (safety!!!).
>>
>
> Why would you expect a function "expand" to reduce the memory size?

Ask Dmitry :o). Far as I can tell he assumed so.


Andrei

September 24, 2013

On 24.09.2013 23:05, Andrei Alexandrescu wrote:
>> Taking the current array implementation as an example, the deltas are
>> computed before the actual GC lock happens inside gc_extend which means
>> that the second of two concurrent requests leads to overallocation.
>
> (I'm confused - which array implementation?)

I mean the dynamic array implementation in rt.lifetime.

> Does this qualify as an
> implementation bug, or are you referring to the case where the void[]
> being reallocated is being shared?

Yes, I'm referring to a shared void[], e.g. strings which use immutable shared buffers that you can still append to if you are not stomping other data.

Appending to it does not need a lock for the whole operation, but only an atomic operation when adding to the stored size (though the implementation uses a synchronized block to change the size).
September 24, 2013
On Tue, 24 Sep 2013 08:29:39 -0700
Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:

> On 9/23/13 11:32 PM, Jacob Carlborg wrote:
> > On 2013-09-23 19:53, Andrei Alexandrescu wrote:
> >> I think this is debatable. For one, languages such as Java and C++ still have built-in "new" but quite ubiquitously unrecommend their usage in user code. Far as I can tell that's been a successful campaign.
> >
> > Since when is it _not_ recommended to use "new" in Java?
> 
> http://goo.gl/KVmAom
> 

Uhh, that's JavaScript, not Java. From what I can tell, the
arguments seem to be very specific to JavaScript, ie, centering around
prototype-inheritance and implicit declarations.

Paulo's comment about preferring dependency injection makes sense (Although they'd still get new'ed anyway, just at some other point, right?). Were there other reasons?

September 24, 2013
> I would rather want new to be overloadable and have 2 sets of parameters
>
> new (allocator)(arg1, arg2)
>
> Where "allocator" would go to the overloaded version of new and "arg1" and "arg2" will be passed to the constructor.

+1
September 24, 2013
> Also, how does it work with your deallocate interface? Suppose I request an 0x100 aligned block of 0x100 bytes. Your alignment allocator requests 0x200 from the GC, which maybe returns [0xffff0040-0xffff0240] and then returns an aligned buffer from that [0xffff0100-0xffff0200]. Later, I try to deallocate that buffer, which means your alignment allocator has to deallocate the original buffer. Where does the alignment allocator store the original GC-provided buffer ptr + size? It cannot be determined from the buffer I gave it.

It can store it just before the address it returns. I have an example
of this at https://github.com/jerro/pfft/blob/multidim/pfft/clib.d#L46 .
If the deallocate method takes the size of the returned block as a
parameter in addition to the pointer, the original pointer could
also be stored right after the end of the returned block.
September 24, 2013
25-Sep-2013 01:05, Andrei Alexandrescu пишет:
> On 9/24/13 2:02 PM, Rainer Schuetze wrote:
[snip]
>> Why would you expect a function "expand" to reduce the memory size?
>
> Ask Dmitry :o). Far as I can tell he assumed so.
>

I had read it as reallocate in place. Yeah, my "semantic reading" must suck though it usually speeds things up :o)

> Andrei
>


-- 
Dmitry Olshansky