September 24, 2013
On 9/23/13 5:12 PM, deadalnix wrote:
> On Monday, 23 September 2013 at 17:16:40 UTC, Andrei Alexandrescu wrote:
>> The singleton allocator "it" is instrumental for supporting stateful
>> and stateless allocators with ease. It took me a few iterations to get
>> to that, and I'm very pleased with the results. I think it would be
>> difficult to improve on it without making other parts more difficult.
>>
>
> The allocator can use a singleton internally without using "it". I'm
> unclear what is the problem solved.

Composability. My first design had stateful allocators define regular methods and stateless (better said global) allocators define static methods. That seemed to work well but then allocators composed with them also had to define static or nonstatic methods depending on whether the parent allocators had state.

Using a singleton instance for stateless allocators has simplified things somewhat - everybody defines regular methods, and if they are stateless they define the singleton instance.


Andrei
September 24, 2013
On Monday, 23 September 2013 at 03:58:51 UTC, Walter Bright wrote:
> On 9/22/2013 4:49 PM, Andrei Alexandrescu wrote:
>> Destroy!
>
> It should be compatible with:
>
> http://wiki.dlang.org/DIP46
>
> or the DIP should be adaptable to work with std.allocator.

It seems to me like DIP46 attempts to address Manu's concerns, and is (mostly) orthogonal to Andrei's designs.

I would wish the current GC to be defined according to Andrei's new API.  It should be the one that handles all allocations by default at program start.

I have a desire for region based allocation, and in this context I want to be able to push the thread's default allocator onto a "thread allocator stack" and thus make all D allocations (new's, string concats, etc) use the region allocator.  So I pretty much want DIP46, except I want "pushAllocator(...)" and "popAllocator()" instead of "gc_push()" and "gc_pop()".  The region allocator would also be defined according to Andrei's API.

By default, the region allocator should request giant chunks of memory from the GC (or whatever is past it in the thread's allocator stack).  This should be manually adjustable so that I can point it at a malloc-forwarding allocator (Mallocator).

The allocator stack at program start should probably look like this:

[GCAllocator] <- current
[Mallocator]
<system>

My server might look like this:

// ----------------------
module main;
import core.thread;
import thread_main;

void main(string[] args)
{
	... initialization/finalization (thanks scope(exit)!) ...

	while ( true )
	{
		if ( !handshake(connection, ...) )
			continue;

		auto worker = new Thread(&threadMain);
		worker.start();
	}
}

// ----------------------
module thread_main;

import std.allocator;

void threadMain()
{
	// Construct a region allocator that uses the nearest
	// Mallocator to allocate regions.
	auto regionAllocator = new RegionAllocator(getMallocator());

	// Use it by default for all of this thread's allocations.
	pushAllocator( regionAllocator );

	scope(exit)
	{
		// Free 'foo' and everything still allocated by
		// the call to doStuff().
		regionAllocator.deallocateAll();

		// Return control to the GC.
		popAllocator();
	}

	auto foo = ubyte[40]; // Region allocated.

	doStuff();
}

/// Gets the nearest Mallocator in the allocator stack, if it's
/// available.  Returns the allocator stack's front if there is
/// none.
Mallocator getMallocator()
{
	auto allocatorStackCopy = allocatorStack.save;
	auto result = allocatorStackCopy.front;
	while ( !allocatorStackCopy.empty )
	{
		// I just realized that allocators should be a class
		//   type and not a struct type, otherwise this function
		//   doesn't work!
		// <failure>
		// if (is(typeof(allocatorStackCopy.front) == Mallocator))
		// </failure>
		
		auto mallocator =
			cast(Mallocator)allocatorStackCopy.front;
		
		if ( mallocator !is null )
			result = mallocator;
	}

	return result;
}

// ----------------------

Now, once inside doStuff(), the allocator stack looks like so:

[RegionaAllocator] -.  <- Current
[GCAllocator]       |
[Mallocator] <------'
<system>

// ----------------------

I think I might have managed to demonstrate that Manu's concerns (and DIP46) have possible implications for Andrei's design: allocators need dynamic dispatch to be reassignable at runtime, and classes with inheritance seem like a natural choice for this.  I like my no-cost abstractions though, so I hate to admit the dynamic nature of this.  Is there a better way?

I definitely don't want 3rd parties to, at compile-time (link-time?), avoid my custom allocators by default.

Small aside: I don't like that statelessOp has to be nothrow.  Is there a reason that we can't pre-allocate a fixed amount of memory for exception objects, including at least one "can't allocate proper exception" fatal exception, and then throw the pre-allocated NoMemoryForException exception if we don't have enough memory to handle the throw's that may occur?
September 24, 2013
On 24 September 2013 09:21, Andrei Alexandrescu < SeeWebsiteForEmail@erdani.org> wrote:

> On 9/23/13 3:12 PM, Peter Alexander wrote:
>
>> On Monday, 23 September 2013 at 21:27:36 UTC, Andrei Alexandrescu wrote:
>>
>>> That allocator would allocate more memory (I suspect there's a gcd calculation somewhere :o)) and then adjust the starting pointer of the allocated block to reach the requested alignment.
>>>
>>
>> That's quite an inefficient use of memory for large alignment sizes.
>>
>
> I don't see a way around it unless the allocator natively provides a large alignment size, which it would then advertise in its "alignment" constant. The problem is independent of this particular design.


There's an important distinction between an allocator advertising a large
allocation size (which it always uses), and an allocator being capable of
allocating a larger aligned block if requested.
Surely the advertised alignment is the alignment you will get in all cases,
even if you request a small alignment.
You should advertise a minimum and maximum alignment if that is your
solution.

 Also, how does it work with your deallocate interface? Suppose I request
>> an 0x100 aligned block of 0x100 bytes. Your alignment allocator requests 0x200 from the GC, which maybe returns [0xffff0040-0xffff0240] and then returns an aligned buffer from that [0xffff0100-0xffff0200]. Later, I try to deallocate that buffer, which means your alignment allocator has to deallocate the original buffer. Where does the alignment allocator store the original GC-provided buffer ptr + size? It cannot be determined from the buffer I gave it.
>>
>
> Walter noted the same issue. I'd need to think about it. A simple solution is to simply store the size just before the pointer returned, in a dope block manner.


This means if the allocator returns you memory that is already of the
desired alignment, you need to waste an entire aligned block just to store
the offset, which would have been 0.
I solve this problem by storing my allocation table in a separate hash
table, and entries in that table associate the size and also the alignment
offset. But it's not a good solution, it's still wasteful. It's just the
best option you have if you don't have any control over the system
allocator.

I also use large alignments which I've detailed in prior discussions.
Common ones are cache line (~64-256 bytes), and gpu memory page (~4k).
You can't go wasting GPU memory by overallocating every block.
It's definitely important that allocator's are able to receive an alignment
request, and give them the opportunity to fulfill a dynamic alignment
request without always resorting to an over-allocation strategy.

 I'd need a handful of good examples where the alignment must be known
>>> at runtime. Can you link to some?
>>>
>>
>> mprotect requires a pointer that is a multiple of the system page size, which isn't constant.
>>
>> http://linux.die.net/man/2/**mprotect<http://linux.die.net/man/2/mprotect>
>>
>
> Thanks! I see posix_memalign is a corresponding call that returns a page-aligned chunk of memory. I assume calls like mmap also return page-aligned chunks. Indeed it is a problem for the current design that the alignment must be known during compilation.
>
>
>  Reading a file without buffering on Windows requires that you align the
>> destination buffer on volume sector size boundaries, which is not known at compile time (although many just assume 4096).
>>
>         >
>
>> http://msdn.microsoft.com/en-**us/library/windows/desktop/**
>> cc644950(v=vs.85).aspx<http://msdn.microsoft.com/en-us/library/windows/desktop/cc644950(v=vs.85).aspx>
>>
>
> I see there's a difference between x86 and Itanium in page size...
>
> I'm trying to lean toward handling these as rare/exotic cases without outright eliminating them, while providing simple and efficient handling of alignment for the frequent cases (i.e. allocate ints at addresses multiple of 4 etc).
>
>
>  As I mentioned previously, you may want to also align to the cache line
>> size (for performance). This is not known at compile time (although
>> again, is often assumed in practice).
>>
>
> Indeed.
>
>
> Andrei
>
>


September 24, 2013
On 9/23/13 9:56 PM, Manu wrote:
> You can't go wasting GPU memory by overallocating every block.

Only the larger chunk may need to be overallocated if all allocations are then rounded up.

> It's definitely important that allocator's are able to receive an
> alignment request, and give them the opportunity to fulfill a dynamic
> alignment request without always resorting to an over-allocation strategy.

I'd need a bit of convincing. I'm not sure everybody needs to pay for a few, and it is quite possible that malloc_align suffers from the same fragmentation issues as the next guy. Also, there's always the possibility of leaving some bits to lower-level functions.


Andrei

September 24, 2013
On 24 September 2013 03:53, Andrei Alexandrescu < SeeWebsiteForEmail@erdani.org> wrote:

> On 9/23/13 8:32 AM, Manu wrote:
>
>> On 23 September 2013 23:58, Andrei Alexandrescu
>> <SeeWebsiteForEmail@erdani.org <mailto:SeeWebsiteForEmail@**erdani.org<SeeWebsiteForEmail@erdani.org>
>> >>
>>
>> wrote:
>> This is a good first step though, I'm happy to discuss this, but I think
>> discussion about the practical application may also reveal design
>> details at this level.
>>
>
> Absolutely. This is refreshing since I've gone through the cycle of "types must be integrated into allocators and with the global GC" ... -> ... "untyped allocators can actually be extricated" once. It is now great to revisit the assumptions again.
>
> One matter I'm rather surprised didn't come up (I assumed everyone would be quite curious about it) is that the allocator does not store the allocated size, or at least does not allow it to be discovered easily. This is a definitely appropriate topic for the design at hand.
>
> The malloc-style APIs go like:
>
> void* malloc(size_t size);
> void free(void*);
>
> Note that the user doesn't pass the size to free(), which means the allocator is burdened with inferring the size of the block from the pointer efficiently. Given that most allocators make crucial strategic choices depending on the size requested, this API shortcoming is a bane of allocator writers everywhere, and a source of inefficiency in virtually all contemporary malloc-style allocators.
>
> This is most ironic since there is next to nothing that user code can do with a pointer without knowing the extent of memory addressed by it. (A notable exception is zero-terminated strings, another questionable design decision.) It all harkens back to Walter's claim of "C's biggest mistake" (with which I agree) of not defining a type for a bounded memory region.
>
> Upon studying a few extant allocators and D's lore, I decided that in D things have evolved sufficiently to have the user pass the size upon deallocation as well:
>
> void[] allocate(size_t size);
> void deallocate(void[] buffer);
>
> This is because the size of D objects is naturally known: classes have it in the classinfo, slices store it, and the few cases of using bald pointers for allocation are irrelevant and unrecommended.
>
> This all makes memory allocation in D a problem fundamentally simpler than in C, which is quite an interesting turn :o).


Yeah, that's definitely cool.
I had previously just presumed the allocator would stash the size somewhere
along with it's allocation record (as usual), but this does potentially
simplify some allocators.

 No, I certainly understand you mean 'them', but you lead to what I'm
>> asking, how do these things get carried/passed around. Are they discreet, or will they invade argument lists everywhere?
>>
>
> Since there's a notion of "the default" allocator, there will be ways to push/pop user-defined allocators that temporarily (or permanently) replace the default allocator. This stage of the design is concerned with allowing users to define such user-defined allocators without having to implement them from scratch.


I get that about this stage of design. Buy my question was about the next
stage, that is, usage/management of many instances of different allocators,
how they are carried around, and how they are supplied to things that want
to use them.
Prior comments lead me to suspect you were in favour of exclusive usage of
this new API, and the 'old' new keyword would continue to exist to just
allocate GC memory. At least that's the impression I got.
This leads to a presumption that parameter lists will become clogged with
allocators if they must be micro-managed, references to allocators will end
up all over the place. I'm not into that. I want to know how it will look
in practise.

I appreciate you think that's off-topic, but I find it pretty relevant.

 Sure, but the common case is that the author will almost certainly use
>> keyword 'new'. How can I affect that as a 3rd party?
>> This would require me overriding the global allocator somehow... which
>> you touched on earlier.
>>
>
> The way I'm thinking of this is to install/uninstall user-defined allocators that will satisfy calls to new. Since not all allocators support tracing, some would require the user to deallocate stuff manually.
>

So we need to keep delete then? I also think it may remain useful for allocators that don't clean up automatically.

     The library wouldn't need to worry as there would be the notion of a
>>     default allocator (probably backed by the existing GC).
>>
>> Right. So it's looking like like the ability to override the global allocator is a critical requirement.
>>
>
> Again, this is difficult to handle in the same conversation as "should we
> pass size upon deallocation". Yes, we need road laws, but it's hard to talk
> about those and engine lubrication in the same breath. For me,
> "critical" right now is to assess whether the untyped API misses an
> important category of allocators, what safety level it has (that's why
> "expand" is so different from "realloc"!) etc.
>

Then we should fork the thread? Or I'll just wait for the next one? I'm happy to do that, but as I said before, I suspect 'that' discussion will have impact on this one though, so however awkward, I think it's relevant.

         And does that mean that applications+libraries are required to
>>         ALWAYS
>>         allocate through given allocator objects?
>>
>>
>>     Yes, they should.
>>
>>
>> Then we make keyword 'new' redundant?
>>
>
> Probably not. Typed allocators will need to integrate with (and occasionally replace) the global shared GC.
>

'typed allocators'? Are they somehow fundamentally separate from this
discussion?
They seem like just a layer above which construct/destruct/casts. Do you
anticipate the typed allocation interface to be significantly more than
just a light wrapper?

     The problem is that installing an allocator does not get to define
>>     what a pointer is and what a reference is.
>>
>> Why not?
>>
>
> Because that requires a language change. I'm not sure you realize but you move the goalposts all the time. We were talking within the context of libraries and installing allocators dynamically and all of a sudden you get to change what the compiler thinks a pointer is.
>

Deprecating new is definitely a language change.

You said (down lower) "have allocators define their own pointers and reference types"... so _you_ said about changing what the compiler thinks a pointer is, that wasn't my idea, I am just trying to roll with that train of thought, and consider possibilities.

I don't think it's fair to accuse me of moving goal posts when it's
absolutely not clear where they are to begin with. I made the first reply
in this thread, at which point there was no conversation to go from.
I haven't declared any goal posts I'm aware of. I'm just humoring ideas,
and trying to adapt my thinking to your responses, and consider how it
affects my requirements. I'm also trying to read into vague comments about
practical usage that I can't properly visualise without examples.
This is a very important topic to me long-term. It is the source of
innumerable headache and mess throughout my career. D needs to get this
right.

I'm equally confused by your changing position on whether new is important,
should be deprecated, will support global 'new' overrides, assuming that
delete doesn't exist (rules out non-collecting allocators in conjunction
with the keyword allocators), allocation through an allocator object should
be encouraged as standard in user code, typed allocators will need to
integrate with (and occasionally replace) the global GC, or not.
Your position hasn't been static, I'm just trying to work out what it is,
and then consider if I'm happy with it.

You're implying I have a fixed position, I don't (I do have a few implementation requirements I think are important though, however they end out being expressed), I had no idea at all what you had in mind at the start of the thread, I'm just trying to get clarification, and generally throwing ideas in the pot.

     What I do hope to get to is to have allocators define their own
>>     pointers and reference types. User code that uses those will be
>>     guaranteed certain allocation behaviors.
>>
>>
>> Interesting, will this mangle the pointer type, or the object type being pointed to? The latter is obviously not desirable. Does the former actually work in theory?
>>
>
> I don't think I understand what you mean. Honest, it seems to me you're confused about what you want, how it can be done, and what moving pieces are involved.
>

Apparently I don't understand what you mean. What does "have allocators
define their own pointers and reference types" mean then?
I presumed you mean that an allocator would have some influence on the type
of the pointers they allocate, then it can be known how to handle them
throughout their life.

One example is Rust, which defines several pointer types to implement its
> scoping and borrowing semantics. I think it's not easy to achieve its purported semantics with fewer types, and time will tell whether programmers will put up with the complexity for the sake of the corresponding benefits.
>

I don't think rust allows user-defined allocators to declare new pointer types, does it?

 I think a critical detail to keep in mind, is that (I suspect) people
>> simply won't use it if it doesn't interface with keyword 'new'.
>>
>
> I think this is debatable. For one, languages such as Java and C++ still have built-in "new" but quite ubiquitously unrecommend their usage in user code. Far as I can tell that's been a successful campaign.
>

I'm not sure what you mean... if by success you mean people don't use new
in C++, then yes, I have observed that pattern, and it's an absolute
catastrophe in my experience.
Every company/code base I've ever worked on has had some stupid new macro
they roll themselves. And that always leads to problems when interacting
with 3rd party libraries, and generic code in C++ can't work when there's
lots of different ways to allocate memory.

For two, there are allocations that don't even entail calls to new, such as
> array concatenation.


This is a very important area of discussion; how will user-defined
allocators fit into implicit allocations?
Shall we split off a few threads so they can be considered 'on topic'?
There are a lot of things that need discussion.

 It's extremely common that I want to enforce that a library exist
>> entirely within a designated heap. It can't fall back to the global GC.
>>
>> I work on platforms where memory is not unified. Different resources need to go into different heaps.
>>
>
> The proposed API would make that entirely possible. You seem to care only with the appendix "and mind you I only want to use new for all of those!" which is a bit out there. But nevertheless good feedback.


You often "quote" people, but totally paraphrase them such that it's
nothing like what they said.
I'm sure I've made the point a whole bunch of times times that I care only
that "libraries don't hard-code an allocator which can not be overridden by
the application that uses them".
That may sound like "I want to use new for all those"; because libraries
invariably use new, and I can't control that. The reason for that is very
likely nothing more than 'because it's a keyword', it's also convenient,
and perceived as the standard/proper way to allocate memory.
I think that's water well under the bridge, probably in the ocean by now,
new is out there, I don't think it can be revoked.
So as far as I see it, unless you can present a practical solution to
deprecate it, and a path to migrate existing 'new' calls, then new must be
a part of the solution here; it must be overridable.

That seems to be the direction this thread is going anyway, so I'll drop this thread now. But you can't accuse me of changing my story when I'm only trying to clarify your story in the first place, which also seems to have changed (in favour of keeping new judging by the direction of the thread, and raising DIP46).


September 24, 2013
On 24 September 2013 15:31, Andrei Alexandrescu < SeeWebsiteForEmail@erdani.org> wrote:

> On 9/23/13 9:56 PM, Manu wrote:
>
>> You can't go wasting GPU memory by overallocating every block.
>>
>
> Only the larger chunk may need to be overallocated if all allocations are then rounded up.


I don't follow.
If I want to allocate 4k aligned, then 8k will be allocated (because it
wants to store an offset).
Any smaller allocation let's say, 16 bytes, will round up to 4k. You can't
waste precious gpu ram like that.

A minimum and a maximum (guaranteed without over-allocating) alignment may
be useful.
But I think allocators need to be given the opportunity to do the best it
can.

 It's definitely important that allocator's are able to receive an
>> alignment request, and give them the opportunity to fulfill a dynamic alignment request without always resorting to an over-allocation strategy.
>>
>
> I'd need a bit of convincing. I'm not sure everybody needs to pay for a few, and it is quite possible that malloc_align suffers from the same fragmentation issues as the next guy. Also, there's always the possibility of leaving some bits to lower-level functions.


What are they paying exactly? An extra arg to allocate that can probably be
defaulted?
  void[] allocate(size_t bytes, size_t align = this.alignment) shared;

Or is it the burden of adding the overallocation boilerplate logic to each
allocator for simple allocators that don't want to deal with alignment in a
conservative way?
I imagine that could possibly be automated, the boilerplate could be given
as a library.

void[] allocate(size_t size, size_t align)
{
  size_t allocSize = std.allocator.getSizeCompensatingForAlignment(size,
align);

  void[] mem = ...; // allocation logic using allocSize

  return std.allocator.alignAllocation(mem, align); // adjusts the range,
and maybe write the offset to the prior bytes
}


September 24, 2013
On 2013-09-23 19:53, Andrei Alexandrescu wrote:

> I talked Walter's ear off a while ago at an ACCU conference about the
> notion that reference counting could be a switch passed to the compiler.
> Recently he's authored a DIP about the compiler inserting refcounting
> primitives in the generated code. Unfortunately I think the DIP is
> awfully incomplete and superficial (to the extent it would bring more
> damage to the language if any implementation effort would be invested in
> it at this point).
>
> Can it be done? Probably. But it would be an absolutely major effort.

I don't know if we're talking about the same things but a couple of us had a private email conversion about implementing ARC (Automatic Reference Counting) in the compiler as a result of my suggestion for adding support for Objective-C in D.

A DIP was never created, at least I cannot find it at the wiki. If I recall correctly the conversation was never put in the public, which I think it should be.

> I think this is debatable. For one, languages such as Java and C++ still
> have built-in "new" but quite ubiquitously unrecommend their usage in
> user code. Far as I can tell that's been a successful campaign.

Since when is it _not_ recommended to use "new" in Java?

-- 
/Jacob Carlborg
September 24, 2013
On 2013-09-24 02:03, H. S. Teoh wrote:

> I thought Walter's DIP already addresses the issue of replacing the
> default allocator?
>
> 	http://wiki.dlang.org/DIP46
>
> I get the feeling that we don't have a good handle on the fundamental
> issues, though. Having a stack for managing the default allocator works
> for the simpler cases, but it's unclear how things would interact once
> you throw in other requirements, like class/struct-scoped allocators,
> block scoped allocators inside closures, user-specified allocators that
> must interact with the foregoing, etc..

Why not just let the user set the allocator explicitly, see:

http://forum.dlang.org/thread/l1nvn4$ca3$1@digitalmars.com?page=2#post-l1oqp2:2429fg:241:40digitalmars.com

-- 
/Jacob Carlborg
September 24, 2013
On 2013-09-23 22:32, Timon Gehr wrote:

> Some general remarks:
>
> One issue you will probably run into and maybe want to fix in some way
> during the typed allocator design is that private constructors cannot be
> called from templates parameterized on types.
>
> E.g.:
>
> module a;
>
> auto New(T,Args...)(Args args){
>      return new T(args);
> }
>
> // ---
>
> module b;
>
> class A{
>      private this(){}
> }
>
> void main(){
>      auto a = New!A(); // error
> }

Allocate the memory needed for T without using "new". Then a pointer can be used to bypass the protection of the constructor, something like:

extern (C) Object _d_newclass(in ClassInfo);

T New (T, Args ...) (Args args)
{
    auto object = cast(T) _d_newclass(T.classinfo);

    // use explicit type to handle overloads.
    T delegate (Args) dg = &object.__ctor;

    return dg(args);
}

-- 
/Jacob Carlborg
September 24, 2013
23-Sep-2013 03:49, Andrei Alexandrescu пишет:
> Hello,
>
>
> I am making good progress on the design of std.allocator, and I am
> optimistic about the way it turns out. D's introspection capabilities
> really shine through, and in places the design does feel really
> archetypal - e.g. "this is the essence of a freelist allocator". It's a
> very good feeling. The overall inspiration comes from Berger's
> HeapLayers, but D's introspection takes that pattern to a whole new level.
>
> Several details are still in flux, but at the top level it seems most
> natural to divide things in two tiers:
>
> 1. Typed allocator, i.e. every request for allocation comes with the
> exact type requested;
>
> 2. Untyped allocator - traffics exclusively in ubyte[].
>

Looks good (s/ubyte[]/void[] per current discussion).

Do you imagine Typed allocators as something more then adapters that simplify a common pattern of allocate + emplace / destroy + deallocate? (create!T/destroy!T)

> struct NullAllocator
> {
>      enum alignment = real.alignof;
>      enum size_t available = 0;
>      ubyte[] allocate(size_t s) shared { return null; }
>      bool expand(ref ubyte[] b, size_t minDelta, size_t maxDelta) shared
>      { assert(b is null); return false; }
>      bool reallocate(ref ubyte[] b, size_t) shared
>      { assert(b is null); return false; }
>      void deallocate(ubyte[] b) shared { assert(b is null); }
>      void collect() shared { }
>      void deallocateAll() shared { }
>      static shared NullAllocator it;
> }
>
> Primitives:
>

First things first - can't allocator return alignment as run-time value - a property (just like 'available' does)? The implementation contract is that it must be O(1) vanila syscall-free function. (Read this as consult system info exactly once, at construction).

Thinking more of it - it also may be immutable size_t? Then it gets proper value at construction and then is never changed.

> * expand(b, minDelta, maxDelta) grows b's length by at least minDelta
> (and on a best-effort basis by at least maxDelta) and returns true, or
> does nothing and returns false. In most allocators this should be @safe.
> (One key insight is that expand() can be made @safe whereas shrink() or
> realloc() are most often not; such mini-epiphanies are very exciting
> because they put the design on a beam guide with few principles and many
> consequences.) The precondition is that b is null or has been previously
> returned by allocate(). This method is optional.

Great to see this primitive. Classic malloc-ators are so lame...
(+ WinAPI Heaps fits here)

> * deallocate(b) deallocates the memory allocated for b. b must have been
> previously allocated by the same allocator. This method is usually
> unsafe (but there are notable implementations that may offer safety,
> such as unbounded freelists.) This method is optional.

Does the following implication hold "have a deallocate" --> must be manually managed? Otherwise how would one reliably work with it and not leak? This brings us to traits that allocators may (should) have
a-la automatic? free-all on termination? Zeros on allocate (more common then one would think)? etc.

> * deallocateAll() deallocates in one shot all memory previously
> allocated by this allocator. This method is optional, and when present
> is almost always unsafe (I see no meaningful @safe implementation.)
> Region allocators are notable examples of allocators that define this
> method.

Okay.. I presume region "mark" works by spiting off a subinstance of allocator and "release" by deallocateAll().

>
> * collect() frees unused memory that had been allocated with this
> allocator. This optional method is implemented by tracing collectors and
> is usually @safe.
>

This seems hard and/or suboptimal w/o typeinfo --> typed allocators? I guess they could be more then a simple helper.

> There are quite a few more things to define more precisely, but this
> part of the design has become quite stable. To validate the design, I've
> defined some simple allocators (Mallocator, GCAllocator, Freelist,
> StackRegion, Region etc) but not the more involved ones, such as
> coalescing heaps, buddy system etc.

The problem I see is that allocators are rooted in poor foundation - malloc is so out of fashion (its interface is simply too thin on guarantees), sbrk is frankly a stone-age syscall.

I would love to make a root allocator one on top of mmap/VirtualAlloc.
This leads me to my biggest problem with classical memory management - ~20 years of PCs having virtual memory support directly in the CPU, and yet hardly anything outside of OS takes advantage of it. They (allocators) seem simply not aware of it - none of interface implies anything that would help user take advantage of it. I'm talking of (potentially) large allocations that may be often resized.

(large allocations actually almost always a blind spot/fallback in allocators)

To the point - we have address space and optionally memory committed in there, leading us to a 3-state of an  _address range_:
available, reserved, committed. The ranges of 2nd state are dirt cheap (abundant on 64bit) and could be easily flipped to committed and back.

So I have a pattern I want to try to fit in your design. At present it is a datastructure + allocation logic. What I would want is clean and nice composition for ultimate reuse.

An example is a heavy-duty array (could be used for potentially large stack). The requirements are:
a) Never reallocate on resize - in certain cases it may even be too large to reallocate in RAM (!)
b) Never waste RAM beyond some limit (a few pages or a tiny faction of size is fine)
c) O(1) amortized appending as in plain array and other properties of an array -> it's a dynamic array after all

Currently I use mmap/madvise and related entities directly. It goes as follows.

Allocate a sufficiently large - as large as "it" may get (1Mb, 1G, 10G you name it) _address range_. Call this size a 'capacity'.
Commit some memory (optionally on first resize) - call this a 'committed'. Appending goes using up committed space as typical array would do. Whenever it needs more it then that applies the same extending algorithm as usual but with (committed, capacity) pair.
Ditto on resize back - (with some amortization) it asks OS to decommit pages decreasing the commited amount.

In short - a c++ "vector" that has 2 capacities (current and absolute maximum) with a plus that it never reallocates. What to do if/when it hits the upper bound - is up to specific application (terminate, fallback to something else). It has the danger of sucking up address space on 32bits though.. but who cares :)


Now how to fit this monster into the current API...
Plainly I need a way to tell hey Mr. Allocator RESERVE me a 1Gb of RAM but don't chew on, it please. In other words - allocate needs something like a "guaranteed capacity of the block". By default it could be == size.

struct Allocator
{
...
//allocate size bytes with a block of with no less then capacity bytes.
void[] allocate(size_t size, size_t capacity=size);
...
}

Or maybe somehow add a reserve method explicitly...

Then extend then will easily do the second part of the job - in-place resize (commit/decommit as required) and both hints make a lot of sense.

The pattern ideally would lower to: take a generic array, plug-in a MemMap allocator, reserve top amount of memory at start ... profit!


-- 
Dmitry Olshansky