June 26, 2013
26-Jun-2013 21:35, Dicebot пишет:
> By the way, while this topic gets some attention, I want to make a
> notice that there are actually two orthogonal entities that arise when
> speaking about configurable allocation - allocators itself and global
> allocation policies. I think good design should address both of those.
>

Sadly I believe that global allocators would still have to be compatible with GC (to not break code in hard to track ways) thus basically being a GC. Hence we can easily stop talking about them ;)



-- 
Dmitry Olshansky
June 26, 2013
Am Wed, 26 Jun 2013 16:30:50 +0200
schrieb Robert Schadek <realburner@gmx.de>:

> 
> >> Imagine we have two delegates:
> >>
> >> void* delegate(size_t);  // this one allocs
> >> void delegate(void*);    // this one frees
> >>
> >> you pass both to a function that constructs you object. The first is
> >> used for allocation the
> >> memory, the second gets attached to the TypeInfo and is used by the gc
> >> to free
> >> the object.
> >

Does it mean 16 extra bytes for every allocation ?

-- 
Marco

June 26, 2013
26-Jun-2013 21:23, H. S. Teoh пишет:

>> Both suffer from
>> a) being totally unsafe and in fact bug prone since all references
>> obtained in there are now dangling (and there is no indication where
>> they came from)
>
> How is this different from using malloc() and free() manually? You have
> no indication of where a void* came from either, and the danger of
> dangling references is very real, as any C/C++ coder knows. And I assume
> that *some* people will want to be defining custom allocators that wrap
> around malloc/free (e.g. the game engine guys who want total control).

Why the heck you people think I purpose to use malloc directly as alternative to whatever hackish allocator stack proposed?

Use the darn container. For starters I'd make allocation strategy a parameter of each containers. At least they do OWN memory.

Then refactor out common pieces into a framework of allocation helpers. I'd personally in the end would separate concerns into 3 entities:

1. Memory area objects - think as allocators but without the circuitry to do the allocation, e.g. a chunk of memory returned by malloc/alloca can be wrapped into a memory area object.

2. Allocators (Policies) - a potentially nested combination of such "circuitry" that makes use of memory areas. Free-lists, pools, stacks etc. Safe ones have ref-counting on memory areas, unsafe once don't. (Though safety largely depends on the way you got that chunk of memory)

3. Containers/Warppers as above objects that handle life-cycle of objects and make use of allocators. In fact allocators are part of
type but not memory area objects.


>
>> b) imagine you need to use an allocator for a stateful object. Say
>> forward range of some other ranges (e.g. std.regex) both
>> scoped/stacked to allocate its internal stuff. 2nd one may handle it
>> but not the 1st one.
>
> Yeah this is a complicated area. A container basically needs to know how
> to allocate its elements. So somehow that information has to be
> somewhere.
>
>
>> c) transfer of objects allocated differently up the call graph
>> (scope graph?), is pretty much neglected I see.
>
> They're incompatible. You can't safely make a linked list that contains
> both GC-allocated nodes and malloc() nodes.

What I mean is that if types are the same as built-ins it would be a horrible mistake. If not then we are talking about containers anyway.
And if these have a ref-counted pointer to their allocator then the whole thing is safe albeit at the cost of performance.

Sadly alias this to some built-in (=e.g. slice) allows squirreling away underlying reference too easily.

As such I don't believe in any of the 2 *lies*:
a) built-ins can be refurbished to use custom allocators
b) we can add opSlice/alias this or whatever to our custom type to get access to the underlying built-ins safely and transparently

Both are just nuclear bombs waiting a good time to explode.

That's just a bomb waiting
> to explode in your face. So in that sense, Adam's idea of using a
> different type for differently-allocated objects makes sense.

Yes, but one should be careful here as not to have exponential explosion in the code size. So some allocators have to be compatible and if there is a way to transfer ownership it'd be bonus points (and a large pot of these mind you).

> A
> container has to declare what kind of allocation its members are using;
> any other way is asking for trouble.

Hence my thoughts to move this piece of "circuitry" to containers proper. The whole idea that by swapping malloc with myMalloc you can translate to a wildly different allocation scheme doesn't quite hold.

I think it may be interesting to try and put a "wall" in different place namely in between allocation strategy and memory areas it works on.


>> I kind of wondering how our knowledgeable community has come to this.
>> (must have been starving w/o allocators way too long)
>
> We're just trying to provoke Andrei into responding. ;-)
>
>
Cool, then keep it coming but ... safety and other holes has to be taken care of.

> [...]
>> IMHO the only place for allocators is in containers other kinds of
>> code may just ignore allocators completely.
>
> But some people clamoring for allocators are doing so because they're
> bothered by Phobos using ~ for string concatenation, which implicitly
> uses the GC. I don't think we can just ignore that.

~= would work with any sensible array-like contianer.
~ is sadly only a convenience for scripts and/or non-performance (determinism) critical apps unfortunately.
>
>
>> std.algorithm and friends should imho be customized on 2 things only:
>>
>> a) containers to use (instead of array)
>> b) optionally a memory source (or allocator) f container is
>> temporary(scoped) to tie its life-time to smth.
>>
>> Want temporary stuff? Use temporary arrays, hashmaps and whatnot
>> i.e. types tailored for a particular use case (e.g. with a
>> temporary/scoped allocator in mind).
>> These would all be unsafe though. Alternative is ref-counting
>> pointers to an allocator. With word on street about ARC it could be
>> nice direction to pursue.
>
> Ref-counting is not fool-proof, though. There's always cycles to mess
> things up.

You surely shouldn't have allocators reference each other cyclically? Then I see this as a DAG with allocator at the bottom and objects referencing it.

>
>
>> Allocators (as Andrei points out in his video) have many kinds:
>> a) persistence: infinite, manual, scoped
>> b) size: unlimited vs fixed
>> c) block-size: any, fixed, or *any* up to some maximum size
>>
>> Most of these ARE NOT interchangeable!
>> Yet some are composable however I'd argue that allocators are not
>> composable but have some reusable parts that in turn are composable.
>
> I was listening to Andrei's talk this morning, but I didn't quite
> understand what he means by composable allocators. Is he talking about
> nesting, say, a GC inside a region allocated by a region allocator?

I'd say something like: fixed size region allocator  with GC as fallback.
Or pool for small allocations + malloc/free with a free-list for bigger allocations etc. And the stuff should be as easily composable as I just listed.

>> Code would have to cutter for specific flavors of allocators still
>> so we'd better reduce this problem to the selection of containers.
> [...]
>
> Hmm. Sounds like we have two conflicting things going on here:
>
> 1) En massé replacement of gc_alloc/gc_free in a certain block of code
> (which may be the entire program), e.g., for the avoidance of GC in game
> engines, etc.. Basically, the code is allocator-agnostic, but at some
> higher level we want to control which allocator is being used.

There is no allocator agnostic code that allocates. It either happens to call free/dispose/destroy manually (implicitly with ref-counts) or it does not. It either escapes references to who knows where or doesn't.

>
> 2) Specific customization of containers, etc., as to which allocator(s)
> should be used, with (hopefully) some kind of support from the type
> system to prevent mistakes like dangling pointers, escaping references,
> etc.. Here, the code is NOT allocator-agnostic; it has to be written
> with the specific allocation model in mind. You can't just replace the
> allocator with another one without introducing bugs or problems.

With another one of the same _kind_ I'd say.

> These two may interact in complex ways... e.g., you might want to use
> malloc to allocate a pool, then use a custom gc_alloc/gc_free to
> allocate from this pool in order to support language built-ins like ~
> and ~= without needing to rewrite every function that uses strings.

I guess we have to re-write them. Or don't allocate in string functions.


> Maybe we should stop conflating these two things so that we stop
> confusing ourselves, and hopefully it will be easier to analyse
> afterwards.
>


-- 
Dmitry Olshansky
June 26, 2013
On 06/26/2013 10:06 PM, Marco Leise wrote:
> Does it mean 16 extra bytes for every allocation ?
>
yes, or wrap it, and you have 4 or 8 bytes, but yes you would to have save it somewhere
June 26, 2013
On Wednesday, 26 June 2013 at 19:40:54 UTC, Dmitry Olshansky wrote:
> Sadly I believe that global allocators would still have to be compatible with GC (to not break code in hard to track ways) thus basically being a GC. Hence we can easily stop talking about them ;)

Nice way to say "we don't really need that embedded, kernel and gamedev guys". GC as a safe an obvious approach should be the default but druntime needs to provide means for tight and dangerous control upon explicit request.
June 26, 2013
27-Jun-2013 00:53, Dicebot пишет:
> On Wednesday, 26 June 2013 at 19:40:54 UTC, Dmitry Olshansky wrote:
>> Sadly I believe that global allocators would still have to be
>> compatible with GC (to not break code in hard to track ways) thus
>> basically being a GC. Hence we can easily stop talking about them ;)
>
> Nice way to say "we don't really need that embedded, kernel and gamedev
> guys". GC as a safe an obvious approach should be the default but
> druntime needs to provide means for tight and dangerous control upon
> explicit request.

Just don't use certain built-ins. Stub them out in run-time if you like. The only problematic point I see is closures allocated on heap.

Frankly I see embedded, kernel and gamedev guys using ref-counting and custom data structures all the time. They all want that level of control and determinism anyway or are so resource constrained that GC is too much code space or run-time overhead anyway.

Needless to say that custom run-time for the first 2 categories is required anyway so just hack the druntime. It would be nice to have hooks readily available (and documented?) to do so but hardly beyond that.

-- 
Dmitry Olshansky
June 26, 2013
On Wednesday, 26 June 2013 at 21:00:54 UTC, Dmitry Olshansky wrote:
> Needless to say that custom run-time for the first 2 categories is required anyway so just hack the druntime. It would be nice to have hooks readily available (and documented?) to do so but hardly beyond that.

It is an API issue. Hacking druntime is, unfortunately, inevitable but keeping ability to swap those two with no code changes simplifies development process and makes less tempting too forget about this use case when doing std lib / runtime stuff - it has been a second-class citizen for rather long time.
June 26, 2013
On Wednesday, 26 June 2013 at 21:00:54 UTC, Dmitry Olshansky wrote:
> Just don't use certain built-ins. Stub them out in run-time if you like. The only problematic point I see is closures allocated on heap.

Actually, I was kinda sorta able to solve this in my minimal d.

// this would be used for automatic heap closures, but there's no way to free it...
///*
extern(C)
void* _d_allocmemory(size_t bytes) {
        auto ptr = manual_malloc(bytes);
        debug(allocations) {
                char[16] buffer;
                write("warning: automatic memory allocation ", intToString(cast(size_t) ptr, buffer));
        }
        return ptr;
}


struct HeapClosure(T) if(is(T == delegate)) {
        mixin SimpleRefCounting!(T, q{
                char[16] buffer;
                write("\nfreeing closure ", intToString(cast(size_t) payload.ptr, buffer),"\n");
                manual_free(payload.ptr);
        });
}

HeapClosure!T makeHeapClosure(T)(T t) { // if(__traits(isNested, T)) {
        return HeapClosure!T(t);
}



void closureTest2(HeapClosure!(void delegate()) test) {
        write("\nptr is ", cast(size_t) test.ptr, "\n");
        test();

        auto b = test;
}

void closureTest() {
        string a = "whoa";
        scope(exit) write("\n\nexit\n\n");
        //throw new Exception("test");
        closureTest2( makeHeapClosure({ write(a); }) );
}




It worked in my toy tests. The trick would be though to never store or use a non-scope builtin delegate. Using RTInfo, I believe I can statically verify you don't do this in the whole program,  but haven't actually tried yet.


I also left built in append unimplemented, but did custom types with ~= that are pretty convenient. Binary ~ is a loss though, too easy to lose pointers with that.
June 26, 2013
27-Jun-2013 01:05, Adam D. Ruppe пишет:
> On Wednesday, 26 June 2013 at 21:00:54 UTC, Dmitry Olshansky wrote:
>> Just don't use certain built-ins. Stub them out in run-time if you
>> like. The only problematic point I see is closures allocated on heap.
>
> Actually, I was kinda sorta able to solve this in my minimal d.
>
> // this would be used for automatic heap closures, but there's no way to
> free it...

[snip a cool hack]

Yeah, I suspected something like this might work. Basically defining your own ref-count closure type and forging delegate keyword in your codebase (except in the file that defines heap closure). That still leaves chasing code like auto dg = (...){ ... } though.

Maybe having it as a template Closure!(ret-type, arg types...)
and instantiator function called simply closure could be more
ecstatically pleasing (this is IMHO).

> It worked in my toy tests. The trick would be though to never store or
> use a non-scope builtin delegate. Using RTInfo, I believe I can
> statically verify you don't do this in the whole program,  but haven't
> actually tried yet.
>
>
> I also left built in append unimplemented, but did custom types with ~=
> that are pretty convenient. Binary ~ is a loss though, too easy to lose
> pointers with that.


-- 
Dmitry Olshansky
June 26, 2013
So to try some ideas, I started implementing a simple container with replaceable allocators: a singly linked list.

All was going kinda well until I realized the forward range it offers to iterate its contents makes it possible to escape a reference to a freed node.

auto range = list.range;
auto range2 = range;
range.removeFront();

range2 now refers to a freed node. Maybe the nodes could be refcounted, though a downside there is even the range won't be sharable, it would be a different type based on allocation method. (I was hoping to make the range be a sharable component, even as the list itself changed type with allocators.)

I guess we could @disable copy construction, and make it a forward range instead of an input one, but that takes some of the legitimate usefulness away.

Interestingly though, opApply would be ok here, since all it would expose is the payload.

(though if the payload is a reference type, does the container take ownership of it? How do we indicate that? Perhaps more interestingly, how do we indicate the /lack/ of ownership at the transfer point?)



This is all fairly easy if we just decide "we're going to do this with GC" or "we're going to do this C style" and do the whole program like that, libraries and all. But trying to mix and match just gets more complicated the more I think about it :( It makes the question of "allocators" look trivial.