Thread overview | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
June 25, 2013 why allocators are not discussed here | ||||
---|---|---|---|---|
| ||||
I know Andrey mentioned he was going to work on Allocators a year ago. In DConf 2013 he described the problems he needs to solve with Allocators. But I wonder if I am missing the discussion around that - I tried searching this forum, found a few threads that was not actually a brain storm for Allocators design. Please point me in the right direction or is there a reason it is not discussed or should we open the discussion? The easiest approach for Allocators design I can imagine would be to let user specify which Allocator operator new should get the memory from (introducing a new keyword allocator). This gives a total control, but assumes user knows what he is doing. Example: CustomAllocator ca; allocator(ca) { auto a = new A; // operator new will use ScopeAllocator::malloc() auto b = new B; free(a); // that should call ScopeAllocator::free() // if free() is missing for allocated area, it is a user responsibility to make sure custom Allocator can handle that } By default allocator is the druntime using GC, free(a) does nothing for it. if some library defines its allocator (e.g. specialized container), there should be ability to: 1. override allocator 2. get access to the allocator used I understand that I spent 5 mins thinking about the way Allocators may look. My point is - if somebody is working on it, can you please share your ideas? |
June 25, 2013 Re: why allocators are not discussed here | ||||
---|---|---|---|---|
| ||||
Posted in reply to cybervadim | On Tuesday, 25 June 2013 at 22:22:09 UTC, cybervadim wrote:
> (introducing a new keyword allocator)
It would be easier to just pass an allocator object that provides the necessary methods and don't use new at all. (I kinda wish new wasn't in the language. It'd make this a little more consistent.)
The allocator's create function could also return wrapped types, like RefCounted!T or NotNull!T depending on what it does.
Though the devil is in the details here and I don't think I can say more without trying to actually do it.
|
June 25, 2013 Re: why allocators are not discussed here | ||||
---|---|---|---|---|
| ||||
Posted in reply to cybervadim | On Wed, Jun 26, 2013 at 12:22:04AM +0200, cybervadim wrote: > I know Andrey mentioned he was going to work on Allocators a year ago. In DConf 2013 he described the problems he needs to solve with Allocators. But I wonder if I am missing the discussion around that - I tried searching this forum, found a few threads that was not actually a brain storm for Allocators design. > > Please point me in the right direction > or > is there a reason it is not discussed > or > should we open the discussion? That would be nice to get things going. :) Ever since I found D and subscribed to this mailing list, I've been hearing rumors of allocators, but they seem to be rather lacking in the department of concrete evidence. They're like the Big Foot or Swamp Ape of D. Maybe it's time we got out into the field and produced some real evidence of these mythical beasts. :-P > The easiest approach for Allocators design I can imagine would be to let user specify which Allocator operator new should get the memory from (introducing a new keyword allocator). This gives a total control, but assumes user knows what he is doing. > > Example: > > CustomAllocator ca; > allocator(ca) { > auto a = new A; // operator new will use ScopeAllocator::malloc() > auto b = new B; > > free(a); // that should call ScopeAllocator::free() > // if free() is missing for allocated area, it is a user > responsibility to make sure custom Allocator can handle that > } > > By default allocator is the druntime using GC, free(a) does nothing > for it. I believe the current direction is to avoid needing new language features / syntax. So the above probably won't happen. > if some library defines its allocator (e.g. specialized container), > there should be ability to: > 1. override allocator > 2. get access to the allocator used > > I understand that I spent 5 mins thinking about the way Allocators > may look. > My point is - if somebody is working on it, can you please share > your ideas? Well, thanks for getting the ball rolling. Maybe Andrei can pipe up about any experimental designs he's currently considering. But barring that, I'm thinking about how allocators would be used in user code. I think it's pretty much a given that the C++ way of sticking it to the end of template arguments doesn't really fly: it's just too much of a hassle to keep having to worry about passing allocators around template arguments, that people just don't bother. So coming back to square one, how would allocators be used? 1) Usually, the user would just be content with the GC, and not ever have to worry about allocators. So this means that whatever allocator design we adopt, it should be practically invisible to ordinary users unless they're specifically looking to change how memory is allocated. 2) Furthermore, it's unlikely that in the same piece of code, you'd want to use 3 or 4 different allocators for different objects; while such cases may exist, it seems to me to be more likely that you want either (a) a very specific object (say a class instance or container) to use a particular allocator, or (b) you want to transitively block off an entire section of code (which may be the entire program in some cases) to use a particular allocator. As a first stab at it, I'd say (a) can be implemented by a static class member reference to an allocator, that can be set from user code. And maybe (b) can be implemented by making gc_alloc / gc_free overridable function pointers? Then we can override their values and use scope guards to revert them back to the values they were before. This allows us to use the runtime stack to manage which allocator is currently active. This lets *all* memory allocations be rerouted through the custom allocator without needing to hand-edit every call to new down the call graph. This is just a very crude first stab at the problem, though. In particular, (a) isn't very satisfactory. And also the interaction of allocated objects with the call stack: if any custom-allocated objects in (b) survive past the containing function which sets/resets the function pointers, there could be problems: if a member function of such an object needs to allocate memory, it will pick up the ambient allocator instead of the custom allocator in effect when the object was first created. Also, we may have the problem of the wrong allocator being used to free the object. Anyone has better ideas? T -- All problems are easy in retrospect. |
June 25, 2013 Re: why allocators are not discussed here | ||||
---|---|---|---|---|
| ||||
Posted in reply to Adam D. Ruppe | On Wed, Jun 26, 2013 at 12:50:36AM +0200, Adam D. Ruppe wrote: > On Tuesday, 25 June 2013 at 22:22:09 UTC, cybervadim wrote: > >(introducing a new keyword allocator) > > It would be easier to just pass an allocator object that provides the necessary methods and don't use new at all. (I kinda wish new wasn't in the language. It'd make this a little more consistent.) It's not too late to introduce a default allocator object that maps to built-in GC primitives. Maybe something like: struct DefaultAllocator { T* alloc(T, A...)(A args) { return new T(args); } void free(T)(T* ref) { // no-op } } We can then change Phobos to always use allocator.alloc and allocator.free, which it gets from user code somehow, and in the default case it would do the Right Thing. > The allocator's create function could also return wrapped types, like RefCounted!T or NotNull!T depending on what it does. So maybe something like: struct RefCountedAllocator { RefCounted!T alloc(T, A...)(A args) { return allocRefCounted(args); } void free(T)(RefCounted!T ref) { dotDotDotMagic(ref); } } etc.. > Though the devil is in the details here and I don't think I can say more without trying to actually do it. The main issue I see is how *not* to get stuck in C++'s situation where you have to specify allocator objects everywhere, which is highly inconvenient and liable for people to avoid using, which defeats the purpose of having allocators. It would be nice, IMO, if we can somehow let the user specify a custom allocator for, say, the whole of Phobos, so that people who care about this sorta thing can just replace the GC wholesale and then use Phobos to their hearts' content without having to manually specify allocator objects everywhere and risk forgetting a single case that eventually leads to memory leakage. T -- Computers shouldn't beep through the keyhole. |
June 25, 2013 Re: why allocators are not discussed here | ||||
---|---|---|---|---|
| ||||
Posted in reply to H. S. Teoh | On Tuesday, 25 June 2013 at 22:50:55 UTC, H. S. Teoh wrote:
> On Wed, Jun 26, 2013 at 12:22:04AM +0200, cybervadim wrote:
>
> That would be nice to get things going. :)
>
> Ever since I found D and subscribed to this mailing list, I've been
> hearing rumors of allocators, but they seem to be rather lacking in the
> department of concrete evidence. They're like the Big Foot or Swamp Ape
> of D. Maybe it's time we got out into the field and produced some real
> evidence of these mythical beasts. :-P
>
> Well, thanks for getting the ball rolling. Maybe Andrei can pipe up
> about any experimental designs he's currently considering.
>
> But barring that, I'm thinking about how allocators would be used in
> user code. I think it's pretty much a given that the C++ way of sticking
> it to the end of template arguments doesn't really fly: it's just too
> much of a hassle to keep having to worry about passing allocators around
> template arguments, that people just don't bother. So coming back to
> square one, how would allocators be used?
>
> 1) Usually, the user would just be content with the GC, and not ever
> have to worry about allocators. So this means that whatever allocator
> design we adopt, it should be practically invisible to ordinary users
> unless they're specifically looking to change how memory is allocated.
>
> 2) Furthermore, it's unlikely that in the same piece of code, you'd want
> to use 3 or 4 different allocators for different objects; while such
> cases may exist, it seems to me to be more likely that you want either
> (a) a very specific object (say a class instance or container) to use a
> particular allocator, or (b) you want to transitively block off an
> entire section of code (which may be the entire program in some cases)
> to use a particular allocator.
>
> As a first stab at it, I'd say (a) can be implemented by a static class
> member reference to an allocator, that can be set from user code.
>
> And maybe (b) can be implemented by making gc_alloc / gc_free
> overridable function pointers? Then we can override their values and use
> scope guards to revert them back to the values they were before. This
> allows us to use the runtime stack to manage which allocator is
> currently active. This lets *all* memory allocations be rerouted through
> the custom allocator without needing to hand-edit every call to new down
> the call graph.
>
> This is just a very crude first stab at the problem, though. In
> particular, (a) isn't very satisfactory. And also the interaction of
> allocated objects with the call stack: if any custom-allocated objects
> in (b) survive past the containing function which sets/resets the
> function pointers, there could be problems: if a member function of such
> an object needs to allocate memory, it will pick up the ambient
> allocator instead of the custom allocator in effect when the object was
> first created. Also, we may have the problem of the wrong allocator
> being used to free the object.
>
> Anyone has better ideas?
>
>
> T
From my experience all objects may be divided into 2 categories
1. temporaries. Program usually have some kind of event loop. During one iteration of this loop some temporary objects are created and then discarded. The ideal case for stack (or ranged or area) allocator, where you define allocator at the beginning of the loop cycle, use it for all temporaries, then free all the memory in one go at the end of iteration.
2. containers. Program receives an event from the outside and puts some data into container OR update the data if the record already exists.
The important thing here is - when updating the data in container, you may want to resize the existing area.
If you are working with temporary which should be placed into container, a copy can be made (with corresponding memory allocation from container allocator).
Not sure if there is anything better than stack/area allocator for the first class. For the second class user should be able to choose default GC or more precise memory handling (e.g. explicit malloc/free for resizing).
Anything I am missing in this categorization?
So even if we get allocators that lets us deal with temporaries, that will be a huge benefit.
|
June 25, 2013 Re: why allocators are not discussed here | ||||
---|---|---|---|---|
| ||||
Posted in reply to H. S. Teoh | On Tuesday, 25 June 2013 at 22:50:55 UTC, H. S. Teoh wrote: > And maybe (b) can be implemented by making gc_alloc / gc_free > overridable function pointers? Then we can override their values and use scope guards to revert them back to the values they were before. Yea, I was thinking this might be a way to go. You'd have a global (well, thread-local) allocator instance that can be set and reset through stack calls. You'd want it to be RAII or delegate based, so the scope is clear. with_allocator(my_alloc, { do whatever here }); or { ChangeAllocator!my_alloc dummy; do whatever here } // dummy's destructor ends the allocator scope I think the former is a bit nicer, since the dummy variable is a bit silly. We'd hope that delegate can be inlined. But, the template still has a big advantage: you can change the type. And I think that is potentially enormously useful. Another question is how to tie into output ranges. Take std.conv.to. auto s = to!string(10); // currently, this hits the gc What if I want it to go on a stack buffer? One option would be to rewrite it to use an output range, and then call it like: char[20] buffer; auto s = to!string(10, buffer); // it returns the slice of the buffer it actually used (and we can do overloads so to!string(10, radix) still works, as well as to!string(10, radix, buffer). Hassle, I know...) Naturally, the default argument is to use the 'global' allocator, whatever that is, which does nothing special. The fun part is the output range works for that, and could also work for something like this: struct malloced_string { char* ptr; size_t length; size_t capacity; void put(char c) { if(length >= capacity) ptr = realloc(ptr, capacity*2); ptr[length++] = c; } char[] slice() { return ptr[0 .. length]; } alias slice this; mixin RefCounted!this; // pretend this works } { malloced_string str; auto got = to!string(10, str); } // str is out of scope, so it gets free()'d. unsafe though: if you stored a copy of got somewhere, it is now a pointer to freed memory. I'd kinda like language support of some sort to help mitigate that though, like being a borrowed pointer that isn't allowed to be stored, but that's another discussion. And that should work. So then what we might do is provide these little output range wrappers for various allocators, and use them on many functions. So we'd write: import std.allocators; import std.range; // mallocator is provided in std.allocators and offers the goods OutputRange!(char, mallocator) str; auto got = to!string(10, str); What's nice here is the output range is useful for more than just allocators. You could also to!string(10, my_file) or a delegate, blah blah blah. So it isn't too much of a burden, it is something you might naturally use anyway. > Also, we may have the problem of the wrong allocator > being used to free the object. Another reason why encoding the allocator into the type is so nice. For the minimal D I've been playing with, the idea I'm running with is all allocated memory has some kind of special type, and then naked pointers are always assumed to be borrowed, so you should never store or free them. auto foo = HeapArray!char(capacity); void bar(char[] lol){} bar(foo); // allowed, foo has an alias this on slice // but.... struct A { char[] lol; // not allowed, because you don't know when lol is going to be freed } foo frees itself with refcounting. |
June 25, 2013 Re: why allocators are not discussed here | ||||
---|---|---|---|---|
| ||||
Posted in reply to cybervadim | cybervadim: > From my experience all objects may be divided into 2 categories > 1. temporaries. Program usually have some kind of event loop. During one iteration of this loop some temporary objects are created and then discarded. The ideal case for stack (or ranged or area) allocator, where you define allocator at the beginning of the loop cycle, use it for all temporaries, then free all the memory in one go at the end of iteration. > 2. containers. Program receives an event from the outside and puts some data into container OR update the data if the record already exists. > The important thing here is - when updating the data in container, you may want to resize the existing area. Many garbage collectors use the same idea (and manage it automatically), with two or three different generations: http://en.wikipedia.org/wiki/Garbage_collection_%28computer_science%29#Generational_GC_.28ephemeral_GC.29 Bye, bearophile |
June 25, 2013 Re: why allocators are not discussed here | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | > Many garbage collectors use the same idea (and manage it automatically), with two or three different generations:
>
> http://en.wikipedia.org/wiki/Garbage_collection_%28computer_science%29#Generational_GC_.28ephemeral_GC.29
>
> Bye,
> bearophile
The problem with GC is that it doesn't know which is temporary and which is not, so it has to traverse tree to determine that. Allocators in my opinion should let user specify explicitly the temporaries.
|
June 26, 2013 Re: why allocators are not discussed here | ||||
---|---|---|---|---|
| ||||
Posted in reply to Adam D. Ruppe | I was just quickly skimming some criticism of C++ allocators, since my thought here is similar to what they do. On one hand, maybe D can do it right by tweaking C++'s design rather than discarding it. On the other hand, with all the C++ I've done, I have never actually used STL allocators, which could say something about me or could say something about them. One thing I saw said making the differently allocated object a different type sucks. ...but must it? The complaint there was "so much for just doing a function that takes a std::string". But, the way I'd want to do it in D is the function would take a char[] instead, and our special allocated type provides that via opSlice and/or alias this. So you'd only have to worry about the different type if you intend to take ownership of the container yourself. Which we already kinda think about in D: if you store a char[], someone else could overwrite it, so we prefer to store an immutable(char)[] aka string. If you're given a char[] and want to store it, you might idup. So I don't think doing a private copy with some other allocation scheme is any more of a hassle. (BTW immutable objects IMO should *always* be garbage collected, because part of immutability is infinite lifetime. So we might want to be careful with implicit conversions to immutable based on allocation method, which I believe we can protect through member functions.) Anyway, bottom line is I don't think that criticism necessarily applies to D. But there's surely many others and I'm more or less a n00b re c++'s allocators so idk yet. |
June 26, 2013 Re: why allocators are not discussed here | ||||
---|---|---|---|---|
| ||||
Posted in reply to Adam D. Ruppe | On 06/26/2013 12:50 AM, Adam D. Ruppe wrote:
> On Tuesday, 25 June 2013 at 22:22:09 UTC, cybervadim wrote:
>> (introducing a new keyword allocator)
>
> It would be easier to just pass an allocator object that provides the necessary methods and don't use new at all. (I kinda wish new wasn't in the language. It'd make this a little more consistent.)
>
I did think about this as well, but than I came up with something that IMHO is even simpler.
Imagine we have two delegates:
void* delegate(size_t); // this one allocs
void delegate(void*); // this one frees
you pass both to a function that constructs you object. The first is
used for allocation the
memory, the second gets attached to the TypeInfo and is used by the gc
to free
the object. This would be completely transparent to the user.
The use in a container is similar. Just use the alloc delegate to
construct the objects and
attach the free delegate to the typeinfo. You could even mix allocator
strategies in the middle
of the lifetime of the container.
|
Copyright © 1999-2021 by the D Language Foundation