June 26, 2013
26-Jun-2013 18:27, cybervadim пишет:
> On Wednesday, 26 June 2013 at 14:17:03 UTC, Dmitry Olshansky wrote:
>> Awful. What that extra syntax had brought you? Except that now new is
>> unsafe by design?
>> Other questions involve how does this allocation scope goes inside of
>> functions, what is the mechanism of passing it up and down of call-stack.
>>
>> Last but not least I fail to see how scoped allocators alone (as
>> presented) solve even half of the problem.
>
> Extra syntax allows me not touching the existing code.
> Imagine you have a stateless event processing. That is event comes, you
> do some calculation, prepare the answer and send it back. It will look
> like:
>
> void onEvent(Event event)
> {
>     process();
> }
>
> Because it is stateless, you know all the memory allocated during
> processing will not be required afterwards.

Here is a chief problem - the assumption that is required to make it magically work.

Now what I see is:

T arr[];//TLS

//somewhere down the line
arr = ... ;
else{
...
alloctor(myAlloc){
	arr = array(filter!....);
}
...
}
return arr;

Having an unsafe magic wand that may transmogrify some code to switch allocation strategy I consider naive and dangerous.

Who ever told you process does return before allocating a few Gigs of RAM (and hoping on GC collection)? Right, nobody. Maybe it's an event loop that may run forever.

What is missing is that code up to date assumes new == GC and works _like that_.

> So the syntax I suggested
> requires a very little change in code. process() may be implemented
> using std lib, doing several news and resizing.
>
> With new syntax:
>
>
> void onEvent(Event event)
> {
>     ScopedAllocator alloc;
>     allocator(alloc) {
>       process();
>     }
> }
>
> So now you do not use GC for all that is created inside the process().
> ScopedAllocator is a simple stack that will free all memory in one go.
>
> It is up to the runtime implementation to make sure all memory that is
> allocated inside allocator{} scope is actually allocated using
> ScopedAllocator and not GC.
>
> Does it make sense?

Yes, but it's horribly broken.

-- 
Dmitry Olshansky
June 26, 2013
26-Jun-2013 05:24, Adam D. Ruppe пишет:
> I was just quickly skimming some criticism of C++ allocators, since my
> thought here is similar to what they do. On one hand, maybe D can do it
> right by tweaking C++'s design rather than discarding it.
>

Criticisms are:

A) Was defined to not have any state (as noted in the standard)
B) Parametrized on type (T) yet a container that is parametrized on it may need to allocate something else completely (a node with T).
C) Containers are parametrized on allocators so say 2 lists with different allocators are incompatible in a sense that e.g. you can't splice pieces of  them together.

Of the above IMHO we can deduce that
a) Should support stateful allocators but we have to make sure we don't pay storage space for state-less ones (global ones e.g. mallocator).
b) Should preferably be typeless and let container define what they allocate
c) Hardly solvable unless we require a way to reassign objects between allocators (at least of similar kinds)

>
> Anyway, bottom line is I don't think that criticism necessarily applies
> to D. But there's surely many others and I'm more or less a n00b re
> c++'s allocators so idk yet.


-- 
Dmitry Olshansky
June 26, 2013
On Wed, Jun 26, 2013 at 04:31:40PM +0200, cybervadim wrote:
> On Wednesday, 26 June 2013 at 14:26:03 UTC, H. S. Teoh wrote:
> >Yeah, I think the best approach would be one that doesn't require changing a whole mass of code to support. Also, one that doesn't require language changes would be far more likely to be accepted, as the core D devs are leery of adding yet more complications to the language.
> >
> >That's why I proposed that gc_alloc and gc_free be made into thread-global function pointers, that can be swapped with a custom allocator's version. This doesn't have to be visible to user code; it can just be an implementation detail in std.allocator, for example. It allows us to implement custom allocators across a block of code that doesn't know (and doesn't need to know) what allocator will be used.
> >
> 
> Yes, being able to change gc_alloc, gc_free would do the work. If runtime  remembers the stack of gc_alloc/gc_free functions like pushd, popd, that would simplify its usage.  I think this is a very nice and simple solution to the problem.

Adam's idea does this: tie each replacement of gc_alloc/gc_free to some stack-based object, that automatically cleans up in the dtor. So something along these lines:

	struct CustomAlloc(A) {
		void* function(size_t size) old_alloc;
		void  function(void* ptr)   old_free;

		this(A alloc) {
			old_alloc = gc_alloc;
			old_free  = gc_free;

			gc_alloc = &A.alloc;
			gc_free  = &A.free;
		}

		~this() {
			gc_alloc = old_alloc;
			gc_free  = old_free;

			// Cleans up, e.g., region allocator deletes the
			// region
			A.cleanup();
		}
	}

	class C {}

	void main() {
		auto c = new C();	// allocates using default allocator (GC)
		{
			CustomAlloc!MyAllocator _;

			// Everything from here on until end of block
			// uses MyAllocator

			auto d = new C();	// allocates using MyAllocator

			{
				CustomAlloc!AnotherAllocator _;
				auto e = new C(); // allocates using AnotherAllocator

				// End of scope: auto cleanup, gc_alloc and
				// gc_free reverts back to MyAllocator
			}

			auto f = new C();	// allocates using MyAllocator

			// End of scope: auto cleanup, gc_alloc and
			// gc_free reverts back to default values
		}
		auto g = new C();	// allocates using default allocator
	}


So you effectively have an allocator stack, and user code never has to directly manipulate gc_alloc/gc_free (which would be dangerous).


T

-- 
Almost all proofs have bugs, but almost all theorems are true. -- Paul Pedersen
June 26, 2013
Some type system help is required to guarantee that references to such scope-allocated data won't escape.
June 26, 2013
On Wed, Jun 26, 2013 at 06:51:54PM +0400, Dmitry Olshansky wrote:
> 26-Jun-2013 03:16, Adam D. Ruppe пишет:
> >On Tuesday, 25 June 2013 at 22:50:55 UTC, H. S. Teoh wrote:
> >>And maybe (b) can be implemented by making gc_alloc / gc_free overridable function pointers? Then we can override their values and use scope guards to revert them back to the values they were before.
> >
> >Yea, I was thinking this might be a way to go. You'd have a global (well, thread-local) allocator instance that can be set and reset through stack calls.
> >
> >You'd want it to be RAII or delegate based, so the scope is clear.
> >
> >with_allocator(my_alloc, {
> >      do whatever here
> >});
> >
> >
> >or
> >
> >{
> >    ChangeAllocator!my_alloc dummy;
> >
> >    do whatever here
> >} // dummy's destructor ends the allocator scope
> >
> 
> Both suffer from
> a) being totally unsafe and in fact bug prone since all references
> obtained in there are now dangling (and there is no indication where
> they came from)

How is this different from using malloc() and free() manually? You have no indication of where a void* came from either, and the danger of dangling references is very real, as any C/C++ coder knows. And I assume that *some* people will want to be defining custom allocators that wrap around malloc/free (e.g. the game engine guys who want total control).


> b) imagine you need to use an allocator for a stateful object. Say forward range of some other ranges (e.g. std.regex) both scoped/stacked to allocate its internal stuff. 2nd one may handle it but not the 1st one.

Yeah this is a complicated area. A container basically needs to know how to allocate its elements. So somehow that information has to be somewhere.


> c) transfer of objects allocated differently up the call graph
> (scope graph?), is pretty much neglected I see.

They're incompatible. You can't safely make a linked list that contains both GC-allocated nodes and malloc() nodes. That's just a bomb waiting to explode in your face. So in that sense, Adam's idea of using a different type for differently-allocated objects makes sense. A container has to declare what kind of allocation its members are using; any other way is asking for trouble.


> I kind of wondering how our knowledgeable community has come to this. (must have been starving w/o allocators way too long)

We're just trying to provoke Andrei into responding. ;-)


[...]
> IMHO the only place for allocators is in containers other kinds of code may just ignore allocators completely.

But some people clamoring for allocators are doing so because they're bothered by Phobos using ~ for string concatenation, which implicitly uses the GC. I don't think we can just ignore that.


> std.algorithm and friends should imho be customized on 2 things only:
> 
> a) containers to use (instead of array)
> b) optionally a memory source (or allocator) f container is
> temporary(scoped) to tie its life-time to smth.
> 
> Want temporary stuff? Use temporary arrays, hashmaps and whatnot
> i.e. types tailored for a particular use case (e.g. with a
> temporary/scoped allocator in mind).
> These would all be unsafe though. Alternative is ref-counting
> pointers to an allocator. With word on street about ARC it could be
> nice direction to pursue.

Ref-counting is not fool-proof, though. There's always cycles to mess things up.


> Allocators (as Andrei points out in his video) have many kinds:
> a) persistence: infinite, manual, scoped
> b) size: unlimited vs fixed
> c) block-size: any, fixed, or *any* up to some maximum size
> 
> Most of these ARE NOT interchangeable!
> Yet some are composable however I'd argue that allocators are not
> composable but have some reusable parts that in turn are composable.

I was listening to Andrei's talk this morning, but I didn't quite understand what he means by composable allocators. Is he talking about nesting, say, a GC inside a region allocated by a region allocator?


> Code would have to cutter for specific flavors of allocators still so we'd better reduce this problem to the selection of containers.
[...]

Hmm. Sounds like we have two conflicting things going on here:

1) En massé replacement of gc_alloc/gc_free in a certain block of code (which may be the entire program), e.g., for the avoidance of GC in game engines, etc.. Basically, the code is allocator-agnostic, but at some higher level we want to control which allocator is being used.

2) Specific customization of containers, etc., as to which allocator(s) should be used, with (hopefully) some kind of support from the type system to prevent mistakes like dangling pointers, escaping references, etc.. Here, the code is NOT allocator-agnostic; it has to be written with the specific allocation model in mind. You can't just replace the allocator with another one without introducing bugs or problems.

These two may interact in complex ways... e.g., you might want to use malloc to allocate a pool, then use a custom gc_alloc/gc_free to allocate from this pool in order to support language built-ins like ~ and ~= without needing to rewrite every function that uses strings.

Maybe we should stop conflating these two things so that we stop confusing ourselves, and hopefully it will be easier to analyse afterwards.


T

-- 
You have to expect the unexpected. -- RL
June 26, 2013
By the way, while this topic gets some attention, I want to make a notice that there are actually two orthogonal entities that arise when speaking about configurable allocation - allocators itself and global allocation policies. I think good design should address both of those.

For example, changing global allocator for custom one has limited usability - you are anyway limited by the language design that makes only GC or ref-counting viable general options. However, some way to prohibit automatic allocations at runtime while still allowing manual ones may be useful - and it does not matter what allocator is actually used to get that memory. Once such API is designed, tighter classification and control may be added with time.
June 26, 2013
On Wednesday, 26 June 2013 at 17:25:24 UTC, H. S. Teoh wrote:
> I was listening to Andrei's talk this morning, but I didn't quite
> understand what he means by composable allocators. Is he talking about
> nesting, say, a GC inside a region allocated by a region allocator?

Maybe he was talking about a freelist allocator over a reap, as
described by the HeapLayers project http://heaplayers.org/ in the
paper from 2001 titled 'Composing High-Performance Memory
Allocators'. I'm pretty sure that web site was referenced in the
talk. A few publications there are from Andrei.

I agree that D should support programming without a GC, with
different GCs than the default one, and custom allocators, and
that features which demand a GC will be troublesome.

-- Brian
June 26, 2013
On Wednesday, 26 June 2013 at 17:25:24 UTC, H. S. Teoh wrote:
> malloc to allocate a pool, then use a custom gc_alloc/gc_free to
> allocate from this pool in order to support language built-ins like ~ and ~= without needing to rewrite every function that uses strings.

Blargh, I forgot about operator ~ on built ins. For custom types it is easy enough to manage, just overload it. You can even do ~= on types that aren't allowed to allocate, if they have a certain capacity set up ahead of time (like a stack buffer)

But for built ins, blargh, I don't even think we can hint on them to the gc. Maybe we should just go ahead and make the gc generational. (If you aren't using gc, I say leave binary ~ unimplemented in all cases. Use ~= on a temporary instead whenever you would do that. It is easier to follow the lifetime if you explicitly declare your temporary.)
June 26, 2013
On Wednesday, 26 June 2013 at 14:59:41 UTC, Dmitry Olshansky wrote:
> Here is a chief problem - the assumption that is required to make it magically work.
>
> Now what I see is:
>
> T arr[];//TLS
>
> //somewhere down the line
> arr = ... ;
> else{
> ...
> alloctor(myAlloc){
> 	arr = array(filter!....);
> }
> ...
> }
> return arr;
>
> Having an unsafe magic wand that may transmogrify some code to switch allocation strategy I consider naive and dangerous.
>
> Who ever told you process does return before allocating a few Gigs of RAM (and hoping on GC collection)? Right, nobody. Maybe it's an event loop that may run forever.
>
> What is missing is that code up to date assumes new == GC and works _like that_.

Not magic, but the tool which is quite powerful and thus it may shoot your leg.
This is unsafe, but if you want it safe, don't use allocators, stay with GC.
In the example above, you get first arr freed by GC, second arr may point to nothing if myAlloc was implemented to free it before. Or you may get a proper arr reference if myAlloc used malloc and didn't free it. The fact that you may write bad code does not make the language (or concept) bad.

June 26, 2013
26-Jun-2013 23:04, cybervadim пишет:
> On Wednesday, 26 June 2013 at 14:59:41 UTC, Dmitry Olshansky wrote:

>> Having an unsafe magic wand that may transmogrify some code to switch
>> allocation strategy I consider naive and dangerous.
>>
>> Who ever told you process does return before allocating a few Gigs of
>> RAM (and hoping on GC collection)? Right, nobody. Maybe it's an event
>> loop that may run forever.
>>
>> What is missing is that code up to date assumes new == GC and works
>> _like that_.
>
> Not magic, but the tool which is quite powerful and thus it may shoot
> your leg.

I know what kind of thing you are talking about. It's ain't powerful it's just a hack that doesn't quite do what advertised.

> This is unsafe, but if you want it safe, don't use allocators, stay with
> GC.

BTW you were talking changing allocation of the code you didn't write.
There is not even single fact that makes the thing safe. It's all working by chance or because the thing was designed to work with scoped allocator to begin with.

I believe the 2nd case (design to use scoped allocation) is
a) The behavior is guaranteed (determinism vs GC etc)
b) Safety is assured be the designer not pure luck (and reasonable assumption that may not hold)

> In the example above, you get first arr freed by GC, second arr may
> point to nothing if myAlloc was implemented to free it before. Or you
> may get a proper arr reference if myAlloc used malloc and didn't free
> it.

Yeah I know, hence I showed it. BTW forget about malloc I'm not talking about explicit malloc being an alternative to you scheme.

> The fact that you may write bad code does not make the language (or
> concept) bad.

It does. Because it introduces easy unreliable and bug prone usage.

-- 
Dmitry Olshansky