February 06, 2014
On 2/6/14, 7:22 AM, Sönke Ludwig wrote:
> I'm just not convinced (far from it) that Phobos should be built on top
> of such an RCSlice type. I rather strongly agree with Dicebot that the
> API should be extended to work with ranges or pre-allocated buffers
> where possible + support for custom allocators where it makes sense. How
> the memory is managed is then totally up to the user and no Phobos
> function needs to be aware of that (e.g. just pass in a pre-allocated,
> reference counted slice).

That makes sense. One possibility I was thinking about was to make Phobos largely transparent wrt types trafficked and simply return the type received. Consider:

// lib code
struct RCSlice(T) { ... }
alias rcstring = RCSlice!(immutable char);
rcstring rc!(string s) { ... }

// user code
auto s1 = buildPath!("hello", "world");
auto s2 = buildPath!(rc!"hello", rc!"world");

In this example s1 will have type string and s2 will have type rcstring.

There are of course functions that would need to be given hints as to the output type.


Andrei

February 06, 2014
On Thursday, 6 February 2014 at 16:25:37 UTC, Andrei Alexandrescu wrote:
> // lib code
> struct RCSlice(T) { ... }
> alias rcstring = RCSlice!(immutable char);
> rcstring rc!(string s) { ... }
>
> // user code
> auto s1 = buildPath!("hello", "world");
> auto s2 = buildPath!(rc!"hello", rc!"world");
>
> In this example s1 will have type string and s2 will have type rcstring.

Looks unnecessary restrictive. Why can't one build rc-string from stack buffers or Array!char from rc-strings? Type of output buffer does not have to do anything with input.
February 06, 2014
On 2/6/14, 7:47 AM, Johannes Pfau wrote:
> Am Thu, 6 Feb 2014 14:37:59 +0300
> schrieb Max Klyga <max.klyga@gmail.com>:
>
>>
>> My point is that we should not ruin the language ease of use. We do
>> need to deal with Phobos internal allocations, but we should not
>> switch to ARC as a default memory management scheme.
>
> What's with all this finger pointing and drawing battle lines in the
> last few days? GC-crowd vs ARC-crowd? Can we please all calm down?
[snip]

Nice. An interspersed point:

> I don't think there's anything wrong with the obvious solution: All
> phobos functions which allocate take an optional Allocator parameter,
> defaulting to GC. The little extra typing won't harm anyone and if you
> want to use things like stack-based buffers you'll have to write extra
> code and think about memory allocation anyway.
>
> auto gcString = toUpper("test");
> auto mallocString = toUpper!Malloc("test");
> ubtye[64] sbuf;
> auto stackString = toUpper(sbuf[], "test");
>
> What's so bad about this?

The issue here is that Phobos functions need to document whether e.g. they return memory that can be deallocated or not. Counterexamples would be returning static strings or subslices of allocations.

I'm not saying it's not solvable, but it'll take some thinking and some work.

> It works for most of phobos, doesn't require
> language changes and it's easy to realize what's going on when reading
> the code. Having an 'application default allocator' or 'thread local
> default allocator' or 'per function default allocator' will actually
> hide the allocation strategy and I bet it would cause issues.

I think a crack should be given to the user to install their own allocator (per thread and/or shared). Perhaps we can limit that to the startup stage, i.e. before any allocation takes place.

> So the question then is: what about language feature which allocate
> using the GC? Wouldn't we want these to work with any kind of
> allocator? Answer: no, because:
>
> This is the list of language features which allocate:
[snip]

I think you forgot AAs.

> We just have to provide everyone with a way to choose their favorite
> implementation. Which means we provide public APIs which allow any kind
> of memory allocation and internally do not rely on automatic memory
> management (internal allocation in phobos should be done on the stack/
> with malloc / made configurable, but not with a GC).

I agree that's a nice goal. But I don't think it's easily attainable. The "choose the allocator" part is easy. The harder is choosing the reclamation method. There are differences between GC and RC that are very difficult to unify under a common API.


Andrei

February 06, 2014
On 2/6/14, 8:25 AM, Andrei Alexandrescu wrote:
> rcstring rc!(string s) { ... }

I meant

rcstring rc(string s)() { ... }


Andrei
February 06, 2014
Am 06.02.2014 16:22, schrieb Sönke Ludwig:
> Am 06.02.2014 14:35, schrieb Meta:
>> On Thursday, 6 February 2014 at 13:23:14 UTC, Sönke Ludwig wrote:
>>> Am 06.02.2014 12:37, schrieb Max Klyga:
>>>> Anti-GC crowd tries to promote ARC as an deterministic alternative for
>>>> memory management.
>>>> I noticed that people promoting ARC do not provide any disadvantages
>>>> for
>>>> proposed approach.
>>>>
>>>> The thing is in gamedev and other soft-realitime software background
>>>> only a handfull types of resources are really managed by RC and memory
>>>> usage patterns are VERY specific to their domain (mostly linear
>>>> allocation/deallocation and objects with non deterministic lifetime are
>>>> preallocated in pools).
>>>>
>>>> Trying to use RC as a general method of memory management leads to some
>>>> problems.
>>>> A pretty detailed view by John Harrop (He is somewhat known for
>>>> trolling
>>>> in PL community, but nonetheless knows what he is talking about) -
>>>> http://www.quora.com/Computer-Programming/How-do-reference-counting-and-garbage-collection-compare/answer/Jon-Harrop-1?srid=3Gvg&share=1#
>>>>
>>>>
>>>>
>>>>
>>>> So RC could also introduce unpredictable pause times at undesired
>>>> places.
>>>>
>>>> This is also confirmed by research from HP -
>>>> http://www.hpl.hp.com/personal/Hans_Boehm/popl04/refcnt.pdf
>>>>
>>>> My point is that we should not ruin the language ease of use. We do
>>>> need
>>>> to deal with Phobos internal allocations, but we should not switch to
>>>> ARC as a default memory management scheme. In practice people promoting
>>>> ARC will probably not use phobos anyway. Currently its just an
>>>> excuse to
>>>> not use D.
>>>>
>>>> Look at c++ and STL, etc. People will roll their own solutions no
>>>> matter
>>>> what you try.
>>>>
>>>
>>> Full ACK! Reference counting should be well supported, but it
>>> shouldn't be the default scheme or built-in at a low level. From my
>>> personal experience it would be ideal to be able to customize certain
>>> types to be reference counted (allowing the user full flexibility
>>> implementing the actual reference counting and without ruling out weak
>>> references!), but have them accessible using the same syntax and type
>>> conversion semantics as normal references.
>>
>> I think the best way forward would be to look at the places in D where
>> allocations happen, and then figure out how we can optionally allow
>> reference counting in these situations. Andrei just made a thread on
>> this yesterday in regard to slices, which I think are the most promising
>> for a RC solution.
>
> I'm just not convinced (far from it) that Phobos should be built on top
> of such an RCSlice type. I rather strongly agree with Dicebot that the
> API should be extended to work with ranges or pre-allocated buffers
> where possible + support for custom allocators where it makes sense. How
> the memory is managed is then totally up to the user and no Phobos
> function needs to be aware of that (e.g. just pass in a pre-allocated,
> reference counted slice).

Although I seldom use D, I would like to say +1, if I may.

--
Paulo
February 06, 2014
On Thursday, 6 February 2014 at 16:40:32 UTC, Andrei Alexandrescu wrote:
> The issue here is that Phobos functions need to document whether e.g. they return memory that can be deallocated or not. Counterexamples would be returning static strings or subslices of allocations.

This is why specifying ownership by type is important. It documents the need, it makes sure the information doesn't get dropped, and it can automatically manage the details (via RAII).

Something that mallocs should return Malloced!T which calls the appropriate free (specified by the allocator) in the destructor. GC should return GC!T. Refconted should return RefCounted!T, and so on.

alias this can easily allow interoperability... though, of course, not escaping things incorrectly would have to be taken care of, either manually or automatically. I keep coming back to this because it cannot be avoided, except by GC through and through. If the language does not help with this, it doesn't mean the complexity goes away. It just means it is moved onto the (fallible) programmer.

> I think a crack should be given to the user to install their own allocator (per thread and/or shared). Perhaps we can limit that to the startup stage, i.e. before any allocation takes place.

You could always link in your own _d_allocmemory, etc. I wouldn't do this, it will make things hard to get right, but it is very easy  - just add the functions to your main project. the linker will prefer your functions to the druntime functions.
February 06, 2014
On Thu, Feb 06, 2014 at 04:30:31PM +0000, Dicebot wrote:
> On Thursday, 6 February 2014 at 16:25:37 UTC, Andrei Alexandrescu wrote:
> >// lib code
> >struct RCSlice(T) { ... }
> >alias rcstring = RCSlice!(immutable char);
> >rcstring rc!(string s) { ... }
> >
> >// user code
> >auto s1 = buildPath!("hello", "world");
> >auto s2 = buildPath!(rc!"hello", rc!"world");
> >
> >In this example s1 will have type string and s2 will have type rcstring.
> 
> Looks unnecessary restrictive. Why can't one build rc-string from stack buffers or Array!char from rc-strings? Type of output buffer does not have to do anything with input.

Agree. Phobos algorithms that populate a data sink should migrate toward using output ranges instead of returning a predetermined type. This will not only address ARC needs, but a bunch of other things as well (output range support/use in Phobos is still rather scanty at the moment).


T

-- 
MSDOS = MicroSoft's Denial Of Service
February 06, 2014
On Thu, Feb 06, 2014 at 04:47:05PM +0100, Johannes Pfau wrote: [...]
> Some people seem to want some implicit way to set a 'default' allocator, but I haven't heard of any solution that works. (E.g. having a thread-local default allocator, per library default allocator, how would that even work?)
> 
> I don't think there's anything wrong with the obvious solution: All phobos functions which allocate take an optional Allocator parameter, defaulting to GC. The little extra typing won't harm anyone and if you want to use things like stack-based buffers you'll have to write extra code and think about memory allocation anyway.
> 
> auto gcString = toUpper("test");
> auto mallocString = toUpper!Malloc("test");
> ubtye[64] sbuf;
> auto stackString = toUpper(sbuf[], "test");
> 
> What's so bad about this? It works for most of phobos, doesn't require language changes and it's easy to realize what's going on when reading the code. Having an 'application default allocator' or 'thread local default allocator' or 'per function default allocator' will actually hide the allocation strategy and I bet it would cause issues.
[...]

I think a superior solution is to pass in an output range to toUpper, that does whatever form of allocation you prefer. There's nothing about toUpper that *fundamentally* depends on an allocator, therefore it shouldn't even *care* what an allocator is. Reduced to its absolute fundamentals, it just takes data from some input string, and produces some output data. Where this output data goes is none of its concern -- it can be a GC string, an ARC string, stdout, an interprocess pipe, a network socket, toUpper shouldn't have to care which one it is. Just take an output range.

Then on the complementary side, have Phobos provide a bunch of premade output ranges that allocates a GC string, or an ARC string, or whatever, and then the user can just pick one of those to pass to toUpper.


T

-- 
Almost all proofs have bugs, but almost all theorems are true. -- Paul Pedersen
February 06, 2014
Am Thu, 06 Feb 2014 08:40:28 -0800
schrieb Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org>:

> >
> > auto gcString = toUpper("test");
> > auto mallocString = toUpper!Malloc("test");
> > ubtye[64] sbuf;
> > auto stackString = toUpper(sbuf[], "test");
> >
> > What's so bad about this?
> 
> The issue here is that Phobos functions need to document whether e.g. they return memory that can be deallocated or not. Counterexamples would be returning static strings or subslices of allocations.
> 
> I'm not saying it's not solvable, but it'll take some thinking and some work.

That's true. I wonder how common these cases are but slices are probably the bigger problem here. (OTOH if a function just slices the input, we'd have to document it but there's no bigger issue)

> > [...] Having an 'application default allocator' or
> > 'thread local default allocator' or 'per function default
> > allocator' will actually hide the allocation strategy and I bet it
> > would cause issues.
> 
> I think a crack should be given to the user to install their own allocator (per thread and/or shared). Perhaps we can limit that to the startup stage, i.e. before any allocation takes place.

If we can make that work then I won't complain. As long as the default
allocator can't be changed at random a point in time most problems
should be solved for a global default allocator.
For per-thread allocators this is difficult: If you allocate in one
thread and free in another how do you make sure you use the correct free
function?

There are some interesting possibilities though: For example we could add a delegate to object which points to the correct 'free' function. But then things get complicated if we have to manage the lifetime of the allocator as well....

> > This is the list of language features which allocate:
> [snip]
> 
> I think you forgot AAs.

I had AA literals in the list, but you're right some other AA features allocate as well. Good you mentioned that, I'll have to detect these cases in -nogc/-vgc code as well...

However, from a user point of view dcollections (and I hope at some point std.container as well) provides a nice replacement for all these operations, except for literals.

> 
> > We just have to provide everyone with a way to choose their favorite implementation. Which means we provide public APIs which allow any kind of memory allocation and internally do not rely on automatic memory management (internal allocation in phobos should be done on the stack/ with malloc / made configurable, but not with a GC).
> 
> I agree that's a nice goal. But I don't think it's easily attainable. The "choose the allocator" part is easy. The harder is choosing the reclamation method. There are differences between GC and RC that are very difficult to unify under a common API.
> 

I'd guess that allocation is actually a bigger issue for those who are unhappy with the GC right now, but I have no way to prove that ;-) (Explicit manual freeing is annoying, but possible. But if a function internally allocates with the GC it can't be used at all).

But you're of course right, getting reclamation right is probably more difficult and also important.
February 06, 2014
On 2/6/14, 9:18 AM, Adam D. Ruppe wrote:
> Something that mallocs should return Malloced!T which calls the
> appropriate free (specified by the allocator) in the destructor. GC
> should return GC!T. Refconted should return RefCounted!T, and so on.

That ain't going to work.

Malloced!T and GC!T suggests parameterization by the type of the allocator. So there would need to be a type per allocator, which is a losing proposition from std.allocator's viewpoint, since there can be so many of them via template combinatorics.

RefCounted!T is a whole different thing, because it doesn't encode allocation strategy but instead memory reclamation tactics. There's no "and so on" and RefCounted!T cannot occur in an enumeration that includes Malloced!T and GC!T.


Andrei