September 29, 2014
On Monday, 29 September 2014 at 15:18:40 UTC, Andrei Alexandrescu wrote:
> On 9/29/14, 5:29 AM, Dicebot wrote:
>> Any assumption that library code can go away with some set of
>> pre-defined allocation strategies is crap. This whole discussion was
>> about how important it is to move allocation decisions to user code
>> (ranges are just one tool to achieve that, Don has been presenting
>> examples of how we do that with plain arrays in DConf 2014 talk).
>
> That's making exactly the confusion I was - that memory allocation strategy is the same as memory management strategy.

Yes but neither decision belongs to library code except for very rare cases.

>
>> In that regard allocators + ranges are still the way to go in my
>> opinion. Yes, sometimes those result in very hard to use API - providing
>> GC-heavy but friendly alternatives for those shouldn't do any harm. But
>> in general full decoupling of algorithms from allocations is necessary.
>> If that makes D poor cousin of C++ we may have a learn few tricks from C++.
>
> As long as things are trivial they can be done with relative ease, albeit with more pain. But consider e.g. the recent JSON library by Sönke. It needs to create a lookup data structure and return things like strings from it. What primitives do you think could it define?

Sounds like it may have to define own kind of allocator with certain implementation restrictions (and implement it in terms of GC by default). I have not actually read the code for that proposal so hard to guess. Will need to do it if it really matters.
September 29, 2014
On 9/29/14, 8:53 AM, Dicebot wrote:
> On Monday, 29 September 2014 at 15:18:40 UTC, Andrei Alexandrescu wrote:
>> On 9/29/14, 5:29 AM, Dicebot wrote:
>>> Any assumption that library code can go away with some set of
>>> pre-defined allocation strategies is crap. This whole discussion was
>>> about how important it is to move allocation decisions to user code
>>> (ranges are just one tool to achieve that, Don has been presenting
>>> examples of how we do that with plain arrays in DConf 2014 talk).
>>
>> That's making exactly the confusion I was - that memory allocation
>> strategy is the same as memory management strategy.
>
> Yes but neither decision belongs to library code except for very rare
> cases.

You just assert it, so all I can say is "I understand you believe this". I've motivated my argument. You may want to do the same for yours.

>>> In that regard allocators + ranges are still the way to go in my
>>> opinion. Yes, sometimes those result in very hard to use API - providing
>>> GC-heavy but friendly alternatives for those shouldn't do any harm. But
>>> in general full decoupling of algorithms from allocations is necessary.
>>> If that makes D poor cousin of C++ we may have a learn few tricks
>>> from C++.
>>
>> As long as things are trivial they can be done with relative ease,
>> albeit with more pain. But consider e.g. the recent JSON library by
>> Sönke. It needs to create a lookup data structure and return things
>> like strings from it. What primitives do you think could it define?
>
> Sounds like it may have to define own kind of allocator with certain
> implementation restrictions (and implement it in terms of GC by
> default). I have not actually read the code for that proposal so hard to
> guess. Will need to do it if it really matters.

So you don't have an answer. And again you are confusing memory allocation with memory management.

I have sketched an approach that works and will take us to Phobos being most transparently usable with tracing collection or with reference counting. Part of that is RCString (and generally reference counted slices and hashtables), and another part is the @refcounted attribute for classes. I will push it through. If you have any objections, it would be great if you argued them properly.


Thanks,

Andrei

September 29, 2014
Am 29.09.2014 12:49, schrieb Andrei Alexandrescu:
> [...]
>
> The three policies are:
>
> (a) gc is the classic garbage-collected style of management;
>
> (b) rc is a reference-counted style still backed by the GC, i.e. the GC
> will still be able to pick up cycles and other kinds of leaks.
>
> (c) mrc is a reference-counted style backed by malloc.
>
> (It should be possible to collapse rc and mrc together and make the
> distinction dynamically, at runtime. I'm distinguishing them statically
> here for expository purposes.)
>
> ...

Personally, I would go just for (b) with compiler support for increment/decrement removal, as I think it will be too complex having to support everything and this will complicate all libraries.

Anyway, that was just my 0.02€. Stepping out the thread as I just toy around with D and cannot add much more to the discussion.

--
Paulo

September 29, 2014
On Monday, 29 September 2014 at 17:04:54 UTC, Andrei Alexandrescu wrote:
>> Yes but neither decision belongs to library code except for very rare
>> cases.
>
> You just assert it, so all I can say is "I understand you believe this". I've motivated my argument. You may want to do the same for yours.

I probably have missed the part with arguments :) Your reasoning is not fundamentally different from "GC should be enough" but extended to several options from single one.

My argument is simple - one can't forsee everything. I remember reading book of one guy who has been advocating thing called "policy-based design", you may know him ;) Was quite impressed with the simple but practical basic idea - decoupling parts of the implementation that are not inherently related.

> So you don't have an answer. And again you are confusing memory allocation with memory management.

Yes, sorry, I don't have an answer. Or time do deeply dive into the code unless it is really important or my direct responsibility.

Unfortunately, I don't see an answer how your proposal fits our code either. Most of Sociomantic code relies on using arrays as ref arguments to avoid creating of new GC roots (no, we don't need/want to switch to ARC). This was several times called as the reason why Phobos in its current shape is largely unusable for out needs even when D2 switch is finished. I don't see how proposal in original post changes that.
September 29, 2014
On 2014-09-29 12:49, Andrei Alexandrescu wrote:

> Now that we clarified that these existing attempts are not going to work
> well, the question remains what does. For Phobos I'm thinking of
> defining and using three policies:
>
> enum MemoryManagementPolicy { gc, rc, mrc }
> immutable
>      gc = ResourceManagementPolicy.gc,
>      rc = ResourceManagementPolicy.rc,
>      mrc = ResourceManagementPolicy.mrc;
>
> The three policies are:
>
> (a) gc is the classic garbage-collected style of management;
>
> (b) rc is a reference-counted style still backed by the GC, i.e. the GC
> will still be able to pick up cycles and other kinds of leaks.
>
> (c) mrc is a reference-counted style backed by malloc.
>
> (It should be possible to collapse rc and mrc together and make the
> distinction dynamically, at runtime. I'm distinguishing them statically
> here for expository purposes.)
>
> The policy is a template parameter to functions in Phobos (and
> elsewhere), and informs the functions e.g. what types to return. Consider:
>
> auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1 path, R2 ext)
> if (...)
> {
>      static if (mmp == gc) alias S = string;
>      else alias S = RCString;
>      S result;
>      ...
>      return result;
> }
>
> On the caller side:
>
> auto p1 = setExtension("hello", ".txt"); // fine, use gc
> auto p2 = setExtension!gc("hello", ".txt"); // same
> auto p3 = setExtension!rc("hello", ".txt"); // fine, use rc

How does allocators fit in this? Will it be an additional argument to the function. Or a separate stack that one can push and pop allocators to?

-- 
/Jacob Carlborg
September 29, 2014
On Monday, 29 September 2014 at 12:29:33 UTC, Dicebot wrote:
> Any assumption that library code can go away with some set of pre-defined allocation strategies is crap. This whole discussion was about how important it is to move allocation decisions to user code (ranges are just one tool to achieve that, Don has been presenting examples of how we do that with plain arrays in DConf 2014 talk).

I think the key to this sort of issue is to try and get as much functionality in Phobos marked @nogc as possible. After that, building new library-like functionality into a DUB package that assumes @nogc and only uses the @nogc code in Phobos would be the next step. Should that get to a state where it's popular and supported, pulling it in as std.nogc.* might make sense, but trying to redo Phobos as a manual memory collection library is infeasible.

Were I your company, I'd start working on leading such an effort.

Unlike Tango, I don't think a development like this would split the community nor the community's resources in a useless fashion.
September 29, 2014
On Monday, 29 September 2014 at 10:49:53 UTC, Andrei Alexandrescu wrote:
> On the caller side:
>
> auto p1 = setExtension("hello", ".txt"); // fine, use gc
> auto p2 = setExtension!gc("hello", ".txt"); // same
> auto p3 = setExtension!rc("hello", ".txt"); // fine, use rc
>
> So by default it's going to continue being business as usual, but certain functions will allow passing in a (defaulted) policy for memory management.

Forcing someone (or rather, a team of someones) to call into the library in a consistent fashion like this seems like a rather risky venture. I suppose that you could add some special compiler checks to make sure that people are being consistent, but I'd probably rather see some way of templating modules so that the chances for human error are reduced.

--- foo.d ---
module std.foo(GC = gc);

void bar() {
   static if (gc) {
      ...
   }
}

--- usercode.d ---
import std.foo!rc;

void fooCaller() {
    bar();
}

Though truthfully, I'd rather it be a compiler flag. But I presume that there's an issue with that, which it is too early for my brain to think of.
September 29, 2014
On 2014-09-29 10:49:52 +0000, Andrei Alexandrescu said:

> Back when I've first introduced RCString I hinted that we have a larger strategy in mind. Here it is.
> 
> The basic tenet of the approach is to reckon and act on the fact that memory allocation (the subject of allocators) is an entirely distinct topic from memory management, and more generally resource management. This clarifies that it would be wrong to approach alternatives to GC in Phobos by means of allocators. GC is not only an approach to memory allocation, but also an approach to memory management. Reducing it to either one is a mistake. In hindsight this looks rather obvious but it has caused me and many people better than myself a lot of headache.
> 
> That said allocators are nice to have and use, and I will definitely follow up with std.allocator. However, std.allocator is not the key to a @nogc Phobos.
> 
> Nor are ranges. There is an attitude that either output ranges, or input ranges in conjunction with lazy computation, would solve the issue of creating garbage. https://github.com/D-Programming-Language/phobos/pull/2423 is a good illustration of the latter approach: a range would be lazily created by chaining stuff together. A range-based approach would take us further than the allocators, but I see the following issues with it:
> 
> (a) the whole approach doesn't stand scrutiny for non-linear outputs, e.g. outputting some sort of associative array or really any composite type quickly becomes tenuous either with an output range (eager) or with exposing an input range (lazy);
> 
> (b) makes the style of programming without GC radically different, and much more cumbersome, than programming with GC; as a consequence, programmers who consider changing one approach to another, or implementing an algorithm neutral to it, are looking at a major rewrite;
> 
> (c) would make D/@nogc a poor cousin of C++. This is quite out of character; technically, I have long gotten used to seeing most elaborate C++ code like poor emulation of simple D idioms. But C++ has spent years and decades taking to perfection an approach without a tracing garbage collector. A departure from that would need to be superior, and that doesn't seem to be the case with range-based approaches.
> 
> ===========
> 
> Now that we clarified that these existing attempts are not going to work well, the question remains what does. For Phobos I'm thinking of defining and using three policies:
> 
> enum MemoryManagementPolicy { gc, rc, mrc }
> immutable
>      gc = ResourceManagementPolicy.gc,
>      rc = ResourceManagementPolicy.rc,
>      mrc = ResourceManagementPolicy.mrc;
> 
> The three policies are:
> 
> (a) gc is the classic garbage-collected style of management;
> 
> (b) rc is a reference-counted style still backed by the GC, i.e. the GC will still be able to pick up cycles and other kinds of leaks.
> 
> (c) mrc is a reference-counted style backed by malloc.
> 
> (It should be possible to collapse rc and mrc together and make the distinction dynamically, at runtime. I'm distinguishing them statically here for expository purposes.)
> 
> The policy is a template parameter to functions in Phobos (and elsewhere), and informs the functions e.g. what types to return. Consider:
> 
> auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1 path, R2 ext)
> if (...)
> {
>      static if (mmp == gc) alias S = string;
>      else alias S = RCString;
>      S result;
>      ...
>      return result;
> }
> 
> On the caller side:
> 
> auto p1 = setExtension("hello", ".txt"); // fine, use gc
> auto p2 = setExtension!gc("hello", ".txt"); // same
> auto p3 = setExtension!rc("hello", ".txt"); // fine, use rc
> 
> So by default it's going to continue being business as usual, but certain functions will allow passing in a (defaulted) policy for memory management.
> 
> Destroy!
> 
> 
> Andrei

I don't like the idea of having to pass in template parameters everywhere -- even for allocators.  Is there some way we could have "allocator contexts"?

E.G.

with( auto allocator = ReferencedCounted() )
{
	auto foo = setExtension("hello", "txt");
}

ReferenceCounted() could replace a thread-local "new" delegate with something it has, and when it goes out of scope, it would reset it to whatever it was before.   This would create some runtime overhead -- but I'm not sure how much more than already exists.

-Shammah

September 29, 2014
> auto p1 = setExtension("hello", ".txt"); // fine, use gc
> auto p2 = setExtension!gc("hello", ".txt"); // same
> auto p3 = setExtension!rc("hello", ".txt"); // fine, use rc
>
> So by default it's going to continue being business as usual, but certain functions will allow passing in a (defaulted) policy for memory management.
>
> Destroy!

I'll try to destroy ;) Before thinking out some answers to this
problem let me ask a little more questions.

1. As far as I understand allocation and memory management of
entities like class (Object), dynamic arrays and associative
arrays is part of language/ runtime. What is proposed here is
*fix* to standart library. But that allocation and MM happening
via GC is not *fault* of standart library but is predefined
behaviour of D lang itself and it's runtime. The standard library
becomes a `hostage` of runtime library in this situation. Do you
really sure that we should "fix" standart library in that way?
For me it looks like implementing struts for standard lib (which
is not broken yet ;) ) in order to compensate behaviour of
runtime lib.

2. Second question is slightly oftopic, but I still want put it
there. What I dislike about ranges and standart library is that
it's hard to understand what is the returned value of library
function. I have some *pedals* (front, popFront) to push and do
some magic. Of course it was made for purpose of making universal
algorithms. But the mor I use ranges, *auto* then less I believe
that I use static-typed language. What is wanted to make code
clear is having distinct variable declaration with specification
of it's type. With all of these auto's logic of programme becomes
unclear, because data structures are unclear. So I came to the
question: is the memory management or allocation policy
syntacticaly part of declaration or is it a inner implementation
detail that should not be shown in decl?

Should rc and gc string look simillar or not?

string str1 = makeGCString("test");
string str2 = makeRCString("test");

// --- vs ---

GCString str1 = "test";
RCString str2 = "test";

// --- or ---

String!GC str1 = "test";
String!RC str2 = "test";

// --- or even ---
@gc string str1 = "test";
@rc string str2 = "test";

As far as I understand currently we will have:
string str1 = "test";
RCString str2 = "test";

So another question is why the same object "string" is
implemented as different types. Array and struct (class)?

3. Should algorithms based on range interface care about
allocation? Range is about iteration and access to elements but
not about allocation and memory mangement.

I would like to have attributes @rc, @gc (or like these) to
switch MM-policy versus *String!RC* or *RCString* but we cannot
apply attributes to literal. Passing to allgorithm something like
this:

find( @rc "test", @rc "t" )

is syntactically incorrect. But we can use this form:

find( RCString("test"), RCString("t") )

But above form is more verbose. As continuation of this question
I have next question.

4. How to deal with literals? How to make them ref-counted?

I ask this because even when writing RCString("test")
syntactically expression "test" is still GC-managed literal. I
pass GC-managed literal into struct to make it RC-managed. Why
just not make it RC from the start?

Adding some additional template parameter to algrorithm wil not
fix this. It is a problem of D itself and it's runtime library.


So I assume that std lib is not broken this way and we should not
try to fix it this way. Thanks for attention.
September 29, 2014
On 9/29/14, 10:16 AM, Paulo Pinto wrote:
> Personally, I would go just for (b) with compiler support for
> increment/decrement removal, as I think it will be too complex having to
> support everything and this will complicate all libraries.

Compiler already knows (after inlining) that ++i and --i cancel each other, so we should be in good shape there. -- Andrei