September 30, 2014
On Tuesday, 30 September 2014 at 12:32:08 UTC, Ola Fosheim Grøstad wrote:
> ...basic building blocks such as intrinsics to build your own RC with compiler support sounds like a more interesting option.

I agree.
October 01, 2014
On 30 September 2014 08:04, Andrei Alexandrescu via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
> On 9/29/14, 10:16 AM, Paulo Pinto wrote:
>>
>> Personally, I would go just for (b) with compiler support for increment/decrement removal, as I think it will be too complex having to support everything and this will complicate all libraries.
>
>
> Compiler already knows (after inlining) that ++i and --i cancel each other, so we should be in good shape there. -- Andrei

The compiler doesn't know that MyLibrary_AddRef(Thing *t); and
MyLibrary_DecRef(Thing *t); cancel eachother out though...
rc needs primitives that the compiler understands implicitly, so that
rc logic can be more complex than ++i/--i;
October 01, 2014
On Wednesday, 1 October 2014 at 01:26:45 UTC, Manu via
Digitalmars-d wrote:
> On 30 September 2014 08:04, Andrei Alexandrescu via Digitalmars-d
> <digitalmars-d@puremagic.com> wrote:
>> On 9/29/14, 10:16 AM, Paulo Pinto wrote:
>>>
>>> Personally, I would go just for (b) with compiler support for
>>> increment/decrement removal, as I think it will be too complex having to
>>> support everything and this will complicate all libraries.
>>
>>
>> Compiler already knows (after inlining) that ++i and --i cancel each other,
>> so we should be in good shape there. -- Andrei
>
> The compiler doesn't know that MyLibrary_AddRef(Thing *t); and
> MyLibrary_DecRef(Thing *t); cancel eachother out though...
> rc needs primitives that the compiler understands implicitly, so that
> rc logic can be more complex than ++i/--i;

Even with simply i++ and i--, the information that they always go
by pair is lost on the compiler in many cases.
October 01, 2014
On 29 September 2014 20:49, Andrei Alexandrescu via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
> [...]
>
> Destroy!
>
> Andrei

I generally like the idea, but my immediate concern is that it implies
that every function that may deal with allocation is a template.
This interferes with C/C++ compatibility in a pretty big way. Or more
generally, the idea of a lib. Does this mean that a lib will be
required to produce code for every permutation of functions according
to memory management strategy? Usually libs don't contain code for
uninstantiated templates.

With this in place, I worry that traditional use of libs, separate
compilation, external language linkage, etc, all become very
problematic.
Pervasive templates can only work well if all code is D code, and if
all code is compiled together.
Most non-OSS industry doesn't ship source, they ship libs. And if libs
are to become impractical, then dependencies become a problem; instead
of linking libphobos.so, you pretty much have to compile phobos
together with your app (already basically true for phobos, but it's
fairly unique).
What if that were a much larger library? What if you have 10s of
dependencies all distributed in this manner? Does it scale?

I guess this doesn't matter if this is only a proposal for phobos... but I suspect the pattern will become pervasive if it works, and yeah, I'm not sure where that leads.
October 01, 2014
On Monday, 29 September 2014 at 10:49:53 UTC, Andrei Alexandrescu wrote:
> Back when I've first introduced RCString I hinted that we have a larger strategy in mind. Here it is.

Slightly related :)

https://github.com/D-Programming-Language/phobos/pull/2573
October 01, 2014
On 9/30/14, 9:10 AM, Sean Kelly wrote:
> On Monday, 29 September 2014 at 10:49:53 UTC, Andrei Alexandrescu wrote:
>>
>> The policy is a template parameter to functions in Phobos (and
>> elsewhere), and informs the functions e.g. what types to return.
>> Consider:
>>
>> auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1 path, R2
>> ext)
>> if (...)
>> {
>>     static if (mmp == gc) alias S = string;
>>     else alias S = RCString;
>>     S result;
>>     ...
>>     return result;
>> }
>
> Is this for exposition purposes or actually how you expect it to work?

That's pretty much what it would take. The key here is that RCString is almost a drop-in replacement for string, so the code using it is almost identical. There will be places where code needs to be replaced, e.g.

auto s = "literal";

would need to become

S s = "literal";

So creation of strings will change a bit, but overall there's not a lot of churn.

> Quite honestly, I can't imagine how I could write a template function
> in D that needs to work with this approach.

You mean write a function that accepts a memory management policy, or a function that uses one?

> As much as I hate to say it, this is pretty much exactly what C++
> allocators were designed for.  They handle allocation, sure, but they
> also hold aliases for all relevant types for the data being allocated.
> If the MemoryManagementPolicy enum were replaced with an alias to a type
> that I could use to at least obtain relevant aliases, that would be
> something.  But even that approach dramatically complicates code that
> uses it.

I think making MemoryManagementPolicy a meaningful type is a great idea. It would e.g. define the string type, so the code becomes:

auto setExtension(alias MemoryManagementPolicy = gc, R1, R2)(R1 path, R2 ext)
if (...)
{
    MemoryManagementPolicy.string result;
    ...
    return result;
}

This is a lot more general and extensible. Thanks!

Why do you think there'd be dramatic complication of code? (Granted, at some point we must acknowledge that some egg breaking is necessary for the proverbial omelette.)

> Having written standards-compliant containers in C++, I honestly can't
> imagine the average user writing code that works this way. Once you
> assert that the reference type may be a pointer or it may be some
> complex proxy to data stored elsewhere, a lot of composability pretty
> much flies right out the window.

The thing is, again, we must make some changes if we want D to be usable without a GC. One of them is e.g. to not allocate built-in slices all over the place.

> For example, I have an implementation of C++ unordered_map/set/etc
> designed to be a customizable cache, so one of its template arguments is
> a policy type that allows eviction behavior to be chosen at declaration
> time.  Maybe the cache is size-limited, maybe it's age-limited, maybe
> it's a combination of the two or something even more complicated.  The
> problem is that the container defines all the aliases relating to the
> underlying data, but the policy, which needs to be aware of these, is
> passed as a template argument to this container.
>
> To make something that's fully aware of C++ allocators then, I'd have to
> define a small type that takes the container template arguments (the
> contained type and the allocator type) and generates the aliases and
> pass this to the policy, which in turn passes the type through to the
> underlying container so it can declare its public aliases and whatever
> else is true standards-compliant fashion (or let the container derive
> this itself, but then you run into the potential for disagreement). And
> while this is possible, doing so would complicate the creation of the
> cache policies to the point where it subverts their intent, which was to
> make it easy for the user to tune the behavior of the cache to their own
> particular needs by defining a simple type which implements a few
> functions.  Ultimately, I decided against this approach for the cache
> container and decided to restrict the allocators to those which defined
> a pointer to T as T* so the policies could be coded with basically no
> knowledge of the underlying storage.

That sounds like a rather involved artifact. Hopefully we can leverage D's better expressiveness to make building such complex libraries easier.

> So... while I support the goal you're aiming at, I want to see a much
> more comprehensive example of how this will work and how it will affect
> code written by D *users*.

Agreed.

> Because it isn't enough for Phobos to be
> written this way.  Basically all D code will have to take this into
> account for the strategy to be truly viable.  Simply outlining one of
> the most basic functions in Phobos, which already looks like it will
> have a static conditional at the beginning and *need to be aware of the
> fact that an RCString type exists* makes me terrified of what a
> realistic example will look like.

That would be overreacting :o).


Andrei

October 01, 2014
On 9/29/14, 11:44 AM, Shammah Chancellor wrote:
>
> I don't like the idea of having to pass in template parameters
> everywhere -- even for allocators.  Is there some way we could have
> "allocator contexts"?
>
> E.G.
>
> with( auto allocator = ReferencedCounted() )
> {
>      auto foo = setExtension("hello", "txt");
> }
>
> ReferenceCounted() could replace a thread-local "new" delegate with
> something it has, and when it goes out of scope, it would reset it to
> whatever it was before.   This would create some runtime overhead -- but
> I'm not sure how much more than already exists.

I'm not sure whether we can do this within D's type system. -- Andrei
October 01, 2014
On 9/29/14, 1:07 PM, Uranuz wrote:
> 1. As far as I understand allocation and memory management of
> entities like class (Object), dynamic arrays and associative
> arrays is part of language/ runtime. What is proposed here is
> *fix* to standart library. But that allocation and MM happening
> via GC is not *fault* of standart library but is predefined
> behaviour of D lang itself and it's runtime. The standard library
> becomes a `hostage` of runtime library in this situation. Do you
> really sure that we should "fix" standart library in that way?
> For me it looks like implementing struts for standard lib (which
> is not broken yet ;) ) in order to compensate behaviour of
> runtime lib.

The change will be to both the runtime and the standard library.

> 2. Second question is slightly oftopic, but I still want put it
> there. What I dislike about ranges and standart library is that
> it's hard to understand what is the returned value of library
> function. I have some *pedals* (front, popFront) to push and do
> some magic. Of course it was made for purpose of making universal
> algorithms. But the mor I use ranges, *auto* then less I believe
> that I use static-typed language. What is wanted to make code
> clear is having distinct variable declaration with specification
> of it's type. With all of these auto's logic of programme becomes
> unclear, because data structures are unclear. So I came to the
> question: is the memory management or allocation policy
> syntacticaly part of declaration or is it a inner implementation
> detail that should not be shown in decl?

Sadly this is the way things are going (not only in D, but other languages such as C++, Haskell, Scala, etc). Type proliferation has costs, but also a ton of benefits.

Most often the memory management policy will be part of function signatures because it affects data type definitions.

> Should rc and gc string look simillar or not?
>
> string str1 = makeGCString("test");
> string str2 = makeRCString("test");
>
> // --- vs ---
>
> GCString str1 = "test";
> RCString str2 = "test";
>
> // --- or ---
>
> String!GC str1 = "test";
> String!RC str2 = "test";
>
> // --- or even ---
> @gc string str1 = "test";
> @rc string str2 = "test";
>
> As far as I understand currently we will have:
> string str1 = "test";
> RCString str2 = "test";

Per Sean's idea things would go GC.string vs. RC.string, where GC and RC are two memory management policies (simple structs defining aliases and probably a few primitives).

> So another question is why the same object "string" is
> implemented as different types. Array and struct (class)?

A reference counted string has a different layout than immutable(char)[].

> 3. Should algorithms based on range interface care about
> allocation? Range is about iteration and access to elements but
> not about allocation and memory mangement.

Most don't.

> I would like to have attributes @rc, @gc (or like these) to
> switch MM-policy versus *String!RC* or *RCString* but we cannot
> apply attributes to literal. Passing to allgorithm something like
> this:
>
> find( @rc "test", @rc "t" )
>
> is syntactically incorrect. But we can use this form:
>
> find( RCString("test"), RCString("t") )
>
> But above form is more verbose. As continuation of this question
> I have next question.

If language changes are necessary, we will make language changes. I'm trying first to explore solutions within the language.

> 4. How to deal with literals? How to make them ref-counted?

I don't know yet.

> I ask this because even when writing RCString("test")
> syntactically expression "test" is still GC-managed literal. I
> pass GC-managed literal into struct to make it RC-managed. Why
> just not make it RC from the start?
>
> Adding some additional template parameter to algrorithm wil not
> fix this. It is a problem of D itself and it's runtime library.

I understand. The problem is actually worse with array literals, which are silently dynamically allocated on the garbage-collected heap:

auto s = "hello"; // at least there's no allocation
auto a = [1, 2, 3]; // dynamic allocation

A language-based solution would change array literal syntax. A library-based solution would leave array literals with today's syntax and semantics and offer a controlled alternative a la:

auto a = MyMemPolicy.array(1, 2, 3); // cool

> So I assume that std lib is not broken this way and we should not
> try to fix it this way. Thanks for attention.

And thanks for your great points.


Andrei

October 01, 2014
On 9/29/14, 3:11 PM, Freddy wrote:
>>
> Internally we should have something like:
>
> ---
> template String(MemoryManagementPolicy mmp=gc){
>       /++ ... +/
> }
> auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1
> path, R2 ext)
> if (...)
> {
>       auto result=String!mmp();
>       /++ +/
> }
> ----
>
> or maybe even allowing user types in the template argument(the
> original purpose of templates)
>
> ---
> auto setExtension(String = string, R1, R2)(R1
> path, R2){
>       /++ +/
> }
> ----

Good idea, and it seems Sean's is even better because it groups everything related to memory management where it belongs - in the memory management policy. -- Andrei
October 01, 2014
On 9/30/14, 7:07 AM, John Colvin wrote:
> Instead of adding a new template parameter to every function (which
> won't necessarily play nicely with existing IFTI and variadic
> templates), why not allow template modules?

Nice idea, but let's try and explore possibilities within the existing rich language. If a need for new language features arises, I trust we'll see it. -- Andrei