October 10, 2013
Rainer Schuetze wrote:

On 26.06.2013 13:12, Michel Fortin wrote:
> Le 26-juin-2013 à 5:38, Walter Bright  a
> écrit :
>
>> On 6/26/2013 12:19 AM, Rainer Schuetze wrote:
>>>
>>> As Michel also said, the reference count does not have to be in
>>> inside the object itself, so we might want to allow reference
>>> counting on other types aswell.
>>
>> That opens the question of what is the point of other RC types? For
>> example, C++ can throw any type - but it turns out that throwing
>> anything but class types is largely pointless.
>
> RC is just another garbage collection scheme. You might favor it for
> its performance characteristics, its determinism, or the lower memory
> footprint.
>
> Or you might need it to interact with foreign code that relies on it
> (COM, Objective-C, etc.), in which case it needs to be customizable
> (use the foreign implementation) or be manually managed.
>
> That's two different use cases. And in the later case you can't use
> the GC to release cycles because foreign code is using memory
> invisible to the GC. It is important to note that when foreign code
> calls AddRef you don't want the GC to collect that object, at least
> not until Release is called.

That means you have to maintain two reference counts if you have both ARC and COM used in the same class. ARC can only release the object if both counters are 0, while the COM implementation has to add/remove roots to the GC when counting from/to 0 to avoid that the object is collected while external references exist.

October 10, 2013
Michel Fortin wrote:

Le 2013-06-26 à 17:42, Rainer Schuetze <r.sagitario@gmx.de> a écrit :

> On 26.06.2013 13:12, Michel Fortin wrote:
>> Le 26-juin-2013 à 5:38, Walter Bright  a
>> écrit :
>>
>>> On 6/26/2013 12:19 AM, Rainer Schuetze wrote:
>>>>
>>>> As Michel also said, the reference count does not have to be in
>>>> inside the object itself, so we might want to allow reference
>>>> counting on other types aswell.
>>>
>>> That opens the question of what is the point of other RC types? For
>>> example, C++ can throw any type - but it turns out that throwing
>>> anything but class types is largely pointless.
>>
>> RC is just another garbage collection scheme. You might favor it for
>> its performance characteristics, its determinism, or the lower memory
>> footprint.
>>
>> Or you might need it to interact with foreign code that relies on it
>> (COM, Objective-C, etc.), in which case it needs to be customizable
>> (use the foreign implementation) or be manually managed.
>>
>> That's two different use cases. And in the later case you can't use
>> the GC to release cycles because foreign code is using memory
>> invisible to the GC. It is important to note that when foreign code
>> calls AddRef you don't want the GC to collect that object, at least
>> not until Release is called.
>
> That means you have to maintain two reference counts if you have both ARC and COM used in the same class. ARC can only release the object if both counters are 0, while the COM implementation has to add/remove roots to the GC when counting from/to 0 to avoid that the object is collected while external references exist.

Yes. The "external" reference count prevents the gc from releasing memory and the "internal" reference count keeps track of the number of references from gc-scanned memory. When both fall to zero, the memory is released immediately; when the external count falls to zero, the gc can collect memory if it isn't connected to any of its roots.

Note that you could implement the external count as a separate hash table that'll include a gc-scanned pointer to the object. That pointer would keep the internal count above zero as long as it is present in the table. The external count doesn't need to be formally part of the gc data structure.

October 10, 2013
On 6/26/2013 4:15 AM, Michel Fortin wrote:
> Well, it's mostly required to write runtime support functions. The attribute could be more obscure so people are less tempted to use it, but if you're going to implement the ref-counting code you'll need that.

I know the temptation is strong to create more attributes as an easy solution, but I'd really like to try hard to find other ways.

> D would need manual, RC and GC to coexist peacefully.

> The problem is how to make the three of those use the same codegen?
>
> - Druntime could have a flag to disable/enable refcounting. It'd make the retain/release functions no-ops, but it'd not prevent the GC from reclaiming memory as it does today.
> - Druntime could have a flag to disable/enable garbage collection (it already has). That'd prevent cycles from being collected, but you could use weak pointers to work around that or request a collection manually at the appropriate time.
> - A @noarc (or similar) attribute at the function level could be used to prevent the compiler from generating function calls on pointer assignments. You could make a whole module @noarc if you want by adding "@noarc:" at the top.
>
> Here's the annoying thing: @noarc is totally safe if reference counting is disabled and we rely entirely on the GC. @noarc is unsafe when reference counting is enabled.

I don't really understand your point. My proposal enables manual, RC and GC to coexist peacefully. The GC wouldn't even know about RC.


>
>>> The downside is that every assignment to a pointer anywhere has to call a function. While this is some overhead, it is more predictable than overhead from a GC scan and would be preferred in some situation (games I guess). Another downside is you have an object retained by being present on the stack frame of a C function, it'd have to be explicitly retained from elsewhere.
>> Doesn't this make it impractical to mix vanilla C with D code? An important feature of D is this capability, without worrying about a "JNI" style interface.
> It's not very different than with the GC today.
>
> If you call a C function by giving it a ref-counted pointer argument, that memory block is guarantied to live at least for that call's lifetime (because it is retained by the caller). So simple calls to C functions are not a problem.
>
> If the C function puts that pointer elsewhere you'll need to retain it some other way, but you have to do this with the GC too. If you're implementing a callback called from C you need to care about what you return because the caller's C code won't retain it, while with the GC you could manage if C code did not store that pointer outside of the stack.
>
> I think that's all you have to worry about.

D (like C and C++) loves to manipulate pointers. Having to call a function every time this is done would be a disaster. It means that people would be motivated to drop down to C to do the fast code, and we might as well throw in the towel.



>> As for D switching to a full refcounted GC for everything, I'm very hesitant for such a step. For one thing, reading the clang spec on all the various pointer and function annotations necessary is very off-putting.
> Don't let Clang intimidate you. The Clang spec is about four to five time more complicated than needed because of autoreleased objects and because it supports weak pointers. Weak pointers can be implemented as a struct templates (as long as we have @noarc). And all those annotations are for special cases, when you need to break the rules. You don't use them when doing normal programming, well except for __weak.
>

Ok.

October 10, 2013
On 6/26/2013 2:33 PM, Rainer Schuetze wrote:
> On 26.06.2013 11:38, Walter Bright wrote:
>>
>> On 6/26/2013 12:19 AM, Rainer Schuetze wrote:
>>>
>>> I imagine a few (constrained) templated functions for the different
>>> operations defined in the library could also do the job, though it
>>> might drown compilation speed. Also getting help from the optimizer to
>>> remove redundant calls will need some back doors.
>>
>> I don't see how this can be done without specific compiler knowledge in
>> a memory safe way.
>
> I currently don't see how it can be memory safe with this proposal.

I'm a little confused about what you are referring to here.


>
>>>> 3. Assignment to a class reference causes a call to AddRef() on the new
>>>> value
>>>> followed by a call to Release() on its original value.
>>>
>>> It might be common knowledge, but I want to point out that the usual
>>> COM implementation (atomic increment/decrement and free when refcount
>>> goes down to 0) is not thread-safe for shared pointers. That means you
>>> either have to guard all reads and writes with a lock to make the full
>>> assignment atomic or have to implement reference counting very
>>> different (e.g. deferred reference counting).
>>
>> Since the implementation of AddRef()/Release() is up to the user,
>> whether it uses locks or not and whether it supports shared or not is up
>> to the user.
>
> You have to put the lock around the pair of AddRef and Release, but if the compiler already splits this into two function calls, this cannot be done in the implementation.

Why is it necessary to put a lock around the pair?

>
>>
>>>> 12. AddRef() is not called when passed as the implicit 'this' reference.
>>>>
>>>
>>> Isn't this unsafe if a member function is called through the last
>>> existing reference and this reference is then cleared during execution
>>> of this member function or from another thread?
>>
>> No. The caller of the function still retains a reference in that thread.
>
> Hmmm, I guess I misunderstand the proposal. Assume for example a refcounted class R and this code
>
> class R : RefCounted
> {
>     int _x;
>     int readx() { return _x; }
> }
> int main()
> {
>     R r = new R;
>     return r.readx();
> }
>
> According to 12. there is no refcounting going on when calling or executing readx. Ok, now what happens here:
>
> class R : RefCounted
> {
>     int _x;
>     int readx(C c)
>     {
>         c.r = null; // "standard" rc deletes r here
>         return _x;  // reads garbage
>     }
> }
> class C
> {
>     R r;
> }
> int main()
> {
>     C c = new C;
>     c.r = new R;
>     return c.r.readx(c);
> }
>
> This reads garbage or crashes if there is no reference counting going on when calling readx.

I think you're right. Grrrr!

>
>>
>>>
>>>> 13. Taking the address of, or passing by reference, any fields of an RC
>>>> object
>>>> is not allowed in @safe code. Passing by reference an RC field is
>>>> allowed.
>>>
>>> Please note that this includes slices to fixed size arrays.
>>
>> As I suggested, arrays would not be supported with this proposal - but
>> the user can create ref counted array-like objects.
>
> Just to clarify, I meant taking a slice of a static array that is a field of a refcounted class. Is it forbidden to have a field like this in a refcounted class or is taking the address through slicing forbidden?

It would be forbidden to obtain a slice of a ref counted object in this way - or even to simply refer to a static array embedded in a ref counted object (in @safe code).


>
>>>
>>> I feel I'm hijacking this proposal, but the step to library defined
>>> read/write barriers seems pretty small. Make AddRef, Release and
>>> assignment free template functions, e.g.
>>>
>>> void ptrConstruct(T,bool stackOrHeap)(T*adr, T p);
>>> void ptrAssign(T,bool stackOrHeap)(T*adr, T p);
>>> void ptrRelease(T,bool stackOrHeap)(T*adr);
>>>
>>> and we are able to experiment with all kinds of sophisticated GC
>>> algorithms including RC. Eliding redundant addref/release pairs would
>>> need some extra support though, I read that LLVM does something like
>>> this, but I don't know how.
>>>
>>
>> It's pretty invasive into the code generation and performance, and could
>> completely disrupt the C compatibility of D.
>
> I don't see a big difference between a free function and a member function call, though the template character of it might hurt compilation performance.
>
> Two more notes:
>
> - I'm not sure it is mentioned, but I think you have to describe what happens when copying a struct. pre- and post-blit actions have to be taken if the struct contain pointers to refcounted objects.

Yes. It's analogous to copying a struct that has fields which contain constructors and destructors.


>
> > 10. Function returns have an AddRef() already done to the return value.
>
> - A refcounted reference returned from a function (including new) would have to be Released if the return value is ignored or if only used as part of an expression.
>
>
>

That's right. Just as if a function returned a struct with a destructor.

The model I am basing this on is C++'s shared_ptr<>, which makes use of all the various rules around construction, assignment, and destruction. The wrinkles we have are:

1. memory safety
2. working with the GC

October 10, 2013
Michel Fortin wrote:

Le 26-juin-2013 à 19:48, Walter Bright  a écrit :

> D (like C and C++) loves to manipulate pointers. Having to call a function every time this is done would be a disaster. It means that people would be motivated to drop down to C to do the fast code, and we might as well throw in the towel.

But at the same time, having a GC that stops the world at irregular interval is worse for certain kind of things, games notably. It's been stated often on the forum that game developers prefer increasing the overhead if it prevents those hiccups. Making everything in D reference-counted would undoubtedly increase the general overhead, but it'd allow game developers to leverage the whole language and its libraries instead of restricting themselves to a custom subset of the class hierarchy derived from a reference counted class.

And about pointer manipulation: for cases where you know for certain that the pointer still points to the same memory block before and after the assignment (when you call popFront on an array for instance), you have no reference count to update and can elide the call.

The downside of optional support for language-wide reference counting is that it requires two incompatible codegen (or rather one incompatible with RC). We could have only one if it's the one that calls the retain/release functions on pointer assignment, with those functions replaced with empty stubs in druntime when reference counting is disabled, but some overhead would remain for the function call.

I'm not claiming this is the right solution. It's just an idea I wanted to mention as an aside because it has some common points. It is however a mostly separate matter from your initial goal of supporting custom reference counting schemes for some object hierarchies. I decided to write about it mostly because you talked about reimplementing arrays using classes and that got me thinking. But perhaps I shouldn't have mentioned it because it seems to be side-tracking the discussion.

October 10, 2013
On 6/26/2013 6:33 PM, Michel Fortin wrote:
> Le 26-juin-2013 à 19:48, Walter Bright  a écrit :
>
>> D (like C and C++) loves to manipulate pointers. Having to call a function every time this is done would be a disaster. It means that people would be motivated to drop down to C to do the fast code, and we might as well throw in the towel.
> But at the same time, having a GC that stops the world at irregular interval is worse for certain kind of things, games notably. It's been stated often on the forum that game developers prefer increasing the overhead if it prevents those hiccups. Making everything in D reference-counted would undoubtedly increase the general overhead, but it'd allow game developers to leverage the whole language and its libraries instead of restricting themselves to a custom subset of the class hierarchy derived from a reference counted class.
>
> And about pointer manipulation: for cases where you know for certain that the pointer still points to the same memory block before and after the assignment (when you call popFront on an array for instance), you have no reference count to update and can elide the call.

I've never seen a scheme for "knows for certain" that did not involve extensive and intrusive pointer annotations, something we very much want to avoid. Pointer annotations work great in theory, but in practice no successful language I know of uses them (we'll see if Rust will become an exception).

>
> The downside of optional support for language-wide reference counting is that it requires two incompatible codegen (or rather one incompatible with RC). We could have only one if it's the one that calls the retain/release functions on pointer assignment, with those functions replaced with empty stubs in druntime when reference counting is disabled, but some overhead would remain for the function call.
>
> I'm not claiming this is the right solution. It's just an idea I wanted to mention as an aside because it has some common points. It is however a mostly separate matter from your initial goal of supporting custom reference counting schemes for some object hierarchies. I decided to write about it mostly because you talked about reimplementing arrays using classes and that got me thinking. But perhaps I shouldn't have mentioned it because it seems to be side-tracking the discussion.
>

This proposal is modeled after C++'s shared_ptr<T> in that it should have equivalent performance and capabilities. Since it has been well accepted into the C++ "best practices", I think we're on solid ground with it.

October 10, 2013
Rainer Schuetze wrote:

On 27.06.2013 02:11, Walter Bright wrote:
>
> On 6/26/2013 2:33 PM, Rainer Schuetze wrote:
>> On 26.06.2013 11:38, Walter Bright wrote:
>>>
>>> On 6/26/2013 12:19 AM, Rainer Schuetze wrote:
>>>>
>>>> I imagine a few (constrained) templated functions for the
>>>> different operations defined in the library could also do the
>>>> job, though it might drown compilation speed. Also getting help
>>>> from the optimizer to remove redundant calls will need some
>>>> back doors.
>>>
>>> I don't see how this can be done without specific compiler
>>> knowledge in a memory safe way.
>>
>> I currently don't see how it can be memory safe with this
>> proposal.
>
> I'm a little confused about what you are referring to here.

When preparing for dconf I read the "Garbage Collection Handbook" by
Richard Jones, and it very much supported my suspicion that the usual
reference counting cannot be both memory safe and high-performance.

>>
>> You have to put the lock around the pair of AddRef and Release, but
>> if the compiler already splits this into two function calls, this
>> cannot be done in the implementation.
>
> Why is it necessary to put a lock around the pair?

To be more accurate, it is the assignment and the Release that have to
be atomic, in addition to a concurrent read with AddRef. Imagine the
reading thread is suspended while just having read the pointer, but not
incremented the reference count yet. If an assignment with release and deletion is performed before the thread resumes, AddRef is called on garbage.

IIRC you also have the GC handbook book on your shelf. Check the
chapters on RC, especially algorithm 18.2 "Eager reference counting with
CompareAndSwap is broken".


>> Just to clarify, I meant taking a slice of a static array that is a
>>  field of a refcounted class. Is it forbidden to have a field like
>> this in a refcounted class or is taking the address through slicing
>> forbidden?
>
> It would be forbidden to obtain a slice of a ref counted object in
> this way - or even to simply refer to a static array embedded in a
> ref counted object (in @safe code).

Ok.

>> Two more notes:
>>
>> - I'm not sure it is mentioned, but I think you have to describe
>> what happens when copying a struct. pre- and post-blit actions have
>> to be taken if the struct contain pointers to refcounted objects.
>
> Yes. It's analogous to copying a struct that has fields which contain
>  constructors and destructors.

Ok, I tend to forget about the swapping to a temporary when assigning structs.

> The model I am basing this on is C++'s shared_ptr<>, which makes use
> of all the various rules around construction, assignment, and
> destruction. The wrinkles we have are:
>
> 1. memory safety

This is the hard part.

> 2. working with the GC

I don't think that the GC is getting in the way as long as it is
mark-and-sweep. A fully reference counting GC is a different story and can be made concurrent, but it is usually not eager and needs to defer actual reference counting to avoid locks. Instead it logs accesses to some thread local buffer which needs to be processed eventually.


October 10, 2013
Rainer Schuetze wrote:


>> Why is it necessary to put a lock around the pair?

> To be more accurate, it is the assignment and the Release that have to
> be atomic, in addition to a concurrent read with AddRef. Imagine the
> reading thread is suspended while just having read the pointer, but not
> incremented the reference count yet. If an assignment with release and
> deletion is performed before the thread resumes, AddRef is called on
> garbage.

On my way to work today, I figured that a safe but slow implementation can work, if the interface is not AddRef/Release, but

class C
{
  C readThis();
  void writeThis(ref C c);
}

where the function can include the necessary locks, e.g.

class C
{
  int refcnt;

  C readThis()
  {
    synchronized(this)
    {
      refcnt++;
      return this;
    }
  }
  void writeThis(ref C c)
  {
    synchronized(c)
    {
       C x = c;
       c = this;
       if (--c.refcnt == 0)
         delete c;
    }
  }
}

Reading/Writing null (e.g. when constructing or destructing a reference) would have to be special cased, that would not be necesessary with free functions.

October 10, 2013
Michel Fortin wrote:


Le 27-juin-2013 à 8:03, "Rainer Schuetze" <r.sagitario@gmx.de> a écrit :

> On my way to work today, I figured that a safe but slow implementation can work, if the interface is not AddRef/Release, but
>
> class C
> {
>  C readThis();
>  void writeThis(ref C c);
> }
>
> where the function can include the necessary locks, e.g.
>
> class C
> {
>  int refcnt;
>
>  C readThis()
>  {
>    synchronized(this)
>    {
>      refcnt++;
>      return this;
>    }
>  }
>  void writeThis(ref C c)
>  {
>    synchronized(c)
>    {
>       C x = c;
>       c = this;
>       if (--c.refcnt == 0)
>         delete c;
>    }
>  }
> }

There's an error in this code. You must synchronize on the lock protecting the pointer, not on the lock at the other end of the pointer's value.

Also, you only need to do this if the pointer pointing to the object is shared. If the pointer is thread-local, assignment does not need to be atomic. And if the object itself is thread-local, not even the reference counter need to be atomic.

October 10, 2013
On 6/26/2013 11:28 PM, Rainer Schuetze wrote:
> On 27.06.2013 02:11, Walter Bright wrote:
>>
>> On 6/26/2013 2:33 PM, Rainer Schuetze wrote:
>>> On 26.06.2013 11:38, Walter Bright wrote:
>>>>
>>>> On 6/26/2013 12:19 AM, Rainer Schuetze wrote:
>>>>>
>>>>> I imagine a few (constrained) templated functions for the
>>>>> different operations defined in the library could also do the
>>>>> job, though it might drown compilation speed. Also getting help
>>>>> from the optimizer to remove redundant calls will need some
>>>>> back doors.
>>>>
>>>> I don't see how this can be done without specific compiler
>>>> knowledge in a memory safe way.
>>>
>>> I currently don't see how it can be memory safe with this
>>> proposal.
>>
>> I'm a little confused about what you are referring to here.
>
> When preparing for dconf I read the "Garbage Collection Handbook" by
> Richard Jones, and it very much supported my suspicion that the usual
> reference counting cannot be both memory safe and high-performance.

I think with the rules in the proposal we have we can support it.

>
>>>
>>> You have to put the lock around the pair of AddRef and Release, but
>>> if the compiler already splits this into two function calls, this
>>> cannot be done in the implementation.
>>
>> Why is it necessary to put a lock around the pair?
>
> To be more accurate, it is the assignment and the Release that have to
> be atomic, in addition to a concurrent read with AddRef. Imagine the
> reading thread is suspended while just having read the pointer, but not
> incremented the reference count yet. If an assignment with release and deletion is performed before the thread resumes, AddRef is called on garbage.

I see. I'll have to think about that some more.

>
> IIRC you also have the GC handbook book on your shelf. Check the
> chapters on RC, especially algorithm 18.2 "Eager reference counting with
> CompareAndSwap is broken".

I have the book, but it is the first edition and there's no chapter 18 in it :-(


>
>
>>> Just to clarify, I meant taking a slice of a static array that is a
>>>  field of a refcounted class. Is it forbidden to have a field like
>>> this in a refcounted class or is taking the address through slicing
>>> forbidden?
>>
>> It would be forbidden to obtain a slice of a ref counted object in
>> this way - or even to simply refer to a static array embedded in a
>> ref counted object (in @safe code).
>
> Ok.
>
>>> Two more notes:
>>>
>>> - I'm not sure it is mentioned, but I think you have to describe
>>> what happens when copying a struct. pre- and post-blit actions have
>>> to be taken if the struct contain pointers to refcounted objects.
>>
>> Yes. It's analogous to copying a struct that has fields which contain
>>  constructors and destructors.
>
> Ok, I tend to forget about the swapping to a temporary when assigning structs.
>
>> The model I am basing this on is C++'s shared_ptr<>, which makes use
>> of all the various rules around construction, assignment, and
>> destruction. The wrinkles we have are:
>>
>> 1. memory safety
>
> This is the hard part.
>
>> 2. working with the GC
>
> I don't think that the GC is getting in the way as long as it is
> mark-and-sweep. A fully reference counting GC is a different story and can be made concurrent, but it is usually not eager and needs to defer actual reference counting to avoid locks. Instead it logs accesses to some thread local buffer which needs to be processed eventually.
>
>

I don't think we should do a fully ref counted GC anyway.