April 18, 2014
On Fri, 18 Apr 2014 01:53:00 -0400, Walter Bright <newshound2@digitalmars.com> wrote:

> On 4/17/2014 6:58 PM, Michel Fortin wrote:
>> Auto-nulling weak references are perfectly memory-safe. In Objective-C you use
>> the __weak pointer modifier for that. If you don't want it to be auto-nulling,
>> use __unsafe_unretained instead to get a raw pointer. In general, seeing
>> __unsafe_unretained in the code is a red flag however. You'd better know what
>> you're doing.
>>
>> If you could transpose the concept to D, __weak would be allowed in @safe
>> functions while __unsafe_unretained would not. And thus memory-safety is preserved.
>
> I recall our email discussion about implementing ARC in D that we couldn't even avoid an inc/dec for the 'this' when calling member functions. So I don't see how inc/dec can be elided in sufficient numbers to make ARC performant and unbloated.

The important thing to recognize is that it's the *caller* that increments/decrements. This means you can elide calls to an object where you already have a guarantee of its reference count being high enough.

I looked up the example you referred to, it was this:

> class R : RefCounted
> {
>    int _x;
>    int readx() { return _x; }
> }
> int main()
> {
>    R r = new R;
>    return r.readx();
> }
>According to 12. there is no refcounting going on when calling or  
> executing readx. Ok, now what happens here:
>class R : RefCounted
> {
>    int _x;
>    int readx(C c)
>    {
>        c.r = null; // "standard" rc deletes r here
>        return _x;  // reads garbage
>    }
> }
> class C
> {
>    R r;
> }
> int main()
> {
>    C c = new C;
>    c.r = new R;
>    return c.r.readx(c);
> }
>This reads garbage or crashes if there is no reference counting going on  
> when calling readx.

So essentially, main needs to increment c.r's ref count. But not c's, because it already knows that it owns one of c's reference counts. R.readx does NOT need to increment its own reference count.

I think the distinction is important.

Also, consider if R is a final class, it can inline readx and can possibly defer incrementing the ref count for later, or cache _x before setting c.r to null (probably the better option).

Opportunities for elision are not as hopeless as you make it sound. The compiler has a lot of information.

The rules should be:

1. when compiling a function, you can assume parameters have at least one reference count increment that will not go away.
2. When passing an object into a function, ensure #1 is true for that function.

Given D's type system of knowing when variables are shared (and if we implement thread-local destruction of unshared data), we have a lot more power even than Objective-C to make better decisions on ref counting.

> Of course, you can always *manually* elide these things, but then if you make a mistake, then you've got a leak and memory corruption.

Manual eliding should be reserved for extreme optimization cases. It's similar to cast. Apple considers it dangerous enough to statically disallow it for ARC code.

-Steve
April 18, 2014
On Friday, 18 April 2014 at 12:55:59 UTC, Steven Schveighoffer wrote:
> The important thing to recognize is that it's the *caller* that increments/decrements. This means you can elide calls to an object where you already have a guarantee of its reference count being high enough.

That won't help you if you iterate over an array, so you need a mutex on the array in order to prevent inc/dec for every single object you inspect.

inc/dec with a lock prefix could easily cost you 150-200 cycles.


Ola.
April 18, 2014
On Fri, 18 Apr 2014 10:00:21 -0400, Ola Fosheim Grøstad <ola.fosheim.grostad+dlang@gmail.com> wrote:

> On Friday, 18 April 2014 at 12:55:59 UTC, Steven Schveighoffer wrote:
>> The important thing to recognize is that it's the *caller* that increments/decrements. This means you can elide calls to an object where you already have a guarantee of its reference count being high enough.
>
> That won't help you if you iterate over an array, so you need a mutex on the array in order to prevent inc/dec for every single object you inspect.

If the array is shared, and the elements are references, yes. It's also possible that each object uses a reference to the array, in which case the array could be altered inside the method, requiring an inc/dec even for unshared arrays.

> inc/dec with a lock prefix could easily cost you 150-200 cycles.

And an inc/dec may not necessarily need a lock if the array element is not shared, even if you inc/dec the ref count.

D offers opportunities to go beyond traditional ref count eliding.

But even still, 150-200 extra cycles here and there is not as bad as a 300ms pause to collect garbage for some apps.

I think nobody is arguing that Ref counting is a magic bullet to memory management. It fits some applications better than GC, that's all.

-Steve
April 18, 2014
On Thu, 17 Apr 2014 11:55:14 +0000, w0rp wrote:

> I'm not convinced that any automatic memory management scheme will buy much with real time applications. Generally with real-time processes, you need to pre-allocate. I think GC could be feasible for a real-time application if the GC is precise and collections are scheduled, instead of run randomly. Scoped memory also helps.

I thought the current GC only ran on allocations? If so @nogc is *very* useful to enforce critical paths. If we added a @nogcscan on blocks that do not contain pointers we maybe able to reduce the collection time, not as good as a precise collector.  I would think we can get decent compiler support for this (ie.  no refs, pointers, class, dynamic array).
April 18, 2014
On 4/17/14, 10:09 AM, Walter Bright wrote:
> On 4/17/2014 2:32 AM, Paulo Pinto wrote:
>> Similar approach was taken by Microsoft with their C++/CX and COM
>> integration.
>>
>> So any pure GC basher now uses Apple's example, with a high
>> probability of not
>> knowing the technical issues why it came to be like that.
>
> I also wish to reiterate that GC's use of COM with ref counting contains
> many, many escapes where the user "knows" that he can just use a pointer
> directly without dealing with the ref count. This is critical to making
> ref counting perform.
>
> But the escapes come with a huge risk for memory corruption, i.e. user
> mistakes.
>
> Also, in C++ COM, relatively few of the data structures a C++ program
> uses will be in COM. But ARC would mean using ref counting for EVERYTHING.

As a COM programmer a long time ago, I concur.

> Using ARC for *everything* means slow and bloat, unless Manu's
> assumption that a sufficiently smart compiler could eliminate nearly all
> of that bloat is possible.
>
> Which I am not nearly as confident of.

Well there's been work on that. I mentioned this recent paper in this group: http://goo.gl/tavC1M, which claims RC backed by a cycle collector can reach parity with tracing. Worth a close read.


Andrei

April 18, 2014
On Friday, 18 April 2014 at 10:06:35 UTC, Manu via Digitalmars-d wrote:
> D pointers are thread-local by default, you need to mark things 'shared'
> explicitly if they are to be passed between threads. This is one of the
> great advantages D has over C/C++/Obj-C.

There's nothing special about pointers in D. You can pass them between threads however you want. The type system has some constraints that you can *choose* to use/abuse/obey/disobey, but they definitely aren't in thread local storage unless they are global or static variables.
April 18, 2014
On Friday, 18 April 2014 at 14:15:00 UTC, Steven Schveighoffer wrote:
> And an inc/dec may not necessarily need a lock if the array element is not shared, even if you inc/dec the ref count.
>
> D offers opportunities to go beyond traditional ref count eliding.

In most situations where you need speed you do need to share data so that you can keep 8 threads busy without trashing the caches and getting the memory bus as a bottle neck.

Then you somehow have to tell the compiler what the mutex covers if ARC is going to be transparent… E.g. "this mutex covers all strings reachable from pointer P". So you need a meta level language…

> But even still, 150-200 extra cycles here and there is not as bad as a 300ms pause to collect garbage for some apps.

I don't know. I think one unified management strategy will not work in most real time apps. I think C++ got that right.

I also think you need both meta level reasoning (program verification constructs) and whole program analysis to get a performant solution with automatic management.

> I think nobody is arguing that Ref counting is a magic bullet to memory management. It fits some applications better than GC, that's all.

As an addition to other management techniques, yes.
April 18, 2014
On Friday, 18 April 2014 at 14:45:37 UTC, Byron wrote:
> On Thu, 17 Apr 2014 11:55:14 +0000, w0rp wrote:
>
>> I'm not convinced that any automatic memory management scheme will buy
>> much with real time applications. Generally with real-time processes,
>> you need to pre-allocate. I think GC could be feasible for a real-time
>> application if the GC is precise and collections are scheduled, instead
>> of run randomly. Scoped memory also helps.
>
> I thought the current GC only ran on allocations? If so @nogc is *very*
> useful to enforce critical paths. If we added a @nogcscan on blocks that
> do not contain pointers we maybe able to reduce the collection time, not
> as good as a precise collector.  I would think we can get decent compiler
> support for this (ie.  no refs, pointers, class, dynamic array).

You can actually prevent scanning/collection already without much difficulty:

GC.disable();
scope(exit) GC.enable();

I feel like @nogc is most useful in avoiding surprises by declaring your assumptions. Problems like how toUpperInPlace would still allocate (with gusto) could much more easily be recognized and fixed with @nogc available.
April 18, 2014
On Fri, 18 Apr 2014 16:17:10 +0000, Brad Anderson wrote:
> 
> You can actually prevent scanning/collection already without much difficulty:
> 
> GC.disable();
> scope(exit) GC.enable();
> 
> I feel like @nogc is most useful in avoiding surprises by declaring your assumptions. Problems like how toUpperInPlace would still allocate (with gusto) could much more easily be recognized and fixed with @nogc available.


I am talking more about hinting to the conservative GC about blocks it doesn't need to scan for addresses.

struct Vertex { int x, y, z, w; }

@nogcscan
Vertex vertexs[10_000];

so when a GC scan does happen it can skip scanning the vertexs memory block completely since we are promising not to hold on to addresses in it.
April 18, 2014
On 4/17/2014 6:43 AM, Manu via Digitalmars-d wrote:
> Well it's still not clear to me what all the challenges are... that's my point.

http://forum.dlang.org/thread/l34lei$255v$1@digitalmars.com