May 25, 2012
On Friday, 25 May 2012 at 11:23:40 UTC, Jacob Carlborg wrote:
> On 2012-05-25 11:02, foobar wrote:
>
>> It makes the call order deterministic like in C++.
>>
>> e.g.
>> class Foo {}
>>
>> struct A {
>> resource r;
>> ~this() { release(r); }
>> }
>>
>> struct B {
>> A* a;
>> Foo foo;
>> ~this() { delete a; } // [1]
>> }
>
> I though we were talking about classes holding structs:
>
> class B {
>     A* a;
>     Foo foo;
>     ~this() { delete a; }
> }
>
> In this case you don't know when/if the destructor of B is called. It doesn't help to wrap it in a struct, you could just have put it directly in A. Is that correct?
>

No. see below.

>> Lets look at point [1]:
>> The "foo" instance is "managed" by the GC since the only resource it
>> holds is memory. The "a" member wraps some "non-managed" resource (e.g.
>> file descriptor) and in this model is still valid, thus allows me to
>> deterministically dispose of it as in c++.
>
> Ok, but if B is a class?
>
>> This can be simply checked at compile-time - you can only reference non
>> class instance members in the destructor, so adding a "delete foo;"
>> statement at point [1] simply won't compile.
>
> If you have a pointer to a struct you don't know how it was created. It's possible it's been created with "new", which means the garbage collector needs to delete it.

let's say we add two classes:
class FooA {
  A a;
}
class FooPA {
  A* pa;
}

For the first case, both the class and the struct share the same lifetime thus when an instance of FooA is GC-ed, the GC would call A's d-tor and allow it to do what-ever (self) cleaning it requires. This means the d-tor will always be called.

For the second case, The GC will only scan "pa" to find inner class instances but will *not* handle the struct value itself.
In order to clean what "pa" points to, you need to explicitly call the destructor yourself. One way to do this would be to register a callback with the GC to get notified when an instance of FooPA is collected and inside the callback function maintain a reference-counter.
This also means that if you allocate a struct value on the heap via "new" you are responsible to call delete _yourself_ and the gc will not call it for you.
I think that loosing this small convenience is worth it - we gay more orthogonal semantics that are easier to reason about.


May 25, 2012
On 5/25/12 12:07 AM, Mehrdad wrote:
> Now, there are two ways a FileStream can get destroyed:
>
> 1. Through a manual call to FileStream.Dispose(). In this case, all
> embedded objects (e.g. SafeFileHandle) are *guaranteed* to be valid, so
> we simply flush the file and call SafeFileHandle.Dispose() to dispose of
> the managed resources, and then dispose of all the unmanaged resources
> (which are primitive fields, guaranteed to be accessible). Furthermore,
> the object suppresses its own finalizer.
>
> 2. Through a garbage-collected call to ~FileStream(). In this case, the
> managed resources such as SafeFileHandle will be (or is already)
> destroyed SEPARATELY, and so we do _NOT_ access them. We ONLY dispose of
> the unmanaged resources, if any, and let the managed resources take care
> of themselves.

What happens in C# if an object A that has a field referring to object B, and the object B has in turn a field referring to object A? That is:

class C { C another; ~this() { writeln(another.another); } }

void main() {
    auto a = new C;
    auto b = new C;
    a.another = b;
    b.another = a;
}

What happens then? Will the GC nullify references to destroyed objects, or will it put them in a zombie state?


Thanks,

Andrei
May 25, 2012
On 25-05-2012 16:35, Andrei Alexandrescu wrote:
> On 5/25/12 12:07 AM, Mehrdad wrote:
>> Now, there are two ways a FileStream can get destroyed:
>>
>> 1. Through a manual call to FileStream.Dispose(). In this case, all
>> embedded objects (e.g. SafeFileHandle) are *guaranteed* to be valid, so
>> we simply flush the file and call SafeFileHandle.Dispose() to dispose of
>> the managed resources, and then dispose of all the unmanaged resources
>> (which are primitive fields, guaranteed to be accessible). Furthermore,
>> the object suppresses its own finalizer.
>>
>> 2. Through a garbage-collected call to ~FileStream(). In this case, the
>> managed resources such as SafeFileHandle will be (or is already)
>> destroyed SEPARATELY, and so we do _NOT_ access them. We ONLY dispose of
>> the unmanaged resources, if any, and let the managed resources take care
>> of themselves.
>
> What happens in C# if an object A that has a field referring to object
> B, and the object B has in turn a field referring to object A? That is:
>
> class C { C another; ~this() { writeln(another.another); } }
>
> void main() {
> auto a = new C;
> auto b = new C;
> a.another = b;
> b.another = a;
> }
>
> What happens then? Will the GC nullify references to destroyed objects,
> or will it put them in a zombie state?
>
>
> Thanks,
>
> Andrei

This is called resurrection: http://msdn.microsoft.com/en-us/magazine/bb985010.aspx (scroll down to Resurrection)

-- 
Alex Rønne Petersen
alex@lycus.org
http://lycus.org
May 25, 2012
On Friday, 25 May 2012 at 14:35:58 UTC, Andrei Alexandrescu wrote:
> What happens in C# if an object A that has a field referring to object B, and the object B has in turn a field referring to object A? That is:
> What happens then? Will the GC nullify references to destroyed objects, or will it put them in a zombie state?

Depends.

If it's a _manual_ disposal, everything is fine -- neither is
GC'd yet.

If it's an _automatic_ disposal (a.k.a. garbage collection), then
the cross references must not be used. I believe their contents
are either undefined or null, but in either case, you don't worry
about disposing them because the objects will take care of
themselves.
May 25, 2012
On Friday, 25 May 2012 at 14:38:29 UTC, Alex Rønne Petersen wrote:
> This is called resurrection: http://msdn.microsoft.com/en-us/magazine/bb985010.aspx (scroll down to Resurrection)



Ah, yes, you're completely right; I missed this fact. Apparently under these conditions, you _can_ resurrect objects, but it's bad practice (and unnecessary) in most situations.


@Andrei: The reason this is allowed is that finalization is _separate_ from garbage collection in .NET. So an object can be finalized and yet still not GC'd. Or its finalizer might be suppressed, allowing it to get GC'd directly. This allows for many possibilities, although you don't usually need them.
May 25, 2012
On 25-05-2012 17:53, Mehrdad wrote:
> On Friday, 25 May 2012 at 14:38:29 UTC, Alex Rønne Petersen wrote:
>> This is called resurrection:
>> http://msdn.microsoft.com/en-us/magazine/bb985010.aspx (scroll down to
>> Resurrection)
>
>
>
> Ah, yes, you're completely right; I missed this fact. Apparently under
> these conditions, you _can_ resurrect objects, but it's bad practice
> (and unnecessary) in most situations.
>
>
> @Andrei: The reason this is allowed is that finalization is _separate_
> from garbage collection in .NET. So an object can be finalized and yet
> still not GC'd. Or its finalizer might be suppressed, allowing it to get
> GC'd directly. This allows for many possibilities, although you don't
> usually need them.

This is, in fact, how most GCs other than D's work. :)

-- 
Alex Rønne Petersen
alex@lycus.org
http://lycus.org
May 25, 2012
On Friday, May 25, 2012 17:53:45 Mehrdad wrote:
> @Andrei: The reason this is allowed is that finalization is _separate_ from garbage collection in .NET. So an object can be finalized and yet still not GC'd. Or its finalizer might be suppressed, allowing it to get GC'd directly. This allows for many possibilities, although you don't usually need them.

Finalization _can_ be separate from the GC in D thanks to clear, but it does normally occur as part of a collection cycle.

- Jonathan M Davis
May 25, 2012
On Friday, 25 May 2012 at 19:30:35 UTC, Jonathan M Davis wrote:
> On Friday, May 25, 2012 17:53:45 Mehrdad wrote:
>> @Andrei: The reason this is allowed is that finalization is
>> _separate_ from garbage collection in .NET. So an object can be
>> finalized and yet still not GC'd. Or its finalizer might be
>> suppressed, allowing it to get GC'd directly. This allows for
>> many possibilities, although you don't usually need them.
>
> Finalization _can_ be separate from the GC in D thanks to clear, but it does
> normally occur as part of a collection cycle.
>
> - Jonathan M Davis

Uhm... sure...

I wasn't really talking about D, so I'm not sure what you mean.

But, comparing to D:

I'm _not_ talking about the fact that you can call the finalizer
manually.
That has _nothing_ to do with the "separation" I was referring to
(even though it's nevertheless necessary for separating the GC
from the finalization queue).

I'm talking about the fact that, if an object has a finalizer,
its finalization stage is SEPARATE from (and obviously, before)
the GC stage. In other words, there can be TWO passes over all
objects that are going to be GC'd:

1. Unreachable finalizable objects are finalized, and their
finalizers are suppressed.
2. Unreachable objects with no finalizers are GC'd.


Therefore, resurrection is possible because after an object goes
through stage 1, it may no longer be eligible for stage 2 (it may
have strong references).

Note that this means merely _having_ a finalizer causes an object
to take 2 passes to be GC'd, instead of 1 -- even if the
finalizer is empty.

Also notice that NOTHING is left in an undefined state, and yet
everything is guaranteed to be reclaimed at some point. And
circular references cause no problems whatsoever, because if
they're unreachable from the GC root, no one cares if they have
references to each other.


@Andrei: Does that make sense?
May 26, 2012
On 2012-05-25 14:05, foobar wrote:

>> If you have a pointer to a struct you don't know how it was created.
>> It's possible it's been created with "new", which means the garbage
>> collector needs to delete it.
>
> let's say we add two classes:
> class FooA {
> A a;
> }
> class FooPA {
> A* pa;
> }
>
> For the first case, both the class and the struct share the same
> lifetime thus when an instance of FooA is GC-ed, the GC would call A's
> d-tor and allow it to do what-ever (self) cleaning it requires. This
> means the d-tor will always be called.

Is that the cases even if the destructor of FooA isn't called?

> For the second case, The GC will only scan "pa" to find inner class
> instances but will *not* handle the struct value itself.
> In order to clean what "pa" points to, you need to explicitly call the
> destructor yourself.

Are you saying that the GC won't collect a struct allocated with "new"?

http://dlang.org/expression.html#NewExpression

"NewExpressions are used to allocate memory on the garbage collected heap...". I though that everything allocated via the GC was also collected by the GC.

> One way to do this would be to register a callback
> with the GC to get notified when an instance of FooPA is collected and
> inside the callback function maintain a reference-counter.
> This also means that if you allocate a struct value on the heap via
> "new" you are responsible to call delete _yourself_ and the gc will not
> call it for you.
> I think that loosing this small convenience is worth it - we gay more
> orthogonal semantics that are easier to reason about.
>
>


-- 
/Jacob Carlborg
May 26, 2012
On Saturday, 26 May 2012 at 11:35:29 UTC, Jacob Carlborg wrote:
> On 2012-05-25 14:05, foobar wrote:
>
>>> If you have a pointer to a struct you don't know how it was created.
>>> It's possible it's been created with "new", which means the garbage
>>> collector needs to delete it.
>>
>> let's say we add two classes:
>> class FooA {
>> A a;
>> }
>> class FooPA {
>> A* pa;
>> }
>>
>> For the first case, both the class and the struct share the same
>> lifetime thus when an instance of FooA is GC-ed, the GC would call A's
>> d-tor and allow it to do what-ever (self) cleaning it requires. This
>> means the d-tor will always be called.
>
> Is that the cases even if the destructor of FooA isn't called?

Huh? In my model FooA has no destructor.

>
>> For the second case, The GC will only scan "pa" to find inner class
>> instances but will *not* handle the struct value itself.
>> In order to clean what "pa" points to, you need to explicitly call the
>> destructor yourself.
>
> Are you saying that the GC won't collect a struct allocated with "new"?
>
> http://dlang.org/expression.html#NewExpression
>
> "NewExpressions are used to allocate memory on the garbage collected heap...". I though that everything allocated via the GC was also collected by the GC.

I indeed propose that structs allocated with "new" will be put in region of the heap *not* managed by the GC. It's a tiny price to pay to get more orthogonal semantics which are easier to reason about.
Please note that this only affects code that directly uses pointers which is not common in D and is geared towards more advanced use cases where the programmer will likely want to manage the memory explicitly anyway.

>
>> One way to do this would be to register a callback
>> with the GC to get notified when an instance of FooPA is collected and
>> inside the callback function maintain a reference-counter.
>> This also means that if you allocate a struct value on the heap via
>> "new" you are responsible to call delete _yourself_ and the gc will not
>> call it for you.
>> I think that loosing this small convenience is worth it - we gay more
>> orthogonal semantics that are easier to reason about.