Thread overview
What's the use case of clear?
Aug 10, 2010
django
Aug 10, 2010
django
Aug 10, 2010
Jonathan M Davis
Aug 10, 2010
Lutger
Aug 10, 2010
Lutger
August 10, 2010
I read the recent discussion on the subject (TDPL: Manual invocation of
destructor) but I still don't understand what is the use case clear() wants to
cover?

Assume for a moment that we have this class:

class Connection {
	this() { connect(); }
	~this() { if (connected) disconnect(); }
	void disconnect() { ... }
	...
}

Now I see the following use cases:

// The connection is closed at some later point if at all. var c1 = new Connection;


// I want the connection closed at the end of the function.
// Should be safe everytime.
void foo() {
	var c2 = new Connection;
	scope(exit) c2.disconnect();
	...
}


// I want the connection closed and memory released.
// (I know what I'm doing)
void bar() {
	var c2 = new Connection;
	scope(exit) {
		c2.disconnect();	// Even better, call the constructor.
		GC.free(c2);
	}
	...
}

What's left for clear()? In which scenario would I like the default
constructor called again?

Thanks
August 10, 2010
Sorry, that should have been:

c2.disconnect(); // Even better, call the *des*tructor.
August 10, 2010
On Tue, 10 Aug 2010 10:49:16 -0400, django <django@thisinotmyrealemail.com> wrote:

> I read the recent discussion on the subject (TDPL: Manual invocation of
> destructor) but I still don't understand what is the use case clear() wants to
> cover?
>
> Assume for a moment that we have this class:
>
> class Connection {
> 	this() { connect(); }
> 	~this() { if (connected) disconnect(); }
> 	void disconnect() { ... }
> 	...
> }
>
> Now I see the following use cases:
>
> // The connection is closed at some later point if at all.
> var c1 = new Connection;
>
>
> // I want the connection closed at the end of the function.
> // Should be safe everytime.
> void foo() {
> 	var c2 = new Connection;
> 	scope(exit) c2.disconnect();
> 	...
> }
>
>
> // I want the connection closed and memory released.
> // (I know what I'm doing)
> void bar() {
> 	var c2 = new Connection;
> 	scope(exit) {
> 		c2.disconnect();	// Even better, call the constructor.
> 		GC.free(c2);
> 	}
> 	...
> }
>
> What's left for clear()? In which scenario would I like the default
> constructor called again?

None.  Why would you call the constructor again?  I think that's the point of the discussion.

But we should clarify a couple things.  First, destructors are *only* valid for releasing resources not allocated by the GC.  For example, in your Connection object, if you used a phobos object to make the connection, and allocated that connection via new, then you are *not* guaranteed that the GC hasn't collected that object already.  So it is not valid in the destructor.

But if your connection is implemented via something like OS calls/handles, then you must release in the destructor.

So the idea is, in order for this object to be properly reclaimed during the GC collection cycle, it must release the resource in the destructor.  If the validity of the object is directly dependent on the resource, then what clear allows you to do is to have a standard way of reclaiming these resources without having to define a separate function.  It also takes the place of delete, which is a very unsafe way of doing the same thing.

The case which clear doesn't handle very well is releasing of GC resources.

For example, this class would be invalid:

class X
{
}

class C
{
   X x;
   this() { x = new X;}
   ~this() { clear(x);}
}

because the GC may have already deallocated x before deallocating the parent class.

One way to fix this is to have the storage for x be in the same memory block as C.  But this concept has not been implemented (I think it's a planned feature).  Then you can be sure that the memory for x is stored inside the C instance.  You can also be sure that the GC won't destroy the C if there are still references to its X, but not any references to the C itself.

-Steve
August 10, 2010
django wrote:

> I read the recent discussion on the subject (TDPL: Manual invocation of
> destructor) but I still don't understand what is the use case clear() wants to
> cover?
> 
> Assume for a moment that we have this class:
> 
> class Connection {
> this() { connect(); }
> ~this() { if (connected) disconnect(); }
> void disconnect() { ... }
> ...
> }
> 
> Now I see the following use cases:
> 
> // The connection is closed at some later point if at all. var c1 = new Connection;
> 
> 
> // I want the connection closed at the end of the function.
> // Should be safe everytime.
> void foo() {
> var c2 = new Connection;
> scope(exit) c2.disconnect();
> ...
> }
> 
> 
> // I want the connection closed and memory released.
> // (I know what I'm doing)
> void bar() {
> var c2 = new Connection;
> scope(exit) {
> c2.disconnect();	// Even better, call the constructor.
> GC.free(c2);
> }
> ...
> }
> 
> What's left for clear()? In which scenario would I like the default
> constructor called again?
> 
> Thanks

Honestly I think that clear() as it does too much, but this is what I think was intended:

Class Connection {
   invariant() { assert( pool !is null); }

   ConnectionPool pool = null;
   bool isOpen = false;

   this() {
      pool = ConnectionPool.get();
      enforce(pool !is null);
   }

   this(string connectionString) {
      this();
      connect(pool, connectionString);
      isOpen = true;
   }

   ~this() {
      disconnect(pool, connectionString);
   }
}

/* use case: although I do not really care the connection should at least be
closed when this object is collected
*/
auto c = new Connection(connectionString);

/* Now I want this too:
1 connection closed at the end of scope
2 isOpen will be reset to false
3 the pool Singleton reference must not be null

(2-3 are important when there may be other references to the connection that
attempt to use it)
*/
scope(exit)
   clear(c);

/* I also want this scheme to be standardized, so that derived classes can more
easily hook up with the disposal scheme and client code does not have to look up
the documentation, check what the actual polymorphic type is, etc.
*/
class MyConnection : Connection { ... }

clear(myConnection); // also does its magic for Connection

clear() wants to solve all of these use cases by itself in a way that the class implementor only has to implement a destructor and the user only needs to call clear().

But imho it conflicts with some (more) important ones. Like in your example, you do not expect to actually connect to the database when you clear() the connection, it would a huge wtf! You also cannot rely on a destructor being called once, nor can you check if an object has already been destructed (without hacks). It is the whole point of clear() that you cannot do so, I think
August 10, 2010
On Tuesday, August 10, 2010 08:09:13 Steven Schveighoffer wrote:
> But we should clarify a couple things.  First, destructors are *only* valid for releasing resources not allocated by the GC.  For example, in your Connection object, if you used a phobos object to make the connection, and allocated that connection via new, then you are *not* guaranteed that the GC hasn't collected that object already.  So it is not valid in the destructor.

Hmm. I don't recall ever reading that anywhere. I would not have expected anything that was referenced by an object to be garbage collected until the destructor had been called on that object. Granted, the need for destructors on classes is minimal, but that seems like the kind of thing that should be in big red lettering somewhere. You just know that there are going to be plenty of programmers out there who are going to referencing newed data in the destructor. It's a very natural thing to do for cleanup if you have object references as member data.

- Jonathan M Davis
August 10, 2010
On Tue, 10 Aug 2010 13:16:31 -0400, Jonathan M Davis <jmdavisprog@gmail.com> wrote:

> On Tuesday, August 10, 2010 08:09:13 Steven Schveighoffer wrote:
>> But we should clarify a couple things.  First, destructors are *only*
>> valid for releasing resources not allocated by the GC.  For example, in
>> your Connection object, if you used a phobos object to make the
>> connection, and allocated that connection via new, then you are *not*
>> guaranteed that the GC hasn't collected that object already.  So it is not
>> valid in the destructor.
>
> Hmm. I don't recall ever reading that anywhere. I would not have expected
> anything that was referenced by an object to be garbage collected until the
> destructor had been called on that object. Granted, the need for destructors on
> classes is minimal, but that seems like the kind of thing that should be in big
> red lettering somewhere. You just know that there are going to be plenty of
> programmers out there who are going to referencing newed data in the destructor.
> It's a very natural thing to do for cleanup if you have object references as
> member data.

Yes, it is a common source of confusion to new programmers.

From this page: http://digitalmars.com/d/2.0/class.html#destructors

"The garbage collector is not guaranteed to run the destructor for all unreferenced objects. Furthermore, the order in which the garbage collector calls destructors for unreference objects is not specified. This means that when the garbage collector calls a destructor for an object of a class that has members that are references to garbage collected objects, those references may no longer be valid. This means that destructors cannot reference sub objects. This rule does not apply to auto objects or objects deleted with the DeleteExpression, as the destructor is not being run by the garbage collector, meaning all references are valid."

The second part that says when you delete it manually, the references are still valid is meaningless.  You only get one destructor, and it's not notified whether its the GC calling it or delete/clear calling it.  So if you write your destructor to cater to manual deletion, the program may crash if the GC destroys it.

Rule #1, destructors are only for non-GC allocated resources.  With GC allocated resources, you have to assume they are invalid in the destructor.  Always.

-Steve
August 10, 2010
On Tue, 10 Aug 2010 13:01:37 -0400, Lutger <lutger.blijdestijn@gmail.com> wrote:

> Honestly I think that clear() as it does too much, but this is what I think was
> intended:
>
> Class Connection {
>    invariant() { assert( pool !is null); }
>
>    ConnectionPool pool = null;
>    bool isOpen = false;
>
>    this() {
>       pool = ConnectionPool.get();
>       enforce(pool !is null);
>    }
>   this(string connectionString) {
>       this();	
>       connect(pool, connectionString);
>       isOpen = true;
>    }
>
>    ~this() {
>       disconnect(pool, connectionString);
>    }
> }

No, this is bad.  ~this cannot access any GC resources.  Although you forgot to store it, connectionString may be a GC-allocated resource, so it may be invalid by the time the destructor is called.  The only way to fix this is to malloc the string:

Class Connection {
   private string connectionString;

   ...

   this(string connectionString) {
       this();
       connect(pool, connectionString); // note, we use the GC'd version in case connect stores it.
       this.connectionString = (cast(immutable(char)*)malloc(connectionString.length))[0..connectionString.length];
   }

   ...

   ~this() {
      // since I malloc'd connectionString, I know the GC didn't collect it.
      disconnect(pool, connectionString);
      free(connectionString.ptr); // clean up my non-GC resources
   }
}

In fact, it might even be illegal to use pool.  If at the end of the program, the pool is destroyed, and then your object gets destroyed, you are screwed.

It is one of the severe limitations of the GC.  I wish there was a way to mark a piece of memory as "owned", that would be nice to have.  But even then, you need to own all the data you deal with in the destructor.  You can't rely on simple pointers to objects you don't own, they may be invalid.

-Steve
August 10, 2010
Steven Schveighoffer wrote:

> On Tue, 10 Aug 2010 13:01:37 -0400, Lutger <lutger.blijdestijn@gmail.com> wrote:
> 
>> Honestly I think that clear() as it does too much, but this is what I
>> think was
>> intended:
>>
>> Class Connection {
>>    invariant() { assert( pool !is null); }
>>
>>    ConnectionPool pool = null;
>>    bool isOpen = false;
>>
>>    this() {
>>       pool = ConnectionPool.get();
>>       enforce(pool !is null);
>>    }
>>   this(string connectionString) {
>>       this();
>>       connect(pool, connectionString);
>>       isOpen = true;
>>    }
>>
>>    ~this() {
>>       disconnect(pool, connectionString);
>>    }
>> }
> 
> No, this is bad.  ~this cannot access any GC resources.  Although you forgot to store it, connectionString may be a GC-allocated resource, so it may be invalid by the time the destructor is called.  The only way to fix this is to malloc the string:

You are right, I ignored that issue. This example is broken.

> Class Connection {
>     private string connectionString;
> 
>     ...
> 
>     this(string connectionString) {
>         this();
>         connect(pool, connectionString); // note, we use the GC'd version
> in case connect stores it.
>         this.connectionString =
> (cast(immutable(char)*)malloc(connectionString.length))
[0..connectionString.length];
>     }
> 
>     ...
> 
>     ~this() {
>        // since I malloc'd connectionString, I know the GC didn't collect
> it.
>        disconnect(pool, connectionString);
>        free(connectionString.ptr); // clean up my non-GC resources
>     }
> }
> 
> In fact, it might even be illegal to use pool.  If at the end of the program, the pool is destroyed, and then your object gets destroyed, you are screwed.
> 
> It is one of the severe limitations of the GC.  I wish there was a way to mark a piece of memory as "owned", that would be nice to have.  But even then, you need to own all the data you deal with in the destructor.  You can't rely on simple pointers to objects you don't own, they may be invalid.
> 
> -Steve

Again you are right, this only works for non-gc owned resources. Frankly at this point I wonder how to even begin use class destructors for anything interesting, especially in the wake of clear(). Did you ever write one?

The spec says that this restriction does not apply when the users calls delete since it isn't collecting then (also applies to clear), but I don't think there is any way to detect that.
August 10, 2010
On Tue, 10 Aug 2010 14:22:23 -0400, Lutger <lutger.blijdestijn@gmail.com> wrote:

> Again you are right, this only works for non-gc owned resources. Frankly at this
> point I wonder how to even begin use class destructors for anything interesting,
> especially in the wake of clear(). Did you ever write one?

No, I usually avoid them, simply because they are not much use :)  It's inevitably the place we all end up after thinking about the rules.

> The spec says that this restriction does not apply when the users calls delete
> since it isn't collecting then (also applies to clear), but I don't think there
> is any way to detect that.

Yeah, I posted an idea that maybe can fix that.