July 10, 2012
On Tuesday, 10 July 2012 at 02:43:05 UTC, Era Scarecrow wrote:
> On Tuesday, 10 July 2012 at 01:41:29 UTC, bearophile wrote:
>> David Piepgrass:
>>> This use case is pretty complex, so if I port this to D, I'd probably just cast away const/immutable where necessary.
>>
>> You are not the first person that says similar things. So D docs need to stress more than casting away const/immutable in D is rather more dangerous than doing the same thing in C++.
...
>  Let's say a class/struct is a book with Page protectors signifying 'const(ant)'. You promise to return the book to the library without making any changes; Although you promised you wouldn't make changes, you still take the Page protectors off and make make notes on the outer edges or make adjustments in the text, then return the book.
>
>  Is this wise? This isn't C++. If something shouldn't change, then don't change it god damn it. If it needs to change it isn't const(ant) and shouldn't suggest it is.

The difficulty, in case you missed it, is that somebody else (the Object class) says that certain functions are const, but in certain cases we really, really want to mutate something, either for efficiency or because "that's just how the data structure works". If a data structure needs to mutate itself when read, yeah, maybe its functions should not be marked const, but quite often the "const" is inherited from Object or some interface that (quite reasonably, it would seem) expects functions that /read stuff/ to be const.

And yet we can't drop const from Object or such interfaces, because there is other code elsewhere that /needs/ const to be there.

So far I have no solution to the dilemma in mind, btw. But the idea someone had of providing two (otherwise identical) functions, one const and one non-const, feels like a kludge to me, and note that anybody with an object would expect to be able to call the const version on any Object.

> Seriously, it's not that hard a concept. I guess if something doesn't port well from C++ then redesign it. Some things done in C++ are hacks due to the language's limitations and faults.

I was referring to a potential port from C#, which has no const. My particular data structure (a complex beast) contains a mutable tree of arbitrary size, which the user can convert to a conceptually immutable tree in O(1) time by calling Clone(). This marks a flag in the root node that says "read-only! do not change" and shares the root between the clones. At this point it should be safe to cast the clone to immutable. However, the original, mutable-typed version still exists. As the user requests changes to the mutable copy in the future, parts of the tree are duplicated to avoid changing the immutable nodes, with one exception: the read-only flag in various parts of the original, immutable tree will gradually be set to true.

In this case, I don't think the D type system could do anything to help ensure that I don't modify the original tree that is supposed to be immutable. Since the static type of internal references must either be all mutable or all immutable, they will be typed mutable in the mutable copy, and immutable in the immutable copy, even though the two copies are sharing the same memory.

And one flag, the read-only flag, must be mutable in this data structure, at least the transition from false->true must happen *after* the immutable copy is created; otherwise, Clone() would have to run in O(N) time, to mark every node read-only. This fact, however, does not affect the immutable copy in any way.
July 10, 2012
On Tue, Jul 10, 2012 at 05:39:45PM +0200, David Piepgrass wrote: [...]
> The difficulty, in case you missed it, is that somebody else (the Object class) says that certain functions are const, but in certain cases we really, really want to mutate something, either for efficiency or because "that's just how the data structure works". If a data structure needs to mutate itself when read, yeah, maybe its functions should not be marked const, but quite often the "const" is inherited from Object or some interface that (quite reasonably, it would seem) expects functions that /read stuff/ to be const.
> 
> And yet we can't drop const from Object or such interfaces, because there is other code elsewhere that /needs/ const to be there.
> 
> So far I have no solution to the dilemma in mind, btw. But the idea someone had of providing two (otherwise identical) functions, one const and one non-const, feels like a kludge to me, and note that anybody with an object would expect to be able to call the const version on any Object.

I think the trouble comes from conflating logical const with actual, bitwise, memory-representation const. Logical const is what C++ provides (or tries to, anyway). It's a way of saying that the object will not _visibly_ change, but says nothing about its underlying representation. So the object may be changing state internally all the time, but to the outside world it looks like it's not changing.

D's const, however, is a _physical_ const. It's saying that the underlying representation of the object will not change, and therefore it will not visibly change either. This is a much stronger guarantee, but unfortunately, that also narrows its scope. It cannot handle the case where the object needs to change internally while still retaining the outward appearance of non-change.

Conceptually, there shouldn't be any problem: if your object is one of those that changes internally but not visibly, then in D's viewpoint it's the same as a mutable object (which it is, physically speaking).

However, the trouble comes when parts of the D runtime need certain guarantees, for example, a built-in hash function may expect that taking the hash of an object shouldn't change its state. Logically speaking, it's OK for the object's toHash method to change it (say, by caching a value that takes a long time to compute). But the D runtime wants to give _guarantees_ that nothing unexpected will happen. And so it requires toHash to be const. That way, even a rogue object method will not be able to change it (not without breaking the type system, anyway), and the runtime will be able to give strong guarantees that yes, literally _nothing_ will cause unexpected mutation to the object when you call its toHash method. But requiring toHash to be const means that you cannot cache the results of an expensive computation, and so certain things that would work with a logical const system don't work in D.

So the bottom line boils down to, we want logical const for some objects, but D doesn't have logical const. Imagining that it does only breaks the type system and any guarantees the language provides. Whether we _should_ have logical const in D is, of course, something to be discussed, but the main objection against that is that it's not enforceable. What constitutes a "non-visible" change of state? It's not possible to tell without solving the halting problem, unfortunately. An object's state can be extremely complex, with only a certain subset of state changes being visible. There's no feasible way for the compiler to figure this out automatically, and so you end up with the C++ const, which can be cast away anytime, anyday, and therefore is pretty much useless except in theory. All it takes is for _one_ function to be const incorrect, and you have a hole in which supposedly immutable objects get changed when they aren't supposed to.


[...]
> I was referring to a potential port from C#, which has no const. My particular data structure (a complex beast) contains a mutable tree of arbitrary size, which the user can convert to a conceptually immutable tree in O(1) time by calling Clone(). This marks a flag in the root node that says "read-only! do not change" and shares the root between the clones. At this point it should be safe to cast the clone to immutable. However, the original, mutable-typed version still exists. As the user requests changes to the mutable copy in the future, parts of the tree are duplicated to avoid changing the immutable nodes, with one exception: the read-only flag in various parts of the original, immutable tree will gradually be set to true.
[...]

Yeah, this is logical const. Unfortunately, D doesn't have logical const.


T

-- 
In a world without fences, who needs Windows and Gates? -- Christian Surchi
July 10, 2012
On 07/10/2012 06:45 PM, H. S. Teoh wrote:
> Yeah, this is logical const. Unfortunately, D doesn't have logical
> const.
>

Then why on earth is druntime acting as if it does?
July 10, 2012
On Tue, Jul 10, 2012 at 06:48:51PM +0200, Timon Gehr wrote:
> On 07/10/2012 06:45 PM, H. S. Teoh wrote:
> >Yeah, this is logical const. Unfortunately, D doesn't have logical const.
> >
> 
> Then why on earth is druntime acting as if it does?

Y'know, this brings up an interesting question. Do methods like toString _need_ to be const? That is, _physical_ const?  Or are we unconsciously conflating physical const with logical const here?

Yes, certain runtime operations need to be able to work with const methods, but I wonder if those required const methods really belong to a core set of more primitive operations that guarantee physical const, and perhaps shouldn't be conflated with logical operations like "convert this object to a string representation", which _may_ require caching, etc.?

Or perhaps application code want to be defining their own non-const versions of certain methods so that they can do whatever they need to do with logical const, without worrying about breaking physical const-ness.

I'm starting to think that D's hardline approach to const is clashing with the principle of information hiding. Users of a class shouldn't _need_ to know if an object is caching the value of toString, toHash, or whatever it is. What they care for is that the object doesn't visibly change, that is, logical const. Binary const implies logical const, but the implication doesn't work the other way round. While it's nice to have binary const (strong, enforceable guarantee), it breaks encapsulation: just because a class needs to do caching, means its methods can't be const, and this is a visible (and viral, no less) change in its external API. What should just be an implementation detail has become a visible difference to the outside world -- encapsulation is broken.

I don't know how to remedy this. It's clear that physical const does have its value -- it's necessary to properly support immutable, allows putting data in ROM, etc.. But it's also clear that something is missing from the picture. Implementation details are leaking past object APIs, caching and other abstractions can't work with const, etc., and that's not a good thing.


T

-- 
Doubt is a self-fulfilling prophecy.
July 10, 2012
On Tuesday, July 10, 2012 10:13:57 H. S. Teoh wrote:
> On Tue, Jul 10, 2012 at 06:48:51PM +0200, Timon Gehr wrote:
> > On 07/10/2012 06:45 PM, H. S. Teoh wrote:
> > >Yeah, this is logical const. Unfortunately, D doesn't have logical const.
> > 
> > Then why on earth is druntime acting as if it does?
> 
> Y'know, this brings up an interesting question. Do methods like toString _need_ to be const? That is, _physical_ const? Or are we unconsciously conflating physical const with logical const here?
> 
> Yes, certain runtime operations need to be able to work with const methods, but I wonder if those required const methods really belong to a core set of more primitive operations that guarantee physical const, and perhaps shouldn't be conflated with logical operations like "convert this object to a string representation", which _may_ require caching, etc.?

For a member function to be called on a const object, that function must be const. Whether it's logical const or physical const is irrelevant as far as that goes. As such, opEquals, opCmp, toHash, and toString all need to be const on Object, or it will be impossible for const Objects to work properly. Ideally, we'd alsa have a way to make it possible to have objects which aren't const and can't be const use those functions (which given physical constness obviously requires a separate function - be it an overload or an entirely separate function), but without those functions being const, const objects don't work.

Of greater debate is whether opEquals, opCmp, toString, and toHash on structs need to be const. Aside from druntime functions wanting to be to take their arguments as const, I don't see really see that as being necessary (though Walter wanst to require that they all be @safe const pure nothrow regardless of whether they're classes or structs), and since druntime probably has to templatize the functions which would take const anyway (and it can use inout or templatize them anyway if it doesn't need to), I wouldn't expect that much in druntime would require const. It should work with it, but it shouldn't need it. So, I don't think that structs really need to have those functions be const.

Classes is where it's a big problem - because of inheritance.

- Jonathan M Davis
July 10, 2012
On Tue, Jul 10, 2012 at 02:04:04PM -0400, Jonathan M Davis wrote:
> On Tuesday, July 10, 2012 10:13:57 H. S. Teoh wrote:
[...]
> > Y'know, this brings up an interesting question. Do methods like toString _need_ to be const? That is, _physical_ const? Or are we unconsciously conflating physical const with logical const here?
> > 
> > Yes, certain runtime operations need to be able to work with const methods, but I wonder if those required const methods really belong to a core set of more primitive operations that guarantee physical const, and perhaps shouldn't be conflated with logical operations like "convert this object to a string representation", which _may_ require caching, etc.?
> 
> For a member function to be called on a const object, that function must be const. Whether it's logical const or physical const is irrelevant as far as that goes. As such, opEquals, opCmp, toHash, and toString all need to be const on Object, or it will be impossible for const Objects to work properly.

Yes, they have to be const, but by doing so, we are implicitly forcing physical constness on all of them (because that's the only const D knows). The question is whether this is the way we should go.


> Ideally, we'd alsa have a way to make it possible to have objects which aren't const and can't be const use those functions (which given physical constness obviously requires a separate function - be it an overload or an entirely separate function), but without those functions being const, const objects don't work.

Which is why I suggested in another post to have both const and non-const variants of the methods. But that isn't a good solution either, because what if some objects simply can't have const versions of those methods? Plus, it leads to needless code duplication -- even if you implement a non-const toString method, you still need to also implement a const toString method because people will expect to be able to call the const method.


> Of greater debate is whether opEquals, opCmp, toString, and toHash on structs need to be const. Aside from druntime functions wanting to be to take their arguments as const, I don't see really see that as being necessary (though Walter wanst to require that they all be @safe const pure nothrow regardless of whether they're classes or structs), and since druntime probably has to templatize the functions which would take const anyway (and it can use inout or templatize them anyway if it doesn't need to), I wouldn't expect that much in druntime would require const. It should work with it, but it shouldn't need it. So, I don't think that structs really need to have those functions be const.
> 
> Classes is where it's a big problem - because of inheritance.
[...]

I think hidden somewhere in this is an unconscious conflation of physical const with logical const.

Take toHash, for example. Why does it need to be const? The naïve assumption would be, well, we're taking the hash of some field values, and that shouldn't require changing anything, so yeah, it's a const method. However, that doesn't fully cover all the use cases of toHash. For one thing, hashes _don't_ need to be based on every single field in the struct/object. I can easily decide, in my custom toHash function, to only compute the hash value based on two out of 5 fields in my struct (perhaps only those two fields matter for whatever I'm using the hash value for). So I don't care what the value of the other fields are. In particular, if the hash value is expensive to compute, I want to be able to cache the computed value in one of the other fields.

So here's a hidden assumption, that toHash must be const -- it must be logical const, yes, but that is in no way equivalent to physical const. In this case, I can't use toHash at all, because it doesn't permit caching, even though its computed value is based only on the unchanged fields. Or, to take this point further, what I _really_ mean is that if my struct is:

	struct S {
		string x,y;	// hash computed on these values
		hash_t cache;
		int p,q;	// not used by toHash
	}

then my toHash method really is expecting this struct:

	struct logical_const_S {
		const(string) x,y;
		hash_t cache;
		int p,q;	// these can be const or not, we don't care
	}

AFAIK, D currently doesn't allow implicit conversion from S to logical_const_S. If it did, and if there was a simple way to express this in the method signature of toHash, then I bet a lot of the complaints about const in druntime will go away, because then we'd have a way of doing caching or whatever it is people feel is indispensible, *without* breaking D's const system.

Now to bring this to my other point: the conversion S -> logical_const_S would allow, to some limited extent, a non-leaky object API (and by leaky I mean breaks encapsulation). Currently, if I declare toHash as a const method, it means that I guarantee the object won't mutate in that method, not even mutation that *still retains the same logical value*. But the user of my class doesn't -- and shouldn't -- care about that. As long as the public methods of the class do not exhibit any visible change, then I should have the freedom to mutate whatever I like inside a logical const method.

For example, say I have this base class:

	class B {
		private string x, y;
		hash_t toHash() {
			// compute hash value based on x and y
		}
		string xGetter() const { return x; }
		string yGetter() const { return y; }
	}

Now say I have a derived class:

	class D : B {
		bool cached = false;
		hash_t hash_cache;

		override hash_t toHash() {
			if (!cached) {
				hash_cache = /* expensive computation */
				cached = true;
			} else {
				return hash_cache;
			}
		}
	}

What I _really_ want to be able to do, is to declare D.toHash() as
taking this class instead:

	class logical_const_D {
		private const(string) x, y;
		bool cached = false;
		hash_t hash_cache;

		...
	}

This class has const versions of the fields inherited from B, but _mutable_ versions of cached and hash_cache. Such an object can still be used with B.xGetter and B.yGetter, because as far as _they're_ concerned, the object is still const.

More importantly, if the language allows implicit conversion from D to logical_const_D, (since mutable x can implicitly convert to const x, and ditto for y), then the definition of logical_const_D doesn't have to be public.  Thus, I can declare my class D something like this:

	class D : B {
		private:
			// define logical_const_D here
		public:
			hash_t toHash() logical_const_D { ... }
	}

The end-user doesn't need to know what logical_const_D is; if he has an
object of type D that can implicitly convert to logical_const_D, then he
can use toHash() on it. If he has an immutable(D), then he can't use
toHash() (because immutable(bool) and immutable(hash_t) can't implicitly
convert to bool and hash_t).

This way, we preserve the type system, *and* allow a caching implementation of toHash, *and* preserve encapsulation (user doesn't need to know which fields actually get changed by toHash -- that's an implementation detail).


T

-- 
Being able to learn is a great learning; being able to unlearn is a greater learning.
July 10, 2012
On Tuesday, July 10, 2012 12:00:59 H. S. Teoh wrote:
> I think hidden somewhere in this is an unconscious conflation of physical const with logical const.

I completely disagree at least as far as classes go. opEquals, opCmp, toString, and toHash must be _physically_ const, because they must work with physically const objects. There is _no_ way around that, and whether the actual internals of those functions could conceivably mutate something if they were logically const is irrelevant. The fact that D's const is physical const and that we must be able to have const object _mandates_ that those functions be const. There is no other option. There is no conflating of physical constness and logical constness there. It's purely a matter of making those functions callable by const objects (which happen to be physically const, because D's const is physical const, but even if D's const were logical const, the situation wouldn't change; those functions would still need to be const to work with const objects - it would just be logical const rather than physical const).

We may be able to make it possible to use non-const objects with those functions as well via overloading or whatnot, but as far as classes go, it's an absolute requirement that those functions be const. And they also need to be @safe, pure, and nothrow, or classes in general are similarly screwed with regards to those attributes.

Structs is the _only_ place where having those functions being required to be const is arguably conflating logical const and physical const. That's not the case with classes at all.

- Jonathan M Davis
July 10, 2012
On Tue, Jul 10, 2012 at 04:11:05PM -0400, Jonathan M Davis wrote:
> On Tuesday, July 10, 2012 12:00:59 H. S. Teoh wrote:
> > I think hidden somewhere in this is an unconscious conflation of physical const with logical const.
> 
> I completely disagree at least as far as classes go. opEquals, opCmp, toString, and toHash must be _physically_ const, because they must work with physically const objects.
[...]

Yes, this is because D only has physical const.

But physical const breaks encapsulation, and precludes a variety of applications such as caching, objects accessed over the network, etc.. I think that's what this uproar is all about. Physical const is useful, and in many cases necessary, but I think we're deceiving ourselves if we imagine that physical const is the whole story.

With the current state of affairs, the scope of const is greatly limited. If I want objects which cache hash values, I'm out of luck, I have to write my own non-const methods. If I want objects accessed over the network, I'm out of luck, I can't use const. If I'm writing a base class that _might_ have _one_ derived class that requires a non-const version of a method, I'm out of luck, I can't use const at all. If my class could conceivably be inherited by third party code, then I can't use const -- because otherwise I might preclude my customers from writing code that caches values.

And when I can't use const, I have to write my own version of opEquals (and call it something else), my own convention for computing hash values, my own version of everything in druntime.  All that elaborate infrastructure in druntime becomes practically worthless.  It's all-or-nothing. If I'm OK with physical const, then all is fine and dandy.  But as soon as one thing can't be const, I've to re-engineer my entire framework from ground up.

Isn't there something we can do to improve this situation?


T

-- 
Sometimes the best solution to morale problems is just to fire all of the unhappy people. -- despair.com
July 10, 2012
On Tuesday, July 10, 2012 14:19:46 H. S. Teoh wrote:
> On Tue, Jul 10, 2012 at 04:11:05PM -0400, Jonathan M Davis wrote:
> > On Tuesday, July 10, 2012 12:00:59 H. S. Teoh wrote:
> > > I think hidden somewhere in this is an unconscious conflation of physical const with logical const.
> > 
> > I completely disagree at least as far as classes go. opEquals, opCmp, toString, and toHash must be _physically_ const, because they must work with physically const objects.

> Isn't there something we can do to improve this situation?

There may be. It's an open question. For opEquals, it was suggested (by Steven IIRC) that we could make it so that the free function opEquals works with both const and non-const objects such that if your class defines a non-const opEquals, and you compare mutable instances of that class, it would use the non-const version. It already has to be templated anyway. And if we make it so that Object has both a const and non-const overload of opEquals, then the appropriate one will be used. Classes which can be const have their non-const opEquals call their const opEquals, and those that can't throw an Error from the const version if it's ever called. We could probably do the same thing with the other 3 functions, and there have been other, similar proposals. Presumably, we can work _something_ out, but how best to do it is still an open question. However, regardless of whether we can sort out making classes which can't be const work, there's no question that we must have const versions of opEquals, opCmp, toString, and toHash on Object, otherwise const is horribly broken.

- Jonathan M Davis
July 10, 2012
On Tuesday, 10 July 2012 at 21:18:18 UTC, H. S. Teoh wrote:
> If I'm OK with physical const, then all is fine and
> dandy.  But as soon as one thing can't be const, I've to re-engineer my
> entire framework from ground up.
>
> Isn't there something we can do to improve this situation?
>
>
> T

I don't think the answer is to change D's const. D's const, unlike C++'s const, only exists to bridge immutable and mutable data. As soon as it becomes incompatible with immutable (which C++'s const very much is), it ceases to be purposeful.

The problem arises when const is forced upon interfaces like toString and opEquals. we don't expect these functions to change observable state. In other words, we expect them to be logically constant, but not necessarily bitwise constant. That's not something the compiler can enforce, and it shouldn't try to. Other operators, e.g. opIndex, correctly leave the responsibility to the programmer.

There's also the reality that you want toString, opEquals, etc. to work with immutable (run-time type) class instances, requiring compatibility with const (compile-time type) references.

What I don't understand is why we've chosen const over mutable, when we should strive for allowing either and maybe even both. We've moved from one end of the scale to the opposite - our current situation is equally limiting to the previous situation.

I think these changes were rushed; understandable considering the amount of pressure, but it just fixes one problem and creates an equally big new problem.