April 14, 2008 A fresh look at comparisons | ||||
---|---|---|---|---|
| ||||
Every now again, someone (rightly) complains that opEquals should return bool, not int. However, that's not my biggest complaint with opEquals and opCmp. My biggest complaint is that it is very tedious having to write code like: class C { int x; int opEquals(Object o) { C c = cast(C) o; if (c is null) throw new Exception("Cannot compare C with wrong type"); return x == c.x; } } (and having to do so for every class I create). This is just wrong on so many levels. It's wrong because we return int, not bool, yes. It's wrong because opEquals should be a const function (it doesn't modify this). It's wrong because it should accept a const object (it doesn't modify it's input). But most of all, why should I have to do all the work of making sure the input parameter is the right type? What should I do if it's not? Should I return false? Should I throw an exception? If I throw an exception, what type of exception should it be? Is it OK to throw an object.Exception or should I roll my own kind of Exception? It there some standard message I should be using for the string? Does it matter that the message is in English and not localized? And it gets worse. For complex numbers we (correctly) have the following result: cdouble x = 1; cdouble y = 1i; (x !<>= y) == true Try doing that with a custom type! The problem is, opCmp() just isn't powerful enough. With opCmp, there is no way to say "not less than, not equal to, and not greater than". I think it's time to take a fresh look at comparisons, and to realise that comparison is /special/, in much the same way that a constructor is special. It shouldn't be a regular function. It should be a special function, with its own special syntax, and some extra special rules that make it (1) easy to write, (2) logical, and (3) behave sensibly. So with all that borne in mind, let's consider a new proposal to /replace/ opEquals. (And later down the page, I shall suggest a similar proposal to replace opCmp). Instead of: class C { int x; int opEquals(Object o) { C c = cast(C) o; if (c is null) throw new Exception("Cannot compare C with wrong type"); return x == c.x; } } we do class C { int x; is(this == c) { return x == c.x; } } Because this is no longer a regular function, the compiler can take care of all the casting and exception throwing, so we don't have to. Moreover, it is now possible for the compiler to generate a *default* equality comparison, if we don't supply one. The default equality comparison would be recursive memberwise equality comparison (which, after all, is usually the correct thing thing to do). And /of course/, the function should return bool, and should consider both "this" and "c" to be both of the same type, and both const. Likewise, we replace opCmp with: class C { int x; is(this < c) { return x < c.x; } } The default implementation of is(this < c) should be to return false always, which implies that comparison is not meaningful for this type. The compiler can auto-generate all fourteen possible comparisons using only these two functions, as follows: (a == b) == (a == b) (a < b) == (a < b) (a <= b) == (a == b || a < b) (a <> b) == (a < b || b < a) (a <>= b) == (a == b || a < b || b < a) (a > b) == (b < a) (a >= b) == (a == b || b < a) (a != b) == (!(a == b)) (a !<> b) == (!(a < b || b < a)) (a !<>= b) == (!(a == b || a < b || b < a)) (a !< b) == (!(a < b)) (a !<= b) == (!(a == b || a < b)) (a !> b) == (!(b < a)) (a !>= b) == (!(a == b || b < a)) We could allow the programmer to supply some or all of these, if they so desire, for reasons of efficiency, To my mind, this proposal fixes everything that is wrong with comparison in D. |
April 14, 2008 Re: A fresh look at comparisons | ||||
---|---|---|---|---|
| ||||
Posted in reply to Janice Caron |
I like this proposal but it might have efficiency issues when you need to find out quickly if you want to order two values (you would often have to call both functions).
What about this (much more primitive) approach:
class C {
int category;
int x;
ComparsionResult opCompare(C other) {
if(category == other.category) {
// This could be shortened to something like
// int.compare(x, other.x)
if(x == other.x) {
return ComparsionResult.EQUAL;
}
else if(x > other.x) {
return ComparsionResult.GREATER;
}
else {
// a < b is "unkown" meaning
// compiler will try to get it via
// b > a which is defined above
return ComparsionResult.UNKNOWN;
}
}
else {
// Objects are simply incomparable
// (like (1 - i) and (i - 1) for example)
return ComparsionResult.UNDEFINED;
}
} // opCompare()
} // class C
whereas ComparsionResult would be an enum defined in object.d or so. This should avoid all mentioned issues without a syntax change.
Henning
--
GPG Public Key: http://gpg-keyserver.de/pks/lookup?op=get&search=0xDDD6D36D41911851
|
April 14, 2008 Re: A fresh look at comparisons | ||||
---|---|---|---|---|
| ||||
Posted in reply to Henning Hasemann | On 14/04/2008, Henning Hasemann <hhasemann@web.de> wrote:
> What about this (much more primitive) approach:
One of the reasons that I believe comparison is "special", like constructors, is because of the following:
class A
{
int opEquals(Object o)
{
...
}
int opCmp(Object o)
{
...
}
}
class B :A
{
int n;
}
Hopefully, you can see the problem here immediately. B extends A, but B does not provide opEquals nor opCmp, so if two Bs are compared, A's compare functions will be called, which means that B's member n will be ignored.
This is, of course, absolutely normal behavior for classes. It's how inheritance works. But it's just not right for comparisons.
And important part of my proposal is that comparisons don't inherit. Instead, we have the default behaviors that I described. Thus, under my proposal, if B did not provide an is(this == c), then a default implementation would be provided by the compiler, equivalent to
is(this == c)
{
return super == c.super && n == c.n;
}
This is very different from just changing the signature of a function.
Also, there really isn't any need for any comparison result to be undefined. If (a < b) is not meaningful, it should suffice for (a < b) and (b < a) both to return false. That's because less-than really is a boolean question. If I ask "is a less than b?", and ordering is not defined for that type, then it seems to me that "no" is the right answer. (No, a is not less than b, because a and be cannot be ordered). It is far, far less complicated that way - less-than is a boolean question, not a yes/no/maybe question. However, what you /do/ need to lose, is the assumption that (a < b) implies (a !>= b), because that no longer holds.
|
April 14, 2008 Re: A fresh look at comparisons | ||||
---|---|---|---|---|
| ||||
Posted in reply to Henning Hasemann | On 14/04/2008, Henning Hasemann <hhasemann@web.de> wrote:
>
> I like this proposal but it might have efficiency issues when you need
> to find out quickly if you want to order two values (you would often
> have to call both functions).
The best way to fix the efficiency issue would be to tell the compiler whether the comparison was fully ordered or not. If the comparison is fully ordered, then the compiler may implement
a <= b
as
!(b < a)
whereas, if the comparison is /not/ fully ordered, then the compiler would need to go the long route, and implement it as
a == b || a < b
So the question then becomes, how we tell the compiler whether or not a comparison is fully ordered. My suggestion would be, assume yes by default, and use a slightly modified syntax if not. e.g.
class NotFullyOrdered
{
int x;
is(this < c, unordered)
{
return whatever;
}
}
The default, compiler-generated comparison function, if not supplied by the programmer, should be
is(this < c, unordered)
{
/* recursive memberwise comparison */
}
That buys you the best of all worlds.
|
April 14, 2008 Re: A fresh look at comparisons | ||||
---|---|---|---|---|
| ||||
Posted in reply to Janice Caron | Janice Caron wrote: > On 14/04/2008, Henning Hasemann <hhasemann@web.de> wrote: >> I like this proposal but it might have efficiency issues when you need >> to find out quickly if you want to order two values (you would often >> have to call both functions). > > > The best way to fix the efficiency issue would be to tell the compiler > whether the comparison was fully ordered or not. If the comparison is > fully ordered, then the compiler may implement > > a <= b > > as > > !(b < a) > > whereas, if the comparison is /not/ fully ordered, then the compiler > would need to go the long route, and implement it as > > a == b || a < b > > So the question then becomes, how we tell the compiler whether or not > a comparison is fully ordered. My suggestion would be, assume yes by > default, and use a slightly modified syntax if not. e.g. > > class NotFullyOrdered > { > int x; > > is(this < c, unordered) > { > return whatever; > } > } > > The default, compiler-generated comparison function, if not supplied > by the programmer, should be > > is(this < c, unordered) > { > /* recursive memberwise comparison */ > } > > That buys you the best of all worlds. Why not put the flag as an arg to is-- a la "cast(extra_arg)" or "foreach(extra_arg)(some;stuff)" as Andrei proposed. So it would be is(unordered)(this<c) I like the ideas you're throwing around here, anyway. Handling of comparisons could use some work. Writing "multi-column" lexical opCmps is also really annoying. Not to mention wanky-ness about opCmp happily compiling with ref/value/ or ptr members but only actually working in all cases when you use value parameters. For sorting, pointer parameters work. For comparisons in regular code ref parameters work. The whole thing is just badly in need of some loving. --bb |
April 14, 2008 Re: A fresh look at comparisons | ||||
---|---|---|---|---|
| ||||
Posted in reply to Henning Hasemann | On 14/04/2008, Henning Hasemann <hhasemann@web.de> wrote:
> // Objects are simply incomparable
> // (like (1 - i) and (i - 1) for example)
> return ComparsionResult.UNDEFINED;
This is worth spending a bit of time discussing. The fact is that, in D, that particular comparison is not undefined. In fact, it is /very well/ defined. It is defined as follows:
cdouble a = 1 - 1i;
cdouble b = 1i - 1;
(a == b) == false
(a < b) == false
(a <= b) == false
(a <> b) == false
(a <>= b) == false
(a > b) == false
(a >= b) == false
(a != b) == true
(a !<> b) == true
(a !<>= b) == true
(a !< b) == true
(a !<= b) == true
(a !> b) == true
(a !>= b) == true
It would be very bad indeed for D, if less-than suddenly became a tristate operation! In my opinion, we really don't want that. So, we don't want to be returning UNDEFINED - instead, we want to be defining the result of each of the fourteen possible comparisons so that each gives a purley boolean result, and that unorderedness is indicated by patterns such as the above. My suggestion does give us that.
|
April 14, 2008 Re: A fresh look at comparisons | ||||
---|---|---|---|---|
| ||||
Posted in reply to Janice Caron | "Janice Caron" <caron800@googlemail.com> wrote: > On 14/04/2008, Henning Hasemann <hhasemann@web.de> wrote: > > // Objects are simply incomparable > > // (like (1 - i) and (i - 1) for example) > > return ComparsionResult.UNDEFINED; > > This is worth spending a bit of time discussing. The fact is that, in > D, that particular comparison is not undefined. In fact, it is /very > well/ defined. It is defined as follows: > (snip) Yeah, maybe undefined was simply the wrong word. What I mean is "there is no meaningful order between the two" ie "it is neither '<' nor '>' nor '=='" ie "the ORDER between these two is not defined" (which is something you can't express with the current opCmp) Just as it is with (1 - i) and (i - 1) whatever the correct term for that would be. (I think UNORDERED would be way better) I just wanted to distinguish that from UNKNOWN which would allow the compiler to search for an alternative form. Here for clarity definitions of what I meant with the two terms in my previous posting: ComparsionResult.UNKNOWN: The comparsion function does not make a statement wether the two objects are equal or a is greater b or whatever. Think of it as the comparsion function is simply not defined this way. For example if the compiler would ask an object a for "a > b" and a would return ComparsionResult.UNKNOWN The compiler can and should ask b for "b < a". If the compiler doesn't find any rewriting of the term that returns something else than UNKNOWN, every comparsion yields false. (Ie the same as if one comparsion would have returned UNDEFINED/UNORDERED) This would normally not assume well-ordering (ie the compiler would *not* check for "a !<= b". One could imagine a special flag like suggested for the is-thingy) ComparsionResult.UNDEFINED/ComparsionResult.UNORDERED: If a is asked for a for being "<", ">", "<=", ">=" or "==" b, false is returned. Ie there is no order between the two objects. Naturally this implies that all the comparsion operators starting with "!" return true. Implications of this approach in contrast to the is-Approach: - As said, no special syntax except maybe for the well-ordered flag - This version *enforces* that two objects a and b can only be in one of the following 4 states: * a is greater than b * b is greater than a (same as a is lower than b) * a is equal to b * a and be are unordered to each other This is clearly a limitation (eg a and b can not be == and < at the same time). I guess it depends strongly on what you want to do with your operators if this can be considered a good thing. - You can inherit comparsion which might be considered useful, especially if your super class compares private members and you want to refer to super class equality: class A { int x,y; ComparsionResult opCompare(A other) { // We only care for equality if(x == other.x and y == other.y) return ComparsionResult.EQUAL; return ComparsionResult.UNORDERED; } } class B : A { int z; ComparsionResult opCompare(B other) { if(super.opCompare(other) == ComparsionResult.EQUAL && z == other.z) { return ComparsionResult.EQUAL; } return ComparsionResult.UNORDERED; } } - The is() approach has a much nicer syntax and its much more easy to write cleary what you want - This approach secretely assumes some magic in that the parameter to opCompare would always be of the same type as the class. (comparsion with other objects would be UNORDERED). The is approach addresses this better in so far that its clearly visible something special is going on here and that its NOT a regular method. To your argument: """ One of the reasons that I believe comparison is "special", like constructors, is because of the following: class A { int opEquals(Object o) { ... } int opCmp(Object o) { ... } } class B :A { int n; } Hopefully, you can see the problem here immediately. B extends A, but B does not provide opEquals nor opCmp, so if two Bs are compared, A's compare functions will be called, which means that B's member n will be ignored. This is, of course, absolutely normal behavior for classes. It's how inheritance works. But it's just not right for comparisons. """: I don't think this is bad or even would feel somehow incorrect. Depending on what type of code you write you don't want every member automagically being compared. In some game code I write curretly there are for example lots of members which would not apply for comparsion. I think its more a matter of taste if members should be compared by default or not. But with throwing away inheritance you throw away all the considerations you made in A which properties identify two equal objects. Is that really what you want? So enough for the war for now ;) I really like your idea just to make myself clear. (And except for a few points it seems to be the better way especially regarding clarity) I just had this idea popped up when I read yours and wanted to fully discuss it. Hereby done. Henning -- GPG Public Key: http://gpg-keyserver.de/pks/lookup?op=get&search=0xDDD6D36D41911851 |
April 14, 2008 Re: A fresh look at comparisons | ||||
---|---|---|---|---|
| ||||
Posted in reply to Henning Hasemann | On 14/04/2008, Henning Hasemann <hhasemann@web.de> wrote: > Depending on what type of code you write you don't want every member > automagically being compared. In some game code I write curretly there > are for example lots of members which would not apply for comparsion. > I think its more a matter of taste if members should be compared by > default or not. Yes, all we're talking about here is a sensible /default/. The fact that you can override the default, by supplying your own function, means that, in some sense, it almost doesn't matter what the default is, because if it's not what you want, then you can override it. But even so - it's nice to get the default covering the most common cases. To me, it seems sensible that the default implementation should be to compare all members (including super). This means that if a subclass adds new members, then by default, a test for subclass equality will be true iff (1) the superclass compares as equal, and (2) the new members also compare as equal. The inheritance mechanism (which is what we have now, thanks to the fact that opEquals() is a regular function), means that if a subclass adds new members, then by default, a test for subclass equality will be true iff the superclass compares as equal. All new members will be ignored. > But with throwing away inheritance you throw away all the > considerations you made in A which properties identify two equal > objects. Is that really what you want? It's not "throwing away", because the default implementation would include a test for superclass equality. ("super" is a member just like any other). So if B extends A, and also adds two member variables x and y, then the test (b1 == b2) would be evaluated as: // Your scheme b1.super == b2.super or // My scheme b1.super == b2.super && b1.x == b2.x && b1.y == b2.y So we don't lose inheritance - we just gain extra tests. Of course, if you don't /want/ those extra tests, then you would have to override the default - such is the nature of defaults - by doing something like class B : A { int x,y; is(this == that) { return super == that.super; // x and y are now ignored } } > So enough for the war for now ;) > I really like your idea just to make myself clear. Thanks. And there was never any question of warfare! :-) |
April 14, 2008 Re: A fresh look at comparisons | ||||
---|---|---|---|---|
| ||||
Posted in reply to Janice Caron | I like this, except for the semantic irregularities similar to Scala we'd be introducing by not having the operator evaluated.
Also, if you can't cast successfully I'd think you would simply return false for opEquals.
if o !is C, then they can't possibly be equal. ;-)
The idea behind opEquals acting this way, I assume naturally, is that it enables the ability for your type to know how to compare itself to multiple *other* types which exist. Somewhat similar to how you can freely compare a long and an int in many cases, without ramifications or need to know any bit-wise or type difference. So what is the answer? What makes two types comparable? Common inheritance? Where? How do I specify (if all I'm saying is "is(this == c)" the possible types I can be compared to within my system. What if I know these types, and they are not my descendants? Or if they are my descendants, when and how can I say that my descendant is not comparable to it's parent?
Cheers,
Scott S. McCoy
On Mon, 2008-04-14 at 09:18 +0100, Janice Caron wrote:
> Every now again, someone (rightly) complains that opEquals should return bool, not int. However, that's not my biggest complaint with opEquals and opCmp. My biggest complaint is that it is very tedious having to write code like:
>
> class C
> {
> int x;
>
> int opEquals(Object o)
> {
> C c = cast(C) o;
> if (c is null)
> throw new Exception("Cannot compare C with wrong type");
> return x == c.x;
> }
> }
>
> (and having to do so for every class I create). This is just wrong on so many levels. It's wrong because we return int, not bool, yes. It's wrong because opEquals should be a const function (it doesn't modify this). It's wrong because it should accept a const object (it doesn't modify it's input). But most of all, why should I have to do all the work of making sure the input parameter is the right type?
>
> What should I do if it's not? Should I return false? Should I throw an exception? If I throw an exception, what type of exception should it be? Is it OK to throw an object.Exception or should I roll my own kind of Exception? It there some standard message I should be using for the string? Does it matter that the message is in English and not localized?
>
> And it gets worse. For complex numbers we (correctly) have the following result:
>
> cdouble x = 1;
> cdouble y = 1i;
>
> (x !<>= y) == true
>
> Try doing that with a custom type! The problem is, opCmp() just isn't powerful enough. With opCmp, there is no way to say "not less than, not equal to, and not greater than".
>
> I think it's time to take a fresh look at comparisons, and to realise that comparison is /special/, in much the same way that a constructor is special. It shouldn't be a regular function. It should be a special function, with its own special syntax, and some extra special rules that make it (1) easy to write, (2) logical, and (3) behave sensibly.
>
> So with all that borne in mind, let's consider a new proposal to /replace/ opEquals. (And later down the page, I shall suggest a similar proposal to replace opCmp). Instead of:
>
> class C
> {
> int x;
>
> int opEquals(Object o)
> {
> C c = cast(C) o;
> if (c is null)
> throw new Exception("Cannot compare C with wrong type");
> return x == c.x;
> }
> }
>
> we do
>
> class C
> {
> int x;
>
> is(this == c)
> {
> return x == c.x;
> }
> }
>
> Because this is no longer a regular function, the compiler can take care of all the casting and exception throwing, so we don't have to. Moreover, it is now possible for the compiler to generate a *default* equality comparison, if we don't supply one. The default equality comparison would be recursive memberwise equality comparison (which, after all, is usually the correct thing thing to do).
>
> And /of course/, the function should return bool, and should consider both "this" and "c" to be both of the same type, and both const.
>
> Likewise, we replace opCmp with:
>
> class C
> {
> int x;
>
> is(this < c)
> {
> return x < c.x;
> }
> }
>
> The default implementation of is(this < c) should be to return false always, which implies that comparison is not meaningful for this type.
>
> The compiler can auto-generate all fourteen possible comparisons using only these two functions, as follows:
>
> (a == b) == (a == b)
> (a < b) == (a < b)
> (a <= b) == (a == b || a < b)
> (a <> b) == (a < b || b < a)
> (a <>= b) == (a == b || a < b || b < a)
> (a > b) == (b < a)
> (a >= b) == (a == b || b < a)
> (a != b) == (!(a == b))
> (a !<> b) == (!(a < b || b < a))
> (a !<>= b) == (!(a == b || a < b || b < a))
> (a !< b) == (!(a < b))
> (a !<= b) == (!(a == b || a < b))
> (a !> b) == (!(b < a))
> (a !>= b) == (!(a == b || b < a))
>
> We could allow the programmer to supply some or all of these, if they so desire, for reasons of efficiency,
>
> To my mind, this proposal fixes everything that is wrong with comparison in D.
|
April 14, 2008 Re: A fresh look at comparisons | ||||
---|---|---|---|---|
| ||||
Posted in reply to Henning Hasemann | Wouldn't that safely be 0?
If there is no order, there must be quality...assuming that all types do have a natural order, even if it's just it's ordinal value. Otherwise, they may not be comparable with opCmp, and maybe say, only opEquals :-)
On Mon, 2008-04-14 at 16:10 +0200, Henning Hasemann wrote:
> "the ORDER between these two is not defined" (which
> is something you can't express with the current opCmp)
|
Copyright © 1999-2021 by the D Language Foundation