Jump to page: 1 25  
Page
Thread overview
A fresh look at comparisons
Apr 14, 2008
Janice Caron
Apr 14, 2008
Henning Hasemann
Apr 14, 2008
Janice Caron
Apr 14, 2008
Janice Caron
Apr 14, 2008
Bill Baxter
Apr 14, 2008
Janice Caron
Apr 14, 2008
Henning Hasemann
Apr 14, 2008
Janice Caron
Apr 14, 2008
Scott S. McCoy
Apr 14, 2008
Janice Caron
Apr 14, 2008
Scott S. McCoy
Apr 14, 2008
Janice Caron
Apr 14, 2008
Paul D Anderson
Apr 15, 2008
Scott S. McCoy
Apr 15, 2008
Janice Caron
Apr 15, 2008
Bruce Adams
Apr 15, 2008
Janice Caron
Apr 15, 2008
Georg Wrede
Apr 15, 2008
Janice Caron
Apr 15, 2008
Bruce Adams
Apr 16, 2008
Janice Caron
Apr 15, 2008
Scott S. McCoy
Apr 15, 2008
Scott S. McCoy
Apr 15, 2008
Janice Caron
Apr 16, 2008
Bruce Adams
Apr 15, 2008
Georg Wrede
Apr 15, 2008
Janice Caron
Apr 14, 2008
Janice Caron
Apr 14, 2008
Robert Fraser
Apr 14, 2008
Graham St Jack
Apr 15, 2008
Yigal Chripun
Apr 16, 2008
Janice Caron
Apr 16, 2008
Yigal Chripun
Apr 16, 2008
Janice Caron
Apr 16, 2008
Yigal Chripun
Apr 16, 2008
Janice Caron
Apr 16, 2008
Yigal Chripun
Apr 17, 2008
Janice Caron
Apr 17, 2008
Yigal Chripun
Apr 17, 2008
Paul D Anderson
Apr 17, 2008
Paul D Anderson
Apr 25, 2008
Bruno Medeiros
Apr 25, 2008
Bruno Medeiros
Apr 25, 2008
Bruno Medeiros
April 14, 2008
Every now again, someone (rightly) complains that opEquals should return bool, not int. However, that's not my biggest complaint with opEquals and opCmp. My biggest complaint is that it is very tedious having to write code like:

    class C
    {
        int x;

        int opEquals(Object o)
        {
            C c = cast(C) o;
            if (c is null)
                throw new Exception("Cannot compare C with wrong type");
            return x == c.x;
        }
    }

(and having to do so for every class I create). This is just wrong on so many levels. It's wrong because we return int, not bool, yes. It's wrong because opEquals should be a const function (it doesn't modify this). It's wrong because it should accept a const object (it doesn't modify it's input). But most of all, why should I have to do all the work of making sure the input parameter is the right type?

What should I do if it's not? Should I return false? Should I throw an exception? If I throw an exception, what type of exception should it be? Is it OK to throw an object.Exception or should I roll my own kind of Exception? It there some standard message I should be using for the string? Does it matter that the message is in English and not localized?

And it gets worse. For complex numbers we (correctly) have the following result:

    cdouble x = 1;
    cdouble y = 1i;

    (x !<>= y) == true

Try doing that with a custom type! The problem is, opCmp() just isn't powerful enough. With opCmp, there is no way to say "not less than, not equal to, and not greater than".

I think it's time to take a fresh look at comparisons, and to realise that comparison is /special/, in much the same way that a constructor is special. It shouldn't be a regular function. It should be a special function, with its own special syntax, and some extra special rules that make it (1) easy to write, (2) logical, and (3) behave sensibly.

So with all that borne in mind, let's consider a new proposal to /replace/ opEquals. (And later down the page, I shall suggest a similar proposal to replace opCmp). Instead of:

    class C
    {
        int x;

        int opEquals(Object o)
        {
            C c = cast(C) o;
            if (c is null)
                throw new Exception("Cannot compare C with wrong type");
            return x == c.x;
        }
    }

we do

    class C
    {
        int x;

        is(this == c)
        {
            return x == c.x;
        }
    }

Because this is no longer a regular function, the compiler can take care of all the casting and exception throwing, so we don't have to. Moreover, it is now possible for the compiler to generate a *default* equality comparison, if we don't supply one. The default equality comparison would be recursive memberwise equality comparison (which, after all, is usually the correct thing thing to do).

And /of course/, the function should return bool, and should consider both "this" and "c" to be both of the same type, and both const.

Likewise, we replace opCmp with:

    class C
    {
        int x;

        is(this < c)
        {
            return x < c.x;
        }
    }

The default implementation of is(this < c) should be to return false always, which implies that comparison is not meaningful for this type.

The compiler can auto-generate all fourteen possible comparisons using only these two functions, as follows:

    (a == b)  ==  (a == b)
    (a < b)  ==  (a < b)
    (a <= b)  ==  (a == b || a < b)
    (a <> b)  ==  (a < b || b < a)
    (a <>= b)  ==  (a == b || a < b || b < a)
    (a > b)  ==  (b < a)
    (a >= b)  ==  (a == b || b < a)
    (a != b)  ==  (!(a == b))
    (a !<> b)  ==  (!(a < b || b < a))
    (a !<>= b)  ==  (!(a == b || a < b || b < a))
    (a !< b)  ==  (!(a < b))
    (a !<= b)  ==  (!(a == b || a < b))
    (a !> b)  ==  (!(b < a))
    (a !>= b)  ==  (!(a == b || b < a))

We could allow the programmer to supply some or all of these, if they so desire, for reasons of efficiency,

To my mind, this proposal fixes everything that is wrong with comparison in D.
April 14, 2008
I like this proposal but it might have efficiency issues when you need to find out quickly if you want to order two values (you would often have to call both functions).

What about this (much more primitive) approach:

class C {
	int category;
	int x;

	ComparsionResult opCompare(C other) {
		if(category == other.category) {
			// This could be shortened to something like
			// int.compare(x, other.x)
			if(x == other.x) {
				return ComparsionResult.EQUAL;
			}
			else if(x > other.x) {
				return ComparsionResult.GREATER;
			}
			else {
				// a < b is "unkown" meaning
				// compiler will try to get it via
				// b > a which is defined above
				return ComparsionResult.UNKNOWN;
			}
		}
		else {
			// Objects are simply incomparable
			// (like (1 - i) and (i - 1) for example)
			return ComparsionResult.UNDEFINED;
		}
	} // opCompare()
} // class C

whereas ComparsionResult would be an enum defined in object.d or so. This should avoid all mentioned issues without a syntax change.

Henning


-- 
GPG Public Key: http://gpg-keyserver.de/pks/lookup?op=get&search=0xDDD6D36D41911851

April 14, 2008
On 14/04/2008, Henning Hasemann <hhasemann@web.de> wrote:
>  What about this (much more primitive) approach:

One of the reasons that I believe comparison is "special", like constructors, is because of the following:

    class A
    {
        int opEquals(Object o)
        {
           ...
        }

        int opCmp(Object o)
        {
           ...
        }
    }

    class B :A
    {
        int n;
    }

Hopefully, you can see the problem here immediately. B extends A, but B does not provide opEquals nor opCmp, so if two Bs are compared, A's compare functions will be called, which means that B's member n will be ignored.

This is, of course, absolutely normal behavior for classes. It's how inheritance works. But it's just not right for comparisons.

And important part of my proposal is that comparisons don't inherit. Instead, we have the default behaviors that I described. Thus, under my proposal, if B did not provide an is(this == c), then a default implementation would be provided by the compiler, equivalent to

        is(this == c)
        {
            return super == c.super && n == c.n;
        }

This is very different from just changing the signature of a function.

Also, there really isn't any need for any comparison result to be undefined. If (a < b) is not meaningful, it should suffice for (a < b) and (b < a) both to return false. That's because less-than really is a boolean question. If I ask "is a less than b?", and ordering is not defined for that type, then it seems to me that "no" is the right answer. (No, a is not less than b, because a and be cannot be ordered). It is far, far less complicated that way - less-than is a boolean question, not a yes/no/maybe question. However, what you /do/ need to lose, is the assumption that (a < b) implies (a !>= b), because that no longer holds.
April 14, 2008
On 14/04/2008, Henning Hasemann <hhasemann@web.de> wrote:
>
>  I like this proposal but it might have efficiency issues when you need
>  to find out quickly if you want to order two values (you would often
>  have to call both functions).


The best way to fix the efficiency issue would be to tell the compiler whether the comparison was fully ordered or not. If the comparison is fully ordered, then the compiler may implement

    a <= b

as

    !(b < a)

whereas, if the comparison is /not/ fully ordered, then the compiler would need to go the long route, and implement it as

    a == b || a < b

So the question then becomes, how we tell the compiler whether or not a comparison is fully ordered. My suggestion would be, assume yes by default, and use a slightly modified syntax if not. e.g.

    class NotFullyOrdered
    {
        int x;

        is(this < c, unordered)
        {
            return whatever;
        }
    }

The default, compiler-generated comparison function, if not supplied by the programmer, should be

        is(this < c, unordered)
        {
            /* recursive memberwise comparison */
        }

That buys you the best of all worlds.
April 14, 2008
Janice Caron wrote:
> On 14/04/2008, Henning Hasemann <hhasemann@web.de> wrote:
>>  I like this proposal but it might have efficiency issues when you need
>>  to find out quickly if you want to order two values (you would often
>>  have to call both functions).
> 
> 
> The best way to fix the efficiency issue would be to tell the compiler
> whether the comparison was fully ordered or not. If the comparison is
> fully ordered, then the compiler may implement
> 
>     a <= b
> 
> as
> 
>     !(b < a)
> 
> whereas, if the comparison is /not/ fully ordered, then the compiler
> would need to go the long route, and implement it as
> 
>     a == b || a < b
> 
> So the question then becomes, how we tell the compiler whether or not
> a comparison is fully ordered. My suggestion would be, assume yes by
> default, and use a slightly modified syntax if not. e.g.
> 
>     class NotFullyOrdered
>     {
>         int x;
> 
>         is(this < c, unordered)
>         {
>             return whatever;
>         }
>     }
> 
> The default, compiler-generated comparison function, if not supplied
> by the programmer, should be
> 
>         is(this < c, unordered)
>         {
>             /* recursive memberwise comparison */
>         }
> 
> That buys you the best of all worlds.

Why not put the flag as an arg to is--  a la
"cast(extra_arg)"
or
"foreach(extra_arg)(some;stuff)" as Andrei proposed.

So it would be

   is(unordered)(this<c)


I like the ideas you're throwing around here, anyway.  Handling of comparisons could use some work.  Writing "multi-column" lexical opCmps is also really annoying.

Not to mention wanky-ness about opCmp happily compiling with ref/value/ or ptr members but only actually working in all cases when you use value parameters.  For sorting, pointer parameters work.   For comparisons in regular code ref parameters work.  The whole thing is just badly in need of some loving.

--bb
April 14, 2008
On 14/04/2008, Henning Hasemann <hhasemann@web.de> wrote:
>                         // Objects are simply incomparable
>                         // (like (1 - i) and (i - 1) for example)
>                         return ComparsionResult.UNDEFINED;

This is worth spending a bit of time discussing. The fact is that, in D, that particular comparison is not undefined. In fact, it is /very well/ defined. It is defined as follows:

   cdouble a = 1 - 1i;
   cdouble b = 1i - 1;

   (a == b)  ==  false
   (a < b)  ==  false
   (a <= b)  ==  false
   (a <> b)  ==  false
   (a <>= b)  ==  false
   (a > b)  ==  false
   (a >= b)  ==  false
   (a != b)  ==  true
   (a !<> b)  ==  true
   (a !<>= b)  ==  true
   (a !< b)  ==  true
   (a !<= b)  ==  true
   (a !> b)  ==  true
   (a !>= b)  ==  true

It would be very bad indeed for D, if less-than suddenly became a tristate operation! In my opinion, we really don't want that. So, we don't want to be returning UNDEFINED - instead, we want to be defining the result of each of the fourteen possible comparisons so that each gives a purley boolean result, and that unorderedness is indicated by patterns such as the above. My suggestion does give us that.
April 14, 2008
"Janice Caron" <caron800@googlemail.com> wrote:
> On 14/04/2008, Henning Hasemann <hhasemann@web.de> wrote:
> >                         // Objects are simply incomparable
> >                         // (like (1 - i) and (i - 1) for example)
> >                         return ComparsionResult.UNDEFINED;
> 
> This is worth spending a bit of time discussing. The fact is that, in
> D, that particular comparison is not undefined. In fact, it is /very
> well/ defined. It is defined as follows:
> (snip)

Yeah, maybe undefined was simply the wrong word. What I mean is "there
is no meaningful order between the two" ie "it is neither '<' nor '>'
nor '=='" ie "the ORDER between these two is not defined" (which
is something you can't express with the current opCmp)

Just as it is with (1 - i) and (i - 1) whatever the correct
term for that would be. (I think UNORDERED would be way better)

I just wanted to distinguish that from UNKNOWN which would allow the compiler to search for an alternative form.

Here for clarity definitions of what I meant with the two terms in my previous posting:

ComparsionResult.UNKNOWN:
  The comparsion function does not make a
  statement wether the two objects are equal or a is greater b or
  whatever. Think of it as the comparsion function is simply not
  defined this way. For example if the compiler would ask an object
  a for "a > b" and a would return ComparsionResult.UNKNOWN The compiler
  can and should ask b for "b < a".
  If the compiler doesn't find any rewriting of the term that returns
  something else than UNKNOWN, every comparsion yields false.
  (Ie the same as if one comparsion would have returned
  UNDEFINED/UNORDERED)

  This would normally not assume well-ordering (ie the compiler would
  *not* check for "a !<= b". One could imagine a special flag like
  suggested for the is-thingy)

ComparsionResult.UNDEFINED/ComparsionResult.UNORDERED:
  If a is asked for a for being "<", ">", "<=", ">=" or "==" b, false is
  returned. Ie there is no order between the two objects.
  Naturally this implies that all the comparsion operators starting
  with "!" return true.

Implications of this approach in contrast to the is-Approach:
- As said, no special syntax except maybe for the well-ordered flag
- This version *enforces* that two objects a and b can only be in one
  of the following 4 states:
  * a is greater than b
  * b is greater than a (same as a is lower than b)
  * a is equal to b
  * a and be are unordered to each other
  This is clearly a limitation (eg a and b can not be == and < at the
  same time). I guess it depends strongly on what you want to do with
  your operators if this can be considered a good thing.
- You can inherit comparsion which might be considered useful,
  especially if your super class compares private members and you want
  to refer to super class equality:

  class A {
    int x,y;
    ComparsionResult opCompare(A other) {
      // We only care for equality
      if(x == other.x and y == other.y)
        return ComparsionResult.EQUAL;
      return ComparsionResult.UNORDERED;
    }
  }

  class B : A {
    int z;
    ComparsionResult opCompare(B other) {
      if(super.opCompare(other) == ComparsionResult.EQUAL && z ==
other.z) {
        return ComparsionResult.EQUAL;
      }
      return ComparsionResult.UNORDERED;
    }
  }
- The is() approach has a much nicer syntax and its much more easy to
  write cleary what you want
- This approach secretely assumes some magic in that the parameter to
  opCompare would always be of the same type as the class.
  (comparsion with other objects would be UNORDERED).
  The is approach addresses this better in so far that its clearly
  visible something special is going on here and that its NOT a regular
  method.


To your argument:
"""
One of the reasons that I believe comparison is "special", like
constructors, is because of the following:

    class A
    {
        int opEquals(Object o)
        {
           ...
        }

        int opCmp(Object o)
        {
           ...
        }
    }

    class B :A
    {
        int n;
    }

Hopefully, you can see the problem here immediately. B extends A, but B does not provide opEquals nor opCmp, so if two Bs are compared, A's compare functions will be called, which means that B's member n will be ignored.

This is, of course, absolutely normal behavior for classes. It's how
inheritance works. But it's just not right for comparisons.
""":

I don't think this is bad or even would feel somehow incorrect.
Depending on what type of code you write you don't want every member
automagically being compared. In some game code I write curretly there
are for example lots of members which would not apply for comparsion.
I think its more a matter of taste if members should be compared by
default or not.
But with throwing away inheritance you throw away all the
considerations you made in A which properties identify two equal
objects. Is that really what you want?


So enough for the war for now ;)
I really like your idea just to make myself clear.
(And except for a few points it seems to be the better way especially
regarding clarity)
I just had this idea
popped up when I read yours and wanted to fully discuss it. Hereby
done.

Henning

-- 
GPG Public Key: http://gpg-keyserver.de/pks/lookup?op=get&search=0xDDD6D36D41911851

April 14, 2008
On 14/04/2008, Henning Hasemann <hhasemann@web.de> wrote:
>  Depending on what type of code you write you don't want every member
>  automagically being compared. In some game code I write curretly there
>  are for example lots of members which would not apply for comparsion.
>  I think its more a matter of taste if members should be compared by
>  default or not.

Yes, all we're talking about here is a sensible /default/. The fact that you can override the default, by supplying your own function, means that, in some sense, it almost doesn't matter what the default is, because if it's not what you want, then you can override it. But even so - it's nice to get the default covering the most common cases.

To me, it seems sensible that the default implementation should be to
compare all members (including super). This means that if a subclass
adds new members, then by default, a test for subclass equality will
be true iff (1) the superclass compares as equal, and (2) the new
members also compare as equal.

The inheritance mechanism (which is what we have now, thanks to the fact that opEquals() is a regular function), means that if a subclass adds new members, then by default, a test for subclass equality will be true iff the superclass compares as equal. All new members will be ignored.



>  But with throwing away inheritance you throw away all the
>  considerations you made in A which properties identify two equal
>  objects. Is that really what you want?

It's not "throwing away", because the default implementation would include a test for superclass equality. ("super" is a member just like any other). So if B extends A, and also adds two member variables x and y, then the test (b1 == b2) would be evaluated as:

    // Your scheme
    b1.super == b2.super

or

    // My scheme
    b1.super == b2.super
        && b1.x == b2.x
        && b1.y == b2.y

So we don't lose inheritance - we just gain extra tests. Of course, if you don't /want/ those extra tests, then you would have to override the default - such is the nature of defaults - by doing something like

    class B : A
    {
        int x,y;

        is(this == that)
        {
            return super == that.super;
            // x and y are now ignored
        }
    }


>  So enough for the war for now ;)
>  I really like your idea just to make myself clear.

Thanks. And there was never any question of warfare! :-)
April 14, 2008
I like this, except for the semantic irregularities similar to Scala we'd be introducing by not having the operator evaluated.

Also, if you can't cast successfully I'd think you would simply return false for opEquals.

if o !is C, then they can't possibly be equal. ;-)

The idea behind opEquals acting this way, I assume naturally, is that it enables the ability for your type to know how to compare itself to multiple *other* types which exist.  Somewhat similar to how you can freely compare a long and an int in many cases, without ramifications or need to know any bit-wise or type difference.  So what is the answer? What makes two types comparable?  Common inheritance?  Where?  How do I specify (if all I'm saying is "is(this == c)" the possible types I can be compared to within my system.  What if I know these types, and they are not my descendants?  Or if they are my descendants, when and how can I say that my descendant is not comparable to it's parent?

Cheers,
	Scott S. McCoy


On Mon, 2008-04-14 at 09:18 +0100, Janice Caron wrote:
> Every now again, someone (rightly) complains that opEquals should return bool, not int. However, that's not my biggest complaint with opEquals and opCmp. My biggest complaint is that it is very tedious having to write code like:
> 
>     class C
>     {
>         int x;
> 
>         int opEquals(Object o)
>         {
>             C c = cast(C) o;
>             if (c is null)
>                 throw new Exception("Cannot compare C with wrong type");
>             return x == c.x;
>         }
>     }
> 
> (and having to do so for every class I create). This is just wrong on so many levels. It's wrong because we return int, not bool, yes. It's wrong because opEquals should be a const function (it doesn't modify this). It's wrong because it should accept a const object (it doesn't modify it's input). But most of all, why should I have to do all the work of making sure the input parameter is the right type?
> 
> What should I do if it's not? Should I return false? Should I throw an exception? If I throw an exception, what type of exception should it be? Is it OK to throw an object.Exception or should I roll my own kind of Exception? It there some standard message I should be using for the string? Does it matter that the message is in English and not localized?
> 
> And it gets worse. For complex numbers we (correctly) have the following result:
> 
>     cdouble x = 1;
>     cdouble y = 1i;
> 
>     (x !<>= y) == true
> 
> Try doing that with a custom type! The problem is, opCmp() just isn't powerful enough. With opCmp, there is no way to say "not less than, not equal to, and not greater than".
> 
> I think it's time to take a fresh look at comparisons, and to realise that comparison is /special/, in much the same way that a constructor is special. It shouldn't be a regular function. It should be a special function, with its own special syntax, and some extra special rules that make it (1) easy to write, (2) logical, and (3) behave sensibly.
> 
> So with all that borne in mind, let's consider a new proposal to /replace/ opEquals. (And later down the page, I shall suggest a similar proposal to replace opCmp). Instead of:
> 
>     class C
>     {
>         int x;
> 
>         int opEquals(Object o)
>         {
>             C c = cast(C) o;
>             if (c is null)
>                 throw new Exception("Cannot compare C with wrong type");
>             return x == c.x;
>         }
>     }
> 
> we do
> 
>     class C
>     {
>         int x;
> 
>         is(this == c)
>         {
>             return x == c.x;
>         }
>     }
> 
> Because this is no longer a regular function, the compiler can take care of all the casting and exception throwing, so we don't have to. Moreover, it is now possible for the compiler to generate a *default* equality comparison, if we don't supply one. The default equality comparison would be recursive memberwise equality comparison (which, after all, is usually the correct thing thing to do).
> 
> And /of course/, the function should return bool, and should consider both "this" and "c" to be both of the same type, and both const.
> 
> Likewise, we replace opCmp with:
> 
>     class C
>     {
>         int x;
> 
>         is(this < c)
>         {
>             return x < c.x;
>         }
>     }
> 
> The default implementation of is(this < c) should be to return false always, which implies that comparison is not meaningful for this type.
> 
> The compiler can auto-generate all fourteen possible comparisons using only these two functions, as follows:
> 
>     (a == b)  ==  (a == b)
>     (a < b)  ==  (a < b)
>     (a <= b)  ==  (a == b || a < b)
>     (a <> b)  ==  (a < b || b < a)
>     (a <>= b)  ==  (a == b || a < b || b < a)
>     (a > b)  ==  (b < a)
>     (a >= b)  ==  (a == b || b < a)
>     (a != b)  ==  (!(a == b))
>     (a !<> b)  ==  (!(a < b || b < a))
>     (a !<>= b)  ==  (!(a == b || a < b || b < a))
>     (a !< b)  ==  (!(a < b))
>     (a !<= b)  ==  (!(a == b || a < b))
>     (a !> b)  ==  (!(b < a))
>     (a !>= b)  ==  (!(a == b || b < a))
> 
> We could allow the programmer to supply some or all of these, if they so desire, for reasons of efficiency,
> 
> To my mind, this proposal fixes everything that is wrong with comparison in D.

April 14, 2008
Wouldn't that safely be 0?

If there is no order, there must be quality...assuming that all types do have a natural order, even if it's just it's ordinal value. Otherwise, they may not be comparable with opCmp, and maybe say, only opEquals :-)

On Mon, 2008-04-14 at 16:10 +0200, Henning Hasemann wrote:
> "the ORDER between these two is not defined" (which
> is something you can't express with the current opCmp)

« First   ‹ Prev
1 2 3 4 5