Jump to page: 1 2
Thread overview
[phobos] Issue with opEquals(Object, Object)
April 23, 2010
Aside from some of the issues of the new object comparison routine, which has a couple outstanding bugs, I've ran into an interesting issue for dcollections.

For dcollections, I allow opEquals on two like types -- for example, you can call opEquals on two Set types, even if one is a TreeSet and one is a HashSet.

Now, when you compare for example a TreeSet to any other Set type, the TreeSet has enough smarts to do the comparison.  However, the new opEquals is going to run the comparison *twice* when they are equal, once against each set, to see that they both agree.  While this doesn't exactly affect the big-O complexity, it seems unnecessary in the case where I know that running the algorithm once is good enough.

One possibility is for opEquals to have a flag that indicates whether the other side has 'already verified' equality.  That way, I can skip the extra loop if I trust the other type knows what it is doing (e.g. if TreeSet knows the other object is a Set type, and it says we're equal, TreeSet just returns true).

Comments?  Ideas?  I realize this api is in TDPL, I hope we can figure this out.

-Steve



April 23, 2010
More thoughts...

What if an object wants say it's able to compare itself to another object, but that other object doesn't know how to compare itself to the first one.  Should this return not equal?  It might be that the first object was written after the second, or one object is part of a wrapper library.

To fix this, we would almost need a canCompare(Object o) function which would return one of several values:

DEFER -- defer to other object, I have no way to compare myself to it.
REQUIRED -- I can compare myself to the other object, but I want to ensure my routine is always called.
OPTIONAL -- I can compare myself to the other object, but I'm ok if you choose to only call the other object's routine.
NO -- Return false without doing any comparison, these objects cannot be equal.

The default would be DEFER.

-Steve



April 26, 2010
That sounds a bit too complicated. opEquals is a simplified collision engine. If you want to define a full-fledged one, it may be simpler to go the named functions route.

Andrei

On 04/23/2010 09:21 PM, Steve Schveighoffer wrote:
> More thoughts...
>
> What if an object wants say it's able to compare itself to another object, but that other object doesn't know how to compare itself to the first one.  Should this return not equal?  It might be that the first object was written after the second, or one object is part of a wrapper library.
>
> To fix this, we would almost need a canCompare(Object o) function which would return one of several values:
>
> DEFER -- defer to other object, I have no way to compare myself to it.
> REQUIRED -- I can compare myself to the other object, but I want to ensure my routine is always called.
> OPTIONAL -- I can compare myself to the other object, but I'm ok if you choose to only call the other object's routine.
> NO -- Return false without doing any comparison, these objects cannot be equal.
>
> The default would be DEFER.
>
> -Steve
>
>
>
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
April 26, 2010
Yah, it's very close to that plus a few tests for null etc. I attach an excerpt from the camera-ready TDPL that discusses the matter. Please don't share - this is the final product!

Andrei

On 04/26/2010 04:19 PM, Steve Schveighoffer wrote:
> So something that was as simple as defining a single opEquals previously now must be done through a separate function call?
>
> Another option is to define opequals so the result is:
>
> a.opEquals(b) || b.opEquals(a)
>
> instead of using&&.
>
> In fact, that might be the best solution.  Do we need to generically handle the case where a says it's equal to b, but b says it's not equal to a?  Wouldn't that be the bizarre case that should be handled through named functions?
>
> -Steve
>
>
>
> ----- Original Message ----
>> From: Andrei Alexandrescu<andrei at erdani.com> To: Discuss the phobos
>> library for D<phobos at puremagic.com> Cc: Steve
>> Schveighoffer<schveiguy at yahoo.com> Sent: Mon, April 26, 2010
>> 4:25:12 PM Subject: Re: [phobos] Issue with opEquals(Object,
>> Object)
>>
>> That sounds a bit too complicated. opEquals is a simplified collision
>>
> engine. If you want to define a full-fledged one, it may be simpler to
>>
> go the named functions route.
>
> Andrei
>
>
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: output.pdf
Type: application/pdf
Size: 159585 bytes
Desc: not available
URL: <http://lists.puremagic.com/pipermail/phobos/attachments/20100426/0978aa0c/attachment-0001.pdf>
April 26, 2010
Yah, I was half-joking. It's ok to share five pages out of five hundred...

Andrei

On 04/26/2010 04:37 PM, Steve Schveighoffer wrote:
>> Yah, it's very close to that plus a few tests for null etc. I attach an
>>
> excerpt from the camera-ready TDPL that discusses the matter. Please
>>
> don't share - this is the final product!
>
> Note, you CC'd the publically viewable mailing list!
>
> -Steve
>
>
>
April 27, 2010
I read the excerpt.

I understand the current implementation just fine.  The issue I have is with the line:

return lhs.opEquals(rhs) && rhs.opEquals(lhs);

I thought maybe || would be a better idea, but after reading the text, I agree the example shows that is not the best method.

Following the example in the book, let's say I make my own widget, called StevesTextWidget, which inherits Widget, not TextWidget.  It has some of the same functionality as TextWidget, but it's not a TextWidget.  If I want to be able to compare StevesTextWidget to a TextWidget, it's not possible, because the TextWidget will veto every time, even though StevesTextWidget has total understanding of TextWidget and can correctly do the comparison.  On the other hand, if || was used, comparing StevesTextWidget to a Widget will allow Widget's routine to override the opEquals of StevesTextWidget, and like the example in the book, could possibly cause an invalid equal response.

If Widget's routine is rewritten to return false when the actual derived class is not widget, then that forces derivative classes to always implement opEquals, even if it just calls the base method (which it can't because it will return false!).

We all agree that opEquals is a double-dispatch problem, but the "fix" being introduced causes 1) unreasonable performance problems and 2) unreasonable paranoia in some cases, to the point where opEquals will not be a universal way to determine equality.  You are making a decision both on equality and ability to determine equality with one bit.

Maybe the answer is to give more information than one bit.  This was my original idea, and I guess I'm back to that.  Like it or not, opEquals is a complex problem to solve, and taking a simple approach to it results in it's usefulness being extremely limited.  To be able to just use == when you want equality is a good goal to have, even if it means the opEquals regime is slightly more complex.

What I'm looking for is a way to give more information to the opEquals global function if you want to, but falls back to some reasonable default (such as the current behavior), if you don't care about performance or dealing with the complexity of the problem.  In other words, you don't *have* to override/define the canCompare function, the default simply always returns the same thing, and if you don't, you get the current behavior.  But if you want to optimize opEquals for specific cases (or make specific cases possible), you can override canCompare.

-Steve



April 27, 2010
On 04/27/2010 06:55 AM, Steve Schveighoffer wrote:
> Following the example in the book, let's say I make my own widget, called StevesTextWidget, which inherits Widget, not TextWidget.  It has some of the same functionality as TextWidget, but it's not a TextWidget.  If I want to be able to compare StevesTextWidget to a TextWidget, it's not possible, because the TextWidget will veto every time, even though StevesTextWidget has total understanding of TextWidget and can correctly do the comparison.  On the other hand, if || was used, comparing StevesTextWidget to a Widget will allow Widget's routine to override the opEquals of StevesTextWidget, and like the example in the book, could possibly cause an invalid equal response.

Let's suppose we allow the feature above. Then a StevesTextWidget can claim equality with a TextWidget in spite of the latter's opposition. But then a WaltersTextWidget could also claim equality with the same TextWidget, again in spite of the latter's opposition. But then StevesTextWidget may not compare equal with WaltersTextWidget, which would break transitivity.

Andrei
April 27, 2010
No loss there.  Transitivity is already not enforced by the current system.

An example:

interface I1
{
   int getX();
}

interface I2
{
   int getX();
}

class C1 : I1
{
   int x;
   int getX() { return x; }
   bool opEquals(Object o)
   { if(auto n = cast(I1)o) return x == n.getX(); return false; }
}

class C2 : I2
{
   int x;
   int getX() { return x; }
   bool opEquals(Object o)
   { if(auto n = cast(I2)o) return x == n.getX(); return false; }
}

class C3 : I1, I2
{
   int x;
   int getX() { return x; }
   bool opEquals(Object o)
   {
      if(auto n = cast(I1)o) return x == n.getX();
      if(auto n = cast(I2)o) return x == n.getX();
      return false;
   }
}

This would lead me to conclude the *only* provably transitive opEquals starts with:

if(auto n = cast(typeof(this))other)

which is quite limited IMO.

-Steve



----- Original Message ----
> From: Andrei Alexandrescu <andrei at erdani.com>
> Let's suppose we allow
> the feature above. Then a StevesTextWidget can claim equality with a TextWidget
> in spite of the latter's opposition. But then a WaltersTextWidget could also
> claim equality with the same TextWidget, again in spite of the latter's
> opposition. But then StevesTextWidget may not compare equal with
> WaltersTextWidget, which would break transitivity.




April 27, 2010
On 04/27/2010 03:11 PM, Steve Schveighoffer wrote:
> No loss there.  Transitivity is already not enforced by the current system.

Of course. One could after all return cast(bool) rand() from opEquals; no need for an elaborate example. That doesn't mean we should espouse breakage of transitivity. My previous message shows that the proposed system is corrupt because it breaks transitivity by otherwise well-intended and unwitting participants.

Andrei
April 27, 2010
The example was not some mindless example, and not even close to something like cast(bool)rand().  I consider the example I gave a well intended and unwitting example, which breaks transitivity via the current opEquals function.  Basically, the intention is, if I have two items which implement an interface, then I should be able to compare those two items via the interface.  A perfect example of this is the dcollections Map interface, I want to be able to compare for equality two maps, even if the implementations are different.

Say you had a collection that implemented the List and the Map interface (Like an ordered hash map).  I should be able to compare that to a map or a list, but I can't compare the map and the list with each other.

In fact, I would posit that unless lhs.opEquals and rhs.opEquals point to the exact same function, transitivity can be unwittingly broken.  Actually, you could use this check instead of checking the typeinfos are equal when determining if you can call just one function because it covers more ground.

-Steve



----- Original Message ----
> From: Andrei Alexandrescu <andrei at erdani.com>
> To: Steve Schveighoffer <schveiguy at yahoo.com>
> Cc: Phobos <phobos at puremagic.com>
> Sent: Tue, April 27, 2010 5:29:04 PM
> Subject: Re: [phobos] Issue with opEquals(Object, Object)
> 
> On 04/27/2010 03:11 PM, Steve Schveighoffer wrote:
> No loss there.
> Transitivity is already not enforced by the current system.

Of course.
> One could after all return cast(bool) rand() from opEquals; no need for an elaborate example. That doesn't mean we should espouse breakage of transitivity. My previous message shows that the proposed system is corrupt because it breaks transitivity by otherwise well-intended and unwitting participants.

Andrei



« First   ‹ Prev
1 2