July 23, 2014
https://issues.dlang.org/show_bug.cgi?id=13179

--- Comment #10 from bearophile_hugs@eml.cc ---
(In reply to Jonathan M Davis from comment #9)
> I would not expect `a == b` to _ever_ be converted to `a.opCmp(b) == 0`.
> That's what opEquals is for.

Why? When opCmp returns zero it means ==. What's wrong about using that possible return value to test for equality?

--
July 23, 2014
https://issues.dlang.org/show_bug.cgi?id=13179

--- Comment #11 from Jonathan M Davis <jmdavisProg@gmx.com> ---
(In reply to bearophile_hugs from comment #10)
> (In reply to Jonathan M Davis from comment #9)
> > I would not expect `a == b` to _ever_ be converted to `a.opCmp(b) == 0`.
> > That's what opEquals is for.
> 
> Why? When opCmp returns zero it means ==. What's wrong about using that possible return value to test for equality?

It's an unnecessary efficiency hit. The language has opEquals for a reason, and it's defined by the language to be ==, whereas opCmp is for <, <=, >=, and >. And code really shouldn't be calling opEquals or opCmp directly except under very rare circumstances. They're for generating overloaded operators, not for calling as normal functions.

I can see an argument for just continuing to have the compiler define == when opCmp is defined in order to avoid having to declare opEquals as well when the default would be fine, and I can see an argument that opEquals should be explicitly defined in that case in order to reduce the odds of there being a bug when someone defined opCmp in a way that was inconsistent with the built in opEquals. But opEquals is what's supposed to be defining equality. That's what it's _for_. That's not opCmp's job.

--
July 23, 2014
https://issues.dlang.org/show_bug.cgi?id=13179

--- Comment #12 from hsteoh@quickfur.ath.cx ---
(In reply to Jonathan M Davis from comment #11)
[...]
> And code really shouldn't be calling opEquals or opCmp directly
> except under very rare circumstances. They're for generating overloaded
> operators, not for calling as normal functions.

Nobody is talking about calling opCmp or opEquals directly here. The issue here is that == does not behave consistently with <, <=, >=, > when the user defines opCmp (but not opEquals). Either we should *reject* such code (i.e., the user must always explicitly define opEquals if opCmp is defined), or we should do the Right Thing, that is, make opEquals the same as opCmp()==0.

--
July 23, 2014
https://issues.dlang.org/show_bug.cgi?id=13179

--- Comment #13 from Jonathan M Davis <jmdavisProg@gmx.com> ---
(In reply to hsteoh from comment #12)
> Nobody is talking about calling opCmp or opEquals directly here. The issue here is that == does not behave consistently with <, <=, >=, > when the user defines opCmp (but not opEquals). Either we should *reject* such code (i.e., the user must always explicitly define opEquals if opCmp is defined), or we should do the Right Thing, that is, make opEquals the same as opCmp()==0.

I'd be in favor of either making it an error to define opCmp but not opEquals, letting the programmer just screws themselves if they define opCmp so that it doesn't match the compiler-generated opEquals, and they don't define opEquals themselves (which would basically be the same as if they defined both but didn't make them consistent).

The advantage of not requiring that opEquals be defined and still using the default one is that you wouldn't have to define opEquals just because you defined opCmp when the default opEquals did what you wanted (which it probably does in the majority of cases).

The advantage of requiring opEquals is that it makes it clear that the programmer needs to make sure that opCmp is consistent with opEquals, but it doesn't necessarily make it less error-prone, because the programmer then needs to define opEquals to be equivalent to a.opCmp(b) == 0, and they can screw that up, so there's no guarantee that a.opCmp(b) == 0 is equivalent to opEquals, even if it's supposed to be. And they're probably _more_ likely to define opEquals incorrectly than the default one if what they want is what the default one does (also, they'd generally lose out on whatever efficiency gains the default one might have if they define it themselves).

Using a.opCmp(b) == 0 for opEquals if opEquals isn't defined would guarantee correctness, but it would incur a performance hit, and I'm willing to bet that a lot of folks would expect the normal, compiler-generated opEquals to still be generated even when opCmp is declared (I certainly would have), so they wouldn't realize that they were getting a performance hit. And those that didn't think it through even that far wouldn't know about the performance hit either.

And there's toHash to consider as well. It needs to be consistent with opEquals, and it won't be if opEquals is defined by the user but not toHash - making it so that toHash should probably be required by the AA to be explicitly defined by the type if opEquals has been declared and so that if toHash is declared, opEquals should be required. Regardless, there is no option in that case to use another function to guarantee correctness (as you can do by using opCmp for opEquals). So, we _still_ require correctness of the user. Also, if the compiler-generate opEquals is used when opCmp is declared, then the compiler-generated toHash can still be used, whereas if a.opCmp(b) == 0 were used for opEquals, then toHash would have to be defined.

So, I think that we really should just use the compiler-generate opEquals unless the programmer defines opEquals themselves, even if opCmp is declared, and the compiler-generate toHash should be used if the compiler-generated opEquals is used, whereas it can't be otherwise. The programmer will just have to make sure that a.opCmp(b) == 0 is equivalent to opEquals if they don't want bugs. They should be unit testing that anyway.

--
July 23, 2014
https://issues.dlang.org/show_bug.cgi?id=13179

--- Comment #14 from hsteoh@quickfur.ath.cx ---
I think the toHash issue is really only applicable if you try to use your type as an AA key. If you never use it as an AA key, there's no reason to require the user to define toHash just because they defined their own opEquals. AFAIK the latest dmd git HEAD will reject using types as AA keys if they define one of toHash/opEquals but not both. So either you don't define either and get the default implementations, or you define both and the onus is on you to make them consistent with each other. The compiler rejects the code if you define one but not the other, which makes sense.

But there's no reason to require toHash to be defined if the user never tries to use it as an AA key, even if they define opEquals.

--
July 23, 2014
https://issues.dlang.org/show_bug.cgi?id=13179

--- Comment #15 from Jonathan M Davis <jmdavisProg@gmx.com> ---
(In reply to hsteoh from comment #14)
> But there's no reason to require toHash to be defined if the user never tries to use it as an AA key, even if they define opEquals.

Agreed. I don't think that I said otherwise. What I was trying to point out that is if you define opEquals, then if you use the type with an AA, then you're going to need to define toHash as well. So, forcing the user to define opEquals simply because they defined opCmp potentially forces them to define toHash. And if the compiler-generate opEquals and toHash would have been consistent with opCmp, then having to define them just increases the risk that you're going to screw up defining them (particular toHash). What it would mean would be that any type that you wanted to use as an AA key that defined opCmp (as was previously required) would then have to define opEquals and toHash as well even if it shouldn't be necessary.

So, I favor just having the compiler continue to generate opEquals and toHash even if opCmp is defined. If opCmp is not consistent with opEquals, then that's a bug, and the programmer will need to either fix opCmp or define opEquals (and toHash if using AAs) so that it's consistent with opCmp. Requiring that the user define opEquals won't necessarily make it less likely to be inconsistent with opCmp, because then instead of relying on the built-in opEquals in the cases where it would have been fine, you'd then have to define it and would potentially screw it up. And if they both need to be defined in order to be consistent, then it's always up to the programmer to get it right anyway.

But I think that making it so that defining opCmp results in opEquals being generated as a.opCmp(b) == 0 is pernicious, because it's a hidden performance hit.

--
July 24, 2014
https://issues.dlang.org/show_bug.cgi?id=13179

--- Comment #16 from Jonathan M Davis <jmdavisProg@gmx.com> ---
How about this, the programmer has to declare opEquals if they define opCmp (in order to force the programmer to be aware of the problem and choose how opEquals should work), but we provide a way to tell the compiler to use the default opEquals rather than forcing the programmer to define their own. e.g. something like

struct S
{
    bool opEquals(S s) = default;
    int opCmp()(auto ref S s) {...}
}

The same with toHash so that the programmer doesn't have to define that just because they defined opEquals in the cases where the default would work.

I believe that C++11 did something like this. By doing this, we force the programmer to consider it but don't force them to define opEquals or toHash themselves when the defaults should work.

Regardless, if the programmer wants opEquals to just call opCmp, I think that they should have to define that themselves, since having the compiler do it would incur a silent performance hit. At least the definition is short and easy:

bool opEquals()(auto ref S s) { return opCmp(s) == 0; }

--
July 24, 2014
https://issues.dlang.org/show_bug.cgi?id=13179

--- Comment #17 from Jacob Carlborg <doob@me.com> ---
(In reply to Kenji Hara from comment #6)

> So you can just remove opCmp completely. For the struct
> 
> struct TagIndex
> {
>     uint tag, index;
> }
> 
> In 2.066 AA will use default member-wise equality and hasing.

My point was that it was a regression _again_. This time DMD changes back and forwards between different versions. Add a function to workaround a regression, then remove it in the next release because of regression again. Walter keeps talking about we should not make regressions and then someone goes and does this.

--
July 24, 2014
https://issues.dlang.org/show_bug.cgi?id=13179

--- Comment #18 from Jacob Carlborg <doob@me.com> ---
(In reply to Jonathan M Davis from comment #15)

> Agreed. I don't think that I said otherwise. What I was trying to point out that is if you define opEquals, then if you use the type with an AA, then you're going to need to define toHash as well. So, forcing the user to define opEquals simply because they defined opCmp potentially forces them to define toHash. And if the compiler-generate opEquals and toHash would have been consistent with opCmp, then having to define them just increases the risk that you're going to screw up defining them (particular toHash). What it would mean would be that any type that you wanted to use as an AA key that defined opCmp (as was previously required) would then have to define opEquals and toHash as well even if it shouldn't be necessary.

That's how the current behavior in 2.066.0-b5 is. If opCmp is defined and the type is used as an AA key the compiler will force you to defined opEquals as well. And if opEquals is defined the compiler will require you to define toHash as well. I'm guessing this is to ease the transition from opCmp being used for AA keys to opEquals.

--
July 25, 2014
https://issues.dlang.org/show_bug.cgi?id=13179

--- Comment #19 from Jonathan M Davis <jmdavisProg@gmx.com> ---
Okay. Clearly, I was not paying enough attention to what was going on here. It always has been the case that opEquals is generated for for structs, if you don't define it. That hasn't changed for 2.066. Rather, in the case where you try and use it as an AA key, it now screams at you for not having defined opEquals, even though the default opEquals was almost certainly fine, because that's what was being used.

It's a major bug IMHO if lhs.opCmp(rhs) == 0 is not equivalent to lhs == rhs, so if we change the AAs to use opEquals but do not give the error that we're currently giving, then all we'd be doing is exposing an existing bug in someone's code, and that bug probably exists elsewhere in their code, since they probably use opEquals with their keys _somewhere_.

And it seems kind of messed up to me to require that opEquals be defined just because you're using an AA when the default opEquals already does the right thing.

IMHO, we should either require that opEquals be defined in general if opCmp is defined (which I think is a bad idea), or we should just use the default opEquals with AAs without complaining about it (much better idea IMHO). The only code that would be negatively affected is already broken anyway. But requiring folks to define opEquals instead of using the perfectly good default just because they're using an AA seems very wrong to me.

--