July 23, 2014
On 7/23/14, 12:04 PM, H. S. Teoh via Digitalmars-d wrote:
> If autogenerating opEquals to be opCmp()==0 is a no-go, then I'd much
> rather say it should be a compile error if the user defines opCmp but
> not opEquals.

No. There is this notion of partial ordering that makes objects not smaller and not greater than others, yet not equal. -- Andrei
July 23, 2014
On 7/23/2014 2:36 PM, Andrei Alexandrescu via Digitalmars-d wrote:
> On 7/23/14, 12:04 PM, H. S. Teoh via Digitalmars-d wrote:
>> If autogenerating opEquals to be opCmp()==0 is a no-go, then I'd much
>> rather say it should be a compile error if the user defines opCmp but
>> not opEquals.
>
> No. There is this notion of partial ordering that makes objects not smaller and not greater than others, yet not equal. -- Andrei

Right, but in that case just define both.  It's not the dominant case so shouldn't define the default behavior.  Or if you truly have a case of comparable but no possibility of equal, just @disable opEqual (or define and throw, or assert, or...).
July 24, 2014
On Wednesday, 23 July 2014 at 21:36:16 UTC, Andrei Alexandrescu wrote:
> On 7/23/14, 12:04 PM, H. S. Teoh via Digitalmars-d wrote:
>> If autogenerating opEquals to be opCmp()==0 is a no-go, then I'd much
>> rather say it should be a compile error if the user defines opCmp but
>> not opEquals.
>
> No. There is this notion of partial ordering that makes objects not smaller and not greater than others, yet not equal. --

I would strongly argue that if lhs.opCmp(rhs) == 0 is not equivalent to lhs == rhs, then it that type is broken and should not be using opCmp to do its comparisons. std.algorithm.sort allows you to use any predicate you want, allowing for such orderings, but it does not work with generic code for a type to define opCmp or opEquals such that they're not consistent, because that's not consistent with how comparisons work for the built-in types.

- Jonathan M Davis
July 24, 2014
On Wednesday, 23 July 2014 at 16:47:40 UTC, H. S. Teoh via Digitalmars-d wrote:
> This morning, I discovered this major WAT in D:
>
> ----
> struct S {
>         int x;
>         int y;
>         int opCmp(S s) {
>                 return x - s.x; // compare only x
>         }
> }
>
> void main() {
>         auto s1 = S(1,2);
>         auto s2 = S(1,3);
>         auto s3 = S(2,1);
>
>         assert(s1 < s3); // OK
>         assert(s2 < s3); // OK
>         assert(s3 > s1); // OK
>         assert(s3 > s2); // OK
>         assert(s1 <= s2 && s2 >= s1); // OK
>         assert(s1 == s2); // FAIL -- WAT??
> }
> ----
>
> The reason for this is that the <, <=, >=, > operators are defined in
> terms of opCmp (which, btw, is defined to return 0 when the objects
> being compared are equal), but == is defined in terms of opEquals. When
> opEquals is not defined, it defaults to the built-in compiler
> definition, which is a membership equality test, even if opCmp *is*
> defined, and returns 0 when the objects are equal.
>
> Why isn't "a==b" rewritten as "a.opCmp(b)==0"?? I'm pretty sure TDPL
> says this is the case (unfortunately I'm at work so I can't check my
> copy of TDPL).
>
> https://issues.dlang.org/show_bug.cgi?id=13179
>
> :-(

I would argue that the compiler should still be generating opEquals even if opCmp is defined. Otherwise, even if opCmp is consistent with the built-in opEquals, you'll be forced to reimplement opEquals - and toHash if you're using that type as a key, since once you define opEquals, you have to define toHash.

If it makes sense for a type to define opCmp but not define opEquals (which I seriously question), then I think that it should be explicit, in which case, we can use @disable, e.g. something like

struct S
{
    @disable bool opEquals(ref S s);

    int opCmp(ref S S)
    {
        ...
    }

    ...
}

- Jonathan M Davis
July 24, 2014
On 7/23/14, 5:28 PM, Jonathan M Davis wrote:
> On Wednesday, 23 July 2014 at 21:36:16 UTC, Andrei Alexandrescu wrote:
>> On 7/23/14, 12:04 PM, H. S. Teoh via Digitalmars-d wrote:
>>> If autogenerating opEquals to be opCmp()==0 is a no-go, then I'd much
>>> rather say it should be a compile error if the user defines opCmp but
>>> not opEquals.
>>
>> No. There is this notion of partial ordering that makes objects not
>> smaller and not greater than others, yet not equal. --
>
> I would strongly argue that if lhs.opCmp(rhs) == 0 is not equivalent to
> lhs == rhs, then it that type is broken and should not be using opCmp to
> do its comparisons. std.algorithm.sort allows you to use any predicate
> you want, allowing for such orderings, but it does not work with generic
> code for a type to define opCmp or opEquals such that they're not
> consistent, because that's not consistent with how comparisons work for
> the built-in types.

std.algorithm.sort does not use equality at all. It just deems objects for which pred(a, b) and pred(b, a) as unordered. -- Andrei

July 24, 2014
On Thursday, 24 July 2014 at 00:31:55 UTC, Jonathan M Davis wrote:
> I would argue that the compiler should still be generating opEquals even if opCmp is defined.

I take this back. As I later suggested in a post somewhere else in this thread (and the bug report that H.S. Teoh opened), I think that we should continue to not define opEquals, but we should add a way to tell the compiler to use the default-generated one (similar to what C++11 does). That way, the programmer is forced to consider what opEquals is supposed to do when opCmp is defined, but they're still able to use the default-generated one. The same goes for toHash.

Regardless, because automatically making opEquals be lhs.opCmp(rhs) == 0 incurs a silent performance hit, I think that it's a bad idea.

- Jonathan M Davis
July 24, 2014
On Thursday, 24 July 2014 at 00:45:05 UTC, Andrei Alexandrescu wrote:
> std.algorithm.sort does not use equality at all. It just deems objects for which pred(a, b) and pred(b, a) as unordered. --

I know. It uses < by default. What I'm disputing is that it's okay to define a type whose opCmp does not match its opEquals - or even that it has opCmp but doesn't have opEquals. I would strongly argue that such a type should not use opCmp for its ordering, because otherwise, its comparison operators are not consistent with how the comparison operators work for the built-in types.

I brought up sort, because that's usually where the question of ordering comes up, and our sort function is designed so that it does not require opCmp (the same with other Phobos constructs such as RedBlackTree). So, if a type needs to do its ordering in a manner which is inconsistent with opEquals, it can do so by providing a function other than opCmp for those comparisons.

But I think that it's a poor idea to have opCmp not be consistent with opEquals, since most generic code will assume that they're consistent. I'd strongly argue that an overloaded operate should be consistent with how the built-in operators work, or that functionality should not be using an overloaded operator. This is especially important when generic code comes into play, but it's also very important with regards to understanding how code works in general. Operators bring with them certain expectations of behavior, and they should maintain that and not be used in a manner which violates that. And having opCmp be inconsistent with opEquals violates that, indicating that opCmp should not be used in that case.

As such, I don't think that arguing that opCmp should be able to exist without an opEquals is a bad argument. I think that if you have opCmp, you should required to have opEquals (though not automatically defined as lhs.opCmp(rhs) == 0, because that would incur a silent performance hit).

- Jonathan M Davis
July 24, 2014
On Thursday, 24 July 2014 at 00:28:06 UTC, Jonathan M Davis wrote:
> On Wednesday, 23 July 2014 at 21:36:16 UTC, Andrei Alexandrescu wrote:
>> On 7/23/14, 12:04 PM, H. S. Teoh via Digitalmars-d wrote:
>>> If autogenerating opEquals to be opCmp()==0 is a no-go, then I'd much
>>> rather say it should be a compile error if the user defines opCmp but
>>> not opEquals.
>>
>> No. There is this notion of partial ordering that makes objects not smaller and not greater than others, yet not equal. --
>
> I would strongly argue that if lhs.opCmp(rhs) == 0 is not equivalent to lhs == rhs, then it that type is broken and should not be using opCmp to do its comparisons. std.algorithm.sort allows you to use any predicate you want, allowing for such orderings, but it does not work with generic code for a type to define opCmp or opEquals such that they're not consistent, because that's not consistent with how comparisons work for the built-in types.
>
> - Jonathan M Davis

floating point ?
July 24, 2014
On Thu, Jul 24, 2014 at 12:51:31AM +0000, Jonathan M Davis via Digitalmars-d wrote:
> On Thursday, 24 July 2014 at 00:31:55 UTC, Jonathan M Davis wrote:
> >I would argue that the compiler should still be generating opEquals even if opCmp is defined.
> 
> I take this back. As I later suggested in a post somewhere else in this thread (and the bug report that H.S. Teoh opened), I think that we should continue to not define opEquals, but we should add a way to tell the compiler to use the default-generated one (similar to what C++11 does). That way, the programmer is forced to consider what opEquals is supposed to do when opCmp is defined, but they're still able to use the default-generated one. The same goes for toHash.
> 
> Regardless, because automatically making opEquals be lhs.opCmp(rhs) == 0 incurs a silent performance hit, I think that it's a bad idea.
[...]

This sounds like a reasonable compromise. If the user defines opCmp, then it is an error not to define opEquals. However, opEquals can be specified to be default, to get the compiler-generated version:

	struct S {
		int opCmp(S s) { ... }
		bool opEquals(S s) = default;
	}

Optionally, we could also allow opEquals to be @disabled, perhaps.

Same goes with toHash.

Keep in mind, though, that due to current AA changes in 2.066 beta, existing code WILL break unless we autogenerate opEquals to be opCmp()=0. In fact, issue 13179 was originally filed because 2.066 beta broke Tango. My current stance is that these AA changes are an improvement that we should keep, so then the question becomes, should we break code over it, or should we introduce opEquals = (opCmp()==0), which would allow existing code *not* to break?

Given the choice between (1) breaking code *and* allowing opCmp to be inconsistent with opEquals, as the current situation is, and (2) *not* breaking code and making opEquals consistent with opCmp by default, I would choose (2) as being clearly more advantageous. The above compromise solves the opEquals/opCmp consistency problem, but does not address the inevitable code breakage that will happen when 2.066 is released. Is it really worth the ire of D coders to have their existing code break, for the questionable benefit of being able to make opEquals inconsistent with opCmp just so we can support partial orderings on types?

I don't know about you, but if it were up to me, I would much rather go with the solution of setting opEquals = (opCmp()==0) by default, and let the user redefine opEquals if they want partial orderings or eliminate performance hits, etc..


T

-- 
Skill without imagination is craftsmanship and gives us many useful objects such as wickerwork picnic baskets.  Imagination without skill gives us modern art. -- Tom Stoppard
July 24, 2014
On Thursday, 24 July 2014 at 01:39:01 UTC, H. S. Teoh via Digitalmars-d wrote:
> Keep in mind, though, that due to current AA changes in 2.066 beta,
> existing code WILL break unless we autogenerate opEquals to be
> opCmp()=0. In fact, issue 13179 was originally filed because 2.066 beta
> broke Tango.

This is exactly what I was referring to by "you will answer to user complaints". In context of declared backwards compatibility efforts decision to not use pragmatical solution in favor of purist approach that brakes user code does sound strange.