July 25, 2014
On Friday, 25 July 2014 at 05:52:57 UTC, Manu via Digitalmars-d wrote:
> On 25 July 2014 14:50, Walter Bright via Digitalmars-d <
> digitalmars-d@puremagic.com> wrote:
>
>> On 7/23/2014 9:45 AM, H. S. Teoh via Digitalmars-d wrote:
>>
>>> https://issues.dlang.org/show_bug.cgi?id=13179
>>>
>>
>> Consider also:
>>
>> http://www.reddit.com/r/programming/comments/2bl51j/
>> programming_in_d_a_great_online_book_for_learning/cj75gm9
>>
>> The current scheme breaks existing code - code that was formerly correct
>> and working.
>>
>> AAs don't make sense if the notion of == on members is invalid. AAs
>> formerly required opCmp, and yes, opCmp could be constructed to give
>> different results for opCmp==0 than ==, but I would expect such an object
>> to not be used in an AA, again because it doesn't make sense.
>>
>> Using the default generated opEquals for AAs may break code, such as the
>> an AA of the structs in the parent post, but it seems unlikely that that
>> code was correct anyway in an AA (as it would give erratic results).
>>
>> Kenji's rebuttal https://issues.dlang.org/show_bug.cgi?id=13179#c2 is
>> probably the best counter-argument, and I'd go with it if it didn't result
>> in code breakage.
>>
>
> I don't really see how opCmp == 0 could be unreliable or unintended. It was
> deliberately written by the author, so definitely not unintended, and I
> can't imagine anybody would ever deliberately ignore the == 0 case when
> implementing an opCmp, or produce logic that works for less or greater, but
> fails for equal.
> <= and >= are expressed by opCmp, which imply that testing for equality
> definitely works as the user intended.
>
> In lieu of an opEquals, how can a deliberately implemented opCmp, which we
> know works in the == case (otherwise <= or >= wouldn't work either) ever be
> a worse choice than an implicitly generated opEquals?
>
> Personally, just skimming through this thread, I find it baffling that this
> is controversial.

So, in the case where opCmp was defined but not opEquals, instead of using the normal, built-in opEquals (which should already be equivalent to lhs.opCmp(rhs) == 0), we're going to make the compiler generate opEquals as lhs.opCmp(rhs) == 0? That's a silent performance hit for no good reason IMHO. It doesn't even improve correctness except in the cases where the programmer should have been defining opEquals in the first place, because lhs.opCmp(rhs) == 0 wasn't equivalent to the compiler-generate opEquals. So, we'd be making good code worse just to try and fix an existing bug in bad code in order to do what? Not break the already broken code?

I can understand wanting to avoid breaking code when changing from using opCmp to using opEquals with AAs, but it's only an issue if the code was already broken by defining opCmp in a way that didn't match opEquals, so if I find it odd that any part of this is controversial, it's the fact that anyone thinks that we should try and avoid breaking code where opEquals and opCmp weren't equivalent.

- Jonathan M Davis
July 25, 2014
On 7/24/2014 10:52 PM, Manu via Digitalmars-d wrote:
> I don't really see how opCmp == 0 could be unreliable or unintended. It was
> deliberately written by the author, so definitely not unintended, and I can't
> imagine anybody would ever deliberately ignore the == 0 case when implementing
> an opCmp, or produce logic that works for less or greater, but fails for equal.
> <= and >= are expressed by opCmp, which imply that testing for equality
> definitely works as the user intended.

Yes, that's why it's hard to see that it would break existing code, unless that existing code had a bug in it that was worked around in some peculiar way.


> In lieu of an opEquals, how can a deliberately implemented opCmp, which we know
> works in the == case (otherwise <= or >= wouldn't work either) ever be a worse
> choice than an implicitly generated opEquals?

Determining an ordering can sometimes be more expensive. It is, after all, asking for more information.

July 25, 2014
"Walter Bright"  wrote in message news:lqsunn$2ke5$1@digitalmars.com...

> Determining an ordering can sometimes be more expensive. It is, after all, asking for more information.

It could also be significantly cheaper, if only a subset of fields need to be compared. 

July 25, 2014
On Friday, 25 July 2014 at 07:21:11 UTC, Daniel Murphy wrote:
> "Walter Bright"  wrote in message news:lqsunn$2ke5$1@digitalmars.com...
>
>> Determining an ordering can sometimes be more expensive. It is, after all, asking for more information.
>
> It could also be significantly cheaper, if only a subset of fields need to be compared.

If that's the case, then the default opEquals isn't correct, and the programmer should have already defined opEquals. If they didn't, then their code is broken, and I see no reason to penalize the folks who wrote correct code just to fix someone else's broken code by then defining opEquals in terms of opCmp.

- Jonathan M Davis
July 25, 2014
"Jonathan M Davis"  wrote in message news:zcutsbuilcttvbuahmlc@forum.dlang.org...

> If that's the case, then the default opEquals isn't correct, and the programmer should have already defined opEquals. If they didn't, then their code is broken, and I see no reason to penalize the folks who wrote correct code just to fix someone else's broken code by then defining opEquals in terms of opCmp.

Just because not all fields _need_ to be compared doesn't mean the default opEquals was incorrect.  The ignored fields could be cached values calculated from the other fields. 

July 25, 2014
On Friday, 25 July 2014 at 07:34:29 UTC, Daniel Murphy wrote:
> "Jonathan M Davis"  wrote in message news:zcutsbuilcttvbuahmlc@forum.dlang.org...
>
>> If that's the case, then the default opEquals isn't correct, and the programmer should have already defined opEquals. If they didn't, then their code is broken, and I see no reason to penalize the folks who wrote correct code just to fix someone else's broken code by then defining opEquals in terms of opCmp.
>
> Just because not all fields _need_ to be compared doesn't mean the default opEquals was incorrect.  The ignored fields could be cached values calculated from the other fields.

True. I didn't think of that. But even if that's the case, if they don't define opEquals, then they've always been getting an opEquals which compares all of them. The only place that they would have gotten better performance would have be when the type was used as the key in an AA, since that will now use opEquals instead of opCmp. But if they want to get that efficiency gain, then they can just define opEquals themselves - and if they really cared about that gain, they would have already defined opEquals themselves anyway, because the cases other than AAs would have been using the default opEquals.

So, while you have a good point that opCmp _can_ be more efficient than opEquals, it usually isn't, and the places where that would make a difference should already be defining opEquals anyway, meaning that changing the default opEquals to use opCmp wouldn't gain them anything unless they didn't care about that efficiency gain.

- Jonathan M Davis
July 25, 2014
On 25/07/14 06:50, Walter Bright wrote:

> Consider also:
>
> http://www.reddit.com/r/programming/comments/2bl51j/programming_in_d_a_great_online_book_for_learning/cj75gm9
>
>
> The current scheme breaks existing code - code that was formerly correct
> and working.
>
> AAs don't make sense if the notion of == on members is invalid. AAs
> formerly required opCmp, and yes, opCmp could be constructed to give
> different results for opCmp==0 than ==, but I would expect such an
> object to not be used in an AA, again because it doesn't make sense.
>
> Using the default generated opEquals for AAs may break code, such as the
> an AA of the structs in the parent post, but it seems unlikely that that
> code was correct anyway in an AA (as it would give erratic results).

The code [1] from the original issue, 13179, does have an opCmp which handles equivalent.

> Kenji's rebuttal https://issues.dlang.org/show_bug.cgi?id=13179#c2 is
> probably the best counter-argument, and I'd go with it if it didn't
> result in code breakage.

I still don't see the problem:

1. If neither opCmp or opEquals are defined, the compiler will automatically generate these and will be used for comparison and equivalent

2. If opEquals is defined, lhs == rhs will be lowered to lhs.opEquals(rhs)

3. If opCmp is defined but no opEquals, lhs == rhs will be lowered to lhs.opCmp(rhs) == 0

4. If opCmp and opEquals is defined, lhs == rhs will be lowered to lhs.opEquals(rhs)

The only case this will break code is when opCmp was defined and opEquals was not defined where lhs.opCmp(rhs) == 0 and lhs == rhs gave different results. Many here think this is a bug in the first place. If anything, this change would fix broken code.

[1] https://github.com/SiegeLord/Tango-D2/blob/d2port/tango/text/Regex.d#L2345

-- 
/Jacob Carlborg
July 25, 2014
On 25/07/14 08:44, Jonathan M Davis wrote:

> So, in the case where opCmp was defined but not opEquals, instead of
> using the normal, built-in opEquals (which should already be equivalent
> to lhs.opCmp(rhs) == 0), we're going to make the compiler generate
> opEquals as lhs.opCmp(rhs) == 0? That's a silent performance hit for no
> good reason IMHO.

So are default initialized variables, virtual by default and other similar cases. D aims for correctness and safety first. With the option to get better performance by, possibly, writing some extra code.

> It doesn't even improve correctness except in the
> cases where the programmer should have been defining opEquals in the
> first place, because lhs.opCmp(rhs) == 0 wasn't equivalent to the
> compiler-generate opEquals. So, we'd be making good code worse just to
> try and fix an existing bug in bad code in order to do what? Not break
> the already broken code?

I don't understand this. How are we making good code worse? If the code was working previously, opCmp == 0 should have had the same result as the default generated opEquals. In that case it's perfectly safe to define opEquals to be opCmp == 0.

> I can understand wanting to avoid breaking code when changing from using
> opCmp to using opEquals with AAs, but it's only an issue if the code was
> already broken by defining opCmp in a way that didn't match opEquals, so
> if I find it odd that any part of this is controversial, it's the fact
> that anyone thinks that we should try and avoid breaking code where
> opEquals and opCmp weren't equivalent.

By defining opEquals to be opCmp == 0 we're:

1. We're not breaking code where it wasn't broken previously
2. We're fixing broken code. That is when opEqual and opCmp == 0 gave different results

-- 
/Jacob Carlborg
July 25, 2014
On 25/07/14 08:50, Walter Bright wrote:

> Yes, that's why it's hard to see that it would break existing code,
> unless that existing code had a bug in it that was worked around in some
> peculiar way.

If the type was only used as an AA key and never checked for equivalent then it worked correctly when opCmp was used for AA keys.

Also, adding an opEqual to call opCmp == 0 will make it work for equivalent as well, even though it's never used. And it will fix the breaking change with AA keys.

-- 
/Jacob Carlborg
July 25, 2014
On 25/07/14 08:31, Jonathan M Davis wrote:

> Exactly. The only reason that switching from using lhs.opCmp(rhs) == 0
> to opEquals would break code is if a type does not define them such that
> they're equivalent, which would mean that opEquals and/or opCmp was
> defined in a buggy manner. So, the only way that the change would break
> code is if it was broken in the first place. All it risks is making it
> so that the bug exhibits itself in an additional case.

If the type is only used as an AA key and never checked for equivalent it worked when opCmp as used for AA keys.

-- 
/Jacob Carlborg