July 25, 2014
On Friday, 25 July 2014 at 09:12:11 UTC, Jacob Carlborg wrote:
> On 25/07/14 10:48, Walter Bright wrote:
>> Putting it simply,
>>
>> 1. == uses opEquals. If you don't supply opEquals, the compiler will
>> make one for you.
>>
>> 2. AAs use ==. See rule 1.
>>
>>
>> Easy to understand, easy to explain, easy to document.
>
> It's very hard to use D when it constantly changes and breaks code. It's especially annoying reading your comments on reddit that we must stop break code. Then a few days later go an break code. I really hope no one gets false hopes from those comments.

The _only_ code that would break would be code that's _already_ broken - code that defines opCmp in a way that's inconsistent with the default opEquals and then doesn't define opEquals. I see no reason to worry about making sure that we don't break code that's already broken.

- Jonathan M Davis
July 25, 2014
"Jonathan M Davis"  wrote in message news:lzigfacgrlssjuemoqyg@forum.dlang.org...

> The compiler _never_ defines opCmp for you. You have to do that yourself. So, what you're suggesting would force people to define opEquals just because they defined opCmp unless they wanted to take a performance hit <<<<<<<< in the rare case that it actually matters >>>>>>>>>>.
> And once you define opEquals, you have to define toHash. So, what you're suggesting would force a lot more code to define toHash, which will likely cause far more bugs than simply requiring that the programmer define opEquals if that's required in order to make it consistent with opEquals. 

July 25, 2014
Am 25.07.2014 12:07, schrieb Jonathan M Davis:
> And once you define opEquals, you have to define
> toHash. So, what you're suggesting would force a lot more code to define
> toHash, which will likely cause far more bugs than simply requiring that

Is it actually hard to define toHash, or should it be?
What is done by default? I guess some magic hash is built over all members of a type (like all members are compared in opEquals).
So couldn't there be some templated function that creates the hash for you in the same way as it's done now, but only for the values you want to hash?

e.g.
hash_t createHash(T...)(T args) {
    return (do magic with args);
}


struct Foo {
    int x;
    int y;
    string str;
    int dontCare;

    bool opEquals()(auto ref const Foo o) const {
        return x == o.x  && y == o.y && str == o.str;
    }

    hash_t toHash() {
        return createHash(x, y, str);
    }
}

Cheers,
Daniel
July 25, 2014
On Friday, 25 July 2014 at 10:10:50 UTC, Daniel Murphy wrote:
> "Jonathan M Davis"  wrote in message news:lzigfacgrlssjuemoqyg@forum.dlang.org...
>
>> The compiler _never_ defines opCmp for you. You have to do that yourself. So, what you're suggesting would force people to define opEquals just because they defined opCmp unless they wanted to take a performance hit <<<<<<<< in the rare case that it actually matters >>>>>>>>>>.

Equality checks are a common operation, so it will affect a fair bit of code. Granted, how much it will really matter is an open question, but there will be a small reduction in speed to quite a bit of code out there.

But regardless of whether the efficiency cost is large, you're talking about incurring it just to fix the code of folks who couldn't be bothered to make sure that opEquals and lhs.opCmp(rhs) == 0 were equivalent. You'd be punishing correct code (however slight that punishment may be) in order to fix the code of folks who didn't even properly test basic functionality. I see no reason to care about trying to help out folks who can't even be bothered to test opEquals and opCmp, especially when that help isn't free.

- Jonathan M Davis
July 25, 2014
On Friday, 25 July 2014 at 10:27:27 UTC, Daniel Gibson wrote:
> Am 25.07.2014 12:07, schrieb Jonathan M Davis:
>> And once you define opEquals, you have to define
>> toHash. So, what you're suggesting would force a lot more code to define
>> toHash, which will likely cause far more bugs than simply requiring that
>
> Is it actually hard to define toHash, or should it be?
> What is done by default? I guess some magic hash is built over all members of a type (like all members are compared in opEquals).
> So couldn't there be some templated function that creates the hash for you in the same way as it's done now, but only for the values you want to hash?

Sure. We could create something like that, and we probably should. It would help out in cases where the default wasn't appropriate (e.g. only some of the member variables were part of opEquals). But why force folks to define opEquals and toHash when the defaults would have worked fine for them just to fix the code of folks who didn't make the effort to test that opEquals and lhs.opCmp(rhs) == 0 were equivalent? That seems to me like we're punishing the folks who actually write good code and test it in order to help those who don't even test the basic functionality of their types.

- Jonathan M Davis
July 25, 2014
On Fri, 25 Jul 2014 09:39:11 +0100, Walter Bright <newshound2@digitalmars.com> wrote:

> On 7/25/2014 1:02 AM, Jacob Carlborg wrote:
>> 3. If opCmp is defined but no opEquals, lhs == rhs will be lowered to
>> lhs.opCmp(rhs) == 0
>
> This is the sticking point. opCmp and opEquals are separate on purpose, see Andrei's posts.

Sure, Andrei makes a valid point .. for a minority of cases.  The majority case will be that opEquals and opCmp==0 will agree.  In those minority cases where they are intended to disagree the user will have intentionally defined both, to be different.  I cannot think of any case where a user will intend for these to be different, then not define both to ensure it.

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/
July 25, 2014
On 25/07/14 11:46, Jonathan M Davis wrote:

> Code that worked perfectly fine before is now slower, because it's using
> opCmp for opEquals when it wasn't before.

Who says opCmp need to be slower than opEquals.

> Even worse, if you define
> opEquals, you're then forced to define toHash, which is much harder to
> get right.

That might be a problem. But you can always call the one in TypeInfo.

> So, in order to avoid a performance hit on opEquals from
> defining opCmp

Assuming there is a performance hit.

-- 
/Jacob Carlborg
July 25, 2014
On 25/07/14 12:07, Jonathan M Davis wrote:

> The compiler _never_ defines opCmp for you. You have to do that
> yourself. So, what you're suggesting would force people to define
> opEquals just because they defined opCmp unless they wanted to take a
> performance hit. And once you define opEquals, you have to define
> toHash. So, what you're suggesting would force a lot more code to define
> toHash, which will likely cause far more bugs than simply requiring that
> the programmer define opEquals if that's required in order to make it
> consistent with opEquals.

Again, you're assuming there will be a performance hit.

-- 
/Jacob Carlborg
July 25, 2014
On 25 July 2014 16:50, Walter Bright via Digitalmars-d < digitalmars-d@puremagic.com> wrote:

> On 7/24/2014 10:52 PM, Manu via Digitalmars-d wrote:
>
>> I don't really see how opCmp == 0 could be unreliable or unintended. It
>> was
>> deliberately written by the author, so definitely not unintended, and I
>> can't
>> imagine anybody would ever deliberately ignore the == 0 case when
>> implementing
>> an opCmp, or produce logic that works for less or greater, but fails for
>> equal.
>> <= and >= are expressed by opCmp, which imply that testing for equality
>> definitely works as the user intended.
>>
>
> Yes, that's why it's hard to see that it would break existing code, unless that existing code had a bug in it that was worked around in some peculiar way.


Indeed.

 In lieu of an opEquals, how can a deliberately implemented opCmp, which we
>> know
>> works in the == case (otherwise <= or >= wouldn't work either) ever be a
>> worse
>> choice than an implicitly generated opEquals?
>>
>
> Determining an ordering can sometimes be more expensive. It is, after all, asking for more information.
>

Correctness has always been the first criteria to satisfy in D. The user is always able to produce faster code with deliberate effort, and that's true in this case too, but you can't have something with a high probability of being incorrect be the default...?

In lieu of opEquals, and opCmp exists, the probability of being correct is super-biased towards the user supplied opCmp==0, which must already support <=/>= and therefore almost certainly correct, than some compiler generated guess, which has no insight into the object, and can only possibly be correct in the event you're lucky...

I'm a user who's concerned with performance more than most, but there's no way I can buy into that argument in this case. It's just wrong, and the sort of bug that this is likely to produce are highly surprising, very easily overlooked, and likely result in many lost hours to track down. It's the sort of bug that nobody wants to be tracking down.

All that said, I'm not even convinced that there would be a performance
advantage anyway. I'd be surprised if the optimiser wouldn't produce
correct code for 'a-b==0' vs 'a==b'. These are trivial things that
optimisers have been extremely good at for decades.
If I had to guess at which one offered a performance advantage, I'd say
that they'd likely be the same (because optimisers work well with that sort
of input), or the advantage would go to the user opCmp.
The reason I say that, is that user supplied opCmp may compare *at most*
every field (and therefore likely perform the same), but in reality,
there's a good chance that the comparison requires comparing only a subset
of fields - a user struct is likely to contain some irrelevant fields,
cache data perhaps, whatever - and therefore comparing less stuff would
more likely yield a performance advantage.


July 25, 2014
On 25/07/14 12:39, Jonathan M Davis wrote:

> But regardless of whether the efficiency cost is large, you're talking
> about incurring it just to fix the code of folks who couldn't be
> bothered to make sure that opEquals and lhs.opCmp(rhs) == 0 were
> equivalent. You'd be punishing correct code (however slight that
> punishment may be) in order to fix the code of folks who didn't even
> properly test basic functionality. I see no reason to care about trying
> to help out folks who can't even be bothered to test opEquals and opCmp,
> especially when that help isn't free.

By Walter and Andrei's definition opCmp is not to be used for equivalent, therefor opCmp does never need to be equal to 0.

-- 
/Jacob Carlborg