recursive equal, and firstDifference functions (page 4) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » Learn » recursive equal, and firstDifference functions (page 4)

March 20, 2013

Re: recursive equal, and firstDifference functions

Posted by Jonathan M Davis
in reply to Dan

Jonathan M Davis

Posted in reply to Dan

On Wednesday, March 20, 2013 01:17:13 Dan wrote:
> This is true, but then my code is by definition not standard.
> However, theoretically, the language writers could. For example,
> any '==' could be lowered to a 'standard' function, probably
> better named 'intancesDeepEqual(a,b)' and that function could use
> reflection to provide equal when not available else call opEquals
> if it is available. Similar with opCmp, dup, idup, ...
> In other words, in the vein of the original poster, why not allow
> all of these nice goodies (equality comparison, opCmp comparison,
> dup) without requiring boilerplate code while still
> honoring/using it when it is provided.

There _are_ defaults for ==, and it really does quite a lot for you (e.g. recursively using == on each of a type's member variables), so you don't always have to define opEquals, but as soon as you have a layer of indirection (aside from arrays), you do have to define opEquals, or you'll generally end up checking for reference/pointer equality rather than equality of the actual objects. And in some cases, that's exactly the right thing to do. In others, it's not. It all depends on the context.

As for comparing ranges, it would actually be very bad for performance if they defaulted to what equal does (particularly with regards to strings), and the language really doesn't do much to support ranges directly. Aside from foreach, they're completely a library construct. And really, equal exists with the idea that ranges of different types (but the same element type) can be compared, which isn't what you're generally trying to do with opEquals at all.

If you want to avoid boilerplate, then use mixins. It should be fairly trivial to define a mixin such that you can do something like

mixin(defineOpEquals!(typeof(this)));

which defines an opEquals which does deep comparison. But also remember that what is necessary for deep comparison often depends on the types being compared (e.g. a range may or may not require equal to get the comparison that you want), so while it may be possible to define a mixin which creates an opEquals which will work in most situations, there are plenty where it won't. For instance, given that the standard library supports ranges, it's likely that something like defineOpEquals would know about them and do what is most likely to be the correct thing for them, but if ranges weren't part of the standard library, thhen defineOpEquals wouldn't know about them and couldn't handle them properly. The same will be true for any user-defined type that requires a particular idiom in order to be compared correctly.

- Jonathan M Davis

March 20, 2013

Re: recursive equal, and firstDifference functions

Posted by H. S. Teoh
in reply to Dan

H. S. Teoh

Posted in reply to Dan

On Wed, Mar 20, 2013 at 01:17:13AM +0100, Dan wrote:
> On Tuesday, 19 March 2013 at 23:13:19 UTC, Jonathan M Davis wrote:
[...]
> >But the main problem that I'm pointing out is that you can't define your own, non-standard functions for equality or hashing or whatever and expect your types to play nicely with other stuff. If your stuff is wrapped in types that do define the proper functions for that (like in your example), then it can work, but the types which were wrapped won't play nice outside of the wrapper.
> >
> 
> This is true, but then my code is by definition not standard. However, theoretically, the language writers could. For example, any '==' could be lowered to a 'standard' function, probably better named 'intancesDeepEqual(a,b)' and that function could use reflection to provide equal when not available else call opEquals if it is available. Similar with opCmp, dup, idup, ...  In other words, in the vein of the original poster, why not allow all of these nice goodies (equality comparison, opCmp comparison, dup) without requiring boilerplate code while still honoring/using it when it is provided.
[...]

I like this idea. By default, provide something that recursively compares struct/class members, array elements, etc.. But if at any level an opEquals is defined, then that is used instead. This maximizes convenience for those cases where you *do* just want a literal equality of all sub-structures, but also allows you to override that behaviour if your class/struct needs some other special processing.


T

-- 
"How are you doing?" "Doing what?"

March 20, 2013

Re: recursive equal, and firstDifference functions

Posted by Jonathan M Davis

Jonathan M Davis

On Tuesday, March 19, 2013 18:20:35 H. S. Teoh wrote:
> I like this idea. By default, provide something that recursively compares struct/class members, array elements, etc.. But if at any level an opEquals is defined, then that is used instead. This maximizes convenience for those cases where you *do* just want a literal equality of all sub-structures, but also allows you to override that behaviour if your class/struct needs some other special processing.

We already get this. That's what == does by default. It's just that it uses == on each member, so if == doesn't work for a particular member variable and the semantics you want for == on the type it's in, you need to override opEquals.

- Jonathan M Davis

March 20, 2013

Re: recursive equal, and firstDifference functions

Posted by Dan
in reply to Jonathan M Davis

Dan

Posted in reply to Jonathan M Davis

On Wednesday, 20 March 2013 at 02:03:31 UTC, Jonathan M Davis wrote:
> We already get this. That's what == does by default. It's just that it uses ==
> on each member, so if == doesn't work for a particular member variable and the
> semantics you want for == on the type it's in, you need to override opEquals.

Really?

string is one most people would like == to just work for. This writes true then false. This certainly takes getting used to. It alone is a good reason for the mixins and potentially a non-member instancesDeepEqual.

import std.stdio;
struct S {
  string s;
}
void main() {
  writeln("foo" == "foo".idup);
  writeln(S("foo") == S("foo".idup));
}

March 20, 2013

Re: recursive equal, and firstDifference functions

Posted by Jonathan M Davis
in reply to Dan

Jonathan M Davis

Posted in reply to Dan

On Wednesday, March 20, 2013 03:48:38 Dan wrote:
> On Wednesday, 20 March 2013 at 02:03:31 UTC, Jonathan M Davis
> 
> wrote:
> > We already get this. That's what == does by default. It's just
> > that it uses ==
> > on each member, so if == doesn't work for a particular member
> > variable and the
> > semantics you want for == on the type it's in, you need to
> > override opEquals.
> 
> Really?
> 
> string is one most people would like == to just work for. This writes true then false. This certainly takes getting used to. It alone is a good reason for the mixins and potentially a non-member instancesDeepEqual.
> 
> import std.stdio;
> struct S {
>    string s;
> }
> void main() {
>    writeln("foo" == "foo".idup);
>    writeln(S("foo") == S("foo".idup));
> }

That's a bug:

http://d.puremagic.com/issues/show_bug.cgi?id=3789

- Jonathan M Davis

March 20, 2013

Re: recursive equal, and firstDifference functions

Posted by Jonathan M Davis
in reply to Dan

Jonathan M Davis

Posted in reply to Dan

On Wednesday, March 20, 2013 01:17:13 Dan wrote:
> This is true, but then my code is by definition not standard.
> However, theoretically, the language writers could. For example,
> any '==' could be lowered to a 'standard' function, probably
> better named 'intancesDeepEqual(a,b)' and that function could use
> reflection to provide equal when not available else call opEquals
> if it is available. Similar with opCmp, dup, idup, ...
> In other words, in the vein of the original poster, why not allow
> all of these nice goodies (equality comparison, opCmp comparison,
> dup) without requiring boilerplate code while still
> honoring/using it when it is provided.

Okay. I'm going to take a second stab at replying to this, because I think that I can explain it much better. The standard function that == lowers to is opEquals. Nothing else is needed. The way that == works is

integral types, char types, pointers, and bool: bitwise comparison

floating point types: equality which is similar to bitwise comparison but takes NaN into account

arrays: The ptr and length attributes are checked, and if the lengths are equal but the ptrs are not, then element-wise comparison is used, where each element is compared with == (whereas if the lengths are different or if the ptr and length attributes are identical, no further comparison is needed).

structs and classes: Their opEquals is used. If a struct or class does not define opEquals, then one is generated where each member variable is compared in turn with ==. So, they are compared recursively.

They _only_ times that you need you need to worry about defining opEquals are when

1. the default comparison is comparing pointers, and you want to compare what they point to rather than the pointers themselves.

2. you want equality to mean something other than calling == on each member variable (including doing something other than == on a particular member variable as might be the case with a member variable which is a range).

The way == is defined is very clean and straightforward. There _are_ bugs which complicate things (e.g. http://d.puremagic.com/issues/show_bug.cgi?id=3789 ), but those can and will be fixed. The design itself is solid.

It's true that equal is often need for comparing ranges of the same type, but that's arguably a defect in the implementations of those ranges. equal is specifically designed to compare ranges which are different types but have the same element type. == is what's for comparing objects of the same type. But most ranges don't currently define opEquals, even when they need it in order for it to do a proper comparison. That's arguably something that should be fixed, but it's a design issue of ranges, not ==. But if you want to ensure that equal is used, you can always use a mixin to define opEquals that way (though you actually risk making the comparison less efficient than it would be if the ranges themselves defined opEquals).

So, I can see a definite argument that ranges should make sure that they define opEquals (which wouldn't negate the general need for equal as that's specifically for comparing ranges of different types which have the same element type), but there's no need to change how == works. The design is actually very clean.

- Jonathan M Davis

March 20, 2013

Re: recursive equal, and firstDifference functions

Posted by Dan
in reply to Jonathan M Davis

Dan

Posted in reply to Jonathan M Davis

On Wednesday, 20 March 2013 at 02:54:23 UTC, Jonathan M Davis wrote:
> On Wednesday, March 20, 2013 03:48:38 Dan wrote:
>> On Wednesday, 20 March 2013 at 02:03:31 UTC, Jonathan M Davis
>> 
>> wrote:
>> > We already get this. That's what == does by default. It's just
>> > that it uses ==
>> > on each member, so if == doesn't work for a particular member
>> > variable and the
>> > semantics you want for == on the type it's in, you need to
>> > override opEquals.
>> 
>> Really?
>> 
>> string is one most people would like == to just work for. This
>> writes true then false. This certainly takes getting used to. It
>> alone is a good reason for the mixins and potentially a
>> non-member instancesDeepEqual.
>> 
>> import std.stdio;
>> struct S {
>>    string s;
>> }
>> void main() {
>>    writeln("foo" == "foo".idup);
>>    writeln(S("foo") == S("foo".idup));
>> }
>
> That's a bug:
>
> http://d.puremagic.com/issues/show_bug.cgi?id=3789
>

From Feb 2010. Maybe by now it is so understood how it works that at some point fixing it could be a problem. For some the language is better defined by how the compiler treats your code than what is listed in bugzilla. Even in looking through the history of that bug I could not find any definitive - some say its a bug, others say its not. You refer to TDPL which is a good source but if it is not viewed as a bug by Walter...

March 20, 2013

Re: recursive equal, and firstDifference functions

Posted by Dan
in reply to Jonathan M Davis

Dan

Posted in reply to Jonathan M Davis

On Wednesday, 20 March 2013 at 03:10:41 UTC, Jonathan M Davis wrote:
>
> The way == is defined is very clean and straightforward. There _are_ bugs which
> complicate things (e.g. http://d.puremagic.com/issues/show_bug.cgi?id=3789 ),
> but those can and will be fixed. The design itself is solid.
>

Thanks for the detailed explanation. If 3789 covers strings, dynamic arrays and associative arrays and it were fixed then I agree, it is clean and straightforward. If I read it correctly with this fix it would change the current opEquals from semantically shallow to semantically deep by default. Semantically deep equality is comfortable to me, but I would imagine by now there is a fair amount of code that might rely on a false result from opEquals if the members (slices, associative arrays) are not bitwise the same.

Thanks
Dan

March 20, 2013

Re: recursive equal, and firstDifference functions

Posted by Jonathan M Davis
in reply to Dan

Jonathan M Davis

Posted in reply to Dan

On Wednesday, March 20, 2013 12:47:38 Dan wrote:
> I would
> imagine by now there is a fair amount of code that might rely on
> a false result from opEquals if the members (slices, associative
> arrays) are not bitwise the same.

I think that it's more likely that there's a lot of code that expects strings and the like to have their elements checked for equality when they're in a struct and is therefore buggy at present due to the fact that they aren't.

As for AAs (which I forgot about), I don't remember what they're supposed to do with ==. They might just do a pointer comparison - or they may do a deeper comparison; I don't know. But the big problem there is that structs need to compare each of their members with ==, and that's not working properly at the moment, so that throws a major wrench in how == works.

- Jonathan M Davis

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation