May 15, 2019
On 5/14/19 10:35 PM, Mike Franklin wrote:
> On Wednesday, 15 May 2019 at 00:32:32 UTC, Andrei Alexandrescu wrote:
> 
>>> Although it would be much more work, perhaps what is needed is a new type (e.g. `struct CmpResult`) with 4 immutable instances representing each result and an `opCmp` and `opEquals` implementation that does the right thing comparing against 0 or whatever else is needed.  Yes, it's more complicated, but I think it would scale better.
>>
>> Not sure there's much to gain there. a < b is lowered to a.opCmp(b) < 0. So then... you define opCmp to return an instance of this:
>>
>> ---
>> import std.stdio;
>>
>> struct OverengineeredCmpResult {
>>     enum R { lt, eq, gt, ionno }
>>     private R payload;
>>     int opCmp(int alwaysZero) {
>>         writeln("b");
>>         return 0;
>>     }
>> }
>>
>> struct A {
>>     OverengineeredCmpResult opCmp(A rhs) {
>>         writeln("a");
>>         return OverengineeredCmpResult(OverengineeredCmpResult.R.ionno);
>>     }
>> }
>>
>> void main() {
>>     A a, b;
>>     if (a < b) {}
>> }
>> ---
>>
>> Much ado about nothing.
> 
> Cool! It actually looks much simpler than I imagined.

I don't think I made my point clear. Ultimately you're still relying on int. The entire code does nothing relevant. It was a joke in code form.
May 15, 2019
On 5/15/19 1:30 AM, H. S. Teoh wrote:
> On Tue, May 14, 2019 at 09:02:49PM -0400, Andrei Alexandrescu via Digitalmars-d wrote:
> [...]
>> Well there would be in some instances. People often implement
>> comparisons as a - b currently, where a and b are int expressions.
> [...]
> 
> FYI, the result of that is actually incorrect in some cases.  (Consider
> what happens when there is overflow involved, such as int.max
> - int.min, and remember the type of the result.)

Thanks.

> So, not exactly the
> kind of code we should recommend, let alone bend over backwards to
> support.

All integral arithmetic is subject to overflow, at every step of the way. A D coder implementing opCmp would need to figure when the range of the operators does not put comparisons at risk, and where it does, use e.g. CheckedInt or a more elaborate approach.
May 15, 2019
On 5/15/19 4:09 AM, Jonathan M Davis wrote:
> Except that this DIP doesn't need to define opCmp's signature - at least not
> if it's not including interfaces in the design. The rules for opCmp's
> signature on classes should be able to be the same as it is for structs.
> Classes derived from a class that defines opCmp will be restricted by the
> signature on the base class, but the base class should be able to define an
> opCmp the same way that a struct would, meaning that any spec changes that
> we might want to make about what opCmp returns or accepts should be able to
> be completely separate from this DIP.

It seems that could work real neat.

One thing that bothers me is inheritance. It seems to me most of the time just inheriting opCmp and opEquals does not work - they'd need to be overridden to account for the added state. However, sometimes they _do_ just work. So I'm in two minds on whether inheriting but not overloading those two should be an error or not.
May 15, 2019
On Wed, May 15, 2019 at 11:11:00AM -0400, Andrei Alexandrescu via Digitalmars-d wrote:
> On 5/15/19 1:30 AM, H. S. Teoh wrote:
> > On Tue, May 14, 2019 at 09:02:49PM -0400, Andrei Alexandrescu via Digitalmars-d wrote: [...]
> > > Well there would be in some instances. People often implement comparisons as a - b currently, where a and b are int expressions.
> > [...]
> > 
> > FYI, the result of that is actually incorrect in some cases. (Consider what happens when there is overflow involved, such as int.max - int.min, and remember the type of the result.)
> 
> Thanks.
> 
> > So, not exactly the kind of code we should recommend, let alone bend over backwards to support.
> 
> All integral arithmetic is subject to overflow, at every step of the way. A D coder implementing opCmp would need to figure when the range of the operators does not put comparisons at risk, and where it does, use e.g.  CheckedInt or a more elaborate approach.

In the case of integers, it's really just a matter of using the right constructs that translate to the correct hardware instruction(s) that do the right thing.


T

-- 
What do you call optometrist jokes? Vitreous humor.
May 15, 2019
On Wed, May 15, 2019 at 11:14:17AM -0400, Andrei Alexandrescu via Digitalmars-d wrote:
> On 5/15/19 4:09 AM, Jonathan M Davis wrote:
> > Except that this DIP doesn't need to define opCmp's signature - at least not if it's not including interfaces in the design. The rules for opCmp's signature on classes should be able to be the same as it is for structs.  Classes derived from a class that defines opCmp will be restricted by the signature on the base class, but the base class should be able to define an opCmp the same way that a struct would, meaning that any spec changes that we might want to make about what opCmp returns or accepts should be able to be completely separate from this DIP.
> 
> It seems that could work real neat.

Yeah, I think that's the right way to go.  Don't impose any specific
signature on opCmp, and let the user derive a base class with whatever
desired signature he wants, be it int opCmp(), or float opCmp(), or
OverlyElaborateOpCmpResult opCmp().  As long as the compiler can
translate x < y to x.opCmp(y) < 0 and have it compile, that's Good
Enough.


> One thing that bothers me is inheritance. It seems to me most of the time just inheriting opCmp and opEquals does not work - they'd need to be overridden to account for the added state. However, sometimes they _do_ just work. So I'm in two minds on whether inheriting but not overloading those two should be an error or not.

How would you enforce it, though?  If we go the compile-time introspection route, that means user base classes that define opCmp can define whatever they want, including allowing derived classes to simply inherit base class opCmp.  It wouldn't be ProtoObject's responsibility to enforce any policies concerning opCmp -- it'd be up to the user to get it right.


T

-- 
Freedom: (n.) Man's self-given right to be enslaved by his own depravity.
May 15, 2019
On Tue, May 14, 2019 at 08:32:32PM -0400, Andrei Alexandrescu via Digitalmars-d wrote: [...]
> ---
> import std.stdio;
> 
> struct OverengineeredCmpResult {
>     enum R { lt, eq, gt, ionno }
>     private R payload;
>     int opCmp(int alwaysZero) {
>         writeln("b");
>         return 0;
>     }
> }
> 
> struct A {
>     OverengineeredCmpResult opCmp(A rhs) {
>         writeln("a");
>         return OverengineeredCmpResult(OverengineeredCmpResult.R.ionno);
>     }
> }
> 
> void main() {
>     A a, b;
>     if (a < b) {}
> }
> ---
[...]

FYI, even with the above amusingly elaborate hack, you still cannot achieve proper 4-way comparison results.  Consider: what should OverengineeredCmpResult.opCmp return for payload == ionno, such that <, <=, >, >=, ==, != would all produce the correct result?

Answer: it's not possible unless you return float, because x < y translates to x.opCmp(y) < 0, and x > y translates to x.opCmp(y) > 0, so the only way to represent an incomparable state is for opCmp to return some value z for which z < 0 and z > 0 are *both* false.  There is no integer value that fits this description; the only candidate is float.nan. Substituting the return value of opCmp with a custom struct doesn't fix this problem; it only defers it to the custom struct's opCmp, which suffers from the same problem.

tl;dr: it's currently *not possible* to represent an incomparable state in opCmp with anything other than float.nan (or double.nan, etc.). No amount of hackery with opCmp returning custom structs is going to fix this without using float.nan at *some* point.


T

-- 
There's light at the end of the tunnel. It's the oncoming train.
May 15, 2019
On Wednesday, May 15, 2019 10:13:47 AM MDT H. S. Teoh via Digitalmars-d wrote:
> On Wed, May 15, 2019 at 11:14:17AM -0400, Andrei Alexandrescu via
Digitalmars-d wrote:
> > One thing that bothers me is inheritance. It seems to me most of the time just inheriting opCmp and opEquals does not work - they'd need to be overridden to account for the added state. However, sometimes they _do_ just work. So I'm in two minds on whether inheriting but not overloading those two should be an error or not.
>
> How would you enforce it, though?  If we go the compile-time introspection route, that means user base classes that define opCmp can define whatever they want, including allowing derived classes to simply inherit base class opCmp.  It wouldn't be ProtoObject's responsibility to enforce any policies concerning opCmp -- it'd be up to the user to get it right.

It wouldn't have anything to do wtih ProtoObject. Presumably, if a class defined opCmp but not opEquals, it would be an error, and if a base class defined opCmp, and the derived class defined either opEquals or opCmp, it would be an error if it hadn't also defined the other.

We could also choose to do something similar to what we do when a derived class overloads a base class function and involve alias. So, a derived class could either override neither opEquals nor opCmp, override both, or override one and provide an alias to the base class function for the other. I suppose that it could also alias both, though that would be rather pointless.

As to whether we _should_ make it an error to override one but not the other... I don't know. It seems highly unlikely that it would make sense to override one and not the other if any behavioral changes are being made (though you could certainly make one do something like log without caring about the other), and the risk of bugs when you override one but not the other would be high. So, requiring that either both or neither be overridden would probably be worth it even if it were annoying in some cases - especially if an alias were enough in those cases where you didn't really want to override both.

Another consideration is that a base class could define opEquals without defining opCpm while a derived class did define opCmp. And in that case, the odds are much higher that overriding opEquals is unnecessary - though requiring at least an alias could be worth it given the risk of bugs. Certainly, it's something to think about.

toHash has similar issues if both it and opEquals are defined, though it makes far more sense to override opEquals without overriding toHash than it makes to override opEquals without overriding opCmp.

- Jonathan M Davis



May 15, 2019
On Wed, May 15, 2019 at 09:56:53AM +0100, Steven Schveighoffer via Digitalmars-d wrote:
> On 5/14/19 9:36 PM, Eduard Staniloiu wrote:
> > Jonathan's question got us to the point raised: maybe it doesn't
> > make much sense to be able to compare two `ProtoObjects`, so maybe
> > you shouldn't be able to. This would change the interface to
> > ```
> > interface Ordered(T)
> > {
> >      int opCmp(scope const T rhs);
> > }
> > ```
> > 
> > Now the attributes of `opCmp` will be inferred.
> 
> Just wanted to make sure you understand this is not the case. opCmp in this instance is a virtual call, and will NOT have attributes inferred.
> 
> There isn't really a way to define an interface for this, nor do you need to.
> 
> Just define the opCmp you want in your own interface/base object, and then you can compare those. Almost nobody wants to compare 2 completely unrelated objects.
[...]

+1.  *This* is the right way to go.  Forget about using interfaces or other such inferior hacks; D has powerful compile-time introspection, why aren't we taking full advantage of it??  Let the user define their own base class (or interface) with whatever definition of opCmp they wish to have. This solves a host of issues:

1) How to define opCmp in a way that satisfies everyone: some people want int opCmp, some want float opCmp, etc.. Why make the decision for them? Let them decide themselves which version their opCmp wants. Pass the buck to the user.

2) How to attribute opCmp in a non-restrictive way: pass the buck to the user.

3) How to compare two ProtoObjects?  If the user wants to compare two disparate objects, let them define their own common base class with an appropriate version of opCmp. Pass the buck to the user.

4) What if this doesn't work? (I.e., the user has two objects from two completely unrelated, opaque, binary-only libraries whose opCmp's are not compatible with each other.)  Easy, you already know nothing about the two objects, and since they are related they don't have any meaningfully-comparable state anyway, so just wrap them in a struct whose opCmp just compares their respective pointer values:

	struct ComparableProtoObject {
		ProtoObject payload;
		int opCmp(in ComparableProtoObject o) @trusted {
			auto a = cast(void*)payload;
			auto b = cast(void*)o.payload;
			return (a < b) ? -1 : (a > b) ? 1 : 0;
		}
	}

IOW, pass the buck to the user.

4) What about inheritance? See (2).  Pass the buck to the user.

5) What about AA's?  For something to be hashable, you need .toHash, .opEquals, and perhaps .opCmp (depending on whether the AA's buckets needs stuff to be orderable).  So either the user creates a Hashable interface for their objects with appropriate definitions of toHash, opEquals, and opCmp, or see (4). IOW, pass the buck to the user.


The user is not an idiot; give him the tools to do what he wants instead of making decisions for him and handing it down from on high.


T

-- 
This sentence is false.
May 15, 2019
On 5/14/2019 2:06 PM, Mike Franklin wrote:
> On Tuesday, 14 May 2019 at 20:36:08 UTC, Eduard Staniloiu wrote:
> 
>> Should `opCmp` return a float?
>>
>> The reason: when we attempt to compare two types that aren't comparable (an unordered relationship) we can return float.NaN. Thus we can differentiate between a valid -1, 0, 1 and an invalid float.NaN comparison.
> 
> Seems like a job for an enum, not a float or an integer.

D used to support the 4 states with floating point comparisons. There was even a set of operators for every case. Zero people used them. It was eventually deprecated, sat there for years, and finally removed. (It was proposed for C, and rejected, and C++ ignored it.)

    https://www.digitalmars.com/d/1.0/expression.html#floating_point_comparisons

Not a single user spoke up for it.

Here's how people write code for unordered cases:

    if (isNaN(f) || isNaN(g)) // deal with unordered cases
       ...
    else if (f < g) // only ordered cases considered here
       ...

I've seen no evidence that anyone would be interested in 4 state comparisons, and that's over two decades (yes, I implemented it in Digital Mars C++!).

I recommend we not waste time on this.

May 15, 2019
On Wed, May 15, 2019 at 10:37:10AM -0700, Walter Bright via Digitalmars-d wrote: [...]
> I've seen no evidence that anyone would be interested in 4 state comparisons, and that's over two decades (yes, I implemented it in Digital Mars C++!).
> 
> I recommend we not waste time on this.

In one of my projects, I wrote a simple integer set type, which obviously included a test for the subset relation.  Originally I wanted to use opCmp for this purpose, but couldn't, because I was unaware of the possibility of opCmp returning a float, and therefore couldn't represent the incomparable state.

Now that I know about this possibility, I still decided against using opCmp for the subset relation, because opCmp draws a sharp distinction between strictly-less vs. less-or-equal, i.e.:

	x.opCmp(y) <= 0

requires deciding whether x was a strict subset of y so that opCmp knew whether to return -1 or 0, but in the end that extra work is thrown away anyway because the difference is ignored by the <= 0.  Computing the difference between -1 and 0 was useless work, even though the definition of opCmp required it.

So either way, implementing a custom .isSubsetOf() member function was a far better solution than trying to make any use of opCmp's support for partial orders.  And that's not to mention people's expectation when seeing an expression like x < y in code; I'd wager 99.999% of people would immediately think "linear order" rather than "partial order". Changing the meaning of < to a partial order sounds like borderline operator overloading abuse IMO, much as I like the concept of not arbitrarily limiting user options.


T

-- 
MAS = Mana Ada Sistem?