split up cmp? - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » D » split up cmp?

Thread overview

split up cmp?
Sep 18, 2003 Hauke Duden
Sep 18, 2003 Daniel Yokomiso
Sep 18, 2003 Hauke Duden
Sep 18, 2003 Philippe Mori
Sep 18, 2003 Hauke Duden
Sep 18, 2003 Philippe Mori
Sep 19, 2003 Sean L. Palmer
Sep 19, 2003 Hauke Duden
Sep 19, 2003 Philippe Mori
Sep 18, 2003 Sean L. Palmer
Sep 18, 2003 Hauke Duden
Sep 18, 2003 Daniel Yokomiso
Sep 19, 2003 Hauke Duden
Sep 19, 2003 Daniel Yokomiso
Sep 19, 2003 Hauke Duden
Sep 19, 2003 Philippe Mori
Sep 18, 2003 John Boucher

September 18, 2003

Posted by Hauke Duden

Hauke Duden

I recently thought about the way D handles operator overloading and I noticed that the current implementation will not work well in a mathematical context.

In math you often work with sets of objects that are only partially ordered. You may be able to determine that a < b or b > a, but if none of these two things apply it doesn't necessarily mean that a==b.

For example, consider the set of integer tuples.

The following definitions apply:
(x1,y1) < (x2,y2) <=> x1<x2 && y1<y2
(x1,y1) > (x2,y2) <=> x1>x2 && y1>y2

The problem now is that you cannot use D's operator overloading for such tuples because there is no way to implement cmp properly.

Example:

You have the following code:

Tuple t1(2,2);
Tuple t2(1,3);

if( t1 < t2 )
	doStuff();

The compiler translates the comparison to a call of t1.cmp(t2).

But how to implement cmp for integer tuples?
- t1 is not < t2 since 3>2
- t2 is not > t2 since 1<2
- but t1 is also != t2 !!

There is not return value defined for such cases.

I really think it is necessary to have separate functions for <, <=, >, >= comparisons. That way also partially ordered objects could be implemented, since then it would be possible to express "t1 is not < t2" without having to state that "t1 >= t2" or "t1==t2".

Hauke

September 18, 2003

Re: split up cmp?

Posted by Daniel Yokomiso
in reply to Hauke Duden

Daniel Yokomiso

Posted in reply to Hauke Duden

"Hauke Duden" <H.NS.Duden@gmx.net> escreveu na mensagem news:bkbvpd$2bld$1@digitaldaemon.com...
> I recently thought about the way D handles operator overloading and I noticed that the current implementation will not work well in a mathematical context.
>
> In math you often work with sets of objects that are only partially ordered. You may be able to determine that a < b or b > a, but if none of these two things apply it doesn't necessarily mean that a==b.
>
> For example, consider the set of integer tuples.
>
> The following definitions apply:
> (x1,y1) < (x2,y2) <=> x1<x2 && y1<y2
> (x1,y1) > (x2,y2) <=> x1>x2 && y1>y2
>
> The problem now is that you cannot use D's operator overloading for such tuples because there is no way to implement cmp properly.
>
> Example:
>
> You have the following code:
>
> Tuple t1(2,2);
> Tuple t2(1,3);
>
> if( t1 < t2 )
> doStuff();
>
> The compiler translates the comparison to a call of t1.cmp(t2).
>
> But how to implement cmp for integer tuples?
> - t1 is not < t2 since 3>2
> - t2 is not > t2 since 1<2
> - but t1 is also != t2 !!
>
> There is not return value defined for such cases.
>
> I really think it is necessary to have separate functions for <, <=, >,
>  >= comparisons. That way also partially ordered objects could be
> implemented, since then it would be possible to express "t1 is not < t2"
> without having to state that "t1 >= t2" or "t1==t2".
>
> Hauke

"cmp" expresses total ordering, which is a relationship between the ordering operators and the comparison operator. Most people expect total ordering, IME it's the most common case. If D implemented both total ordering and partial ordering, a possible solution:


interface PartiallyOrdered {
    bool isLesser(PartiallyOrdered other);
    bool isGreater(PartiallyOrdered other);
}
interface TotallyOrdered : PartiallyOrdered {
    bool isLesser(TotallyOrdered other)
    out (result) {
        if (!result) {
            assert(this == other || this.isGreater(other));
        }
    };
    bool isGreater(TotallyOrdered other)
    out (result) {
        if (!result) {
            assert(this == other || this.isLesser(other));
        }
    };
}


we should need to decide between these classes to model our classes, but people should need to be specially attent to not do things like:


if (a <= b) {
    printf("a is not greater than b\r\n");
} else {
    printf("a is greater than b\r\n");
}


because the expected semantics would be wrong. Partial ordering is a nice concept but IMO people wouldn't expect it. There are lots of things people expect from mathematical operators, for example "+": nobody thinks that it shouldn't be associative and commutative, but it can.

    Best regards,
    Daniel Yokomiso.

"First they ignore you, then they laugh at you, then they fight you, then
you win."
 - Mohandas Gandi


---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.518 / Virus Database: 316 - Release Date: 12/9/2003

September 18, 2003

Re: split up cmp?

Posted by Hauke Duden
in reply to Daniel Yokomiso

Hauke Duden

Posted in reply to Daniel Yokomiso

Daniel Yokomiso wrote:
> "cmp" expresses total ordering, which is a relationship between the ordering
> operators and the comparison operator. Most people expect total ordering,
> IME it's the most common case.

Of course, if you're dealing with such Tuple objects you need to be aware that the ordering is only partial. That is an issue with understanding the program, not an issue with the language. Also, that doesn't change the fact that there is an ordering among the objects which cannot currently be expressed with the comparison operators.

Another possible solution just occurred to me. It is not as clean as having different operator functions but maybe easier to implement:

If cmp could simply return a value "undefined" if there is no ordering relation between the two objects, then everything would work. All comparison operators would simply have to return false if they encounter that value.

Hauke

September 18, 2003

Re: split up cmp?

Posted by Philippe Mori
in reply to Hauke Duden

Philippe Mori

Posted in reply to Hauke Duden

> Daniel Yokomiso wrote:
> > "cmp" expresses total ordering, which is a relationship between the
ordering
> > operators and the comparison operator. Most people expect total
ordering,
> > IME it's the most common case.
>
> Of course, if you're dealing with such Tuple objects you need to be aware that the ordering is only partial. That is an issue with understanding the program, not an issue with the language. Also, that doesn't change the fact that there is an ordering among the objects which cannot currently be expressed with the comparison operators.
>
> Another possible solution just occurred to me. It is not as clean as having different operator functions but maybe easier to implement:
>
> If cmp could simply return a value "undefined" if there is no ordering relation between the two objects, then everything would work. All comparison operators would simply have to return false if they encounter that value.
>
> Hauke
>

I already think that cmp might be inefficient in some cases... If a fourth
state is added, it will only be worst... because the compiler would have
to check that at every comparison and typically we won't knows
what to do if the result is undefined (in some case all comparison can
returns false, in other throwing an exception would be preferable)

So we would also need that cmp function could returns different classes according to the comparison... or have 2 cmp functions so that the compiler would know if there is an undefined state and what should we do in such a case (I think we should throw an exception!)

In fact, if you throw an exception for your undefined cases
in Tuple sample then default cmp would be adequate and
no language change would be required... and if performance
really matters (and exception are not a solution), you could
uses your own comparison function... and this would have
the benefit that it will better "document" the code...

Another case where undefined exist is when comparing floatting
points from what I understand because of case like NAN.
I think an exception should be normally thrown but in some
case, we need to be able to have a totally orderred answer
and in those case, NAN should probably be just before -INF.

Another case where undefined could be interesting if for mixed (class) type comparison... and in most case I think the best think would be to throw an exception...

I think we should be able to implement some operators and have the compiler provide others... If we provide < and =, the compiler should be able to provide all others and cmp. If we provide cmp, the compiler can provide <, <=, ==, !=, > and >=.

We can provide the operator we want... the compiler choose the better one... Also we should be able to specialize generic algorithm on weither cmp or <,= is defined...

September 18, 2003

Re: split up cmp?

Posted by Hauke Duden
in reply to Philippe Mori

Hauke Duden

Posted in reply to Philippe Mori

Philippe Mori wrote:
> I already think that cmp might be inefficient in some cases... If a fourth
> state is added, it will only be worst... because the compiler would have
> to check that at every comparison and typically we won't knows
> what to do if the result is undefined (in some case all comparison can
> returns false, in other throwing an exception would be preferable)

You shouldn't throw an exception in cmp if the relationship is not defined, because testing for e.g. < is a valid operation. Even if the relationship is not defined, it is still certain that it is not <. So the operator should simply return false.

I think exceptions should only ever be thrown if the two objects are not not of compatible types. And maybe not even then.

About the performance thing: I think a fourth result value wouldn't add any overhead at all.

If you define the result as a bit mask (with convenient constants, of course) everything can be checked with a single compare.

e.g.

enum
{
	UNDEFINED=0
	SMALLER=1
	EQUAL=2
	BIGGER=4
};

Then the < operator could be implemented as
return (cmp() == SMALLER);

and the <= operator as
return (cmp() & (SMALLER | EQUAL));

and so on.

Hauke

September 18, 2003

Re: split up cmp?

Posted by Sean L. Palmer
in reply to Hauke Duden

Sean L. Palmer

Posted in reply to Hauke Duden

We already have such a thing for floating point numbers.  And for floats there are operators defined to test for unordered.

Maybe if cmp returns an enum: less = -1, equal = 0, greater = 1, unordered = 0x80000000

Sean

"Hauke Duden" <H.NS.Duden@gmx.net> wrote in message news:bkc6o9$2kpc$1@digitaldaemon.com...
> Daniel Yokomiso wrote:
> > "cmp" expresses total ordering, which is a relationship between the
ordering
> > operators and the comparison operator. Most people expect total
ordering,
> > IME it's the most common case.
>
> Of course, if you're dealing with such Tuple objects you need to be aware that the ordering is only partial. That is an issue with understanding the program, not an issue with the language. Also, that doesn't change the fact that there is an ordering among the objects which cannot currently be expressed with the comparison operators.
>
> Another possible solution just occurred to me. It is not as clean as having different operator functions but maybe easier to implement:
>
> If cmp could simply return a value "undefined" if there is no ordering relation between the two objects, then everything would work. All comparison operators would simply have to return false if they encounter that value.
>
> Hauke

September 18, 2003

Re: split up cmp?

Posted by Hauke Duden
in reply to Sean L. Palmer

Hauke Duden

Posted in reply to Sean L. Palmer

Sean L. Palmer wrote:
> We already have such a thing for floating point numbers.  And for floats
> there are operators defined to test for unordered.

Really? Is this described anywhere? The docs on digitalmars.com seem to be unfinished when it comes to this:

"""
In addition to the usual < <= > >= == != comparison operators, D adds more that are specific to floating point. These are [blah, blah, blah] and match the semantics for the NCEG extensions to C.

[insert table here]
"""

Doesn't contain all that much information ;).

Hauke

September 18, 2003

Re: split up cmp?

Posted by John Boucher
in reply to Hauke Duden

John Boucher

Posted in reply to Hauke Duden

In article <bkbvpd$2bld$1@digitaldaemon.com>, Hauke Duden says...
>
>I recently thought about the way D handles operator overloading and I noticed that the current implementation will not work well in a mathematical context.
>
>In math you often work with sets of objects that are only partially ordered. You may be able to determine that a < b or b > a, but if none of these two things apply it doesn't necessarily mean that a==b.
>
>For example, consider the set of integer tuples.
>
>The following definitions apply:
>(x1,y1) < (x2,y2) <=> x1<x2 && y1<y2
>(x1,y1) > (x2,y2) <=> x1>x2 && y1>y2
>
>The problem now is that you cannot use D's operator overloading for such tuples because there is no way to implement cmp properly.
>
>Example:
>
>You have the following code:
>
>Tuple t1(2,2);
>Tuple t2(1,3);
>
>if( t1 < t2 )
>	doStuff();
>
>The compiler translates the comparison to a call of t1.cmp(t2).
>
>But how to implement cmp for integer tuples?
>- t1 is not < t2 since 3>2
>- t2 is not > t2 since 1<2
>- but t1 is also != t2 !!
>
>There is not return value defined for such cases.
>
>I really think it is necessary to have separate functions for <, <=, >,
> >= comparisons. That way also partially ordered objects could be
>implemented, since then it would be possible to express "t1 is not < t2" without having to state that "t1 >= t2" or "t1==t2".
>
>Hauke
>
>

I'm pretty sure I don't understand what you're talking about, but then of course
the compiler doesn't either.
Only _you_ know what these values relate to, and I can have another
interpetation. _You_ have to tell me (and the compiler) how these things relate.
To _me_ I'd say that these values are points on the Cartesian Plane and I don't
know which you intend to have be "lesser" or "greater" or if that even applies.
In some instances I could consider that the point closer to the X axis is
"lesser", while in others I could consider that the point closer to the Y axis
is "lesser". It depends on what I want to do with the values. And my program
would specify my intentions, I can't rely on the language to have a construct
for each and every thing I want to do.

The act of programming is to _apply_ the language to a problem. We should not expect (or want) a language to have specialized tools for everything.

<aside>
This also brings to mind a situation I was working on a few weeks ago dealing
with strings and applying greaterthan and lesserthan operators thereto.

It occurred to me that s1 >= s2 if s2 can be "subtracted" from s1. Certainly this is true of numbers, so why not strings?

Given that D defines +, and Perl has both + (well, .) and * (well, x) string
operators, I proceded to define -, /, and %.

I then proceded to define the comparison operators around this concept. Would anyone else in this forum want such a concept in their language? I doubt it, so I can make a string class that does it, but it should _not_ be built into the language.

However, I likewise am unsure that the comparison operators should have been
defined for strings in D the way they are.
With numbers, the comparison operators can be used for two purposes; determining
order, and determining whether or nor subtraction will succeed (without going
negative). But with strings you can't have one operator that does both, so there
should be no operator to do either. It's both or neither for me.
</aside>

<soapbox>
This gets back to the situation with the assignment operator; in most earlier
languages the equal sign was used for both assignment _and_ test for equality
and so the meaning was determined by context -- in an if statement (or other
test)it's a test for equality, otherwise it's assignment.
Now C came along and wanted to allow assignment in tests so they had to have two
separate operators, what they _should_ have done was to define two _new_
operators, one for each meaning and done away with the bare equal sign as an
operator. But they didn't, causing so many bugs and wasted time. And we're still
debating the situation thirty years later! The bare equal sign should be both or
neither.
</soapbox>

John Boucher
The King had Humpty pushed.

September 18, 2003

Re: split up cmp?

Posted by Philippe Mori
in reply to Hauke Duden

Philippe Mori

Posted in reply to Hauke Duden

>
> You shouldn't throw an exception in cmp if the relationship is not defined, because testing for e.g. < is a valid operation. Even if the relationship is not defined, it is still certain that it is not <. So the operator should simply return false.
>

But what will happen in case where everything must be ordered (like it is the case with C++ STL std::less when used for a std::set or std::map for example)? Adding an undefined would cause the undefined item to appears anywhere and the container would be invalid... So anytime someone want to add an item in a sorted container, undefined order is not acceptable...

So IMO, we should not support that forth state for any user defined comparison...

> I think exceptions should only ever be thrown if the two objects are not not of compatible types. And maybe not even then.
>

It should be possible to prevent unintended comparison and IMO by default we should not allows any mixed type comparison...

> About the performance thing: I think a fourth result value wouldn't add any overhead at all.
>

And what do you do when that forth state is not acceptable (IMO
most of the time)?

One case where there would be overhead is in a case like that:

if (a < b)
{
}
else if (a ==b)
{
}
else if (a > b)
{
}
else
{
}

If no undefined state is possible, the compiler can easily optimize the third test away and remove unreacheable code in the last else...

IMO, undefined is a hole in the language as it does not allows to be able to ensure that something is totally ordered. The only solution that I see is to have 2 posibilities:

enum CmpResult { smaller = -1, equal, greater };
enum CmpResultWithUndefined { undefined = -2, smaller, equal, greater };

CmpResult cmp(A a1, A a2)    // Undefined not supported
CmpResultWithUndefined cmp(A a1, A a2)    // Undefined supported

And it would be not possible when cmp is virtual to changes the result type to add it but only to remove it...

> If you define the result as a bit mask (with convenient constants, of course) everything can be checked with a single compare.
>
> e.g.
>
> enum
> {
> UNDEFINED=0
> SMALLER=1
> EQUAL=2
> BIGGER=4
> };
>
> Then the < operator could be implemented as
> return (cmp() == SMALLER);
>
> and the <= operator as
> return (cmp() & (SMALLER | EQUAL));
>
> and so on.
>
> Hauke
>

September 18, 2003

Re: split up cmp?

Posted by Daniel Yokomiso
in reply to Hauke Duden

Daniel Yokomiso

Posted in reply to Hauke Duden

"Hauke Duden" <H.NS.Duden@gmx.net> escreveu na mensagem news:bkc6o9$2kpc$1@digitaldaemon.com...
> Daniel Yokomiso wrote:
> > "cmp" expresses total ordering, which is a relationship between the
ordering
> > operators and the comparison operator. Most people expect total
ordering,
> > IME it's the most common case.
>
> Of course, if you're dealing with such Tuple objects you need to be aware that the ordering is only partial. That is an issue with understanding the program, not an issue with the language. Also, that doesn't change the fact that there is an ordering among the objects which cannot currently be expressed with the comparison operators.
>
> Another possible solution just occurred to me. It is not as clean as having different operator functions but maybe easier to implement:
>
> If cmp could simply return a value "undefined" if there is no ordering relation between the two objects, then everything would work. All comparison operators would simply have to return false if they encounter that value.
>
> Hauke

While I went to work I realized another reason why partial ordering shouldn't be modeled as a single trait. While a object can have at most one total ordering relation, therefore this trait can be modelled as a single method (or set of methods), any object can have multiple partial ordering relations. Using your int pair example, we have:

1. Pairwise
(x1,y1) < (x2,y2)   iff   (x1 < x2) and (y1 < y2)
(x1,y1) > (x2,y2)   iff   (x1 > x2) and (y1 > y2)

2. Sum
(x1,y1) < (x2,y2)   iff   (x1 + y1) < (x2 + y2)
(x1,y1) > (x2,y2)   iff   (x1 + y1) > (x2 + y2)

3. Norm
(x1,y1) < (x2,y2)   iff   sqrt(x1 + y1) < sqrt(x2 + y2)
(x1,y1) > (x2,y2)   iff   sqrt(x1 + y1) > sqrt(x2 + y2)


and so on. IME we should never model something that isn't as an unique trait, in this case a single "cmp" method. A solution would use object views:


a.pairWise < b.pairWise
a.sum < b.sum
a.norm < b.norm


    Best regards,
    Daniel Yokomiso.

"1. Out of clutter, find simplicity.
 2. From discord, find harmony.
 3. In the middle of difficulty lies opportunity."
 - Albert Einstein


---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.518 / Virus Database: 316 - Release Date: 11/9/2003

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation