split up cmp? (page 2) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » D » split up cmp? (page 2)

September 19, 2003

Re: split up cmp?

Posted by Sean L. Palmer
in reply to Philippe Mori

Sean L. Palmer

Posted in reply to Philippe Mori

Check out std::multimap in C++ STL.

Some would want to #define LARGER BIGGER

Maybe there should be a more direct way of testing and setting flags than & and | operators.  Either some kind of test, set, and reset operators or maybe bitfields.  Or some kind of special flags type that's alot like an enum.

Sean

"Philippe Mori" <philippe_mori@hotmail.com> wrote in message news:bkdccj$18fj$1@digitaldaemon.com...
>
> >
> > You shouldn't throw an exception in cmp if the relationship is not defined, because testing for e.g. < is a valid operation. Even if the relationship is not defined, it is still certain that it is not <. So the operator should simply return false.
> >
>
> But what will happen in case where everything must be ordered (like it is the case with C++ STL std::less when used for a std::set or std::map for example)? Adding an undefined would cause the undefined item to appears anywhere and the container would be invalid... So anytime someone want to add an item in a sorted container, undefined order is not acceptable...
>
> So IMO, we should not support that forth state for any user defined comparison...
>
> > I think exceptions should only ever be thrown if the two objects are not not of compatible types. And maybe not even then.
> >
>
> It should be possible to prevent unintended comparison and IMO by default we should not allows any mixed type comparison...
>
> > About the performance thing: I think a fourth result value wouldn't add any overhead at all.
> >
>
> And what do you do when that forth state is not acceptable (IMO
> most of the time)?
>
> One case where there would be overhead is in a case like that:
>
> if (a < b)
> {
> }
> else if (a ==b)
> {
> }
> else if (a > b)
> {
> }
> else
> {
> }
>
> If no undefined state is possible, the compiler can easily optimize the third test away and remove unreacheable code in the last else...
>
> IMO, undefined is a hole in the language as it does not allows to be able to ensure that something is totally ordered. The only solution that I see is to have 2 posibilities:
>
> enum CmpResult { smaller = -1, equal, greater };
> enum CmpResultWithUndefined { undefined = -2, smaller, equal, greater };
>
> CmpResult cmp(A a1, A a2)    // Undefined not supported
> CmpResultWithUndefined cmp(A a1, A a2)    // Undefined supported
>
> And it would be not possible when cmp is virtual to changes the result type to add it but only to remove it...
>
> > If you define the result as a bit mask (with convenient constants, of course) everything can be checked with a single compare.
> >
> > e.g.
> >
> > enum
> > {
> > UNDEFINED=0
> > SMALLER=1
> > EQUAL=2
> > BIGGER=4
> > };
> >
> > Then the < operator could be implemented as
> > return (cmp() == SMALLER);
> >
> > and the <= operator as
> > return (cmp() & (SMALLER | EQUAL));
> >
> > and so on.
> >
> > Hauke
> >
>
>

September 19, 2003

Re: split up cmp?

Posted by Hauke Duden
in reply to Philippe Mori

Hauke Duden

Posted in reply to Philippe Mori

Philippe Mori wrote:
> But what will happen in case where everything must be ordered (like
> it is the case with C++ STL std::less when used for a std::set or
> std::map for example)? Adding an undefined would cause the
> undefined item to appears anywhere and the container would be
> invalid... So anytime someone want to add an item in a sorted
> container, undefined order is not acceptable...

I believe that is not really an issue. If a container requires total ordering then you cannot use it with partially ordered objects or else the results are undefined.

It is up to the programmer to make sure that an object complies with the requirements of storage classes. For example, I could easily implement a class whose < operator always returns false. That would probably have an adverse effect on container classes. However, does that mean that the language has to have a rule that makes it impossible to return false from a < operator?

It is up to the programmer to ensure that the components he uses work together. That is not the job of the language.

>>About the performance thing: I think a fourth result value wouldn't add
>>any overhead at all.
>>
> 
> And what do you do when that forth state is not acceptable (IMO
> most of the time)?

What do you mean? It is not a problem at all to implement the comparison operators with an undefined state (see my previous post).

> One case where there would be overhead is in a case like that:
> 
> if (a < b)
> {
> }
> else if (a ==b)
> {
> }
> else if (a > b)
> {
> }
> else
> {
> }
> 
> If no undefined state is possible, the compiler can easily optimize the
> third test away and remove unreacheable code in the last else...

If you write code like that then obviously you think that you are dealing with partially ordered objects. Otherwise, what would you write in that last else-block?

> IMO, undefined is a hole in the language as it does not allows to be
> able to ensure that something is totally ordered. The only solution
> that I see is to have 2 posibilities:

Why does total ordering have to be a requirement? It is a fact that not all kinds of objects can have a total ordering.

Hauke

September 19, 2003

Re: split up cmp?

Posted by Hauke Duden
in reply to Daniel Yokomiso

Hauke Duden

Posted in reply to Daniel Yokomiso

Daniel Yokomiso wrote:
> While I went to work I realized another reason why partial ordering
> shouldn't be modeled as a single trait. While a object can have at most one
> total ordering relation, therefore this trait can be modelled as a single
> method (or set of methods), any object can have multiple partial ordering
> relations. Using your int pair example, we have:

Actually, objects can also have multiple total orderings. The best example for this are strings, which can be ordered by different criteria. Digits could be regarded as smaller or bigger than alpha characters, for example. This is also not a theoretical thing, by the way. Different countries DO use different default string orderings.

It is up to the programmer to decide which possible definition for <,=,> he uses. That is true for partially ordered objects as well as for totally ordered ones.

Hauke

September 19, 2003

Re: split up cmp?

Posted by Daniel Yokomiso
in reply to Hauke Duden

Daniel Yokomiso

Posted in reply to Hauke Duden

In article <bkek1a$4q3$1@digitaldaemon.com>, Hauke Duden says...
>
>Daniel Yokomiso wrote:
>> While I went to work I realized another reason why partial ordering shouldn't be modeled as a single trait. While a object can have at most one total ordering relation, therefore this trait can be modelled as a single method (or set of methods), any object can have multiple partial ordering relations. Using your int pair example, we have:
>
>Actually, objects can also have multiple total orderings. The best example for this are strings, which can be ordered by different criteria. Digits could be regarded as smaller or bigger than alpha characters, for example. This is also not a theoretical thing, by the way. Different countries DO use different default string orderings.

I know this I live in a country with different default string ordering ;)

But I disagree with your example. Strings are sequences of characters, and characters are encodings, so an ASCII string is different from a EBCDIC string, even if they both are printed as "abcde". So the ordering is correctly defined, sequences are ordered by element and by length, while the characters are ordered according to the regional rules.

You are right about possible multiple total orderings (but I can't think of an example now), but that is a problem of every system that contains a subtype relationship. An example is the hierarchy of number types, usually defined as:

complex <- real <- rational <- integer

This "natural" relation isn't the only one possible. We can map the integer "2" to the rational "2", but we also can map the same integer to the rational "1/2". It's a possible morphism, as are others. If we want subtyping we want an implicit "natural" morphism, and we choose the one that is more convenient.

Ditto for total orderings, while theoretically we can have multiple total-orderings we choose the most common one as the default. I don't think that we can decide which partial ordering relation is the most common. For example IMO tuples of ordered elements should be ordered by element (giving a total ordering), as sequences are, instead of using another relation. If you want different semantics use a different type (e.g. define a pair or a point type).

>It is up to the programmer to decide which possible definition for <,=,> he uses. That is true for partially ordered objects as well as for totally ordered ones.

As this kind of definition is an implicit assumption, IMO we shouldn't change the expected "not(a <= b) implies (a > b)" property of the ordering operators. C# has a similar thing, you can overload true and false for your class, so it's possible to use a code where "(b != true) implies not(b)" doesn't hold. There's an quick explanation about it from the horse's mouth: <http://www.devhood.com/messages/message_view-2.aspx?thread_id=72626>. IMO this is breaks lots of assumptions programmers have about how some piece of code should work, and the same goes for partial ordering operators.

>Hauke

Best regards,
Daniel Yokomiso.

"Projects promoting programming in "natural languages" are intrinsically doomed
to fail."
- Edsger W. Dijkstra

September 19, 2003

Re: split up cmp?

Posted by Hauke Duden
in reply to Daniel Yokomiso

Hauke Duden

Posted in reply to Daniel Yokomiso

Daniel Yokomiso wrote:
>>Actually, objects can also have multiple total orderings. The best example for this are strings, which can be ordered by different criteria. Digits could be regarded as smaller or bigger than alpha characters, for example. This is also not a theoretical thing, by the way. Different countries DO use different default string orderings.
> 
> 
> I know this I live in a country with different default string ordering ;)
> 
> But I disagree with your example. Strings are sequences of characters, and
> characters are encodings, so an ASCII string is different from a EBCDIC string,
> even if they both are printed as "abcde". So the ordering is correctly defined,
> sequences are ordered by element and by length, while the characters are ordered
> according to the regional rules.

I didn't just mean different character sets, but also different opinions about which characters are smaller than others.

Usually there are two ways to compare a string. One is to compare the character indices. This is often inadequate when the ordering becomes visible to the user (e.g. in graphical lists and stuff like that). The other is often called "collate" and it applies custom, locale-specific rules as to which characters come first.

That is what I meant. Strings DO have different total orderings and they ARE used in practice. This is nothing inherent to partial orderings, so it cannot be used as an argument as to why partial order shouldn't be supported.

> You are right about possible multiple total orderings (but I can't think of an
> example now), but that is a problem of every system that contains a subtype
> relationship. An example is the hierarchy of number types, usually defined as:
> 
> 
> complex <- real <- rational <- integer
> 
> 
> This "natural" relation isn't the only one possible. We can map the integer "2"
> to the rational "2", but we also can map the same integer to the rational "1/2".
> It's a possible morphism, as are others. If we want subtyping we want an
> implicit "natural" morphism, and we choose the one that is more convenient. 
> 
> Ditto for total orderings, while theoretically we can have multiple
> total-orderings we choose the most common one as the default. I don't think that
> we can decide which partial ordering relation is the most common.

Well, this really has nothing to do with the issue of wether partial ordering should be supported or not (as I have pointed out above), but nevertheless:

The fact that an ordering is partial or not has no influence on wether there is a specific "canonical" ordering. As I tried to point out before, there are objects that also have different natural total orderings, depending on where you live.

And apart from that, I also think that partially ordered objects often DO have a natural ordering as well. For example, the usual ordering for (x,y) tuples is (x1,y1) < (x2,y2) if  x1<y1 && x2<y2. There are others (as has been pointed out before) but this one is common enough to be seen as a "default".

> As this kind of definition is an implicit assumption, IMO we shouldn't change
> the expected "not(a <= b) implies (a > b)" property of the ordering operators.
> C# has a similar thing, you can overload true and false for your class, so it's
> possible to use a code where "(b != true) implies not(b)" doesn't hold. There's
> an quick explanation about it from the horse's mouth:
> <http://www.devhood.com/messages/message_view-2.aspx?thread_id=72626>. IMO this
> is breaks lots of assumptions programmers have about how some piece of code
> should work, and the same goes for partial ordering operators.

Hmmm. I think this is another good point as to why partial orderings should be supported ;).

Maybe the problem is just that we have different basic assumptions about the comparison operators:

I think if a < b yields true then it means "a is smaller than b". And more importantly, if a < b yields false it means "a is not smaller than b" and nothing else.

For you, it seems that a<b == false also means "a is bigger or equal to b". I.e. you automatically apply rules that work for integers.

Maybe it is just that I have more of a mathematical background, but for me that last assumption is not automatically true, unless someone has proven that is so. Or, in a programming context, unless the documentation says so or it is immediately obvious from the type of the object.

Hauke

September 19, 2003

Re: split up cmp?

Posted by Philippe Mori
in reply to Hauke Duden

Philippe Mori

Posted in reply to Hauke Duden

> > But what will happen in case where everything must be ordered (like it is the case with C++ STL std::less when used for a std::set or std::map for example)? Adding an undefined would cause the undefined item to appears anywhere and the container would be invalid... So anytime someone want to add an item in a sorted container, undefined order is not acceptable...
>
> I believe that is not really an issue. If a container requires total ordering then you cannot use it with partially ordered objects or else the results are undefined.
>
> It is up to the programmer to make sure that an object complies with the requirements of storage classes. For example, I could easily implement a class whose < operator always returns false. That would probably have an adverse effect on container classes. However, does that mean that the language has to have a rule that makes it impossible to return false from a < operator?
>
> It is up to the programmer to ensure that the components he uses work together. That is not the job of the language.
>

I do not completly agree with that.. In a language that support design by
contract, I think we should be able to prevent the instanciation of a class
that would not satisfy one container or algorithm requirement if they
can be expressed...

So IMO, we should be able to know from any type which type of comparison are supported:

a) no comparison at all.
b) equality comparison only (equal or not equal, no ordering)
c) fully ordered comparisons.
d) partially ordered comparisons.

Knowing that, it is possible to write a map style container that would be optimized but works properly for any types that support comparison:

a) Error at instanciation - we cannot compare the key
b) Slow lineary search
c) Fast binary search
d) This might be a compromize between b and c
as we would probably be able to uses fast search
if defined and equality comparison otherwise...

> >>About the performance thing: I think a fourth result value wouldn't add any overhead at all.
> >>
> >
> > And what do you do when that forth state is not acceptable (IMO
> > most of the time)?
>
> What do you mean? It is not a problem at all to implement the comparison operators with an undefined state (see my previous post).
>

If I code a container that require fully ordered comparison, I would
not be able to detect thios a compile-time but only at run-time by
an additional test... while currently it is generally assumes that we do
not have an undefined state...

> > One case where there would be overhead is in a case like that:
> >
> > if (a < b)
> > {
> > }
> > else if (a ==b)
> > {
> > }
> > else if (a > b)
> > {
> > }
> > else
> > {
> > }
> >
> > If no undefined state is possible, the compiler can easily optimize the third test away and remove unreacheable code in the last else...
>
> If you write code like that then obviously you think that you are dealing with partially ordered objects. Otherwise, what would you write in that last else-block?
>

The point is that with the addition of the undefined state, the compiler
can do less validation... but in fact this is not really true since the
compiler knows predefined types (like int) and can still warn in such
case and for user type it might even be able to validate it in some
case.. and if not (as it would be the case in C++ with the defiition
is not visible), the compiler would not have known anyway... but
might still warn if it assumes for warning purpose that operators
do what they should and no side-effect occurs (but this would
be more a check for a lint style tool)...

> > IMO, undefined is a hole in the language as it does not allows to be able to ensure that something is totally ordered. The only solution that I see is to have 2 posibilities:
>
> Why does total ordering have to be a requirement? It is a fact that not all kinds of objects can have a total ordering.
>

It is not a requirement to have total ordering... I only want to be able to know if a type pretent to support total ordering or not...

IMO, this can be done by using a distinct return type for both cases
(or maybe 3 or 4 cases if we consider the other two (a and b above)).
In fact this would be a case where extensible enumeration would make
sens (can we extent enum in D - I was thinking so but I do see it in doc
so I think it might be a suggestion I have seen in this newgroup) or
it might some predefined type that have some special rules...

Also, for mixed type comparisons, generally, I do not want to allows
them and/or returns undefined... This might also be decided from
a base class how to handle such comparisons...

    class A;
    class B : A;

    A a;
    B b;

    a < b; // or a == b;

Here I like to be able from class A to choose how mixed type
comparison are handled. I was thinking on using final or virtual
to decide but I realize that both concept are essentially independant.

So IMO, we should either uses a specific return type to tell the
compiler what to do or have some attribute modifiers for the
function to tell about its properties or uses different functions
names or have some class attribute for that  (and the compiler
would knows which conversions are valids... and would
even be able to throw an exception if desired or not
compile...). And I would also like if the compiler allows
us to define < style operator and cmp style one (and
generate the other as required)

Also, in C++, one can do some type_trait for such things
although it is cumbersome... so if the programmer and
the compiler can give more informations on the properties
of types and methods, then it is possible to make the
compilation fails if we want so or select an appropriate
algorithm by specialisation.

Here since D does not support multiple inheritance, we cannot easily associate some trait to a class that way...

September 19, 2003

Re: split up cmp?

Posted by Philippe Mori
in reply to Hauke Duden

Philippe Mori

Posted in reply to Hauke Duden

>
> And apart from that, I also think that partially ordered objects often
> DO have a natural ordering as well. For example, the usual ordering for
> (x,y) tuples is (x1,y1) < (x2,y2) if  x1<y1 && x2<y2. There are others
> (as has been pointed out before) but this one is common enough to be
> seen as a "default".
>

IMO, the default should be :

    < when x1 < x2 || x1 == x2 && y1 < y2
    == when x1 == x2 && y1 == y2
    > otherwise (i.e. when x1 > x2 || x1 == x2 && y1 > y2)

This is easy to implement. Do the check for x. If the result is 0 (equal)
compare y.

>
> > As this kind of definition is an implicit assumption, IMO we shouldn't
change
> > the expected "not(a <= b) implies (a > b)" property of the ordering
operators.
> > C# has a similar thing, you can overload true and false for your class,
so it's
> > possible to use a code where "(b != true) implies not(b)" doesn't hold.
There's
> > an quick explanation about it from the horse's mouth: <http://www.devhood.com/messages/message_view-2.aspx?thread_id=72626>.
IMO this
> > is breaks lots of assumptions programmers have about how some piece of
code
> > should work, and the same goes for partial ordering operators.
>
> Hmmm. I think this is another good point as to why partial orderings should be supported ;).
>

I think that operator should always have the expected meaning and the compiler should have the right to validate it if possible and issues a warning if the test is done incorrectly since the compiler can often validate the code and this will prevent some bugs in complex comparisons...


> Maybe the problem is just that we have different basic assumptions about the comparison operators:
>
> I think if a < b yields true then it means "a is smaller than b". And more importantly, if a < b yields false it means "a is not smaller than b" and nothing else.
>
> For you, it seems that a<b == false also means "a is bigger or equal to b". I.e. you automatically apply rules that work for integers.
>
> Maybe it is just that I have more of a mathematical background, but for me that last assumption is not automatically true, unless someone has proven that is so. Or, in a programming context, unless the documentation says so or it is immediately obvious from the type of the object.
>

But this is what it is expected from most programmer and I think that
the programmer should be able to express is intent through code
instead of documentation as this allows better diagnostic from
the compiler, bugs prevention and code that is documentedby itself
will probably have a documentation that is more correct than a
comment inserted by a developper and not modified laster when the
function changes...

So I thing that we should be able to associate an attribute with
a class and/or with comparing function to indicate what kind
of ordering it offer...

And we could support multiple ordering functions that would be recognized as such and for which we will knows the ordering property. For partial ordering, it would be required to add the modifier (otherwise, it would not be possible to returns the undefined state). Does we have a syntax for such attibutes in D?

int total_ordering cmp(...);    // Total ordering --cannot returns undefined
int partial_ordering cmp(...);

A class cannot have both type with the same name but might have many comparison functions:

int total_ordering cmp_case_insentitive(...);
int total_ordering cmp_c_locale(...);
int total_ordering cmp_international_with_number(...);

total_ordering would be the default for cmp function
and other functions would not have such attribute by default.

It would also be possible to query at compile-time those
attributes such that we can make a sort function that
accept only total_ordering functions and issues a compile
time error (using static assert) if we try to instanciate with
the wrong type (since we would have different specialisation
(again, implicit template instanciation would be usefull -
if not fully automatic at least by a keyword instanciate
and automatic type detection).

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation