March 28, 2003
I have to agree with all comments by Sean, Andy, and Patrick.  Like Sean, I'd like a "free" operator overloading mechanism, and the syntax he's showing looks fine to me.

I guess if we had a "free" version, D would still need the method based version.  It fits Walter's vision of C++ programmers feeling very comfortable in D.

Walter, would using the actual operator rather than an equivalent function name complicate D?

Bill

Sean L. Palmer wrote:
> You have got to the root of the problem, Bill.
> 
> There is no way in D to do an overloaded operator without having a class
> member override some specially named member function.  Thus no module-scope
> ("free") operators are possible.
> 
> You're wrong about C++.  It has these.  They are good.  Preferred in fact to
> making the binary operator part of the class.
> 
> The only reason you'd want a binary operator to be part of a class is if it
> needed access to some private section.  In C++ you use friend for this;  in
> D you put the operators in the same module as the class.  (but what if the
> operator needs private access to *two* classes that are in different
> modules?)
> 
> We should be able to define free operators:
> 
> // example free operator overload
> 
> BigInt operator + (BigInt a, BigInt b)
> {
>     return BigInt.add(a, b);
> }
> 
> As you can tell I also dislike the current D syntax for declaring operators.
> You have to use some specially named member function currently, which means
> you have to remember what the name is for each operator you may want to
> overload.  I prefer the direct approach;  If I want an operator I do not
> want it to have a name.  It just clutters up the namespace and litters it
> with landmines;  for instance the above example wouldn't work because member
> add is treated specially and would have already overloaded the + operator.
> 
> There are some unanswered questions in the D design.
> 
> Sean> "Bill Cox" <bill@viasic.com> wrote in message
> news:3E84520F.8080809@viasic.com...
> 
>>Ok, I'll kick in with a very very old point of view on the topic that
>>dates back to the beginning of time...
>>
>>Operator overloading should not be implemented as methods.  They should
>>be implemented as overloaded functions at the module level.  Then, this
>>whole problem goes away.  Besides, mathematically, a binary operator's
>>operands have equal status.  Why pass one to a method of the other?
>>It's just a stupid trick to try to avoid writing module-level routines.
>>
>>The fact that we can't even define the == operator without worrying
>>about the LEFT operand being null (but not the right) clearly shows the
>>flaw in C++ style operator overloading.
>>
>>Bill
>>
> 
> 
> 

March 28, 2003

Matthew Wilson wrote:
> Bill, I think you're agreeing with what I'm saying, sort of. I suggested
> making the eq() and eqi() methods static simply because I'm not aware of a D
> mechanism for doing it outside the class, which is obviously the preferred
> approach.
> 
> I'm not quite sure what you mean wrt C++. In C++ one can define operator ==
> as a class member, or as a free function with or without friend access to
> the class. In my opinion (and I think it is the widely accepted preference)
> free functions without friend access is the preferred approach. Where's the
> flaw?

You're right.  My mistake, I just forgot.  For some reason, I haven't seen use of the free version in a while.  It does seem like the prefered aproach to me.

> "Bill Cox" <bill@viasic.com> wrote in message
> news:3E84520F.8080809@viasic.com...
> 
>>Ok, I'll kick in with a very very old point of view on the topic that
>>dates back to the beginning of time...
>>
>>Operator overloading should not be implemented as methods.  They should
>>be implemented as overloaded functions at the module level.  Then, this
>>whole problem goes away.  Besides, mathematically, a binary operator's
>>operands have equal status.  Why pass one to a method of the other?
>>It's just a stupid trick to try to avoid writing module-level routines.
>>
>>The fact that we can't even define the == operator without worrying
>>about the LEFT operand being null (but not the right) clearly shows the
>>flaw in C++ style operator overloading.
>>
>>Bill
>>
> 
> 
> 

March 28, 2003
Man, this is all too nasty. Also, I'm confused by a lot of what is being said. Muddy stuff.


I'm not a great (language) theorist, rather I take a practical perspective. It seems the following can be / is being said:

1. evaluating x == y with x as null (and in many cases with y also) causes
an access violation

2. syntax using == on object types and primitive types looks the same and feels the same, yet isn't and needn't be the same

3. some people are advocating the use of contracts to guard against testing for equivalence on null references (and other contract transgressions)

4. some are advocating a change of mindset, which will enable those of us
that think (1.) is unacceptable to get in the D (or Smalltalk) way of
thinking.


My points are

A. Practical opposition
-----------------------

You can push away on (4.) until the universe reaches entropy death, but you
are not going to convince large parts of the development community that (2.)
is true. If that is the case, why has the (partial solution) of operator
overloading in C# been done, and why has it been broadly welcomed as a big
(but still incomplete) step up from Java's == vs equals(). So whatever is
theoretically correct - assuming that's determinable - doesn't really
matter. It's what is going to work and appeal that matters.

I know this is not a democracy (nor should it be, otherwise we'll all find D has the consistency and performance efficiency of Boost software), but I think it would be appropriate to gather a collective opinion on this. I'll start:

 "I am primarily, but not exclusively, a C/C++ programmer, and think that
the fact that calling == on null references results in a crash is a wart. It
will be a dissuasive factor against my using the language."

B. D is a C-family language
---------------------------

D is a C-family language. It uses { }. It uses = and ==, not, for example, := and =. etc. etc.

The situation, and what is being promulgated by some in this thread, is an inconsistent indulgence.

int    i1    =    . . .
int    i2    =    . . .

if(i1 == i2)

This compares the values of i1 and i2.

SomeClass o1    =    . . .
SomeClass o2    =    . . .

if(o1 === o2)

This compares the values of o1 and o2. Of course, the "values" are the values of the references, rather than the referenced instances. To do that we should write

if(o1 == o2)

That's inconsistent. However, if it works, it's an inconsistency we not only can live with, but one I think we would all want. It's the "if" that's the issue.

There are three points of view, I think:

(a) The syntax is the same, but the semantics are different, and that is
intended and acceptable. It is the responsibility of the D
programmer/maintainer to be aware of the situation, and cater to its
requirements in their use of the language.
(b) The semantics are different, so the syntax should be different.
(c) The syntax is the same, and the semantics are the same. It is the
responsibility of the compiler to ensure that meaningful comparisons are
carried out even when the dichotomous nature of references from primitives
can lead to "no" object on one or the other side of equivalence expressions.

Obviously we are at (a). I contend that (a) is the only untenable option. If it's different, and we should know it's different, why does it have the same syntax? Is it a trick? Do we get to wear a badge of honour once we've been bitten by it enough times to know - in the same way one learns to spot the use of == rather than equals() in Java (and I was caught out on that in the last couple of days, despite many years of caffienating) - when we see it in the code of an unwary neophyte?

(b) would be to remove === entirely, and disallow the use of == on anything
but primitives (otherwise we'd be just where Java is). Object comparison
would have to be via EqualValue() and EqualRef() methods/globals, or
similar. This would be syntactically ugly, but would at least make sense.

Personally, I like (c). I want the inconsistency, because it makes the langu age succinct, fits with me largely C/C++ background, and also provides the compiler lots of opportunity for fishing out unnecessary null-tests from the generated code. But it has to all make sense, and work in a sensible way. I'm not saying my eq() + eqi() proposal is the way to go - I actually like the idea of non-member operators, or at least static operators - but I am sure that between the many large brains in this newsgroup that a robust, elegant and efficient solution can be achieved.

C. Contracts are great, but the world is full of liars!
---------------------------------------------------

(3.) is great up to a point. D makes it even better with the built-in unittests. It's a fab aid to development, and even more so to maintenance. However, this is all a dressed up assert, and as such can never be valid as the complete answer to broader system correctness. Sure it works in the Win32 API, because it is the standard (albeit a wriggling one) on Win32. We can rely on its crashes in a good way. (Arf arf)

But consider that one is building an application infrastructure that loads, say, COM components that can be sourced from arbitrary providers. Is programming to contract, employing assertions the valid approach here? Is it bollocks. M$ tried that with ISAPI for a vanishingly short time, before wrapping all calls to ISAPI extensions with __try __except, since the server came crashing down with crushing regularity. Now I'm *not* saying that this comment necessarily undercuts all the arguments on this topic about null references (though I do believe it should give all some pause), but I see a worrying degree of confidence throughout many of the threads on this group that suggest that the unittest mechanism is the complete answer to robustness. It is not.


Let's have your "statements", and comments on whether (a), (b) or (c) - with
ideas for impl (c) - is the way to go.

Matthew

P.S. It's very early here, so I will immediately resile from any offensive statement herein with abject and grovelling apologies.



"Matthew Wilson" <dmd@synesis.com.au> wrote in message news:b5r9l2$2r2u$1@digitaldaemon.com...
> In the documentation it says that the expression (a == b) is rewritten as
> a.equals(b).
>
> Given that, what happens if a is null? NullPointerException, or is there a "safe-this" mechanism, ie. a.this is automatically checked against null?
If
> not automatic, is it possible to write something such as
>
> class X
> {
>   int value;
>
>   int eq(X rhs)
>   {
>     if(this === rhs)
>     {
>       return 1;
>     }
>     else if(rhs === null)
>     {
>        return false;
>     }
>     else if(rhs === null)
>     {
>        return false;
>     }
>     else
>     {
>       return this.value == rhs.value;
>     }
>   }
> }
>
> (I expect I've got some syntax, wrong, but presumably you know what I
mean.)
>
> This implementation then supports
>
> X    x1     =     new X(1);
> X    x2    =    new X(3);
> X    x3    =    null;
>
> if(x1 == x2)  // (i)
> {}
> if(x1 == x3)  // (ii)
> {}
> if(x3 == x2)  // (iii)
> {}
>
> If all this is not supported, what would be the result of (iii) - crash?
>
>
>
>
>
>


March 28, 2003
Matthew Wilson wrote:
> SomeClass o1    =    . . .
> SomeClass o2    =    . . .
> 
> if(o1 === o2)
> 
> This compares the values of o1 and o2. Of course, the "values" are the
> values of the references, rather than the referenced instances. 

Why??? Has "another truth" become so obvious for you? More below.

> To do that we should write
> 
> if(o1 == o2)
> 
> That's inconsistent. However, if it works, it's an inconsistency we not only
> can live with, but one I think we would all want. It's the "if" that's the
> issue.

Wait! It is not necessarily inconsistent. You forget that in D an Object is a much more abstract concept than in C++. All of the changes made in D compared to C++ objects are targeted to be (mostly) able to think of an object just as of an "object", not as of a reference. An "object" is a user-defined type which supports Darvin's theory, and i'm not really sure it always has to be implemented as a reference. Anyway, the idea is to abstract away from references, pointers, and such. Yes, it works due to references. But don't we all think of other things when saying something different? And that's what C++ style so-called "consistency" advocates are probably especially proficient at, even to an extent that this lie replaces the truth for them.

Now, you are comparing 2 rabbits. Rabbits are objects. Oh, you don't want to accidentially compare them by their properties instead of their position? I see. ;>

Please don't take this post too seriously. ;) I know that a perfectly ideallistic programmer would never turn to D, as to any C descendant. :>
I have done a more serious post in "identity & equivalence" thread.

-i.

March 28, 2003
> Please don't take this post too seriously. ;) I know that a perfectly ideallistic programmer would never turn to D, as to any C descendant. :> I have done a more serious post in "identity & equivalence" thread.

I can't take it seriously. Mate, I _really_ mean no offence, but I have absolutely no idea what you're talking about. Clear as custard.

It may well my fault, in that I'm no great thinker on language theory and all that, but then very few are, which kind of backs up my point that D should make sense to the masses and not to the theorists.



March 28, 2003
On Wed, 26 Mar 2003 15:17:36 +1100, "Matthew Wilson" <dmd@synesis.com.au> wrote:
>
I am a great admirer of Walter's work (Ever since he personally answered some questions about Zortech C) but I think he's !dead! wrong on this one.

int i;
MyObj o;

1)  i is not a value. i is a symbol that refers to some storage. When I type
   i = 5;
  I am not storing 5 into i, but into the storage that i refers to. One of the glories
of the compiler is that it allows me to ignore this distinction.
Why should not it do the same for o?

2)
   It makes sense to have a pointer that doesn't point to anything, but a reference that doesn't
refer to anything seems illogical. When I type 'MyObj o', am i not creating a reference to
something; Why did I type it then?

3) if r1 and r2 are references, then is 'if (r1==r2)' the same thing as 'if (*r1 == *r2)'
   where r1 and r2 are pointers? What then is the point? Save two asterisks?

4) 'int i' declares a bit of storage  for an int? why does  'MyObject o' just create a symbol
table entry? Are not 'int' and 'MyObj' both datatypes?  The difference here seems
entirely gratuitous.

5) Forcing an exception for uninitialized objects makes as much sense as forcing an exception for an uninitialized int.

I see 2 ways to deal with this:

a) 'MyObj o' creates an actual instance of MyObj. The language could allow access
  to this prototype object or not.

b) 'MyObj o'  creates a reference to a null object. o is immediately a reference to an object (like it claims to be!) but is unlikely to compare with an initialized object.


Karl Bochert




March 28, 2003
> C. Contracts are great, but the world is full of liars!
> ---------------------------------------------------

Contracts _are_ great. The world _is_ full of liars.  I think there a more important reason contracts are not the solution here.  In most cases all you are going to do here is manually crashing a program that would crash anyway. At the same time you are probably annoyingly moving the crash away from the line of code where the problem is.  Contracts might be useful for things like ensuring a function doesn't return a null, but this isn't really the problem we are trying to address here.

> SomeClass o1    =    . . .
> SomeClass o2    =    . . .
>
> if(o1 === o2)
>
> This compares the values of o1 and o2. Of course, the "values" are the values of the references, rather than the referenced instances. To do that we should write
>
> if(o1 == o2)
>
> That's inconsistent. However, if it works, it's an inconsistency we not
only
> can live with, but one I think we would all want. It's the "if" that's the issue.

Life would definitely be easier if we weren't all such rotten creatures of habit :-)

> (a) The syntax is the same, but the semantics are different, and that is
> intended and acceptable. It is the responsibility of the D
> programmer/maintainer to be aware of the situation, and cater to its
> requirements in their use of the language.
> (b) The semantics are different, so the syntax should be different.
> (c) The syntax is the same, and the semantics are the same. It is the
> responsibility of the compiler to ensure that meaningful comparisons are
> carried out even when the dichotomous nature of references from primitives
> can lead to "no" object on one or the other side of equivalence
expressions.
>
> Obviously we are at (a). I contend that (a) is the only untenable option.

I  have to agree.

> (b) would be to remove === entirely, and disallow the use of == on
anything
> but primitives (otherwise we'd be just where Java is). Object comparison
> would have to be via EqualValue() and EqualRef() methods/globals, or
> similar. This would be syntactically ugly, but would at least make sense.

Ugliness is a big turnoff in a language though.  At least it is for me, and I suspect I'm not the only one.  It's completely irrational, and I'm sure I miss a lot of neat things, but there it is.

> Personally, I like (c). I want the inconsistency, because it makes the
langu
> age succinct, fits with me largely C/C++ background,
[break]
> But it has to all make sense, and work in a sensible way.
> I actually like the idea of non-member operators.

I'm forced to agree.  It's the least evil option available :-)

I have a question though, and I hope it's not to stupid:-).  This entire problem is created by the idea of object as references that has become so popular lately.  What is the rationale behind the object as reference idea as opposed to the C++ idea of objects as objects?  To me the C++ way seems much simpler and more consistent, even with the pointers that are required for any kind of runtime object creation.  It also seems like the C++ way has more potential for offloading a lot of the work that needs to be done from runtime to compile time, resulting in much better performance.



March 28, 2003
Jon Allen wrote:
> I have a question though, and I hope it's not to stupid:-).  This entire
> problem is created by the idea of object as references that has become so
> popular lately.  What is the rationale behind the object as reference idea
> as opposed to the C++ idea of objects as objects?  To me the C++ way seems
> much simpler and more consistent, even with the pointers that are required
> for any kind of runtime object creation.

It's rather the opposite: in D objects supposed to be "just objects", while in C++ you always have underlying implementation issues. That's also the reason, why default is comparison by content in D, and not by reference.

C++ way being simpler or more consistent is IMO a hallucination.

> It also seems like the C++ way has
> more potential for offloading a lot of the work that needs to be done from
> runtime to compile time, resulting in much better performance.

What makes you think that? [-Eliza]

-i.

March 29, 2003
"Ilya Minkov" <midiclub@8ung.at> wrote in message news:b62neg$1iai$1@digitaldaemon.com...
> Jon Allen wrote:
> > I have a question though, and I hope it's not to stupid:-).  This entire problem is created by the idea of object as references that has become
so
> > popular lately.  What is the rationale behind the object as reference
idea
> > as opposed to the C++ idea of objects as objects?  To me the C++ way
seems
> > much simpler and more consistent, even with the pointers that are
required
> > for any kind of runtime object creation.
>
> It's rather the opposite: in D objects supposed to be "just objects", while in C++ you always have underlying implementation issues. That's also the reason, why default is comparison by content in D, and not by reference.

That's patently false.  Take the simplest possible case (same syntax for declaring two objects declared in both cases):

MyObj a;
MyObj b;

Now try to work with them.  Comparisons are what this is thread is about so we'll do that:

if(a==b)
{
    //do stuff
}

Both languages say this should compare the content of the variables.

Does this work in C++? Yes.  Always.  No exceptions (okay it might not
always _do_ what you want, but that's something D has to worry about too).
No concern about the underlying implementation.  I don't know, don't care
whether the compiler made them references, threw them directly on the stack,
or did anything else with them.
Does it work in D?  Maybe.  To make sure that it works you have to concern
yourself with the fact that a and b don't describe the actual type that I
want, but that they are in fact cleverly disguised pointers to those
objects.  Underlying implementation becomes very important here and, to my
eyes, needlessly complicates my life.

> > It also seems like the C++ way has
> > more potential for offloading a lot of the work that needs to be done
from
> > runtime to compile time, resulting in much better performance.
>
> What makes you think that? [-Eliza]

Why do we have stacks and data segments instead of just a heap?  The data segment can be mapped out at compile time, leaving little to be done at runtime for statics, but D objects can't be included here because an object declaration may refer to any number of object instances, even though it probably will only ever refer to one.  It's even tough (sometimes impossible) to say that it'll ever point to anything at all.

It's also much cheaper to throw crap on the stack than it is to put it in the heap(not to mention easier), but putting objects on the stack in D either takes a pretty impressive compiler, or a lot of work on the user's end which will further expose implementation details and cut portability.

How about the extra step the CPU has to take to dereference the link to that stuff we want to use?  I realize that most classes will probably be doing this a million times anyway from the vtable alone, but I think we all want to see D kick the pants off C++ in speed at least, and every little bit helps.

In C++ these things are no problem, and it's because pointers are pointers and variables are variables.  As opposed to the variables are sometimes pointers, sometimes variables idea.  That's not to mention the likelyhood of forgetting those "new"s at least once sometime in any decently sized project, no matter how seasoned you are.  Bugs are performance problems too.

I can also see it taking a lot off the hands of the GC, since it would make it much easier to free a good percentage of data.  It's a lot faster to move a stack pointer or the like than it is to mark and sweep the entire heap.



March 29, 2003
> Ugliness is a big turnoff in a language though.  At least it is for me,
and
> I suspect I'm not the only one.  It's completely irrational, and I'm sure
I
> miss a lot of neat things, but there it is.

For me also. However, given the choice between ugliness and prettiness_with_inconsistency, then it has to be ugliness. (I keep calling it inconsistency, since the collegial nature of the Digital Mars newsgroups and my own good manners prevents me from using more vivid language.)

To digress for a moment, I am actually intrigued as to why this issue has me so hot under the collar. It's been precipitated by my being nearly all the way through a different article I'm writing about how to efficiently override C#'s Equals() method, in which I observe the nonsensical ways in which both Java and, to a lesser extent, C# handle the identity/equivalent issue, and I had written that D had the issue solved. That prompted the earlier post which then led Farmer to give me the news of the grim reality that we've been debating since.

I don't think this can explain it all, however. I think that, having been away from the D newsgroup for some months, now that I've dived back into D (and will be having more to do with it for a few different writing projects in the foreseeable future) it seems to be much closer to a _real_ language. Last year I thought it was kind of interesting in an intellectual way, but now I can actually imagine doing significant amounts of my work in D and, though I'll be shot by my C++ friends, even preferring D to C++ for some non-trivial tasks.

> I'm forced to agree.  It's the least evil option available :-)
>
> I have a question though, and I hope it's not to stupid:-).  This entire problem is created by the idea of object as references that has become so popular lately.  What is the rationale behind the object as reference idea as opposed to the C++ idea of objects as objects?  To me the C++ way seems much simpler and more consistent, even with the pointers that are required for any kind of runtime object creation.  It also seems like the C++ way
has
> more potential for offloading a lot of the work that needs to be done from runtime to compile time, resulting in much better performance.

The answers to that one are endless, I fear.

I think some of it has to do with basing D on GC, which is arguably a good thing, as all the objects then have to be heap-based. This means references, or pointers, or whatever name you want to give the concept, but it means that the thing can be null.

C++ is arguably more flexible, in providing for both types. (And this is recognised by C# in that they place what they call value types on the stack, even though they still new() them. Bit Yucky.) But C++ also has lots of problems as a result of that flexibility.

The fact is, there is no perfect world in S/E, just as in anything else. I am increasingly persuaded that D is a very good local maxima. It may well soon become the most "sensible" language, at least that I'm aware of. But it will never be the perfect language, because such a thing is impossible. There are valid, and widely used, programming paradigms/facilities that are simply incongruent, so all languages must make a compromise in their support for them. (Of course, there are facilities that are omitted by mistake, nothing else. Java has more of these than any language I can think of - anyone think of a reason why Java cannot have const, for example? - but all suffer from them. C++ doesn't allow local functions. C# and Java don't support RAII, and we may justly scoff at them for it. C has all that struct/enum namespace guff.)

Thus, the issue we're talking about is a good example of the great language debate. == in D is a fudge, a con, a trick of the compiler's light, but it is one that we want, since we want to be able to use == on the values of references (for convenience), and we will likely want it for genericity within templates (for power). Assuming everyone can admit that, we can further admit that neither side is correct in an absolute sense.

Given that, we have to consider whether D's == operator serves the aims of D, and its likely success:

1. Is it likely to lead to more or fewer errors in its current guise?
2. Does it serve to encourage/discourage the use of D with C/C++ users? With
Java users? C#? Smalltalk? Delphi? etc.
3. Would changing it unnecessarily complicate the compiler?
4. Can it be altered to include identity in an efficient way?

These questions, apart from (3.), cannot be answered by a single poster to the group. It's something we should all opine on, such that a decision can be made.

I was thinking about this last night, and wondering whether it really was important enough to have stimulated all this debate (a lot of it from me, I grant you). I was thinking through the languages I know, and what they did.

C doesn't have the issue, because things are either pointers or they are
values.
C++ doesn't have it because references cannot be null (unless you're the
victim of some perverse pointer dereferencing trickery...)
Java is the signal example of this issue, and takes the position that
equals() must be applied to all object instance references, and == to all
primitives. This confounds the inexperienced a lot, and still the
experienced occasionally.
I think C# is where we need to look for the answer. They've left out, or
messed up, a great many concepts that C/C++ programmers hold dear,
presumably due to the pressures of keeping the language simple and of
getting it all developed in a reasonable time. However, they've taken the
effort to address the Java inconsistency, in providing overloading for ==.
We have to ask why? It's not just syntactic sugar, to win over C++ diehards,
because it's still nearly as painfully verbose as Java in other aspects. Can
anyone suggest a reason beyond that they don't want to suffer from Java's
identity problem (arf, arf)?

Further evidence that that is their intent is the fact that operators == and != are declared as static members of the class. This allows comparison with instances on one or the other or both sides of an expression, and it also protects against x == y with x (or y) as null. The advised implementation of == is

public static bool operator ==(MyClass x1, MyClass x2)
{
    return Object.Equals(x1, x2);
}

Object's static Equals(o1, o2) tests x1 and x2 for null, before calling the
instance Equals() on one of the operands.

So again, why have the promulgators of VB, ASP, Fox, MFC, and all the other dumbed down and offensively simplistic guff spent effort on effecting this sophistication in C#? It's not the last bit of polish of their otherwise perfect language. It's not to win over C++ programmers, who have many other areas on which to wince. I can only surmise that they deemed it sufficiently important, and I suggest that we should do so for D.

Yours enthusiastically

Matthew