null == o? (page 4) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » D » null == o? (page 4)

March 29, 2003

Posted by Matthew Wilson
in reply to Sean L. Palmer

Matthew Wilson

Posted in reply to Sean L. Palmer

Agreed.

Although I can understand that Walter may say it's easier for the compiler to parse a named function than the C++'s operator syntax.

If that's the objection, then why not have the python model? There are a limited number of operators in the character set, all of which can be named.

Your example would then be

BigInt __op_add__ (BigInt a, BigInt b)
{



"Sean L. Palmer" <seanpalmer@directvinternet.com> wrote in message news:b624r1$14db$1@digitaldaemon.com...
> You have got to the root of the problem, Bill.
>
> There is no way in D to do an overloaded operator without having a class member override some specially named member function.  Thus no
module-scope
> ("free") operators are possible.
>
> You're wrong about C++.  It has these.  They are good.  Preferred in fact
to
> making the binary operator part of the class.
>
> The only reason you'd want a binary operator to be part of a class is if
it
> needed access to some private section.  In C++ you use friend for this;
in
> D you put the operators in the same module as the class.  (but what if the operator needs private access to *two* classes that are in different modules?)
>
> We should be able to define free operators:
>
> // example free operator overload
>
> BigInt operator + (BigInt a, BigInt b)
> {
>     return BigInt.add(a, b);
> }
>
> As you can tell I also dislike the current D syntax for declaring
operators.
> You have to use some specially named member function currently, which
means
> you have to remember what the name is for each operator you may want to overload.  I prefer the direct approach;  If I want an operator I do not want it to have a name.  It just clutters up the namespace and litters it with landmines;  for instance the above example wouldn't work because
member
> add is treated specially and would have already overloaded the + operator.
>
> There are some unanswered questions in the D design.
>
> Sean
>
> "Bill Cox" <bill@viasic.com> wrote in message news:3E84520F.8080809@viasic.com...
> > Ok, I'll kick in with a very very old point of view on the topic that dates back to the beginning of time...
> >
> > Operator overloading should not be implemented as methods.  They should be implemented as overloaded functions at the module level.  Then, this whole problem goes away.  Besides, mathematically, a binary operator's operands have equal status.  Why pass one to a method of the other? It's just a stupid trick to try to avoid writing module-level routines.
> >
> > The fact that we can't even define the == operator without worrying about the LEFT operand being null (but not the right) clearly shows the flaw in C++ style operator overloading.
> >
> > Bill
> >
>
>

March 29, 2003

Posted by Antti Sykari
in reply to Jon Allen

Antti Sykari

Posted in reply to Jon Allen

"Jon Allen" <jallen@minotstateu.edu> writes:

>> C. Contracts are great, but the world is full of liars!
>> ---------------------------------------------------
>
> Contracts _are_ great. The world _is_ full of liars.  I think there a more important reason contracts are not the solution here.  In most cases all you are going to do here is manually crashing a program that would crash anyway. At the same time you are probably annoyingly moving the crash away from the line of code where the problem is.

My impression is that contracts bring the crash to precisely where the
problem is.  Had the contract not been verified, the program might
crash soon (if you're lucky), unexpectedly after a long time (causing
a fun weekend of bug-hunting), or not at all (even worse).

-Antti

March 29, 2003

Posted by Antti Sykari
in reply to Matthew Wilson

Antti Sykari

Posted in reply to Matthew Wilson

I found the discussion about references confusing, so here's a short summary of how different languages handle equality. I include Java, C++, C# and Eiffel.  Eiffel, because it also includes the notion of "deep equality", which shows that the value/reference equality thing is not everything.  Eiffel also seems to have a lot of careful thought behind its design.

Leaving aside function pointers, delegates, strings, arrays and other borderline cases, there are two kinds of variables:

- those with value semantics (D: basic types and structs, objects
  in C++ when treated as objects instead of pointers to objects. Primitive
  types in Java. Value types in C#. Expanded types in Eiffel.)
- those with reference semantics (objects in C#, Java, D, Eiffel).

Then there a couple of useful and often-used equality comparison methods. (And infinitely many others - you can always roll our own if you need a customized equivalence relation between objects and/or values.) Common ones that I can think of are:

1) reference equality: does X refer to the same object as Y?
 - "===" in D, "==" in Java/C# and C++ (if object is a pointer) "=" in Eiffel
 - if a reference type, compare the pointers
 - if a value, then use method 2 (logically, this is nonsense, since
   two different integers never refer to the same integer unless they
   really are the same integer - but so the common thinking seems to go)

2) value equality: does X have the same value as Y?
 - if it's a value, bitwise (or user-defined) comparison
    - "==" in D, Java/C#/C++, "=" in Eiffel
 - if it's a reference type:
   - "==" in C++ (if object is a reference or the thing itself)
   - "==" in D, "X.equals(Y)" in Java, "X.Equals(Y)" in C# but in C#
     you should overload operator==(X, Y) to do this
   - in D that's just a call of Object.eq(other) which does
     whatever the implementation decides; in current Phobos, it's just
     this === other)
   - IMO, should be something like:
     - compare its reference members with reference equality (method 1)
     - compare its value members with value equality (method 2)
   - which is precisely what it does in Eiffel ("equal" operator)

3) recursive value equality: does X have the deep structure as Y?
 - not seen in many languages; compares the two objects recursively
 - "deep_equal" in Eiffel
 - if it's a value, bitwise comparison
 - if it's a reference type:
   - compare its reference members recursively (method 3)
   - compare its value members with value equality (method 2)

It's interesting to note that you don't need deep equality very often. Maybe that's why many popular languages (including lack the notion in the first place. (Although deep clone might be useful -- for example, http://www.javaworld.com/javaworld/javatips/jw-javatip76.html. By serializing, you get also deep equality for free.)

Eiffel has methods clone and deep_clone which have as postconditions,
respectively, that result.equal(original) and result.deep_equal(original).

Despite what has been said here earlier, D (as it currently stands) seems to be the most consistent language: "==" is always "value equality" and "===" always "reference equality".  Still, I don't know if that's a good or not, taking into consideration the popularity of languages that seem to think differently: C++, Java and C#.  There are undoubtetly reasons behind that...

I have some more thoughts on this, but they will need some time to settle down, so they will be left to another posting.

-Antti

March 29, 2003

Posted by Sean L. Palmer
in reply to Jon Allen

Sean L. Palmer

Posted in reply to Jon Allen

I'm all for C++'s variable allocation technique.

It truly bothers me that the only place you can make an object in D is on the heap.  If I want a variable to be short-lived, I have to use the delete keyword.  I have to use new also (C# is the same way).  Way too wordy for me, for something that happens so often in a program.

Just make local object variables unable to be passed as parameters to other functions.  If you want to pass an object to another function, return it, or store pointers to it, you must declare an object reference instead and allocate with new.

I guess what we want is for objects that are able to act like structs in some situations and like classes in others?

I think let's have an explicit reference keyword also.  Say I have a rather large struct I want to pass around as an 'in' parameter.  I don't want it to be inout.  But I do want it to be by reference because it's large.  I would like to declare:

bool TestMyBigObject(in ref MyBigObject obj);

There's no hard line between structs and classes.  It's quite blurry.

Sean

"Jon Allen" <jallen@minotstateu.edu> wrote in message news:b62s2d$1ltc$1@digitaldaemon.com...
>
> "Ilya Minkov" <midiclub@8ung.at> wrote in message news:b62neg$1iai$1@digitaldaemon.com...
> > Jon Allen wrote:
> > > I have a question though, and I hope it's not to stupid:-).  This
entire
> > > problem is created by the idea of object as references that has become
> so
> > > popular lately.  What is the rationale behind the object as reference
> idea
> > > as opposed to the C++ idea of objects as objects?  To me the C++ way
> seems
> > > much simpler and more consistent, even with the pointers that are
> required
> > > for any kind of runtime object creation.
> >
> > It's rather the opposite: in D objects supposed to be "just objects", while in C++ you always have underlying implementation issues. That's also the reason, why default is comparison by content in D, and not by reference.
>
> That's patently false.  Take the simplest possible case (same syntax for declaring two objects declared in both cases):
>
> MyObj a;
> MyObj b;
>
> Now try to work with them.  Comparisons are what this is thread is about
so
> we'll do that:
>
> if(a==b)
> {
>     //do stuff
> }
>
> Both languages say this should compare the content of the variables.
>
> Does this work in C++? Yes.  Always.  No exceptions (okay it might not always _do_ what you want, but that's something D has to worry about too). No concern about the underlying implementation.  I don't know, don't care whether the compiler made them references, threw them directly on the
stack,
> or did anything else with them.
> Does it work in D?  Maybe.  To make sure that it works you have to concern
> yourself with the fact that a and b don't describe the actual type that I
> want, but that they are in fact cleverly disguised pointers to those
> objects.  Underlying implementation becomes very important here and, to my
> eyes, needlessly complicates my life.
>
> > > It also seems like the C++ way has
> > > more potential for offloading a lot of the work that needs to be done
> from
> > > runtime to compile time, resulting in much better performance.
> >
> > What makes you think that? [-Eliza]
>
> Why do we have stacks and data segments instead of just a heap?  The data segment can be mapped out at compile time, leaving little to be done at runtime for statics, but D objects can't be included here because an
object
> declaration may refer to any number of object instances, even though it probably will only ever refer to one.  It's even tough (sometimes impossible) to say that it'll ever point to anything at all.
>
> It's also much cheaper to throw crap on the stack than it is to put it in the heap(not to mention easier), but putting objects on the stack in D either takes a pretty impressive compiler, or a lot of work on the user's end which will further expose implementation details and cut portability.
>
> How about the extra step the CPU has to take to dereference the link to
that
> stuff we want to use?  I realize that most classes will probably be doing this a million times anyway from the vtable alone, but I think we all want to see D kick the pants off C++ in speed at least, and every little bit helps.
>
> In C++ these things are no problem, and it's because pointers are pointers and variables are variables.  As opposed to the variables are sometimes pointers, sometimes variables idea.  That's not to mention the likelyhood
of
> forgetting those "new"s at least once sometime in any decently sized project, no matter how seasoned you are.  Bugs are performance problems
too.
>
> I can also see it taking a lot off the hands of the GC, since it would
make
> it much easier to free a good percentage of data.  It's a lot faster to
move
> a stack pointer or the like than it is to mark and sweep the entire heap.
>
>
>

March 29, 2003

Posted by Antti Sykari
in reply to Antti Sykari

Antti Sykari

Posted in reply to Antti Sykari

IMO, "==" should mean value equality for value types and reference equality for reference types.

This is not very religious issue for me, but there are some reasons as to why I would prefer the situation to be that way.


Reason 1. The thing that is done most often should look the simplest.

That's because the most common operation for value types is value comparison and the most common operation for reference types is reference comparison.  Then we follow the rule "the most common and simple things should look and feel simple."

What is now done for "==" is not common (at least I find that I rarely use value comparisons in C++ objects, and where I do, they are struct-like objects which should not be heap-allocated in the first place) nor simple. I take it that currently it's translated for something like this:

bit operator==(object a, object b)
{
  return (a === null && b === null)
    || (a !== null && b !== null && (a.eq(b) || b.eq(a)));
}

At least if you've to believe what is said in http://www.digitalmars.com/d/expression.html.

Let's see... actually, there seems to be internal inconsistency in the D documentation. expression.html says that

Object x = null, y = null;
x == y;

is legal and true (null == null is true), but operatoroverloading.html
tells us that == will be rewritten as x.equals(y), which will cause a
null pointer exception if x is null.  Which is even worse.

Wonder which one it is.  Might even go as far as to install the compiler myself and test, now that I even have Windows on my computer.

At any order, I'd still very much prefer "===" or "equals" or
something like that to mean the more complicated case ("return true if
both null or both non-null and a.equals(b) is true, and otherwise
false") and "==" to mean the simpler case (compare identity).


Reason 2. Consistency with the operator "=".

"=" and "==" are closely related.  "=" copies the value or reference of a variable to another.  "==" inspects, or should inspect, if two values or two references are the same.

It's instructive to look at the following code snippet and think about its meaning (assume, for example, that you're in one of them dreaded maintenance programming positions)

{
  <Type> a, b;

  //... a lot of code here so you don't remember what the types of a and
  //    b are ...

  a = b;  // what does this mean?
}

Now, what happens during the assignment?  Well, it depends on if <Type> is a reference or value type!  If the language was stubbornly consistent, it would demand that "a = b" would call b.clone() if the type in question were a reference one, and otherwise do a simple bitwise copy.

My point is that because "a = b" does not call clone() on b, "a == b"
should not call equals() on a.


And, to take the question further...

Reason 3.  Generic containers.

Here I'd rather attempt to grow more understanding and promote discussion than to blindly try to push my opinion (as was the case earlier ;-), since the generics in D are still quite unknown terrain.

Suppose that we have, for example, a Set template that can
contain both basic types and objects, and that we've just instantiated
Set(int) as intSet and Set(object) as objectSet.

f() {
  intSet intset = new intSet();
  objectSet objectset = new objectSet();

  intset.insert(5);
  intset.insert(5); // ok, overwrites the old integer
  objectset.insert(new XWindowSystem(":0.0"));
  objectset.insert(new TCPIPNetwork());
  objectset.insert(new MeaningOfLife(42));
  objectset.insert(new MeaningOfLife(42)); // ok, won't overwrite
                                           // ... or does it?
  objectset.insert(null); // surely we can have null objects in
                          // containers, right?
}

Do we want the objectset to use reference comparison or value comparison internally?  (Or do we want to get two or just one different MeaningOfLife objects when later iterating the set? Depends on the application, I'd guess...)

In C++ we would just use set<int> or set<Object*> and, hence, be explicit about our choice.  But in D, there is not much room for being explicit because pointers are hidden inside object references.

Anyway, let's say that I'd want reference comparison semantics if possible. We have then two alternatives:

- specialize the template for things of type Object and use ===
  there instead -- which would probably lead to duplication of code
  in every container which wants to store (or find, or replace, or
  whatever you do with objects inside containers!) objects
  identity-wise

- attack the "root" of the problem, and make == mean reference
  equality. Which, of course, greatly simplifies all template code.
  The downside being that we have to specialize templates for Objects
  if we suddenly want value semantics.  Can't get everything, can you?

Now, that might not have been the best of examples...


Conclusion:

The thing about objects, is not as much about their values as it is about their identity, so == should mean reference equality.

Identity is the essence of a reference type; value is the essence of a value type.

The essence is what should be compared against when the programmer says "==" -- regardless of whether he knows the type he's going to use, as in normal code, or doesn't, as when he's writing template code.

(Oh yeah, and Eiffel, Java and C# are pretty much on the same road,
too.)

-Antti

March 29, 2003

Posted by Luna Kid
in reply to Matthew Wilson

Luna Kid

Posted in reply to Matthew Wilson

Matthew, I followed through this lively thread (not 100%
thouroughly though, so I may make a big mistake by this quick
message now...), and I'm still unsure about two points.
(So I'm not arguing, just asking below, to see more clearly.)

1.)

 You seem to say you don't want to check for "being
 not null" when doing ==, but want D doing that for
 you implicitly. Correct?

 But would you want the compiler to do if, indeed,
 a == 0?

 Do you consider 0 valid or invalid in this generic
 context?

 If 0 is valid, then == should just return true for
 a == b, if both are null. Is that what you propose?
 (I have not thought about it being good or bad.)

 Or, an exception should be thrown, if you consider
 0 invalid. But in that case, of course, as you don't
 want to have a check in the code, we have not
 progressed from the access violation case.

2.)

 You seem to say that a == x in C++ is safe, because
 there is no access violation that way.

 Well, in fact, it *is* possible (as you said, though,
 by "some perverse pointer dereferencing trickery" --
 which I *have* encountered), and a C++ program will
 then crash the exact same way a D program would, if
 an object reference is 0.

 As I don't know enough about D pointers and references
 yet, it can only make sense to me if D lets you assign
 0 to a reference more "conveniently" than C++, possibly
 to the extent that D considers 0 references legal,
 while C++ considers them illegal.

 Do I understand correctly that that is the source of
 all this pain here?


Anyhow, it all boils down to this:

If D considers 0 being legal reference values (supporing
assigning 0, for instance), then it should indeed check
them as said in (1.) above, and not raising any error.

If D considers 0 references illegal, then it should blow
up programs loudly (as also said above), but then it also
must

    - clearly state that 0 references are illegal

    - prevent creating dangling object references
      as much as possible

(Walter?)

Thanks for your patience,
Luna Kid

March 29, 2003

Posted by Luna Kid
in reply to Luna Kid

Luna Kid

Posted in reply to Luna Kid

>     - prevent creating dangling object references

Sorry, I meant: prevent creating zero references,
most importantly.

March 29, 2003

Posted by Luna Kid
in reply to Antti Sykari

Luna Kid

Posted in reply to Antti Sykari

> Let's see... actually, there seems to be internal inconsistency in the D documentation. expression.html says that
>
> Object x = null, y = null;
> x == y;
>
> is legal and true (null == null is true), but operatoroverloading.html
> tells us that == will be rewritten as x.equals(y), which will cause a
> null pointer exception if x is null.  Which is even worse.
>
> Wonder which one it is.


Oops, I just tried it:

 Object o = null;
 Object p = null;

 if (o == p) {
    ...
 }

This is legal, but still breaks!

So, Matt, now I also understand your point much
better -- and unrerstand Walter's point much less...

Luna Kid

March 29, 2003

Posted by Antti Sykari
in reply to Antti Sykari

Antti Sykari

Posted in reply to Antti Sykari

While I was writing the previous article, it occurred to me that different concepts of equivalence could be implemented if it were possible to overload operator == at the module scope. It would be quite useful.

The overloaded operators then become part of the module's interface, as in C++.

I also promoted consistency with the assignment operator, so that if == means value equivalence, then = should mean cloning, and if == means reference equivalence, then = should mean copying reference, and so on. This scheme would made that possible as well.

As examples, I'll provide hypothetical classes "BackAccount" and "String".

module backaccount;

// operator = that just copies the reference.
// this would be the one that's generated by default.
void assign(inout BankAccount lhs, BankAccount rhs)
{
  lhs = rhs; // let's pretend that lhs = rhs here means assigning the
             // reference for real, and outside the module it means
             // calling backaccount.assign(lhs, rhs);
}

// operator == that compares the identities of the objects.
// again, this would be the default-generated one.
bit eq(BankAccount lhs, BankAccount rhs)
{
  return lhs == rhs; // the same caution as above
}

// Standard bank account example
class BackAccount
{
  this(int initialAmount) { ... }
  void withdraw(int amount) { ... }
  void deposit(int amount) { ... }
  // ...
private:
  // internal state
}

And there you go, a bank account with all the way reference semantics...


module string;

// operator = which clones the string (a real world example might do
// reference counting and copy-on-write or something like that)
void assign(inout String lhs, String rhs)
{
  lhs = rhs.clone();
}

// operator == which compares the values of the string
void eq(String lhs, String rhs)
{
  return lhs.m_contents == rhs.m_contents;
}

class String
{
  this(const char*) { ... }
  String clone() { ... }
  char at(int index) { return chars[index]; }

private:
  char[] m_contents;
  // ptr
}

And there you go, a value-oriented class.

-Antti

March 29, 2003

Posted by Antti Sykari
in reply to Antti Sykari

Antti Sykari

Posted in reply to Antti Sykari

Not enough replying to myself today, it seems :)

Antti Sykari <jsykari@gamma.hut.fi> writes:
> Reason 3.  Generic containers.
>
> Here I'd rather attempt to grow more understanding and promote discussion than to blindly try to push my opinion (as was the case earlier ;-), since the generics in D are still quite unknown terrain.
>
> Suppose that we have, for example, a Set template that can
> contain both basic types and objects, and that we've just instantiated
> Set(int) as intSet and Set(object) as objectSet.
>
> f() {
>   intSet intset = new intSet();
>   objectSet objectset = new objectSet();
>
>   intset.insert(5);
>   intset.insert(5); // ok, overwrites the old integer
>   objectset.insert(new XWindowSystem(":0.0"));
>   objectset.insert(new TCPIPNetwork());
>   objectset.insert(new MeaningOfLife(42));
>   objectset.insert(new MeaningOfLife(42)); // ok, won't overwrite
>                                            // ... or does it?
>   objectset.insert(null); // surely we can have null objects in
>                           // containers, right?
> }

Here we might also get "an impossible situation" easily.

Let's suppose, for the example, that == is value equivalence and === is reference equivalence.

A set of objects is supposed to maintain the invariant that an object is in the set only once, or that there are no objects x and y so that set.has(x) && set.has(y) && x == y && x !== y.  As we all know from mathematics.  Or, to put that into pseudocode:

foreach (x in set.objects, y in set.objects)
{
  if (x !== y)
    assert(x != y);
}

But now it could happen that we have...

class Integer
{
  this(int i) { m_i = i; }
  int i;
}

Object first = new Foo(1), second = new Foo(2);
set.insert(first);
set.insert(second); // ok; invariant maintained; first != second

second.i = 1; // this will break the set invariant; first == second !
              // there are suddenly two objects inside the set that
              // have the same value (but still different identity)

And, so, I am forced to answer my own question:

> Do we want the objectset to use reference comparison or value comparison internally?

Yes.  (Otherwise, integrity might be violated.)

Of course, that isn't a necessity for all containers.

-Antti

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation