March 29, 2003
Being quite thick in the head when it comes to reading - classic picture-thinker problems - I'm quite confused by your use of 0, a numeric, where I suspect you're actually meaning null. I'm answering with the assumption that you mean null, this predicated on a further assumption (shaky ground, assumptions on assumptions) that you're a good C++ chap and eschew NULL in favour of 0.

That being my assumption, answers inline

----- Original Message -----
From: "Luna Kid" <lunakid@neuropolis.org>
Newsgroups: D
Sent: Sunday, March 30, 2003 8:59 AM
Subject: Re: null == o?


> Matthew, I followed through this lively thread (not 100%
> thouroughly though, so I may make a big mistake by this quick
> message now...), and I'm still unsure about two points.
> (So I'm not arguing, just asking below, to see more clearly.)
>
> 1.)
>
>  You seem to say you don't want to check for "being
>  not null" when doing ==, but want D doing that for
>  you implicitly. Correct?
>

Yes


>  But would you want the compiler to do if, indeed,
>  a == 0?

in a comparison x == y

x is null; y is null - result: true (result deduced by the compiler-inserted
check)
x is null; y is non-null - result: false (result deduced by the
compiler-inserted check)
x is non-null; y is null - result: false (result deduced by the
compiler-inserted check, so any equals() method can be sure of the rhs
parameter being non-null; makes things nice and simple)
x is non-null; y is non-null - result is result of x.equals(y) (deduced is
same way)

In effect, and this is kind of inspired with Antti's comments (though I don't think I agree with much of what he said), I am suggesting that == is translated into a call to

 static bit __Reference.equals(Object o1, Object o2) // __Reference is the
built-in, inaccessible, type of which instances hold "pointers" to instances
of programmer-accessible types
 {
   return x === null ? (y === null ? true : false) : (y === null ? false :
x.equals(y));
 }

>
>  Do you consider 0 valid or invalid in this generic
>  context?
>

null has to be valid, as there are simply too many times that one would wish to use a null value of a reference to indicate some special condition (whether that be failure of something, or uniitialised state of something, ad infinitum).

This is the source of the problem. We want to "hide" the notion of pointers behind references, but we still want to be able to have null ones. If we accept this contradiction for the sake of utility (which I believe we should), then we have the issue of what == means for reference types.

== is identity
-------------

Some would argue that, a la Java, the value of a reference is its "pointer", which does make some sense. (All points of view on this issue make some sense. That's the vexing part of it.). The problem in this case is that though the semantics are consistent, the syntax is not (or, rather, the syntax gives the reader an impression that is incorrect.)

If I write

 x == y

then this compares by value for value types, and by identity for reference types. Some say consistency, some say inconsistency. Both are right, neither wrong. Perspective. Nasty.

Where the weight falls down on my side - but of course he'd say that!, you cry - is when one considers the use of templates.

If I write a template that, say, searches an unordered sequence to elide duplicate (by value), then I am stymied.

If I write it for value types it will work on identity for reference types.
Not what I want.
If I write if for reference types (using equals()) then it won't work for
value types (my assumption is that calling .equals() on an int is an error
...)

Why should I have to specialise this simple algorithm. Sucks bad. Of course, Java can survive this because it doesn't support templates. (It'll be interesting to see how it handles that when/if Sun every get round to making good their promise to include them.)

== is value
-----------

This is the current situation with D. For general convenience, appropriate terseness, and templates I agree with it. However, the problem is that == doesn't take account of whether its operands are null, resulting in a crash.

Consider an example I've just been working on. I've written classes/funcs in a number of languages to tokenise a string based on char and/or string delimiters, and to elide or preserve blanks in respect of parameter values. The returned value is an array of char arrays. If blanks are preserved, then the entries for such are empty strings, in other words char arrays of zero length.

Now it's not too hard to imagine the same components returning instantiated objects, which have been serialised in a text format, for, say, passing an array of objects between hosts. A blank token ". . .xhfjdk][][sadjd2339 . . ." would result in a null object being returned in the array of polymorphic instances.

But now I want to apply an algorithm to the array. I pass the array as one argument, and a sentinel object to the other. (Imagine the D equivalent of STL's find_if). The algorithm uses ==. The program crashes on the null array entries. I have to write a separate algorithm. I am unhappy and disatisfied. More code, bigger exe, more maintenance, less reuse, blah, blah

>  If 0 is valid, then == should just return true for
>  a == b, if both are null. Is that what you propose?
>  (I have not thought about it being good or bad.)
>

Yes, see above

>  Or, an exception should be thrown, if you consider
>  0 invalid. But in that case, of course, as you don't
>  want to have a check in the code, we have not
>  progressed from the access violation case.
>
> 2.)
>
>  You seem to say that a == x in C++ is safe, because
>  there is no access violation that way.
>
>  Well, in fact, it *is* possible (as you said, though,
>  by "some perverse pointer dereferencing trickery" --
>  which I *have* encountered), and a C++ program will
>  then crash the exact same way a D program would, if
>  an object reference is 0.
>

I don't think this is relevant to the debate. Any language that respects its
practitioners as being professional and
competent is inevitably going to leave holes that can be exploited by the,
though informed, perverse individual.

>  As I don't know enough about D pointers and references
>  yet, it can only make sense to me if D lets you assign
>  0 to a reference more "conveniently" than C++, possibly
>  to the extent that D considers 0 references legal,
>  while C++ considers them illegal.
>
>  Do I understand correctly that that is the source of
>  all this pain here?
>

Pretty much, yes. We can happily pretend that pointers are references if references can never be null. But is it worth the strictures that this would impose?

>
> Anyhow, it all boils down to this:
>
> If D considers 0 being legal reference values (supporing
> assigning 0, for instance), then it should indeed check
> them as said in (1.) above, and not raising any error.
>

Glad you agree. :)

> If D considers 0 references illegal, then it should blow
> up programs loudly (as also said above), but then it also
> must
>
>     - clearly state that 0 references are illegal
>
>     - prevent creating dangling object references
>       as much as possible
>
> (Walter?)
>
> Thanks for your patience,
> Luna Kid
>
>


March 30, 2003
> ... I suspect you're actually meaning null. I'm answering with the assumption that you mean null, this predicated on a further assumption (shaky ground, assumptions on assumptions) that you're a good C++ chap and eschew NULL in favour of 0.


Sorry about driving you into that... But anyhow,
you guessed both things right. :)

And thanks for the confirming _my_ assumptions
(regarding this == business).

Cheers,
Luna Kid
[need to find a better name than Szabolcs Szasz...]


March 31, 2003
On Fri, 28 Mar 2003 23:09:40 GMT, Karl Bochert <kbochert@copper.net> wrote:

> On Wed, 26 Mar 2003 15:17:36 +1100, "Matthew Wilson" <dmd@synesis.com.au> wrote:
>>
> I am a great admirer of Walter's work (Ever since he personally answered some questions
> about Zortech C) but I think he's !dead! wrong on this one.
>
> int i;
> MyObj o;
>
> 1)  i is not a value. i is a symbol that refers to some storage. When I type
> i = 5; I am not storing 5 into i, but into the storage that i refers to. One of the glories
> of the compiler is that it allows me to ignore this distinction.
> Why should not it do the same for o?

I have a similar point of view to Karl on this. Conceptually, I regard both 'i' and 'o' in this example to represent run-time entities. That is, something that exists in addressable memory for a period of time. And that all entities have certain properties ( or attributes if you like), such as "CanBeWrittenTo", "Initialized", "CanBeUpdated", "Value", "Identity", "MemoryAddress", "LogicalSize", "PhysicalSize", "DataType", "ValueDataType", and the list goes on...

I know that most of these properties cannot be directly accessed by D coders, but I'm talking in conceptual terms here.

Amongst the operations that coders will need to perform with entities, is to check if two entities have the same value, and to check if two entities are actually the same entity or not.

To me, consistency would argue that "==" is the operation symbol for checking the value of the entities, and "===" is the operation symbol for checking the identity of the entities.

Such that "if o1 == o2" is like shorthand for "if o1.value is the same as o2.value" and "if o1 === o2" is like shorthand for "if o1.identity is the same as o2.identity".


As for the expression "if (o1)", to me this is like shorthand for "if o1.initialized is true".


-- 
Derek
March 31, 2003
Derek

Good points. However, you've not addressed the core issue of this current debate, which is whether == should crash if one or both operands are null (for reference types, of course), or whether it should do a "safe" comparison. This is the issue we cannot find agreement on.

Matthew

"Derek Parnell" <Derek.Parnell@No.Spam> wrote in message news:oprmvkc1jryj5swd@news.digitalmars.com...
> On Fri, 28 Mar 2003 23:09:40 GMT, Karl Bochert <kbochert@copper.net>
wrote:
>
> > On Wed, 26 Mar 2003 15:17:36 +1100, "Matthew Wilson"
<dmd@synesis.com.au>
> > wrote:
> >>
> > I am a great admirer of Walter's work (Ever since he personally answered
> > some questions
> > about Zortech C) but I think he's !dead! wrong on this one.
> >
> > int i;
> > MyObj o;
> >
> > 1)  i is not a value. i is a symbol that refers to some storage. When I
> > type
> > i = 5; I am not storing 5 into i, but into the storage that i refers to.
> > One of the glories
> > of the compiler is that it allows me to ignore this distinction.
> > Why should not it do the same for o?
>
> I have a similar point of view to Karl on this. Conceptually, I regard
both
> 'i' and 'o' in this example to represent run-time entities. That is, something that exists in addressable memory for a period of time. And that all entities have certain properties ( or attributes if you like), such as "CanBeWrittenTo", "Initialized", "CanBeUpdated", "Value", "Identity", "MemoryAddress", "LogicalSize", "PhysicalSize", "DataType", "ValueDataType", and the list goes on...
>
> I know that most of these properties cannot be directly accessed by D coders, but I'm talking in conceptual terms here.
>
> Amongst the operations that coders will need to perform with entities, is to check if two entities have the same value, and to check if two entities are actually the same entity or not.
>
> To me, consistency would argue that "==" is the operation symbol for checking the value of the entities, and "===" is the operation symbol for checking the identity of the entities.
>
> Such that "if o1 == o2" is like shorthand for "if o1.value is the same as o2.value" and "if o1 === o2" is like shorthand for "if o1.identity is the same as o2.identity".
>
>
> As for the expression "if (o1)", to me this is like shorthand for "if
> o1.initialized is true".
>
>
> --
> Derek


March 31, 2003
On Mon, 31 Mar 2003 11:45:25 +1000, Matthew Wilson <dmd@synesis.com.au> wrote:

> Derek
>
> Good points. However, you've not addressed the core issue of this current
> debate, which is whether == should crash if one or both operands are null
> (for reference types, of course), or whether it should do a "safe"
> comparison. This is the issue we cannot find agreement on.
>
> Matthew

Yeah, I know ;-)

I bow out of that debate for now. My mind is not so clear about which way to bend yet.

My current thinking (subject to massive fluctuation), depends on what is meant by "null". I think what people are trying to say is that it is possible for an entity (reference) to be unusable sometimes. If this is the case, and that we want coders to be micro-managing their code, then any attempt at using that entity should result in an exception. By "using" I mean any run-time access to any of the entity's properties.

To aid the coder in micro-managing their code, it would be nice to be able to avoid exceptions by explicitly testing if an entity is usable or not.

 if (available(o1) && available(o2))
 {
   if (o1 == o2 )
   {
     . . .

As to whether this availablity checking should be implicit or not seems to be where the debate is mainly stuck.

If we assume 'implicit' is a possiblity (that is, the compiler is allowed to do some micro-managing) then if either entity is not available, any relationship or identity testing should always return False. My reasoning is that if 'o1' is not available then how can its non-existant value be the same as o2.value? And if 'o1' is not available then it's non-existant identity can never be the same as o2.identity.

My own opinion for now, is that I would like to see the language have both implicit 'null'/availabilty testing and the ability for coders to do explicit 'null'/availability testing.

By the way, this would extend to having some form of syntax that allowed the coder to explictly make an entity unavailable.

-- 
Derek
March 31, 2003

Matthew Wilson wrote:
> 
> Derek
> 
> Good points. However, you've not addressed the core issue of this current debate, which is whether == should crash if one or both operands are null (for reference types, of course), or whether it should do a "safe" comparison. This is the issue we cannot find agreement on.

Then I'll add my three reasons why it shouldn't crash.

(1) It is against the expectations of programmers. If you do OO programming
    in C, C++ or Java, you will write exactly that (and expect it to be save)

(2) It is against the very idea of operator overloading, which must have
    transparency to achieve its goal. That is, you must be able to forget that
    the operator may be a method of an object. i==0 cant crash, so null==object
    shouldn't be able to crash too.

(3) Any crash is bad. If there is a sensible way to continue operation in
    a situation we should take the chance to tolerate it. A "close" on a file
    that is not open - fine, lets go on. A free on some pointer that is NULL
    - whats the problem? An object that doesn't exist compared to null in D -
    well sure it should return "true". Nothing else would make more sense.

--
Helmut Leitner    leitner@hls.via.at Graz, Austria   www.hls-software.com
March 31, 2003
Helmut Leitner wrote:
>
> Then I'll add my three reasons why it shouldn't crash.
>
> (1) It is against the expectations of programmers. If you do OO
programming
>     in C, C++ or Java, you will write exactly that (and expect it to be
save)

Yes.

> (2) It is against the very idea of operator overloading, which must have
>     transparency to achieve its goal. That is, you must be able to forget
that
>     the operator may be a method of an object. i==0 cant crash, so
null==object
>     shouldn't be able to crash too.

Yes.

> (3) Any crash is bad.

Only true for crashes in critical production systems,
where malfunction can be tolerable but downtime can not.

Otherwise, as Walter also noted, crashes are usually
better than undetected malfunction.

Cheers,
Luna Kid


March 31, 2003
"Helmut Leitner" <helmut.leitner@chello.at> escribió en el mensaje
news:3E87DEA7.A5ABC430@chello.at...
|
| (2) It is against the very idea of operator overloading, which must have
|     transparency to achieve its goal. That is, you must be able to forget
that
|     the operator may be a method of an object. i==0 cant crash, so
null==object
|     shouldn't be able to crash too.
|

I don't understand much of what you guys have said about all this, but
here's what I think.
I don't want this to happen:

Object o1,o2;
...
if (o1==o2)   //access violation because o1 is null

I just hate it. But under a different POV, it makes sense. o1==o2 becomes o1.eq(o2) (or is it cmp?), which isn't possible because o1 is null, so we can't reference a function of a null object. That's all I can say.

-------------------------
Carlos Santander


---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.465 / Virus Database: 263 - Release Date: 2003-03-25


March 31, 2003

"Carlos Santander B." wrote:
> 
> "Helmut Leitner" <helmut.leitner@chello.at> escribió en el mensaje
> news:3E87DEA7.A5ABC430@chello.at...
> |
> | (2) It is against the very idea of operator overloading, which must have
> |     transparency to achieve its goal. That is, you must be able to forget
> that
> |     the operator may be a method of an object. i==0 cant crash, so
> null==object
> |     shouldn't be able to crash too.
> |
> 
> I don't understand much of what you guys have said about all this, but
> here's what I think.
> I don't want this to happen:
> 
> Object o1,o2;
> ...
> if (o1==o2)   //access violation because o1 is null
> 
> I just hate it. But under a different POV, it makes sense. o1==o2 becomes o1.eq(o2) (or is it cmp?), which isn't possible because o1 is null, so we can't reference a function of a null object. That's all I can say.

I think you first impression is right. The other POV uses intimate knowledge about implementation detail. It's understandable like many weird things in programming languages and APIs, but it is a violation of common sense.

--
Helmut Leitner    leitner@hls.via.at Graz, Austria   www.hls-software.com
March 31, 2003
Hmm,
Well let me first say that I love the current == and === operators, They
are probably my single favorite aspect of D, especially compared to Java. The
null issue is a pain, and I've been thinking about how to get around it.
How about a static object called "null", which for all practical purposes would
act just as a null pointer, but could have its own methods like eq(), so null
pointer exceptions wouldn't arise under comparison.

//somewhere in memory,
Object null;

//in our program
Object o1, o2;   //by default, &o1 == &null, &o2 == &null

if(o1 == o2)   //no longer an access violation, since both references are //pointing to a valid object.

myObj = null;  //now just points to the null object, which is valid.

To make life simpler, classinfo, toString(), etc. would all just return "null",
to aid in debugging.

I'm no language designer or theorist, but this seemed to me to be an easy way to avoid those nasty crashes but still be assured that you will know if an object is uninitialized. Whether or not this is a kludge is up to you guys. =) Do any other languages do this?

-Jon

>> I don't understand much of what you guys have said about all this, but
>> here's what I think.
>> I don't want this to happen:
>> 
>> Object o1,o2;
>> ...
>> if (o1==o2)   //access violation because o1 is null
>> 
>> I just hate it. But under a different POV, it makes sense. o1==o2 becomes o1.eq(o2) (or is it cmp?), which isn't possible because o1 is null, so we can't reference a function of a null object. That's all I can say.