View mode: basic / threaded / horizontal-split · Log in · Help
July 21, 2005
Re: [Suggestion] Make if(array) illegal.
Hi Regan,

In article <opst8meeo123k2f5@nrage.netwin.co.nz>, Regan Heath says...
>
>On Thu, 21 Jul 2005 00:04:56 +0000 (UTC), AJG <AJG_member@pathlink.com>  
>wrote:
>>> Making difference between an empty array and a nonexistent one is flaky,
>>> if not directly ambiguous, thus D does not do it, as far as i can
>>> remember the statement of Walter. Thus if(array) is not ambiguous.
>>
>> Hm... not only does this distinction exist, it is in fact _very_ much  
>> available
>> in D. That's exactly the point Regan has made in some past replies. I'm
>> indifferent towards this distinction, but Regan seems fond of it. Please  
>> look at
>> my examples further below.
>
>It's true.

Praise the lord, agreement. ;)

>>> And at all, arrays have somewhat pointer-like semantics in D.
>>
>> No, the do not, IMHO. This is one of the points I've tried to make.  
>> Arrays have
>> completely different semantics in D compared to C. In D arrays are  
>> first-class
>> objects. They are handled via references, which can't be nulled, they  
>> keep their
>> own length, etc. I think this is a good thing. Very different from C.
>
>The point I'm trying to make is that in D an array can be nulled, and it  
>has meaning, eg.
>
>char[] p = null;
>
>you're confusing the _implementation_ of arrays with the _behaviour_ of  
>arrays, the above array _referece_ behaves just like any other reference  
>that has been nulled(*) eg.

I'm well aware of the implementation vs. the behaviour. It just so happens the
two are married when it comes to the compiler. In fact, in the resulting
executable, they are indistinguishable. Confusion arises as a result.

>>> One of the reasons is that it seems
>>> familiar to C programmers.
>>
>> Indeed. It seems familiar, and people will misuse it because of that.
>
>How? When you write "if(x)" you're asking is 'x' null or 0. D's answer is  
>perfectly correct in all cases(*).

And except for static arrays. Oh, and strings, which must be compatible with C.
Since strings are a fairly important piece of the puzzle, I'd say this is
problematic.

>(*) except for the _BUG_ where you can write:
>
>char[] p = "";
>p.length = 0;
>if (p) { //false, length = 0 resets the data pointer to null }

Has Walter actually acknowledged this to be a bug? This seems more like what you
mentioned, a desire to make the distinction (empty/exist) dissapear. If that's
the case, then why would you say it's a bug? If anything, it could only get
worse.

>> # int[0] emptyArray;
>> # if (emptyArray) writef("See, I'm empty, yet I exist!");
>> // The statement will print.
>
>This is a static array. It's data pointer can never be null, thus it  
>always exists.
>(Nothing incongruous here)

My friend, that's the very definition of an incongruence. It means static arrays
do not follow the same principles as other kinds (just like strings).
# int[0] empty;              // Not null.
# int[ ] empty = new int[0]; // Yes null.

I even went ahead and _assigned_ an empty array (int[0]) to the reference, and
yet it remains _non_ existant. How do you explain that? You can't have a dynamic
array that is empty and non-existant, but you _can_ have a static one? (or at
least, not via the initializer?)

Let's analyze this carefully, and you will definitely see an incongruence:

# int[] A = null;
# int[] B = new int[0];

if (A) // this is false.
if (B) // this is false.

Since false == false, then A == B, and therefore null == int[0]. The very
distinction you are so fond of is gone! So in this case empty == non-existant,
but all over the place it isn't? _That's_ an incongruence.

>> // Let's try it again:
>> # int[] emptyArray = new int[0];
>> # if (emptyArray) writef("I'm still empty, but non-existant.");
>> // The statement will *not* print.
>
>Here you have not allocated any memory, thus nothing exists.
>(Nothing incongruous here)

Oh, so then it's purely about memory? How very semantic. Nevermind the fact that
int[0] means an empty array. The distinction is lost, as shown above. IMHO
there's no way around this one.

>> // Think about strings:
>> # string emptyString = "";
>> # if (emptyString) writef("Empty, yet I exist");
>> // The statement will *not* print.
>
>Wrong, this statement will print (try it).
>
>The reason it prints is that memory _is_ allocated because string  
>constants are C compatible i.e. contain a null terminator. If this was not  
>the case then this would act as the previous example.

"If this was not the case". That's fine, but it happens to _be_ the case.
Therefore the docs should state: "There is an incongruence when it comes to
string literals. Because we want them to be compatible with C, it means an empty
string is not really empty. In other words, what should have been an empty array
is really not. Careful, folks!" 

>> But what about this:
>> # string emptyString = null;
>> # if (emptyString) writef("Empty, but now I don't exist");
>> // The statement will print.
>
>Wrong, it will not print. The array is null, nothing exists.
>(Nothing incongruous here)
>
>> Would you say the behaviour I showed above is consistent?

If you agree with the previous statements, you'll concur that the behaviour is
not consistent. It calls for exceptions to be made and explained. Once more
gratuitously: static vs. dynamic, and string literals, and the .length "bug,"
and the dynamic initializer problem.

>> You don't find it a tad, say, ambiguous?

If you at least agree it's inconsistent, then we are getting somewhere. The
ambiguity results in not knowing when which is going to happen. Since there is
no documentation on this, the problem is only aggravated.

>> You don't think people will be confused? I certainly was.
>
>That's because you're asking the wrong questions, and you didn't check  
>your answers.

I did check my answers, and now I know. I made the mistake, and by _chance_ one
case didn't work early on, so I started looking under the hood. But how many
people will go to their graves with bugs like that still coded? How many bugs
like that exist as we speak? Remember, for _most_ cases, it will not show up.

Tell me this, do you agree with this statement:
People (mistakedly) may use if (array) to test for the emptiness of an array.
What about this:
Moreover, this test will work most of the time.
And finally:
The remaining times, they are bugs.

My proposal aims to prevent those bugs. 

>>> makes the foreach..else syntax suggestion from AJG very unnecessary.
>>
>> Huh? I don't see how the two things are related. You may have a valid  
>> point, but I fail to see the connection.
>
>I'm not sure either. I suspect he's referring to foreach being usable on a  
>null array equally well, i.e. you dont have to check whether it's a null  
>array, it will iterate 0 times for both a null array and an emtpy array.

If this is true, Ilya, that was never the intention of my suggestion. I know
that foreach is "safe" even with "null" arrays. The suggestion is a way to deal
with the no-items case elegantly without using a separate if statement every
single time. As a matter of fact, no-items happens quite a bit IMHO.

Thanks for reading,
--AJG.
July 21, 2005
Re: [Suggestion] Make if(array) illegal.
On Thu, 21 Jul 2005 02:18:27 +0000 (UTC), AJG <AJG_member@pathlink.com>  
wrote:
>>> Hm... not only does this distinction exist, it is in fact _very_ much
>>> available
>>> in D. That's exactly the point Regan has made in some past replies. I'm
>>> indifferent towards this distinction, but Regan seems fond of it.  
>>> Please
>>> look at
>>> my examples further below.
>>
>> It's true.
>
> Praise the lord, agreement. ;)

We're both men of "distinction" ;)

>> you're confusing the _implementation_ of arrays with the _behaviour_ of
>> arrays, the above array _referece_ behaves just like any other reference
>> that has been nulled(*) eg.
>
> I'm well aware of the implementation vs. the behaviour.It just so  
> happens the two are married when it comes to the compiler. In fact, in  
> the resulting
> executable, they are indistinguishable. Confusion arises as a result.

Sorry, I don't see your point. The compiler isn't confused, neither am I.  
Arrays are references, treat them as such and there is no confusion.

>>>> One of the reasons is that it seems
>>>> familiar to C programmers.
>>>
>>> Indeed. It seems familiar, and people will misuse it because of that.
>>
>> How? When you write "if(x)" you're asking is 'x' null or 0. D's answer  
>> is perfectly correct in all cases(*).
>
> And except for static arrays.

No, this is no exception to the rule.

Yes, static arrays are different to dynamic ones, no surprises there. Yes,  
static arrays cannot have a null data pointer, no, it makes no difference  
to the behaviour of "if(x)", nor should it.

static arrays are the same as dynamic ones that _exist_, this makes  
perfect sense as static arrays always exist.

> Oh, and strings, which must be compatible with C.

Again, there is no exception to the rule here.
"bob" is a static string, it cannot be null.
"" is a static string, it cannot be null.

Yes, the last example has no items, i.e. has a 0 length, but it still  
_exists_.

If Walter decided to remove the trailing null and make it incompatible  
with C then it could be optimised away, i.e. the compiler could decide ""  
was meaningless and so could remove it, making it non existant. In that  
case it wouldn't exist. Otherwise it does. As long as it exists it has a  
non-null data pointer. The length is meaningless when talking about  
existance.

>> (*) except for the _BUG_ where you can write:
>>
>> char[] p = "";
>> p.length = 0;
>> if (p) { //false, length = 0 resets the data pointer to null }
>
> Has Walter actually acknowledged this to be a bug?

In short, no. But then he isn't known for his verbosity on many matters.  
He just percolates and out pops a new compiler possibly with a changes we  
talk about.

> This seems more like what you mentioned, a desire to make the  
> distinction (empty/exist) dissapear.

I believe that was the original intent.

> If that's the case, then why would you say it's a bug?

In this case my impression is that the real intent was to remove the seg-v  
problems associated with null strings, remove the need to check for null  
all the time, etc. That has been achieved, what is great is that at the  
same time we can preseve the distinction if we so choose (it takes so very  
little to do this, from the current state)

> If anything, it could only get worse.

Oh ye of little faith!

>>> # int[0] emptyArray;
>>> # if (emptyArray) writef("See, I'm empty, yet I exist!");
>>> // The statement will print.
>>
>> This is a static array. It's data pointer can never be null, thus it
>> always exists.
>> (Nothing incongruous here)
>
> My friend, that's the very definition of an incongruence.

Whose definition?
  http://dictionary.reference.com/search?q=incongruous

The closest/best definition for this situation appears to be:
  "Not in keeping with what is correct, proper, or logical; inappropriate:  
incongruous behavior"

> It means static arrays do not follow the same principles as other kinds  
> (just like strings).

What "principles" are you referring to?

> # int[0] empty;              // Not null.
> # int[ ] empty = new int[0]; // Yes null.

> I even went ahead and _assigned_ an empty array (int[0]) to the  
> reference, and yet it remains _non_ existant. How do you explain that?  
> You can't have a dynamic array that is empty and non-existant, but you  
> _can_ have a static one? (or at least, not via the initializer?)

Aha! This is a new (good) example. I agree in this example shows  
"incongruous behaviour".

I would suggest that "int[0] s;" be an error, as it's pretty meaningless..  
Except template programmers would likely be a little annoyed with that.

I would suggest that "int[0] s;" have a null data pointer (as the dynamic  
one does).. But I believe they're implemented in such a way that there is  
no such data pointer.

There seems to be no simple solution to this problem, perhaps Walter has  
an idea. I'll post to the bugs NG.

> Let's analyze this carefully, and you will definitely see an  
> incongruence:
>
> # int[] A = null;
> # int[] B = new int[0];
>
> if (A) // this is false.
> if (B) // this is false.
>
> Since false == false, then A == B, and therefore null == int[0]. The very
> distinction you are so fond of is gone!

Not true.

I suspect "new int[0]" allocates no memory, therefore it _is_ null.
This is different to C/C++ which can and do allocate a zero-length item in  
the heap.

This could be a solution to the problem above, if "new int[0]" allocated a  
zero length item on the heap it would be consistent with the static array  
case.

>>> // Let's try it again:
>>> # int[] emptyArray = new int[0];
>>> # if (emptyArray) writef("I'm still empty, but non-existant.");
>>> // The statement will *not* print.
>>
>> Here you have not allocated any memory, thus nothing exists.
>> (Nothing incongruous here)
>
> Oh, so then it's purely about memory?

In essence, yes. If no memory is allocated it doesn't exist. Exactly like  
your own C example earlier.

> How very semantic. Nevermind the fact that int[0] means an empty array.

"new int[0]" means allocate an array of 0 int's. 0 * int.sizeof == 0. In  
other words allocate 0 bytes. I suspect a shortcut is being done where it  
does no allocation when you ask for 0 bytes. I think perhaps it should  
allocate a zero-length item on the heap instead.

> The distinction is lost, as shown above. IMHO there's no way around this  
> one.

Sure, there is 1 problem in the static array vs dynamic array example.
Lets hope Walter agrees and has/likes the solution.

>>> // Think about strings:
>>> # string emptyString = "";
>>> # if (emptyString) writef("Empty, yet I exist");
>>> // The statement will *not* print.
>>
>> Wrong, this statement will print (try it).
>>
>> The reason it prints is that memory _is_ allocated because string
>> constants are C compatible i.e. contain a null terminator. If this was  
>> not the case then this would act as the previous example.
>
> "If this was not the case". That's fine, but it happens to _be_ the case.
> Therefore the docs should state: "There is an incongruence when it comes  
> to string literals. Because we want them to be compatible with C, it  
> means an empty string is not really empty.

It depends how you want to look at it. When I type "" I'm saying here  
exists a string containing nothing. In other words, it _exists_ but  
contains _nothing_ it's the very definition of a non-null data pointer  
with a 0 length.

> In other words, what should have been an empty array is really not.  
> Careful, folks!"

It _is_ empty, it's length is 0. The trailing \0 is effectively outside  
the length of the array, it exists past the end.

>>> But what about this:
>>> # string emptyString = null;
>>> # if (emptyString) writef("Empty, but now I don't exist");
>>> // The statement will print.
>>
>> Wrong, it will not print. The array is null, nothing exists.
>> (Nothing incongruous here)
>>
>>> Would you say the behaviour I showed above is consistent?
>
> If you agree with the previous statements, you'll concur that the  
> behaviour is not consistent. It calls for exceptions to be made and  
> explained.

As I said above, there are no exceptions in the rule for "if(x)". It  
simply and always checks the variable 'x' against null or 0. Nothing more,  
nothing less. You do however need to understand what other statements like  
the "new int[0]" do, in order to understand how they relate to "if(x)".  
That doesn't mean there is anything wrong with "if(x)".

> Once more gratuitously: static vs. dynamic, and string literals, and the  
> .length "bug," and the dynamic initializer problem.

Summary:
I agree there is a problem with static vs dynamic above.
I don't agree that there is anything wrong with the behaviour of "if(x)".

>>> You don't find it a tad, say, ambiguous?
>
> If you at least agree it's inconsistent, then we are getting somewhere.

The static vs dynamic example above shows inconsistency.

> The ambiguity results in not knowing when which is going to happen.

Specifically with statments like "new int[0]" and "int[0] a" and what  
exactly _they_ do.

>>> You don't think people will be confused? I certainly was.
>>
>> That's because you're asking the wrong questions, and you didn't check
>> your answers.
>
> I did check my answers, and now I know.

Yeah, I didn't see your post correcting it till after I wrote this.

> I made the mistake, and by _chance_ one case didn't work early on, so I  
> started looking under the hood. But how many people will go to their  
> graves with bugs like that still coded? How many bugs like that exist as  
> we speak? Remember, for _most_ cases, it will not show up.
>
> Tell me this, do you agree with this statement:
> People (mistakedly) may use if (array) to test for the emptiness of an  
> array.

No. My reasoning:

1. Most container classes use a length or size member for this. I haven't  
seen a single container class/object/thing in any language that lets you  
check the length or size of an object using "if(x)".

2. The statement "if(x)" is well know to mean check x vs null or 0. If you  
assume an array is a struct you're writing something meaningless. If you  
assume an array is a reference you're comparing the reference to null or  
0. I cannot see how you would ever think it would silently call ther  
length member of x.

> What about this:
> Moreover, this test will work most of the time.

Sure. Most of the time you'll have an array with items, thus the data  
pointer will be non-null.

> And finally:
> The remaining times, they are bugs.

Yes. Assuming: you wrote "if(x)" and meant to check for length>0 then in  
the case of a non-null data pointer and a 0 length it would execute the  
code you had written for arrays with a length greater than 0.

> My proposal aims to prevent those bugs.

Sure, only you want to do it in such a way as to break existing code  
relying on "if(x)". You want to introduce inconsistent behaviour (making  
arrays behave differently to all other types in D). And lastly the bugs  
you're referring to are, IMO, unlikely to occur.

Essentially you have to generate a zero length non-null array. The 3 ways  
I know of doing this are:

char[0] p;            //1
char[] p = "";        //2

char[] tmp = "abc";
char[] p = tmp[0..0]; //3

You'd have to (incorrectly) attempt to compare the length of an array with  
"if(p)" and the outcome would have to be wrong in a subtle way for this to  
be a serious problem, a blatant bug is easy to find and you quickly learn  
not to use "if(p)" to check for length.

Most cases I can imagine the non-null zero length array causes no  
problems, because as Ilya mentioned things like "foreach" treat them the  
same. This is part of the "treat them the same" that was Walters initial  
goal and is achieved mostly by array references never being null.

In short, I like it how it is, I can't see a significant problem, and I  
totally dislike your suggested solution. But, like you say thanks for  
listening to my point of view, it's been fun. (I think we've exhausted our  
ideas and I don't think we're agreeing)

Regan.
July 21, 2005
Re: [Suggestion] Make if(array) illegal.
>>It might be easier to just live with the current behavior.
>
> That's just laziness speaking ;).

Maybe "easier" isn't the right word :-)
The last time this topic came up one suggestion was to encourage explicit 
.length or .ptr conditions but to keep the current implicit conversions. For 
example the C++string vs D string page 
http://www.digitalmars.com/d/cppstrings.html was changed to test for empty 
as:
if (!array.length) ...
It's in the section "Checking For Empty Strings". It used to just be "if 
(!array)", I think.


>>Then again we already have 'if (x = y)' illegal so there is precendent for
>>filtering conditions - the good-old 'value does not give boolean result'
>>error.
>
> Yes! That's exactly what I was thinking. D even has its cake and eats it,
> because (x = y) is still legal with an additional explict == true/false; 
> this is
> great. It allows you to do it yet prevents the common missing = mistake.
>
> This is analogous to if (array). The pointer check can still be done via
> array.ptr, but D would error out when using the ambiguous form. So there 
> is
> definitely precedent, and it's a good precendent.

In fact now that I think about the 'if (!array)' code if we made 'if 
(array)' illegal we'd also need a special check for 'if (!array)'. That's at 
least two more special cases for conditions.
July 21, 2005
Re: [Suggestion] Make if(array) illegal.
Hi,

>>>> Please
>>>> look at
>>>> my examples further below.
>>>
>>> It's true.
>>
>> Praise the lord, agreement. ;)
>
>We're both men of "distinction" ;)

Hehehe. I'll in requiring your testimony in court one day.

>In short, I like it how it is, I can't see a significant problem, and I  
>totally dislike your suggested solution. But, like you say thanks for  
>listening to my point of view, it's been fun. (I think we've exhausted our  
>ideas and I don't think we're agreeing)

Yes, I suppose we can agree to disagree.

One last couple of things I'd like to clarify, though: My idea is not
necessarily to make if (array) check length automatically. This is just one of
the three I mentioned. My general suggestion is to improve/clarify and document
the behaviour of the construct because I find it dangerous and leading to the
subtle bugs I mentioned.

You agreed that the bugs can at least happen. It'd be great to know how common
they could appear; alas, this wouldn't be easy. However, in all honesty, bugs
arising from using assignment as a boolean (if (x = y)) haven't happened to me
very much. Maybe once or twice (in years). Yet the construct was made partially
illegal, requiring a more explicit version. That's fine with me. It helps
prevent those subtle (if seldom) bugs.

In addition, IIRC, nowhere on the D site proper is there a mention of what the
correct behaviour is supposed to be. I have a feeling Walter left this construct
a little unfinished with regards to arrays. Maybe he's working on the empty/null
distinction thing and then he will revise it. Anyway, as I've said the lack of
documentation doesn't help.

And finally: Could you give me a concrete example of a useful application of if
(array) to test for the array pointer's nullness? Say, in a complete function? I
simply don't think dealing with ptrs (or checking them) should be necessary in D
except for C-compat. But perhaps you have a really good use for this construct
that I haven't considered.

Thanks,
--AJG.
July 21, 2005
Re: [Suggestion] Make if(array) illegal.
On Thu, 21 Jul 2005 13:35:36 +0000 (UTC), AJG <AJG_member@pathlink.com>  
wrote:
> And finally: Could you give me a concrete example of a useful  
> application of if (array) to test for the array pointer's nullness? Say,  
> in a complete function? I simply don't think dealing with ptrs (or  
> checking them) should be necessary in D except for C-compat. But perhaps  
> you have a really good use for this construct that I haven't considered.

Template programming is an example of where we rely on the logical  
consistency of types to achieve generic things, see:

import std.stdio;

class A
{
	char[] toString()
	{
		return "A";
	}
}

template doWrite(Type)
{
	void doWrite(Type p)
	{
		if (p) writef(p);
	}
}

alias doWrite!(A) doWriteA;
alias doWrite!(char[]) doWriteC;

void main()
{
	char[] a = "this is an ";
	
	doWriteC(null);
	doWriteC(a);
	doWriteA(null);
	doWriteA(new A());
}

Essentially anywhere you expect consistent behaviour of references (string  
or otherwise) and want to test the reference is not null, i.e.  
non-existant.

Regan
July 22, 2005
Re: [Suggestion] Make if(array) illegal.
"Regan Heath" <regan@netwin.co.nz> wrote in message 
news:opst6x8cje23k2f5@nrage.netwin.co.nz...
> On Wed, 20 Jul 2005 02:15:58 +0000 (UTC), AJG <AJG_member@pathlink.com> 
> wrote:
>> This is a suggestion based on a thread from a couple of weeks ago. What 
>> about
>> making if (array) illegal in D? I think it brings ambiguity and a high 
>> potential
>> for errors to the language. The main two uses for this construct can 
>> already be
>> done with a slightly more explicit syntax:
>>
>> if (array.ptr == null) // Check for a kind of "non-existance."
>> if (array.length == 0) // Check for explicit emptiness.
>>
>> On the other hand, one is not sure what if (array) by itself is supposed 
>> to
>> mean, since it's _not_ like C. In C, if (array), where array is 
>> typically a
>> pointer, means simply != NULL. The problem in D is that the array ptr is 
>> tricky
>> and IMHO it's best not to interface with it directly.
>>
>> I think it would be wise to remove this ambiguity. I propose two options:
>> 1) Make if (array) equal _always_ to if (array.length).
>> 2) Simply make it illegal.
>>
>> What do you guys think? Walter?
>
> I prefer the current behaviour (for all the reasons I mentioned in the 
> previous thread):
>   http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/25804
>
> "if (array)" is the same as "if (array.ptr)" which acts just like it does 
> in C, comparing it to 0/null.
>
> Essentially the "if" statement is checking the not zero state of the 
> variable itself. In the case of value types it compares the value to 0. In 
> the case of pointers and references it compares them to null.
>
> In the case of an array, which (as explained in link above) is a 
> mix/pseudo value/reference type, it compares the data pointer to null.
>
> The reason this is the correct behaviour is that a null array has a null 
> data pointer, but, an empty array i.e. an existing set containing no 
> elements may have a non-null data pointer. In both cases they have a 0 
> length property.
>
> Of course we could change this, we could remove the case where an array 
> contains no items but has a non-null data pointer. This IMO would remove a 
> useful distinction, the "existing set containing no items" would be 
> un-representable with a single array variable. IMO that would be a bad 
> move, the current situation(*) is good.
>
> (*) there remains the problem where setting the length of an array sets 
> the data pointer to null. This can change an "existing set with no 
> elements" into a "non existant set".
>
> Regan

I was poking around the Qt documentation and interestingly enough QString 
has a concept of null and empty. Here's what they say, though: "For 
historical reasons, QString distinguishes between a null string and an empty 
string. [snip] We recommend that you always use isEmpty() and avoid 
isNull()."

The exact doc is 
http://doc.trolltech.com/4.0/qstring.html#distinction-between-null-and-empty-strings
July 22, 2005
Re: [Suggestion] Make if(array) illegal.
On Thu, 21 Jul 2005 22:31:37 -0400, Ben Hinkle <ben.hinkle@gmail.com>  
wrote:
> "Regan Heath" <regan@netwin.co.nz> wrote in message
> news:opst6x8cje23k2f5@nrage.netwin.co.nz...
>> On Wed, 20 Jul 2005 02:15:58 +0000 (UTC), AJG <AJG_member@pathlink.com>
>> wrote:
>>> This is a suggestion based on a thread from a couple of weeks ago. What
>>> about
>>> making if (array) illegal in D? I think it brings ambiguity and a high
>>> potential
>>> for errors to the language. The main two uses for this construct can
>>> already be
>>> done with a slightly more explicit syntax:
>>>
>>> if (array.ptr == null) // Check for a kind of "non-existance."
>>> if (array.length == 0) // Check for explicit emptiness.
>>>
>>> On the other hand, one is not sure what if (array) by itself is  
>>> supposed
>>> to
>>> mean, since it's _not_ like C. In C, if (array), where array is
>>> typically a
>>> pointer, means simply != NULL. The problem in D is that the array ptr  
>>> is
>>> tricky
>>> and IMHO it's best not to interface with it directly.
>>>
>>> I think it would be wise to remove this ambiguity. I propose two  
>>> options:
>>> 1) Make if (array) equal _always_ to if (array.length).
>>> 2) Simply make it illegal.
>>>
>>> What do you guys think? Walter?
>>
>> I prefer the current behaviour (for all the reasons I mentioned in the
>> previous thread):
>>   http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/25804
>>
>> "if (array)" is the same as "if (array.ptr)" which acts just like it  
>> does
>> in C, comparing it to 0/null.
>>
>> Essentially the "if" statement is checking the not zero state of the
>> variable itself. In the case of value types it compares the value to 0.  
>> In
>> the case of pointers and references it compares them to null.
>>
>> In the case of an array, which (as explained in link above) is a
>> mix/pseudo value/reference type, it compares the data pointer to null.
>>
>> The reason this is the correct behaviour is that a null array has a null
>> data pointer, but, an empty array i.e. an existing set containing no
>> elements may have a non-null data pointer. In both cases they have a 0
>> length property.
>>
>> Of course we could change this, we could remove the case where an array
>> contains no items but has a non-null data pointer. This IMO would  
>> remove a
>> useful distinction, the "existing set containing no items" would be
>> un-representable with a single array variable. IMO that would be a bad
>> move, the current situation(*) is good.
>>
>> (*) there remains the problem where setting the length of an array sets
>> the data pointer to null. This can change an "existing set with no
>> elements" into a "non existant set".
>>
>> Regan
>
> I was poking around the Qt documentation and interestingly enough QString
> has a concept of null and empty. Here's what they say, though: "For
> historical reasons, QString distinguishes between a null string and an  
> empty
> string. [snip] We recommend that you always use isEmpty() and avoid
> isNull()."
>
> The exact doc is
> http://doc.trolltech.com/4.0/qstring.html#distinction-between-null-and-empty-strings

That's not too surprising. A lot of people have never seen the need for  
the distinction, and it certainly can make life "simpler". However, I  
don't believe you can argue that it doesn't exist, at least logically.  
That is why you get situations like this (stolen from a post to the  
DMDScript group):

<quote>
For example, might it not be useful to return 'null' on EOF, thus allowing
this sort of construct:

    var line = readln();

    while (line != null)
    {
         ...
         line = readln();
    }
</quote>

which is an example where there is a desire to distinguish between  
existance and empty.

Sure, you can remove the distinction, lessen the expressiveness of arrays  
and force everyone to "work around" the deficiency in other ways, it's  
possible, it can make life simpler for the general case and more  
complicated for the rest.

I think arrays in D are nearly perfect(*). They allow you to ignore the  
distinction in the general case (thus life is pretty easy already) yet you  
can tell the difference if you require it.

(*) there are only 2 problems with them IMO:

1. length = 0; resets the data pointer to null, changing emtpy into  
non-existant.
2. "int[0] a;" and "int[] a = new int[0];" produce different results when  
you'd expect the same thing.

Regan
July 22, 2005
Re: [Suggestion] Make if(array) illegal.
Derek Parnell schrieb:
>>Making difference between an empty array and a nonexistent one is flaky, 
>>if not directly ambiguous, thus D does not do it, as far as i can 
>>remember the statement of Walter. Thus if(array) is not ambiguous.
> 
> Maybe in your world, but not in mine.

[...]

> To repeat: Existence and Emptiness are not the same concept.

The matter of discussion is not your or my view of the real world, nor 
some other programming languages' realm. The matter is how arrays are 
implemented, or should be implemented in D. Considering that D relies on 
garbage collection heaily with arrays anyway, the construct of an empty, 
but existant array is unnecessary.

I believe that making this distinction, between empty and non-existent 
arrays, just provides the possibility for another misconception and bug.

If someone sees real technical necessity to be able to distinguish 
between the empty and the non-existing one, is invited to show it here.

-eye
July 22, 2005
Re: [Suggestion] Make if(array) illegal.
"Regan Heath" <regan@netwin.co.nz> wrote in message 
news:opsuaqfmcv23k2f5@nrage.netwin.co.nz...
> On Thu, 21 Jul 2005 22:31:37 -0400, Ben Hinkle <ben.hinkle@gmail.com> 
> wrote:
>> "Regan Heath" <regan@netwin.co.nz> wrote in message
>> news:opst6x8cje23k2f5@nrage.netwin.co.nz...
>>> On Wed, 20 Jul 2005 02:15:58 +0000 (UTC), AJG <AJG_member@pathlink.com>
>>> wrote:
>>>> This is a suggestion based on a thread from a couple of weeks ago. What
>>>> about
>>>> making if (array) illegal in D? I think it brings ambiguity and a high
>>>> potential
>>>> for errors to the language. The main two uses for this construct can
>>>> already be
>>>> done with a slightly more explicit syntax:
>>>>
>>>> if (array.ptr == null) // Check for a kind of "non-existance."
>>>> if (array.length == 0) // Check for explicit emptiness.
>>>>
>>>> On the other hand, one is not sure what if (array) by itself is 
>>>> supposed
>>>> to
>>>> mean, since it's _not_ like C. In C, if (array), where array is
>>>> typically a
>>>> pointer, means simply != NULL. The problem in D is that the array ptr 
>>>> is
>>>> tricky
>>>> and IMHO it's best not to interface with it directly.
>>>>
>>>> I think it would be wise to remove this ambiguity. I propose two 
>>>> options:
>>>> 1) Make if (array) equal _always_ to if (array.length).
>>>> 2) Simply make it illegal.
>>>>
>>>> What do you guys think? Walter?
>>>
>>> I prefer the current behaviour (for all the reasons I mentioned in the
>>> previous thread):
>>>   http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/25804
>>>
>>> "if (array)" is the same as "if (array.ptr)" which acts just like it 
>>> does
>>> in C, comparing it to 0/null.
>>>
>>> Essentially the "if" statement is checking the not zero state of the
>>> variable itself. In the case of value types it compares the value to 0. 
>>> In
>>> the case of pointers and references it compares them to null.
>>>
>>> In the case of an array, which (as explained in link above) is a
>>> mix/pseudo value/reference type, it compares the data pointer to null.
>>>
>>> The reason this is the correct behaviour is that a null array has a null
>>> data pointer, but, an empty array i.e. an existing set containing no
>>> elements may have a non-null data pointer. In both cases they have a 0
>>> length property.
>>>
>>> Of course we could change this, we could remove the case where an array
>>> contains no items but has a non-null data pointer. This IMO would 
>>> remove a
>>> useful distinction, the "existing set containing no items" would be
>>> un-representable with a single array variable. IMO that would be a bad
>>> move, the current situation(*) is good.
>>>
>>> (*) there remains the problem where setting the length of an array sets
>>> the data pointer to null. This can change an "existing set with no
>>> elements" into a "non existant set".
>>>
>>> Regan
>>
>> I was poking around the Qt documentation and interestingly enough QString
>> has a concept of null and empty. Here's what they say, though: "For
>> historical reasons, QString distinguishes between a null string and an 
>> empty
>> string. [snip] We recommend that you always use isEmpty() and avoid
>> isNull()."
>>
>> The exact doc is
>> http://doc.trolltech.com/4.0/qstring.html#distinction-between-null-and-empty-strings
>
> That's not too surprising. A lot of people have never seen the need for 
> the distinction, and it certainly can make life "simpler". However, I 
> don't believe you can argue that it doesn't exist, at least logically. 
> That is why you get situations like this (stolen from a post to the 
> DMDScript group):
>
> <quote>
> For example, might it not be useful to return 'null' on EOF, thus allowing
> this sort of construct:
>
>     var line = readln();
>
>     while (line != null)
>     {
>          ...
>          line = readln();
>     }
> </quote>
>
> which is an example where there is a desire to distinguish between 
> existance and empty.
>
> Sure, you can remove the distinction, lessen the expressiveness of arrays 
> and force everyone to "work around" the deficiency in other ways, it's 
> possible, it can make life simpler for the general case and more 
> complicated for the rest.
>
> I think arrays in D are nearly perfect(*). They allow you to ignore the 
> distinction in the general case (thus life is pretty easy already) yet you 
> can tell the difference if you require it.
>
> (*) there are only 2 problems with them IMO:
>
> 1. length = 0; resets the data pointer to null, changing emtpy into 
> non-existant.
> 2. "int[0] a;" and "int[] a = new int[0];" produce different results when 
> you'd expect the same thing.
>
> Regan

Sure, I agree special values can be useful and null is an easy special value 
to use. Note the same behavior can be obtained with returning a singleton 
empty just for eof, if desired. The singleton approach could arguably make 
the code more readable, too, since the reader wouldn't have to know that 
null line meant eof. For example
char[] line = din.readLine();
while (line !is din.eofLine()) { ... line = din.readLine(); }
where eofLine can return null or if the stream author wishes it can return 
some other unique empty string.
July 23, 2005
Re: [Suggestion] Make if(array) illegal.
On Fri, 22 Jul 2005 15:00:51 +0200, Ilya Minkov <minkov@cs.tum.edu> wrote:
> Derek Parnell schrieb:
>>> Making difference between an empty array and a nonexistent one is  
>>> flaky, if not directly ambiguous, thus D does not do it, as far as i  
>>> can remember the statement of Walter. Thus if(array) is not ambiguous.
>>  Maybe in your world, but not in mine.
>
> [...]
>
>> To repeat: Existence and Emptiness are not the same concept.
>
> The matter of discussion is not your or my view of the real world, nor  
> some other programming languages' realm. The matter is how arrays are  
> implemented, or should be implemented in D.

Sure, however D exists in the real world. Programmers solve real world  
problems. IMO arrays should be implemented in D in a manner that best  
allows us to do that.

> Considering that D relies on garbage collection heaily with arrays  
> anyway, the construct of an empty, but existant array is unnecessary.

I don't see your point. The concept of existance, non-existance, empty,  
not-empty still exists with garbage collection as much as any other memory  
management sceme. Garbage collection does not obviate the need to express  
non-existance, exists but empty, exists and not empty.

> I believe that making this distinction, between empty and non-existent  
> arrays, just provides the possibility for another misconception and bug.

You're correct in one respect, having the ability to express more i.e.  
non-existance, exists but empty, exists and not empty adds complexity  
increasing the chance that someone will mistakenly use one when they mean  
the other.

However, as a concrete example a very common bug in C/C++ is referencing a  
null pointer (a pointer is a good example of a type which can represent  
non-existance, exists but empty, exists and not empty).

Arrays in D do not share this problem, the array reference cannot be null.  
At the same time, the current array implementation retains the  
expressiveness that allows you to represent non-existance, exists but  
empty, exists and not empty.

My point is that D's arrays have the expressiveness without the  
complexity, you can ignore the non-existance case unless you want/need to  
consider it.

> If someone sees real technical necessity to be able to distinguish  
> between the empty and the non-existing one, is invited to show it here.

I'm not sure there is a "necessity" as in most cases you could probably  
"work around" the restriction (if it was added to D). Here is an example  
where the expressiveness of representing non-existance, exists but empty,  
exists and not empty is useful.

This comment was posted to the DMDScript NG recently:

<quote>
For example, might it not be useful to return 'null' on EOF, thus allowing
this sort of construct:

    var line = readln();

    while (line != null)
    {
         ...
         line = readln();
    }
</quote>

Of course you could implement this in another way, removing the need for  
the ability to represent non-existance. You would have to if your type  
couldn't represent non-existance, that is the price you pay for  
simplicity. The current price paid for the current array's expressiveness  
is very little IMO.

Regan
1 2 3 4 5 6 7
Top | Discussion index | About this forum | D home