July 25, 2007
Frits van Bommel escribió:
> 
> Since null arrays have length 0, they *are* empty arrays :P.
> 

But empty arrays are not null. You could even argue that null arrays don't have a length, thus they can't be empty.

-- 
Carlos Santander Bernal
July 25, 2007
Regan Heath wrote:
>>> The only thing that should compare equal to null is null.  Likewise an empty array should only compare equal to another empty array.
>>  >
>>  > My reasoning for this is consistency, see at end.
>>
>> Since null arrays have length 0, they *are* empty arrays :P.
> 
> I can't tell in which way you're joking so I'm just going to come out with...
> 
> The length of something be it an array, a car, a <insert thing> is totally independant of whether it exists (though a non-existant item cannot have a length).
> 
> It either exists or it does not.  If it exists, it has a length which may or may not be zero.
> 
> Something which exists cannot be equal to something which doesn't.

I don't think that's really what's happening here.
Consider vectors. If a vector has a length of zero, the direction doesn't exist.
Take two arbitrary vectors with different directions, a and b.
a*0 == b*0, even though the direction of a is completely different to that of b.
This is the same model which is being used for arrays; if the .length is zero, the .ptr is irrelevant.
July 26, 2007
On Wed, 25 Jul 2007 15:05:25 +0200, Frits van Bommel wrote:

> Since both the null string and "" have .length == 0, that means they compare equal using those methods (having no contents to compare and equal length)
> 
> This is all perfectly consistent (and even useful) to me...

However,

   string x = "";

means that 'x' is not null because it has a pointer and that points a string with no content. Something that is null has no pointer and therefore the length component is not significant. But of course, in order to represent something that really does have the address of zero we should only consider 'x' to be null when both x.ptr and x.length are both zero. In every other case it is not null.

-- 
Derek Parnell
Melbourne, Australia
"Down with mediocrity!"
July 26, 2007
On Wed, 25 Jul 2007 14:29:47 +0100, Regan Heath wrote:

> Aside: If the location and length are identical you can short-circuit the compare, returning true and ignoring the content, this could save a bit of time on comparisons of large arrays.

I don't think this is such a good idea. How does one address the array of
four bytes at RAM location 4?


-- 
Derek Parnell
Melbourne, Australia
"Down with mediocrity!"
July 26, 2007
On Wed, 25 Jul 2007 14:31:28 +0100, Bruno Medeiros wrote:

> The .ptr of empty arrays may be different than the .ptr of null arrays, but they are conceptually the same, and thus not safely distinguishable.

No they are not! Conceptually they are different things. However, D sometimes implements them as the same thing.

> Example:
> 	writefln("" is null); // false
> 	writefln("".dup is null); // true
>
> "".ptr is not null, but "".dup.ptr is null. Such duplication is correct, because empty arrays are conceptually the same as null arrays, and trying to use .ptr do distinguish them is unsafe, implementation-depedendent behavior (aka a program error).

But I believe that the implementation here is wrong. "".dup should create another empty string and not a null string.

-- 
Derek Parnell
Melbourne, Australia
"Down with mediocrity!"
July 26, 2007
On Wed, 25 Jul 2007 19:01:57 +0200, Frits van Bommel wrote:

> Since null arrays have length 0, they *are* empty arrays :P.

Not in my world. I see that null arrays have no length. That is to say, the
do not have any length, which is different from saying they have a length
and that length is zero.


>> All that I would like changed is for the compare, in the case of length == 0, to check the data pointers, eg.
>> 
>>  > int opEquals(T)(T[] u, T[] v) {
>>  >     if (u.length != v.length) return false;
>>       if (u.length == 0) return (u.ptr == v.ptr);
>>  >     for (size_t i = 0; i < u.length; i++) {
>>  >         if (u[i] != v[i]) return false;
>>  >     }
>>  >     return true;
>>  > }
>> 
>> This should mean "" == "" but not "" == null, likewise null == null but not null == "".
> 
> Let's look at this code:
> ---
> import std.stdio;
> 
> void main()
> {
>      char[][] strings = ["hello world!", "", null];
> 
>      foreach (str; strings) {
>          auto str2 = str.dup;
>          if (str == str2)
>              writefln(`"%s" == "%s" (%s, %s)`, str, str2, str.ptr,
> str2.ptr);
>          else
>              writefln(`"%s" != "%s" (%s, %s)`, str, str2, str.ptr,
> str2.ptr);
>      }
> }
> ---
> The output is currently (on my machine):
> =====
> "hello world!" == "hello world!" (805BE60, F7CFBFE0)
> "" == "" (805BE78, 0000)
> "" == "" (0000, 0000)
> =====
> Your change would change the second line (even if it actually allocated
> a new empty string like you probably want instead of returning null).
> How would that be consistent in any way?

Your example is misleading for at least two reasons:
** The '==' operator compares the contents of the strings. A null string
has no content so there is nothing to compare. This should fail but is
doesn't in the current D. It should fail in the same manner that a null
object reference fails the '==' operator.
** The output is 'writefln' attempt at given a string representation of the
data presented. It (aka Walter) has decided that the string representation
of a null array is an empty string. This does not mean that a null array is
an empty strng but just that writefln represents it as such.


-- 
Derek Parnell
Melbourne, Australia
"Down with mediocrity!"
July 26, 2007
On Wed, 25 Jul 2007 22:07:15 +0200, Don Clugston wrote:

> Regan Heath wrote:
>>>> The only thing that should compare equal to null is null.  Likewise an empty array should only compare equal to another empty array.
>>>  >
>>>  > My reasoning for this is consistency, see at end.
>>>
>>> Since null arrays have length 0, they *are* empty arrays :P.
>> 
>> I can't tell in which way you're joking so I'm just going to come out with...
>> 
>> The length of something be it an array, a car, a <insert thing> is totally independant of whether it exists (though a non-existant item cannot have a length).
>> 
>> It either exists or it does not.  If it exists, it has a length which may or may not be zero.
>> 
>> Something which exists cannot be equal to something which doesn't.
> 
> I don't think that's really what's happening here.
> Consider vectors. If a vector has a length of zero, the direction doesn't exist.
> Take two arbitrary vectors with different directions, a and b.
> a*0 == b*0, even though the direction of a is completely different to that of b.
> This is the same model which is being used for arrays; if the .length is zero,
> the .ptr is irrelevant.

But arrays are not vectors.

-- 
Derek Parnell
Melbourne, Australia
"Down with mediocrity!"
July 26, 2007
Derek Parnell wrote:
> On Wed, 25 Jul 2007 19:01:57 +0200, Frits van Bommel wrote:
> 
>> Since null arrays have length 0, they *are* empty arrays :P.
> 
> Not in my world. I see that null arrays have no length. That is to say, the
> do not have any length, which is different from saying they have a length
> and that length is zero.

But the fact of the matter is, 'T[] x = null;' reserves space for the .length and sets it to 0. If you have a suggestion for a different value to put there, by all means make it.
Or would you prefer a segfault or diagnostic when accessing (cast(T[])null).length? That'd introduce overhead on every .length access (unless the compiler can statically determine whether an array reference is null).

>>> All that I would like changed is for the compare, in the case of length == 0, to check the data pointers, eg.
>>>
>>>  > int opEquals(T)(T[] u, T[] v) {
>>>  >     if (u.length != v.length) return false;
>>>       if (u.length == 0) return (u.ptr == v.ptr);
>>>  >     for (size_t i = 0; i < u.length; i++) {
>>>  >         if (u[i] != v[i]) return false;
>>>  >     }
>>>  >     return true;
>>>  > }
>>>
>>> This should mean "" == "" but not "" == null, likewise null == null but not null == "".
>> Let's look at this code:
>> ---
>> import std.stdio;
>>
>> void main()
>> {
>>      char[][] strings = ["hello world!", "", null];
>>
>>      foreach (str; strings) {
>>          auto str2 = str.dup;
>>          if (str == str2)
>>              writefln(`"%s" == "%s" (%s, %s)`, str, str2, str.ptr, str2.ptr);
>>          else
>>              writefln(`"%s" != "%s" (%s, %s)`, str, str2, str.ptr, str2.ptr);
>>      }
>> }
>> ---
>> The output is currently (on my machine):
>> =====
>> "hello world!" == "hello world!" (805BE60, F7CFBFE0)
>> "" == "" (805BE78, 0000)
>> "" == "" (0000, 0000)
>> =====
>> Your change would change the second line (even if it actually allocated a new empty string like you probably want instead of returning null). How would that be consistent in any way?
> 
> Your example is misleading for at least two reasons:
> ** The '==' operator compares the contents of the strings. A null string
> has no content so there is nothing to compare. This should fail but is
> doesn't in the current D. It should fail in the same manner that a null
> object reference fails the '==' operator.

This wasn't the point of the example. I could have left out the third element and change the .dup in the second line to a different empty string (f.e. a 0-length slice of the first one) and the point would remain the same: the proposed change would break comparison by '==' for empty non-null strings.

> ** The output is 'writefln' attempt at given a string representation of the
> data presented. It (aka Walter) has decided that the string representation
> of a null array is an empty string. This does not mean that a null array is
> an empty strng but just that writefln represents it as such.

Like I said, the point of the example didn't actually have anything to do with null strings, but rather with a bug in a change Regan proposed to make null strings and non-null empty strings compare unequal, which resulted in non-null empty strings comparing unequal.
July 26, 2007
Derek Parnell wrote:
> On Wed, 25 Jul 2007 14:29:47 +0100, Regan Heath wrote:
> 
>> Aside: If the location and length are identical you can short-circuit the compare, returning true and ignoring the content, this could save a bit of time on comparisons of large arrays.
> 
> I don't think this is such a good idea. How does one address the array of
> four bytes at RAM location 4?

I'm pretty sure the only way to obtain such an array would be to have already invoked Undefined Behavior (assuming 4 is an invalid memory location on the platform the program's running on) and as such it doesn't really matter whether or not two array references to it compare equal or not...
July 26, 2007
On Thu, 26 Jul 2007 07:47:03 +0200, Frits van Bommel wrote:

> Derek Parnell wrote:
>> On Wed, 25 Jul 2007 19:01:57 +0200, Frits van Bommel wrote:
>> 
>>> Since null arrays have length 0, they *are* empty arrays :P.
>> 
>> Not in my world. I see that null arrays have no length. That is to say, the do not have any length, which is different from saying they have a length and that length is zero.
> 
> But the fact of the matter is, 'T[] x = null;' reserves space for the .length and sets it to 0. If you have a suggestion for a different value to put there, by all means make it.

I'm trying not to set in concrete the ABI of variable-length arrays. So even though the current D definition is that a VL array consists of a two-element struct and zero or one block of RAM, conceptually a null array doesn't point to anything and does not have a length. So to me it doesn't matter that D allocates space for .length and .ptr portions of the nullVL array, because it still should not use the .length value. But, because theoretically every RAM address possbiel could be stored in the .ptr portion, including zero, I conceed that in D the .ptr and .length both being zero is needed to indicate a null array, even though this disallows the conceptual empty array begining at address zero.

> Or would you prefer a segfault or diagnostic when accessing (cast(T[])null).length? That'd introduce overhead on every .length access (unless the compiler can statically determine whether an array reference is null).

Yes I would. However, too many people are relying on this inconsistency so I'll live with that wart in the language.

>>>> All that I would like changed is for the compare, in the case of length == 0, to check the data pointers, eg.
>>>>
>>>>  > int opEquals(T)(T[] u, T[] v) {
>>>>  >     if (u.length != v.length) return false;
>>>>       if (u.length == 0) return (u.ptr == v.ptr);
>>>>  >     for (size_t i = 0; i < u.length; i++) {
>>>>  >         if (u[i] != v[i]) return false;
>>>>  >     }
>>>>  >     return true;
>>>>  > }
>>>>
>>>> This should mean "" == "" but not "" == null, likewise null == null but not null == "".
>>> Let's look at this code:
>>> ---
>>> import std.stdio;
>>>
>>> void main()
>>> {
>>>      char[][] strings = ["hello world!", "", null];
>>>
>>>      foreach (str; strings) {
>>>          auto str2 = str.dup;
>>>          if (str == str2)
>>>              writefln(`"%s" == "%s" (%s, %s)`, str, str2, str.ptr,
>>> str2.ptr);
>>>          else
>>>              writefln(`"%s" != "%s" (%s, %s)`, str, str2, str.ptr,
>>> str2.ptr);
>>>      }
>>> }
>>> ---
>>> The output is currently (on my machine):
>>> =====
>>> "hello world!" == "hello world!" (805BE60, F7CFBFE0)
>>> "" == "" (805BE78, 0000)
>>> "" == "" (0000, 0000)
>>> =====
>>> Your change would change the second line (even if it actually allocated
>>> a new empty string like you probably want instead of returning null).
>>> How would that be consistent in any way?
>> 
>> Your example is misleading for at least two reasons:
>> ** The '==' operator compares the contents of the strings. A null string
>> has no content so there is nothing to compare. This should fail but is
>> doesn't in the current D. It should fail in the same manner that a null
>> object reference fails the '==' operator.
> 
> This wasn't the point of the example.

Sorry for misunderstanding.

> I could have left out the third element and change the .dup in the second line to a different empty string (f.e. a 0-length slice of the first one) and the point would remain the same: the proposed change would break comparison by '==' for empty non-null strings.

I agree with you. Two empty non-null strings should compare as equal because the equality test is against the contents of the array and not the addresses of the array. A null array has no content so one has nothing to compare it with; this is why I think that it is an illegal/meaningless operation.

-- 
Derek Parnell
Melbourne, Australia
"Down with mediocrity!"