June 30, 2004
Andy Friesen <andy@ikagames.com> wrote in news:cbt41t$i1n$1@digitaldaemon.com:

> Farmer wrote:
> 
>> Andy Friesen <andy@ikagames.com> wrote in news:cbpsi6$1u7d$1@digitaldaemon.com:
>> 
>> [snip]
>> 
>>>C++ containers cannot represent null either.  D will (and does) get along just fine if its array type works the same way.
>> 
>> [snip]
>> 
>> And probably that is one reason why programmers don't use std::vector.
> 
> They don't?  Do you have a source to back that up?  As far as I've ever noticed, bigwig C++ people have always made it clear that std::vector is preferable over an array and that std::string is preferable to a char*.

Sorry, my statement was badly expressed. I meant it more like "And probably that is another reason why programmers often refrain from using std:vector."

Of course, programmers use std::vector, otherwise I'd said that I am not a programmer ;-)


> 
> The concern for distinguishing empty vs null has quite honestly never even occurred to me until it was mentioned here.  Think about expressing the distinction a different way and move on.

I expect that this concern will rarely come up, and that's exactly why I
brought it up.
I would move on, but I see no compelling reason to express it in a different
way.


> 
> I do apologize if I sound naive, (I'll assume that comment was directed at me :) ) but I honestly can't comprehend a situation in which the distinction is going to have any measurable cost on clarity, let alone performance.
> 
>   -- andy

I was naive in believing that it is obvious what posts I referred to.
I was thinking e.g. of post
    	http://www.digitalmars.com/drn-bin/wwwnews/23126
Btw, the author of this post, happens to use the term naive, so he shouldn't
take offense.

But in fact, this post doesn't really advocate 'NaN' for ints, rather
    	http://www.digitalmars.com/drn-bin/wwwnews/23100
does so.

Sorry, andy and sorry Regan. You didn't suggest 'NaN' for ints. So no f(l)ame(s) for you...


Farmer.






June 30, 2004
I hope you're not referring to the quick hack I posted. It was meant to express the *conceptual* problem of returning a null value for a value type -- *not* a practical one. It was mentioned in the context of the ML option type.

ps. Both links are broken.


June 30, 2004
On Wed, 30 Jun 2004 22:57:02 +0000 (UTC), Farmer <itsFarmer.@freenet.de> wrote:
> Sean Kelly <sean@f4.ca> wrote in news:cbsqnf$547$1@digitaldaemon.com:
>
>> In article <Xns9517F3F654C29itsFarmer@63.105.9.61>, Farmer says...
>>>
>>> The .length parameter would still work with null-arrays (as they
>>> currently do).
>>> But why would you want to initialize an array to null/empty and then
>>> resize it, instead of 'newing' it with the correct size in first place?
>>
>> Consider the following:
>>
>> char[] str = new char[100];
>> str.length = 0; // A
>> str.length = 5; // B
>> str = new char[10]; // C
>>
>> In A, AFAIK it's legal for the compiler to retain the memory and merely
>> change the length parameter for the string.  B then just changes the
>> length parameter again, and no reallocation is performed.  C forces a
>> reallocation even if the array already has the (hidden) capacity in
>> place.  Lacking allocators, this is a feature I consider rather nice in
>> D.
> I agree with you that this feature is quite useful.
> The problem with (A) is, that DMD doesn't do that; the function
> 'arraysetlength' explicitly checks whether the new length is null, and if so
> destroys the data pointer.

Provably correct. :)

--[test.d]--
struct array { int length; void *data; }
void main() {
	char[] p = new char[100];
	array *s = cast(array *)&p;
	
	printf("%d\n",s.length);
	printf("%08x\n",s.data);
	p.length = 0;
	printf("%d\n",s.length);
	printf("%08x\n",s.data);
}

prints

100
007d2f80
0
00000000

> Furthermore it seems that it is not allowed to
> call the .length property for null-arrays.

I can go:

p.length = 0;
p.length = 0;
p.length = 0;
p.length = 0;

no problem? is that what you mean't?

> How do I know? Well the function in the phobos file internal\gc.d
>     	byte[] _d_arraysetlength(uint newlength, uint sizeelem, Array *p)
> contains this assertion
>     	assert(!p.length || p.data);

perhaps this function is not called if (p.length == 0 && newlength == 0) one level higher?

> Ironically, this assertion permits, that the data pointer is null, but the
> length is greater than 0.

which is technically impossible.

Regan

>>> Extra coding is not required if you don't need null-arrays: if some user
>>> passes a null-array, the user gets a nice access violation/array bounds
>>> exception and will quickly learn to not pass null-arrays to such
>>> functions. A quick check in the DbC section of your function would do
>>> the job, too. (But I suppose, the user might not adapt that fast that
>>> way :-)
>>
>> I originally thought D worked the way you describe and added DBC clauses
>> to all my functions to check for null array parameters.  After some
>> testing I realized I'd been mistaken and happily removed most of these
>> clauses.  The result IMO was tighter, cleaner code that was easier to
>> understand.  I suppose it's really a matter of opinion.  I like that
>> arrays work the same as the other primitive types.
>
> I always love it when this happens. Code that isn't written, is bug-free,
> maintainable, and super-fast ;-)
>
>
>
>>
>>> If your function should deal with both null-arrays and empty-arrays, no
>>> extra code is required, since the .length property can be accessed for
>>> both null- arrays and emtpy-arrays.
>>
>> Could it?  I suppose so, but the concept seems a tad odd.  I kind of
>> expect none of the parameters (besides sizeof, perhaps) to work for
>> dynamic types that have not been initialized.  Though perhaps that's the
>> C way of thinking.
>
> Yes, I think it is bit odd, too. For reading the length property it makes
> sense, but for resizing it is more questionable. But I am definetely thinking
> the C way here.
>
>
> Farmer.



-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
July 01, 2004
On Wed, 30 Jun 2004 22:57:04 +0000 (UTC), Farmer <itsFarmer.@freenet.de> wrote:
> Andy Friesen <andy@ikagames.com> wrote in
> news:cbt41t$i1n$1@digitaldaemon.com:
>
>> Farmer wrote:
>>
>>> Andy Friesen <andy@ikagames.com> wrote in
>>> news:cbpsi6$1u7d$1@digitaldaemon.com:
>>>
>>> [snip]
>>>
>>>> C++ containers cannot represent null either.  D will (and does) get
>>>> along just fine if its array type works the same way.
>>>
>>> [snip]
>>>
>>> And probably that is one reason why programmers don't use std::vector.
>>
>> They don't?  Do you have a source to back that up?  As far as I've ever
>> noticed, bigwig C++ people have always made it clear that std::vector is
>> preferable over an array and that std::string is preferable to a char*.
>
> Sorry, my statement was badly expressed. I meant it more like "And probably
> that is another reason why programmers often refrain from using std:vector."
>
> Of course, programmers use std::vector, otherwise I'd said that I am not a
> programmer ;-)
>
>
>>
>> The concern for distinguishing empty vs null has quite honestly never
>> even occurred to me until it was mentioned here.  Think about expressing
>> the distinction a different way and move on.
>
> I expect that this concern will rarely come up, and that's exactly why I
> brought it up.
> I would move on, but I see no compelling reason to express it in a different
> way.
>
>
>>
>> I do apologize if I sound naive, (I'll assume that comment was directed
>> at me :) ) but I honestly can't comprehend a situation in which the
>> distinction is going to have any measurable cost on clarity, let alone
>> performance.
>>
>>   -- andy
>
> I was naive in believing that it is obvious what posts I referred to.
> I was thinking e.g. of post
>     	http://www.digitalmars.com/drn-bin/wwwnews/23126
> Btw, the author of this post, happens to use the term naive, so he shouldn't
> take offense.

Was it me.. these links don't work for me :(

> But in fact, this post doesn't really advocate 'NaN' for ints, rather
>     	http://www.digitalmars.com/drn-bin/wwwnews/23100
> does so.

linky no worky :(

> Sorry, andy and sorry Regan.
> You didn't suggest 'NaN' for ints. So no f(l)ame(s) for you...

Aww.. AFAIKS we either need a NaN value for all value types, OR, we use reference types instead.

Arrays in D act just like reference types (except for the inconsitencies you have shown) even tho they aren't technically, what I want to know is, what effect will changes to those inconsistencies actually have to people who do not need to be able to tell a null array from an empty one?

Regan.

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
July 01, 2004
Regan Heath wrote:

> if ("a" in attribs) { ... }
> ...
> 
> you seem to have completely ignored the fact that, *if* we remove the ability to return null when an array type is expected (you suggested removing the ability to assign null to an array, it's the same thing), the above will cease to work altogether as I imagine the above is simply going
> 
> if (attribs["a"] != null)

I very much doubt this.  Associative arrays maintain an internal list of keys and values.  In all likelihood, the 'in' operator hashes the key ("a" in this case) and searches through the associative array's internal hash table for one that matches.

>> If nonexistence is an alias for some default, fill the array before parsing the file.  Attributes that are present will override those which are not.
>>
>> Python offers a get() method which takes two arguments: a key, and a default value which is returned should the key not exist.  I use this a lot.
> 
> but if there is no default, you're left doing the nadda thing below which is simply an ugly hack (explanation below)

Right.  I am an idiot. (below)

>> That's why you use 'is' and not ==.  'is' performs a pointer comparison.    The array has to point into that exact string literal for the comparison to be true.  The only catch is string pooling.  It'd be okay as long as the string literal "nadda" isn't declared anywhere in the source code.
> 
> ahh, gotcha, so basically you're creating null with another name. Why not just have null. :)

I was thinking about this, and the conclusion that I came to is that I am a complete idiot for not noticing what looked to be a completely arbitrary distinction with respect to comparing against null and comparing against any other pointer.

After a tiny bit of testing, I came to the conclusion that I am an even bigger idiot than I could have possibly imagined.  D already gets things pretty much bang on:

    T[] a, b;
    a = b;     // 'a == b' and 'a is b' will both be true. (even if b is
               // null)
    a = b.dup; // 'a == b' will be true.  'a is b' will be true iff b is
               // null. (null.dup is null, evidently.  funny that)

With respect to 'a == null', my mind is quite blown.  Farmer's tests reliably produce situations where zero-length strings compare false against null.  My own tests show that empty arrays are equivalent to null but do not share identity.  Don't test x==null, I guess. :)

Explicitly testing for an empty, non-null array requires that you write 'if (x !== null && x.length == 0)', which is probably okay: I can envision hordes of new programmers going postal because of 'name != ""' and 'name.length == 0' somehow both evaluating to true at the same time.

 -- andy
July 01, 2004
On Wed, 30 Jun 2004 19:02:22 -0700, Andy Friesen <andy@ikagames.com> wrote:

> Regan Heath wrote:
>
>> if ("a" in attribs) { ... }
>> ...
>>
>> you seem to have completely ignored the fact that, *if* we remove the ability to return null when an array type is expected (you suggested removing the ability to assign null to an array, it's the same thing), the above will cease to work altogether as I imagine the above is simply going
>>
>> if (attribs["a"] != null)
>
> I very much doubt this.  Associative arrays maintain an internal list of keys and values.  In all likelihood, the 'in' operator hashes the key ("a" in this case) and searches through the associative array's internal hash table for one that matches.

I agree totally. I am not disputing how an associative array works, what I am saying is, without the ability to compare an array to null, you cannot express 'does not exist' in terms of an associative array.

What does:
  if ("a" in attribs)

actually evaluate to, if not:
  if (attribs["a"] != null)

?

>>> If nonexistence is an alias for some default, fill the array before parsing the file.  Attributes that are present will override those which are not.
>>>
>>> Python offers a get() method which takes two arguments: a key, and a default value which is returned should the key not exist.  I use this a lot.
>>
>> but if there is no default, you're left doing the nadda thing below which is simply an ugly hack (explanation below)
>
> Right.  I am an idiot. (below)
>
>>> That's why you use 'is' and not ==.  'is' performs a pointer comparison.    The array has to point into that exact string literal for the comparison to be true.  The only catch is string pooling.  It'd be okay as long as the string literal "nadda" isn't declared anywhere in the source code.
>>
>> ahh, gotcha, so basically you're creating null with another name. Why not just have null. :)
>
> I was thinking about this, and the conclusion that I came to is that I am a complete idiot for not noticing what looked to be a completely arbitrary distinction with respect to comparing against null and comparing against any other pointer.
>
> After a tiny bit of testing, I came to the conclusion that I am an even bigger idiot than I could have possibly imagined.  D already gets things pretty much bang on:
>
>      T[] a, b;
>      a = b;     // 'a == b' and 'a is b' will both be true. (even if b is
>                 // null)
>      a = b.dup; // 'a == b' will be true.  'a is b' will be true iff b is
>                 // null. (null.dup is null, evidently.  funny that)
>
> With respect to 'a == null', my mind is quite blown.  Farmer's tests reliably produce situations where zero-length strings compare false against null. My own tests show that empty arrays are equivalent to null but do not share identity.  Don't test x==null, I guess. :)
>
> Explicitly testing for an empty, non-null array requires that you write 'if (x !== null && x.length == 0)', which is probably okay:

My tests, given:

char[] e = ""
char[] n;

output:

e is ""    (f)
n is ""    (f)
e is null  (f)
n is null  (t)
e is n     (f)

e == ""    (t)
n == ""    (t) incorrect?
e == null  (f)
n == null  (t)
e == n     (t) incorrect?

e === ""   (f)
n === ""   (f)
e === null (f)
n === null (t)
e === n    (f)

The != and !== tests were all the opposite of the above, so I have not included them.

== calls opEquals, perhaps it has a shortcut in it which says if the lengths are both 0 return true? this would explain the two cases above I have marked "incorrect?". I think these two cases are inconsistent.

To reliably test for nullness I can use '===' or '!==' or 'is'.

> I can envision hordes of new programmers going postal because of 'name != ""' and 'name.length == 0' somehow both evaluating to true at the same time.

Yeah.. to stop that name.length would have to have a NaN (null) value. Which 'int' or 'uint' does not have.

Regan.

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
July 01, 2004
In article <Xns95199C928F73itsFarmer@63.105.9.61>, Farmer says...
>
>How do I know? Well the function in the phobos file internal\gc.d
>    	byte[] _d_arraysetlength(uint newlength, uint sizeelem, Array *p)
>contains this assertion
>    	assert(!p.length || p.data);
>
>Ironically, this assertion permits, that the data pointer is null, but the length is greater than 0.

I read it that the assertion requires either the length to be zero or the length to be nonzero and the data to be non-null.  This seems to correspond to my assumption that D allows for zero length arrays to retain allocated memory.

Sean


July 01, 2004
On Thu, 1 Jul 2004 04:37:37 +0000 (UTC), Sean Kelly <sean@f4.ca> wrote:

> In article <Xns95199C928F73itsFarmer@63.105.9.61>, Farmer says...
>>
>> How do I know? Well the function in the phobos file internal\gc.d
>>    	byte[] _d_arraysetlength(uint newlength, uint sizeelem, Array *p)
>> contains this assertion
>>    	assert(!p.length || p.data);
>>
>> Ironically, this assertion permits, that the data pointer is null, but the
>> length is greater than 0.
>
> I read it that the assertion requires either the length to be zero or the length
> to be nonzero and the data to be non-null.  This seems to correspond to my
> assumption that D allows for zero length arrays to retain allocated memory.

It may very well allow it (in this code, at this level), but how do you do it?

Regan

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
July 01, 2004
Regan Heath wrote:
> I am not disputing how an associative array works, what I am saying is, without the ability to compare an array to null, you cannot express 'does not exist' in terms of an associative array.
> 
> What does:
>   if ("a" in attribs)
> 
> actually evaluate to, if not:
>   if (attribs["a"] != null)

This could never work anyway.  Types for which null does not make sense obviously can't use null to indicate nonexistence.  Types for which null does make sense can't do this either, as it makes perfect sense to store a null reference.

The fundamental idea is that you're trying to represent a "nonvalue", which is storable in the result variable, but not part of the variable's range.  This obviously won't work, as it requires two contradictory ideas to be simultaneously true.  Adding a 'special' value like null is sometimes close enough for specific application domains, but, in the end, all you're doing is making the range of allowable values bigger.

> == calls opEquals, perhaps it has a shortcut in it which says if the lengths are both 0 return true? this would explain the two cases above I have marked "incorrect?". I think these two cases are inconsistent.

Looking at internal/adi.d, it looks like it compares the lengths, then compares each element in succession:

    extern (C) int _adEq(Array a1, Array a2, TypeInfo ti)
    {
        if (a1.length != a2.length)
            return 0;		// not equal
        int sz = ti.tsize();
        //printf("sz = %d\n", sz);
        void *p1 = a1.ptr;
        void *p2 = a2.ptr;
        for (int i = 0; i < a1.length; i++)
        {
            if (!ti.equals(p1 + i * sz, p2 + i * sz))
                return 0;		// not equal
        }
        return 1;			// equal
    }

How on Earth ""!=null ever comes about is beyond me.

 -- andy
July 01, 2004
On Wed, 30 Jun 2004 22:40:28 -0700, Andy Friesen <andy@ikagames.com> wrote:

> Regan Heath wrote:
>> I am not disputing how an associative array works, what I am saying is, without the ability to compare an array to null, you cannot express 'does not exist' in terms of an associative array.
>>
>> What does:
>>   if ("a" in attribs)
>>
>> actually evaluate to, if not:
>>   if (attribs["a"] != null)
>
> This could never work anyway.  Types for which null does not make sense obviously can't use null to indicate nonexistence.  Types for which null does make sense can't do this either, as it makes perfect sense to store a null reference.

Yeah... you're right.

> The fundamental idea is that you're trying to represent a "nonvalue", which is storable in the result variable, but not part of the variable's range.  This obviously won't work, as it requires two contradictory ideas to be simultaneously true.  Adding a 'special' value like null is sometimes close enough for specific application domains, but, in the end, all you're doing is making the range of allowable values bigger.

I think.. I agree. :)

>> == calls opEquals, perhaps it has a shortcut in it which says if the lengths are both 0 return true? this would explain the two cases above I have marked "incorrect?". I think these two cases are inconsistent.
>
> Looking at internal/adi.d, it looks like it compares the lengths, then compares each element in succession:

I went looking for that (not hard enough obviously)..

>      extern (C) int _adEq(Array a1, Array a2, TypeInfo ti)
>      {
>          if (a1.length != a2.length)
>              return 0;		// not equal
>          int sz = ti.tsize();
>          //printf("sz = %d\n", sz);
>          void *p1 = a1.ptr;
>          void *p2 = a2.ptr;
>          for (int i = 0; i < a1.length; i++)
>          {
>              if (!ti.equals(p1 + i * sz, p2 + i * sz))
>                  return 0;		// not equal
>          }
>          return 1;			// equal
>      }

> How on Earth ""!=null ever comes about is beyond me.

below _adEq is..

extern (C) int _adCmp(Array a1, Array a2, TypeInfo ti)
{
    int len;

    //printf("adCmp()\n");
    len = a1.length;
    if (a2.length < len)
	len = a2.length;
    int sz = ti.tsize();
    void *p1 = a1.ptr;
    void *p2 = a2.ptr;
    for (int i = 0; i < len; i++)
    {
	int c;

	c = ti.compare(p1 + i * sz, p2 + i * sz);
	if (c)
	    return c;
    }
    return cast(int)a1.length - cast(int)a2.length;
}

which would return 0 if both lengths were 0. "" and null both have a length of 0.

Regan.

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/