June 15, 2005
On Wed, 15 Jun 2005 01:59:07 +0200, Tom S <h3r3tic@remove.mat.uni.torun.pl> wrote:
> Regan Heath wrote:
>> On Tue, 14 Jun 2005 10:31:39 -0400, Ben Hinkle <bhinkle@mathworks.com>  wrote:
>>> In fact one could argue that Walter should change the
>>> implicit conversion of an array from returning the ptr to returning the
>>> length so that the test
>>>  if (!str) {...}
>>> is equivalent to
>>>  if (str.length != 0) {...}
>>> instead of being equivalent to
>>>  if (str.ptr != null) {...}
>>> as happens today.
>>   We could, but in my mind that would be counter intuitive. "if (a)" tests a  class references, pointers and intrinsics against null, or 0, if we change  arrays we now have different behaviour for them as compared to everything  else.
>
> Yeah, but we already have this kind of 'inconsistency':
>
> # struct Foo
> # {
> # 	bool a = true;
> # }
> #
> # void main()
> # {
> # 	Foo f;
> # 	if (f) {}
> # }

A struct/union is the *only* type that this doesn't work for, because it is a value type, which cannot be implicitly compared to 0, 0.0, or null.

> Yields: 'expression f of type Foo does not have a boolean value'. I prefer to percieve D's arrays as structs

Regardless arrays are references, not structs.

char[] a;

Declares a reference to a char array.

> , so having to check their members: 'ptr' and 'length' would be pretty logical, consistent and less error prone than the current approach IHMO.

An array is no different to a class with length and ptr properties. As with the class, you simply have to remember that fact. That said, the array opCmp operator handles having a null rhs, which it then treats the same as a "" rhs, this is also part of the problem, IMO.

Regan
June 15, 2005
"Derek Parnell" <derek@psych.ward> wrote in message news:1lruav5hsvxn6$.uj97bpn5gk38$.dlg@40tude.net...
> Hmmm... then the code below produces unexpected results ...

Oh darn, looks like I've got work to do...

> Walter, these are direct questions: Do you believe there is a distinction between an empty array and a null array?

I've tried to eliminate the distinction.

> And if so, will D support the two
> concepts in a consistent manner?


June 15, 2005
On Tue, 14 Jun 2005 19:50:01 -0700, Walter wrote:

> "Derek Parnell" <derek@psych.ward> wrote in message news:1lruav5hsvxn6$.uj97bpn5gk38$.dlg@40tude.net...
>> Hmmm... then the code below produces unexpected results ...
> 
> Oh darn, looks like I've got work to do...

Fair enough.

>> Walter, these are direct questions: Do you believe there is a distinction between an empty array and a null array?
> 
> I've tried to eliminate the distinction.

Right, so there is a distinction but you are trying to remove it.

*DON'T* - please. It is a useful distinction and really ought not to be artificially removed.

-- 
Derek
Melbourne, Australia
15/06/2005 12:59:35 PM
June 15, 2005
Ben Hinkle wrote:
<snip>
> Agreed - the doc should be changed to use str.length == 0 to mean "empty" and/or that particular section in the cppstring.html should be removed. Testing the ptr isn't the same as testing for 0 length.
> OTOH since other containers will likely have an "empty" or "isEmpty" member maybe one should be added to the builtin arrays. 

Either way, we'll still need somewhere in the spec an indication of whether using an array reference as a boolean tests for non-null or non-empty, or if it's going to become illegal.

Stewart.

-- 
My e-mail is valid but not my primary mailbox.  Please keep replies on the 'group where everyone may benefit.
June 18, 2005
"Derek Parnell" <derek@psych.ward> wrote in message news:zbvlue18tuv7.h1nffudg5kvn$.dlg@40tude.net...
> Right, so there is a distinction but you are trying to remove it.
>
> *DON'T* - please. It is a useful distinction and really ought not to be artificially removed.

What exactly is the advantage to the distinction?


June 18, 2005
On Fri, 17 Jun 2005 18:31:33 -0700, Walter wrote:

> "Derek Parnell" <derek@psych.ward> wrote in message news:zbvlue18tuv7.h1nffudg5kvn$.dlg@40tude.net...
>> Right, so there is a distinction but you are trying to remove it.
>>
>> *DON'T* - please. It is a useful distinction and really ought not to be artificially removed.
> 
> What exactly is the advantage to the distinction?

For example, to be able to tell if something has been set to empty or has never been set at all.

To distinguish between the presence of emptiness and the absence of anything.


-- 
Derek Parnell
Melbourne, Australia
18/06/2005 12:21:21 PM
June 18, 2005
On Sat, 18 Jun 2005 12:23:09 +1000, Derek Parnell <derek@psych.ward> wrote:
> On Fri, 17 Jun 2005 18:31:33 -0700, Walter wrote:
>
>> "Derek Parnell" <derek@psych.ward> wrote in message
>> news:zbvlue18tuv7.h1nffudg5kvn$.dlg@40tude.net...
>>> Right, so there is a distinction but you are trying to remove it.
>>>
>>> *DON'T* - please. It is a useful distinction and really ought not to be
>>> artificially removed.
>>
>> What exactly is the advantage to the distinction?
>
> For example, to be able to tell if something has been set to empty or has
> never been set at all.
>
> To distinguish between the presence of emptiness and the absence of
> anything.

Exactly, for example you have a web pages which *may* contain settings A, B, C, D, ... when the page is submitted you want to know if the setting was present, and if so, what it was set to. So, there are 3 possible states for each setting:

1 - not present
2 - present, set to ""
3 - present, set to <anything>

Such a post might look like:

A=text&B=&C=other+text&&

Where:

A is case #3
B is case #2
C is case #3
D is case #1

If you remove the distinction between cases #1 and #2 then you cannot, for example, do the following to the previously stored setting values:

overwrite A with "text"
overwrite B with ""
overwrite C with "other text"
leave D as is

because, without the distinction between #1 and #2 you cannot tell the difference between B & D. (no pun intended, well maybe a little)

Compare D's arrays with C's pointer, a pointer can represent the 3 states:

char *p;
//p is set here
if (p == NULL) //case #1
else if (*p == '\0') //case #2
else //case #3

I think we want/need for:
 - arrays in a certain state, stay in that state until intentionally changed (i.e. not when length is set to 0)
 - a reccomended/standard way of identifying the states (I think we have this) eg.

char[] p = null;
//p is set here
if (p is null) //case #1
else if (p.length == 0) //case #2, alternately: if (p[0] == '\0')
else //case #3

I can't see a downside to ensuring the distinction remains, after all you dont get a seg-fault calling:

char[] p = null;
if (p.length == 0)

already, and that doesn't need to change.

Regan

Note: There are workaround solutions (like using an AA and the 'in' operator) but these are seldom as intuitive (for me at least, and perhaps other C/pointer-style programmers) or as direct a solution to the problem as simply supporting the 3 states.
June 18, 2005
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Derek Parnell schrieb am Wed, 15 Jun 2005 09:06:41 +1000:
> On Tue, 14 Jun 2005 15:14:08 -0700, Walter wrote:
>
>> "Derek Parnell" <derek@psych.ward> wrote in message news:2o7hnscxa0hj$.s1owhetd92i0$.dlg@40tude.net...
>>> This problem exists mainly because D sets .ptr to zero whenever the .len
>> is
>>> set to zero,
>> 
>> No, it doesn't. The reason it doesn't is to support the 'reserve some space in advance' idiom:
>> 
>>     array.length = 100;
>>     array.length = 0;
>> 
>> Now, array.ptr points to the start of the buffer that is at least 100 long. Then, the array can be appended to without undergoing reallocation.
>
> Hmmm... then the code below produces unexpected results ...
>
><code>
> import std.stdio;
>
> void func(int w, char[] x, int y, bool z)
> {
>     char[] ok;
>     char[] addr;
>     char[] emp;
>     char[] nul;
>
>     ok = "Good";
>     if (x.length != y)
>         ok = "Bad1";
>     if (z && x.ptr == null)
>         ok = "Bad2";
>     if (!z && x.ptr != null)
>         ok = "Bad3";
>
>     if (z)
>         addr = "non-zero";
>     else
>         addr = "       0";
>
>     emp = (x.length == 0 ? " empty" : "!empty");
>     nul = (x is null ? " null"  : "!null");
>
>     writefln("Test# %d: Actual Len=%d Addr=%8x\n"
>              "       Expected Len=%d Addr=%s  (%s) %s %s\n",
>                 w, x.length, cast(ulong)x.ptr, y, addr, ok, emp, nul);
> }
>
> void main()
> {
>     char[] b;
>
>
>     func(1, "abc", 3, true);
>     func(2, "", 0, true);
>     func(3, "abc".dup, 3, true);
>     func(4, "".dup, 0, true);
>     func(5, b, 0, false);
>     b = "qwerty".dup;
>     func(6, b, 6, true);
>     b.length = 0;
>     func(7, b, 0, true);
>     b = null;
>     func(8, b, 0, false);
>
>     b = "poiuyt".dup;
>     writefln("Addr is %8x with %d elements", cast(ulong)b.ptr, b.length);
>     b.length = 0;
>     writefln("Addr is %8x with %d elements", cast(ulong)b.ptr, b.length);
>     b.length = 100;
>     writefln("Addr is %8x with %d elements", cast(ulong)b.ptr, b.length);
>     b.length = 0;
>     writefln("Addr is %8x with %d elements", cast(ulong)b.ptr, b.length);
>
> }
></code>
>
><results using v0.126>
> Test# 1: Actual Len=3 Addr=  413188
>        Expected Len=3 Addr=non-zero  (Good) !empty !null
>
> Test# 2: Actual Len=0 Addr=  413198
>        Expected Len=0 Addr=non-zero  (Good)  empty !null
>
> Test# 3: Actual Len=3 Addr=  870fe0
>        Expected Len=3 Addr=non-zero  (Good) !empty !null
>
> Test# 4: Actual Len=0 Addr=       0
>        Expected Len=0 Addr=non-zero  (Bad2)  empty  null
>
> Test# 5: Actual Len=0 Addr=       0
>        Expected Len=0 Addr=       0  (Good)  empty  null
>
> Test# 6: Actual Len=6 Addr=  870fd0
>        Expected Len=6 Addr=non-zero  (Good) !empty !null
>
> Test# 7: Actual Len=0 Addr=       0
>        Expected Len=0 Addr=non-zero  (Bad2)  empty  null
>
> Test# 8: Actual Len=0 Addr=       0
>        Expected Len=0 Addr=       0  (Good)  empty  null
>
> Addr is   870fc0 with 6 elements
> Addr is        0 with 0 elements
> Addr is   872f00 with 100 elements
> Addr is        0 with 0 elements
></results>

Added to DStress as http://dstress.kuehne.cn/run/p/ptr_10_A.d http://dstress.kuehne.cn/run/p/ptr_10_B.d http://dstress.kuehne.cn/run/p/ptr_10_C.d http://dstress.kuehne.cn/run/p/ptr_10_D.d http://dstress.kuehne.cn/run/p/ptr_10_E.d http://dstress.kuehne.cn/run/p/ptr_10_F.d http://dstress.kuehne.cn/run/p/ptr_10_G.d http://dstress.kuehne.cn/run/p/ptr_10_H.d

Thomas


-----BEGIN PGP SIGNATURE-----

iD8DBQFCs88d3w+/yD4P9tIRAu6JAJ4iXGnp1Gb4lXGFNZsTWbS6QqaDRQCdH8lM
CjBwYB1cqje3PFNEpM7Gzec=
=EDZq
-----END PGP SIGNATURE-----
June 18, 2005
In article <d8o53s$2bs0$1@digitaldaemon.com>, Walter says...
>
>> Hmmm... then the code below produces unexpected results ...
>
>Oh darn, looks like I've got work to do...

So you mean you weren't aware that it currently sets ptr=null when you set lenght=0? Or have you just changed your mind about it? :) I mean, some of the phobos code have been written specifically to work around this feature. Look at these two excerpts from stream.d:

#  char ungetc(char c) {
#    if (c == c.init) return c;
#    // first byte is a dummy so that we never set length to 0
#    if (unget.length == 0)
#      unget.length = 1;
#    unget ~= c;
#    return c;
#  }

#  void flush() {
#    if (unget.length > 1)
#      unget.length = 1; // keep at least 1 so that data ptr stays
#  }

I think that such tricks are extremely ugly, though, so if you fix it I'll be very happy :)

Nick


June 18, 2005
Regan Heath:
>I can't see a downside to ensuring the distinction remains

The downside IMHO is that it makes strings and arrays more complicated objects, having not two states (non-empty and empty) but three (non-empty, empty and null). This makes them more confusing, and I think the coding style that this encourages will be less readable and more prone to have hard-to-find bugs. But that's just my opinion.

Nick