Jump to page: 1 27  
Page
Thread overview
Checking if a string is null
Jul 25, 2007
Max Samukha
Jul 25, 2007
Hoenir
Jul 25, 2007
Max Samukha
Jul 25, 2007
Regan Heath
Jul 25, 2007
Regan Heath
Jul 25, 2007
Max Samukha
Jul 25, 2007
Ald
Jul 25, 2007
Regan Heath
Jul 25, 2007
Bruno Medeiros
Jul 25, 2007
Regan Heath
Jul 25, 2007
Max Samukha
Jul 25, 2007
Regan Heath
Jul 26, 2007
Bruno Medeiros
Jul 27, 2007
Bruno Medeiros
Jul 27, 2007
Frits van Bommel
Jul 27, 2007
Frits van Bommel
Jul 28, 2007
Frits van Bommel
Jul 30, 2007
Regan Heath
Jul 30, 2007
Regan Heath
Jul 28, 2007
Derek Parnell
Jul 27, 2007
Bruno Medeiros
Jul 28, 2007
Bruno Medeiros
Jul 29, 2007
Bruno Medeiros
Jul 30, 2007
Bruno Medeiros
Jul 30, 2007
Bruno Medeiros
Jul 27, 2007
Bruno Medeiros
Jul 29, 2007
Bruno Medeiros
Jul 30, 2007
Bruno Medeiros
Jul 30, 2007
Manfred Nowak
Jul 30, 2007
Bill Baxter
Jul 30, 2007
Manfred Nowak
Jul 30, 2007
Bill Baxter
Jul 31, 2007
Bruno Medeiros
Jul 25, 2007
Frits van Bommel
Jul 25, 2007
Regan Heath
Jul 25, 2007
Frits van Bommel
Jul 25, 2007
Regan Heath
Jul 25, 2007
Don Clugston
Jul 26, 2007
Derek Parnell
Jul 25, 2007
Carlos Santander
Jul 26, 2007
Derek Parnell
Jul 26, 2007
Frits van Bommel
Jul 26, 2007
Derek Parnell
Jul 26, 2007
Oskar Linde
Jul 26, 2007
Derek Parnell
Jul 26, 2007
Frits van Bommel
Jul 26, 2007
Bruno Medeiros
Jul 26, 2007
Regan Heath
Jul 26, 2007
Oskar Linde
Jul 26, 2007
Regan Heath
Jul 26, 2007
Derek Parnell
Jul 26, 2007
Frits van Bommel
Jul 26, 2007
Derek Parnell
Jul 26, 2007
Frits van Bommel
Jul 26, 2007
Derek Parnell
Jul 26, 2007
Regan Heath
Jul 25, 2007
Bruno Medeiros
Jul 25, 2007
Regan Heath
Jul 25, 2007
Bruno Medeiros
Jul 25, 2007
Regan Heath
Jul 26, 2007
Derek Parnell
Jul 26, 2007
Bruno Medeiros
Jul 26, 2007
Derek Parnell
July 25, 2007
Using '== null' and 'is null' with strings gives odd results (DMD
1.019):

void main()
{
	char[] s;

	if (s is null) writefln("s is null");
	if (s == null) writefln("s == null");
}

Output:
s is null
s == null

----

void main()
{
	char[] s = "";

	if (s is null) writefln("s is null");
	if (s == null) writefln("s == null");
}

Output:
s == null

----

Can anybody explain why s == null is true in the second example?





July 25, 2007
Max Samukha schrieb:
> Using '== null' and 'is null' with strings gives odd results (DMD
> 1.019):
> 
> void main()
> {
> 	char[] s;
> 
> 	if (s is null) writefln("s is null");
> 	if (s == null) writefln("s == null");		
> }
> 
> Output:
> s is null
> s == null
> 
> ----
> 
> void main()
> {
> 	char[] s = "";
> 
> 	if (s is null) writefln("s is null");
> 	if (s == null) writefln("s == null");		
> }
> 
> Output:
> s == null
> 
> ----
> 
> Can anybody explain why s == null is true in the second example?
> 
Makes sense to me. is compares the pointer and == the content or something like that.
July 25, 2007
On Wed, 25 Jul 2007 08:32:52 +0200, Hoenir <mrmocool@gmx.de> wrote:

>Max Samukha schrieb:
>> Using '== null' and 'is null' with strings gives odd results (DMD
>> 1.019):
>> 
>> void main()
>> {
>> 	char[] s;
>> 
>> 	if (s is null) writefln("s is null");
>> 	if (s == null) writefln("s == null");
>> }
>> 
>> Output:
>> s is null
>> s == null
>> 
>> ----
>> 
>> void main()
>> {
>> 	char[] s = "";
>> 
>> 	if (s is null) writefln("s is null");
>> 	if (s == null) writefln("s == null");
>> }
>> 
>> Output:
>> s == null
>> 
>> ----
>> 
>> Can anybody explain why s == null is true in the second example?
>> 
>Makes sense to me. is compares the pointer and == the content or something like that.

Then, it's unclear what null content means. If it is the same as empty string (ptr != null and length == 0), I remain confused. If it means a null string (ptr == null and length == 0), the second example should output nothing since s.ptr != null.
July 25, 2007
Max Samukha wrote:
> Using '== null' and 'is null' with strings gives odd results (DMD
> 1.019):
> 
> void main()
> {
> 	char[] s;
> 
> 	if (s is null) writefln("s is null");
> 	if (s == null) writefln("s == null");		
> }
> 
> Output:
> s is null
> s == null
> 
> ----
> 
> void main()
> {
> 	char[] s = "";
> 
> 	if (s is null) writefln("s is null");
> 	if (s == null) writefln("s == null");		
> }
> 
> Output:
> s == null
> 
> ----
> 
> Can anybody explain why s == null is true in the second example?

Not I, it's inconsistent IMO and it gets worse:

import std.stdio;

void main()
{
	foo(null);
	foo("");	
}

void foo(string s)
{
	writefln(s.ptr, ", ", s.length);
	if (s is null) writefln("s is null");
	if (s == null) writefln("s == null");
	if (s < null)  writefln("s <  null");
	if (s > null)  writefln("s <  null");
	if (s <= null) writefln("s <= null");
	if (s >= null) writefln("s <  null");
	writefln("");
}

Output:
0000, 0
s is null
s == null
s <= null
s <  null

415080, 0
s == null
s <= null
s <  null

So, "" is < and == null!?
and <=,== but not >=!?


This all boils down to the empty vs null string debate where some people want to be able to distinguish between them and some see no point.

I'm in the 'distinguishable' camp.  I can see the merit.  At the very least it should be consistent!

Regan
July 25, 2007
Manfred Nowak wrote:
> Regan Heath wrote
> 
>> This all boils down to the empty vs null string debate where some
>> people want to be able to distinguish between them and some see no
>> point. 
> 
> I haven't seen such a debate.

There have been several, I did a brief search and came up with:

http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=55270
(this one was my fault)

http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=25804
http://www.digitalmars.com/d/archives/digitalmars/D/learn/3521.html
http://www.digitalmars.com/d/archives/21782.html
http://www.digitalmars.com/d/archives/digitalmars/D/27123.html
http://www.digitalmars.com/d/archives/16905.html
http://www.digitalmars.com/d/archives/digitalmars/D/bugs/Issue_1314_New_Dupping_an_empty_array_creates_a_null_array_11585.html
http://www.digitalmars.com/pnews/read.php?server=news.digitalmars.com&group=D&artnum=17083

Some of those go back a long, long way.

> Does it mean that it is not possible to implement a Kleene Algebra for strings in D because there is no neutral element for the alternative operator?

I have no idea. :)

Regan
July 25, 2007
On Wed, 25 Jul 2007 11:12:19 +0100, Regan Heath <regan@netmail.co.nz> wrote:

>Max Samukha wrote:
>> Using '== null' and 'is null' with strings gives odd results (DMD
>> 1.019):
>> 
>> void main()
>> {
>> 	char[] s;
>> 
>> 	if (s is null) writefln("s is null");
>> 	if (s == null) writefln("s == null");
>> }
>> 
>> Output:
>> s is null
>> s == null
>> 
>> ----
>> 
>> void main()
>> {
>> 	char[] s = "";
>> 
>> 	if (s is null) writefln("s is null");
>> 	if (s == null) writefln("s == null");
>> }
>> 
>> Output:
>> s == null
>> 
>> ----
>> 
>> Can anybody explain why s == null is true in the second example?
>
>Not I, it's inconsistent IMO and it gets worse:
>
>import std.stdio;
>
>void main()
>{
>	foo(null);
>	foo("");
>}
>
>void foo(string s)
>{
>	writefln(s.ptr, ", ", s.length);
>	if (s is null) writefln("s is null");
>	if (s == null) writefln("s == null");
>	if (s < null)  writefln("s <  null");
>	if (s > null)  writefln("s <  null");
>	if (s <= null) writefln("s <= null");
>	if (s >= null) writefln("s <  null");
>	writefln("");
>}
>
>Output:
>0000, 0
>s is null
>s == null
>s <= null
>s <  null
>
>415080, 0
>s == null
>s <= null
>s <  null
>
>So, "" is < and == null!?
>and <=,== but not >=!?
>

You didn't update all writefln's :)

>
>This all boils down to the empty vs null string debate where some people want to be able to distinguish between them and some see no point.
>
>I'm in the 'distinguishable' camp.  I can see the merit.  At the very least it should be consistent!
>
>Regan

Anyway, it feels like an undefined area in the language. Do the specs say anything about how exactly arrays/strings/delegates should compare to null? It seems to be more than comparing the pointer part of the structs.
July 25, 2007
I believe the manual says that, when comparing, the compiler tries to call the opEquals() method.  And calling that from null pointer yields undefined behavior.  You should use _!is null_ construct instead.

Max Samukha Wrote:

July 25, 2007
>> So, "" is < and == null!?
>> and <=,== but not >=!?
>>
> 
> You didn't update all writefln's :)

<hangs head in shame> What can I say, I'm having a bad morning.

> Anyway, it feels like an undefined area in the language. Do the specs
> say anything about how exactly arrays/strings/delegates should compare
> to null? It seems to be more than comparing the pointer part of the
> structs.

Not that I can find.  The array page does say:

"Strings can be copied, compared, concatenated, and appended:"
..
"with the obvious semantics."

but not much more on the topic.  Under "Array Initialization" we see:

    * Pointers are initialized to null.
    ..
    * Dynamic arrays are initialized to having 0 elements.
    ..

Which does not state that an array will be initialised to "null" but rather to something with 0 elements.

To my mind something with 0 elements is 'empty' as opposed to being 'non existant' which is typically represented by 'null' or a similar value (like NAN for floats, 0xFF for char, etc).

So, it seems the spec is hinting/saying that arrays cannot be non-existant, only empty (or not empty).

And yet in the current implementation there is clearly a difference between 'null' and "" when it comes to arrays.

I'm still firmly in favour of there being 3 distinct states for an array:
 * non existant (null)
 * empty        ("", length == 0)
 * not empty    (length > 0)

That said I'm all firmly in favour of not getting a seg-fault when I have a reference to a non-existant array (we currently have this behaviour and it's perfect).

All I think that needs 'fixing', and going back to your initial test case:

char[] s = "";

if (s is null) writefln("s is null");
if (s == null) writefln("s == null");		

neither of these tests should evaluate 'true'.

The fact that the latter does indicates to me that the array compare is first comparing length, seeing they're both 0 and assuming the arrays must be equal.

I think instead it should also check the data pointer because in the case of "" the data pointer is non-null.  The same is true for a zero length slice i.e. s[0..0], it exists (data pointer is non-null) but is empty (length is zero).

In short, the compare function should recognise the 3 states:
 * non existant (data pointer is null)
 * empty        (data pointer is non-null, length is zero)
 * not empty    (length is > zero)

and never make the mistake of calling an array in one state equal to an array in another state.

Regan

p.s. I am cross-posting and setting followup to digitalmars.D as it has become more of a theory/discussion on D than a learning exercise :)

p.p.s Plus, I figure if Manfred cannot recall a discussion on this topic we probably need another one about now.
July 25, 2007
Regan Heath wrote:
> Max Samukha wrote:
>> Using '== null' and 'is null' with strings gives odd results (DMD
>> 1.019):
>>
>> void main()
>> {
>>     char[] s;
>>
>>     if (s is null) writefln("s is null");
>>     if (s == null) writefln("s == null");       }
>>
>> Output:
>> s is null
>> s == null
>>
>> ----
>>
>> void main()
>> {
>>     char[] s = "";
>>
>>     if (s is null) writefln("s is null");
>>     if (s == null) writefln("s == null");       }
>>
>> Output:
>> s == null
>>
>> ----
>>
>> Can anybody explain why s == null is true in the second example?
> 
> Not I, it's inconsistent IMO and it gets worse:
> 
> import std.stdio;
> 
> void main()
> {
>     foo(null);
>     foo("");   }
> 
> void foo(string s)
> {
>     writefln(s.ptr, ", ", s.length);
>     if (s is null) writefln("s is null");
>     if (s == null) writefln("s == null");
>     if (s < null)  writefln("s <  null");
>     if (s > null)  writefln("s <  null");
>     if (s <= null) writefln("s <= null");
>     if (s >= null) writefln("s <  null");
>     writefln("");
> }
> 
> Output:
> 0000, 0
> s is null
> s == null
> s <= null
> s <  null
> 
> 415080, 0
> s == null
> s <= null
> s <  null
> 
> So, "" is < and == null!?
> and <=,== but not >=!?

As Max said, you forgot to update some writeflns. The output of the corrected version is:
===
0000, 0
s is null
s == null
s <= null
s >= null

805BEF0, 0
s == null
s <= null
s >= null
===

Seems perfectly consistent to me. Anything with an equality comparison (==, <=, >=) is true in both cases, and 'is' is only true when the pointer as well as the length is equal.

> This all boils down to the empty vs null string debate where some people want to be able to distinguish between them and some see no point.
> 
> I'm in the 'distinguishable' camp.  I can see the merit.  At the very least it should be consistent!

They *are* distinguishable. That's why above code returns different results for the 'is' comparison...

I for one am perfectly fine with "cast(char[]) null" meaning ".length == 0 && .ptr == null" and with comparisons of arrays using == and friends only inspecting the contents (not location) of the data.

Now, about comparisons: array comparisons basically operate like this:
---
int opEquals(T)(T[] u, T[] v) {              // bah to int return type
    if (u.length != v.length) return false;
    for (size_t i = 0; i < u.length; i++) {
        if (u[i] != v[i]) return false;
    }
    return true;
}

int opCmp(T)(T[] u, T[] v) {
    size_t len = min(u.length, v.length)
    for (size_t i = 0; i < len; i++) {
        if (auto diff = u[i].opCmp(v[i])) {
            return diff;
        }
    }
    return cast(int)u.length - cast(int)v.length;
}
---
(Taken from object.TypeInfo_Array and converted to templates instead of void*s + casting + element TypeInfo.{equals/compare} for readability)

Since both the null string and "" have .length == 0, that means they compare equal using those methods (having no contents to compare and equal length)

This is all perfectly consistent (and even useful) to me...
July 25, 2007
>> I'm in the 'distinguishable' camp.  I can see the merit.  At the very least it should be consistent!
> 
> They *are* distinguishable. That's why above code returns different results for the 'is' comparison...

True.  I guess what I meant to say was I'm in the '3 distict states' camp (which may be a camp of 1 for all I know).  See my reply to digitalmars.D for a definition of the 3 states.

> I for one am perfectly fine with "cast(char[]) null" meaning ".length == 0 && .ptr == null" 

Same here.

> and with comparisons of arrays using == and friends
> only inspecting the contents (not location) of the data.

I don't think an empty string (non-null, length == 0) should compare equal to a non-existant string (null, length == 0).  And vice-versa.

The only thing that should compare equal to null is null.  Likewise an empty array should only compare equal to another empty array.

My reasoning for this is consistency, see at end.

Aside: If the location and length are identical you can short-circuit the compare, returning true and ignoring the content, this could save a bit of time on comparisons of large arrays.

> Now, about comparisons: array comparisons basically operate like this:
> ---
> int opEquals(T)(T[] u, T[] v) {              // bah to int return type
>     if (u.length != v.length) return false;
>     for (size_t i = 0; i < u.length; i++) {
>         if (u[i] != v[i]) return false;
>     }
>     return true;
> }
> 
> int opCmp(T)(T[] u, T[] v) {
>     size_t len = min(u.length, v.length)
>     for (size_t i = 0; i < len; i++) {
>         if (auto diff = u[i].opCmp(v[i])) {
>             return diff;
>         }
>     }
>     return cast(int)u.length - cast(int)v.length;
> }
> ---
> (Taken from object.TypeInfo_Array and converted to templates instead of void*s + casting + element TypeInfo.{equals/compare} for readability)

Thanks.

> Since both the null string and "" have .length == 0, that means they compare equal using those methods (having no contents to compare and equal length)

This is the bit I don't like.

> This is all perfectly consistent (and even useful) to me...

It's not consistent with other reference types, types which can represent 'non-existant', eg.

  char *p = null;  //non-existant

  if (p == null) writefln("p == null");
  if (p == "") writefln("p == \"\"");

Output:
  p == null

Compare that to:

  char[] p = null;

  if (p == null) writefln("p == null");
  if (p == "") writefln("p == \"\"");

Output:
  p == null
  p == ""

All that I would like changed is for the compare, in the case of length == 0, to check the data pointers, eg.

> int opEquals(T)(T[] u, T[] v) {
>     if (u.length != v.length) return false;
      if (u.length == 0) return (u.ptr == v.ptr);
>     for (size_t i = 0; i < u.length; i++) {
>         if (u[i] != v[i]) return false;
>     }
>     return true;
> }

This should mean "" == "" but not "" == null, likewise null == null but not null == "".

Regan
« First   ‹ Prev
1 2 3 4 5 6 7