Thread overview | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
March 19, 2008 Empty Array is Null? | ||||
---|---|---|---|---|
| ||||
char[] array = "".dup; assert(array !is null); This will exit because the assert condition is false. Why is that? -- Brian |
March 19, 2008 Re: Empty Array is Null? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Brian White | "Brian White" wrote
> char[] array = "".dup;
> assert(array !is null);
>
> This will exit because the assert condition is false.
>
> Why is that?
Here is my guess:
The compiler does not allocate a piece of memory for "", and so the array struct for it looks like:
{ ptr = null, length = 0 }
If you dup this, it gives you the same thing (no need to allocate an array of size 0).
Now, here is the weird part. The compiler does some magic with arrays. If you are comparing an array with null, it changes the code to actually just compare the array pointer to null. So, the the following code:
array !is null
is translated to:
array.ptr !is null
And this is why the program fails the assert.
The sucky part about all this is that if you have an empty array where the pointer is NOT null, then you get a different result (that array is not considered to be null)
So an array is null ONLY if the pointer is null. An array is empty if the length is 0. If you want to check for an empty array, just check that the length is 0. If you want to make sure that the pointer is null (which implies the length is 0), then check against null.
So code like this looks weird to people who are used to C# or Java:
array = null;
array.length = 5; // you would expect a segfault here
Because array is really a struct with some compiler magic, the variable array itself can never truly be null.
Anyways, hope this helps.
-Steve
|
March 19, 2008 Re: Empty Array is Null? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Steven Schveighoffer | > Anyways, hope this helps.
It confirms I'm not going insane, and that's always helpful. :-)
-- Brian
|
March 19, 2008 Re: Empty Array is Null? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Steven Schveighoffer | Steven Schveighoffer wrote: > "Brian White" wrote >> char[] array = "".dup; >> assert(array !is null); >> >> This will exit because the assert condition is false. >> >> Why is that? > > Here is my guess: > > The compiler does not allocate a piece of memory for "", and so the array struct for it looks like: > > { ptr = null, length = 0 } Sorry, but your guess is wrong: --- urxae@urxae:~/tmp$ cat test.d import std.stdio; void main() { writefln("%s", "".ptr); } urxae@urxae:~/tmp$ dmd -run test.d 805C41C --- > If you dup this, it gives you the same thing (no need to allocate an array of size 0). Since as I mentioned above the input wasn't null, so it's not "the same thing". Otherwise, this is correct (including the reason given; actually allocating 0 bytes is pretty useless). The fact that empty_arr.dup returns null has been the topic of some discussion in the newsgroups IIRC, but the fact is that it's equivalent to allocating a zero-byte array on the heap in the most important aspects: * The returned array has the correct length. * All elements of the returned array are identical to the original array. [1] * All of the returned array's elements can be freely modified without modifying the original array. [1] * Changing any of the original elements doesn't change the returned array. [1] * Appending anything to the returned value doesn't risk changing anything previously allocated (as the GC will allocate a new block of memory when appending to a non-gc-allocated array; which includes null arrays). On top of all that, it's also very efficient since it doesn't require any allocation (at least, until anything is appended onto it). The *only* property it doesn't have that 'normal' .dups do have is that normal .dups return unique non-null values. The only ways to even detect that are by 'is'-comparing to null (or a null-valued array) or (implicitly or explicitly) casting it to a boolean. All other behavior is completely consistent. The discussion on the NGs was, IIRC, between those who considered 'null' to mean "no string" while considering other empty strings as "empty string" and those who just don't see any reason to explicitly distinguish between the two. In the end, I believe, it came down to "Walter is in the latter camp". [1]: These are trivially true since having no elements that can be read or written means they don't actually require anything for empty arrays. > Now, here is the weird part. The compiler does some magic with arrays. If you are comparing an array with null, it changes the code to actually just compare the array pointer to null. So, the the following code: > > array !is null > > is translated to: > > array.ptr !is null > > And this is why the program fails the assert. Actually, if you compare an array to null (using 'is') DMD performs an 'or' instruction on the .ptr and .length and tests for the flag that it sets if the result is zero. This is just an optimization; this is equivalent to checking if both .ptr and .length are 0 (though presumably faster, since it's a single instruction that doesn't even implement full comparison). > The sucky part about all this is that if you have an empty array where the pointer is NOT null, then you get a different result (that array is not considered to be null) Actually, 'array == null' should return true for any empty array. Testing arrays with 'is' explicitly requests comparing .ptr and .length directly, not paying any attention to the contents; 'is' checks for identity, '==' for equivalence. > So an array is null ONLY if the pointer is null. An array is empty if the length is 0. If you want to check for an empty array, just check that the length is 0. If you want to make sure that the pointer is null (which implies the length is 0), then check against null. Other ways to check for an empty array are 'arr == ""' or 'arr == null' (using '==' instead of 'is') |
March 19, 2008 Re: Empty Array is Null? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Frits van Bommel | "Frits van Bommel" wrote > Steven Schveighoffer wrote: >> "Brian White" wrote >>> char[] array = "".dup; >>> assert(array !is null); >>> >>> This will exit because the assert condition is false. >>> >>> Why is that? >> >> Here is my guess: >> >> The compiler does not allocate a piece of memory for "", and so the array struct for it looks like: >> >> { ptr = null, length = 0 } > > Sorry, but your guess is wrong: > --- > urxae@urxae:~/tmp$ cat test.d > import std.stdio; > > void main() { > writefln("%s", "".ptr); > } > urxae@urxae:~/tmp$ dmd -run test.d > 805C41C Hm... ok, like I said it was a guess :) > On top of all that, it's also very efficient since it doesn't require any > allocation (at least, until anything is appended onto it). > The *only* property it doesn't have that 'normal' .dups do have is that > normal .dups return unique non-null values. The only ways to even detect > that are by 'is'-comparing to null (or a null-valued array) or (implicitly > or explicitly) casting it to a boolean. All other behavior is completely > consistent. > The discussion on the NGs was, IIRC, between those who considered 'null' > to mean "no string" while considering other empty strings as "empty > string" and those who just don't see any reason to explicitly distinguish > between the two. In the end, I believe, it came down to "Walter is in the > latter camp". My view is that array is null should not compile, as array is not a pointer type. Having statements like this confuses new coders into thinking array is a pure pointer or reference type, when in fact it is a struct. This is espeically confusing to Java or C# (and probably other) coders who are used to an array being a heap-allocated type. But I seriously doubt my view is going to change anything like others before me :) > Actually, if you compare an array to null (using 'is') DMD performs an 'or' instruction on the .ptr and .length and tests for the flag that it sets if the result is zero. This is just an optimization; this is equivalent to checking if both .ptr and .length are 0 (though presumably faster, since it's a single instruction that doesn't even implement full comparison). Huh? Why does it do that? If you have a null pointer, then clearly the length should be 0. An optimization in my mind would be to just replace array is null to array.ptr is null. Is there a good reason to have a null pointer array with a non-zero length? > >> The sucky part about all this is that if you have an empty array where the pointer is NOT null, then you get a different result (that array is not considered to be null) > > Actually, 'array == null' should return true for any empty array. Testing arrays with 'is' explicitly requests comparing .ptr and .length directly, not paying any attention to the contents; 'is' checks for identity, '==' for equivalence. I would guess that the newest D compiler would not allow that, since comparing to null is now an error except for using 'x is null' Of course, this is another guess, since I haven't downloaded the new compiler yet :) -Steve |
March 19, 2008 Re: Empty Array is Null? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Steven Schveighoffer | Steven Schveighoffer wrote: > "Frits van Bommel" wrote >> Actually, if you compare an array to null (using 'is') DMD performs an 'or' instruction on the .ptr and .length and tests for the flag that it sets if the result is zero. This is just an optimization; this is equivalent to checking if both .ptr and .length are 0 (though presumably faster, since it's a single instruction that doesn't even implement full comparison). > > Huh? Why does it do that? If you have a null pointer, then clearly the length should be 0. An optimization in my mind would be to just replace array is null to array.ptr is null. Is there a good reason to have a null pointer array with a non-zero length? Indeed, no program should be able to get a non-empty array with .ptr == null. However, it appears the compiler currently doesn't use that as an optimization opportunity. Maybe even only because Walter didn't think of it, or just because it doesn't really save that much and it isn't worth the trouble of checking if one of the values is known to be null at compile time. The 'or' is itself an optimization that only applies when comparing to a 0-length null array, but this optimization may well be implemented completely in the compiler backend which doesn't know that the length should always be null if the pointer is; it may only know that it needs to compare these two numbers against those other two numbers and jump based on the result... >>> The sucky part about all this is that if you have an empty array where the pointer is NOT null, then you get a different result (that array is not considered to be null) >> Actually, 'array == null' should return true for any empty array. Testing arrays with 'is' explicitly requests comparing .ptr and .length directly, not paying any attention to the contents; 'is' checks for identity, '==' for equivalence. > > I would guess that the newest D compiler would not allow that, since comparing to null is now an error except for using 'x is null' > > Of course, this is another guess, since I haven't downloaded the new compiler yet :) I'm pretty sure it's only an error when comparing class instances. It shouldn't be an error to compare pointers or arrays against null. (There's no reason for it to be since they don't use vtables) |
March 19, 2008 Re: Empty Array is Null? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Frits van Bommel | "Frits van Bommel" wrote > Steven Schveighoffer wrote: >> "Frits van Bommel" wrote >>> Actually, if you compare an array to null (using 'is') DMD performs an 'or' instruction on the .ptr and .length and tests for the flag that it sets if the result is zero. This is just an optimization; this is equivalent to checking if both .ptr and .length are 0 (though presumably faster, since it's a single instruction that doesn't even implement full comparison). >> >> Huh? Why does it do that? If you have a null pointer, then clearly the length should be 0. An optimization in my mind would be to just replace array is null to array.ptr is null. Is there a good reason to have a null pointer array with a non-zero length? > > Indeed, no program should be able to get a non-empty array with .ptr == > null. However, it appears the compiler currently doesn't use that as an > optimization opportunity. Maybe even only because Walter didn't think of > it, or just because it doesn't really save that much and it isn't worth > the trouble of checking if one of the values is known to be null at > compile time. > The 'or' is itself an optimization that only applies when comparing to a > 0-length null array, but this optimization may well be implemented > completely in the compiler backend which doesn't know that the length > should always be null if the pointer is; it may only know that it needs to > compare these two numbers against those other two numbers and jump based > on the result... Good point. I wonder if comparing any struct to null is equivalent to comparing if all it's values are 0... >>>> The sucky part about all this is that if you have an empty array where the pointer is NOT null, then you get a different result (that array is not considered to be null) >>> Actually, 'array == null' should return true for any empty array. Testing arrays with 'is' explicitly requests comparing .ptr and .length directly, not paying any attention to the contents; 'is' checks for identity, '==' for equivalence. >> >> I would guess that the newest D compiler would not allow that, since comparing to null is now an error except for using 'x is null' >> >> Of course, this is another guess, since I haven't downloaded the new compiler yet :) > > I'm pretty sure it's only an error when comparing class instances. It shouldn't be an error to compare pointers or arrays against null. (There's no reason for it to be since they don't use vtables) I think you are right. Now that I look at Walter's message, he said specifically comparing class to null is invalid... Thanks -Steve |
March 19, 2008 Re: Empty Array is Null? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Steven Schveighoffer | Steven Schveighoffer wrote:
>
> Now, here is the weird part. The compiler does some magic with arrays. If you are comparing an array with null, it changes the code to actually just compare the array pointer to null. So, the the following code:
>
no this is the weird part:
IIRC this passes.
char* cp = cast(char*)null;
char[] ca = ap[0..15];
assert(ca.ptr == null && ca.length == 15);
|
March 19, 2008 Re: Empty Array is Null? | ||||
---|---|---|---|---|
| ||||
Posted in reply to BCS | BCS wrote:
> Steven Schveighoffer wrote:
>>
>> Now, here is the weird part. The compiler does some magic with arrays. If you are comparing an array with null, it changes the code to actually just compare the array pointer to null. So, the the following code:
>>
>
> no this is the weird part:
>
> IIRC this passes.
>
> char* cp = cast(char*)null;
> char[] ca = ap[0..15];
> assert(ca.ptr == null && ca.length == 15);
It does with DMD, if you s/ap/cp/, but I'm pretty sure what you're doing is invoking undefined behavior. Or at least it should be, but I can't seem to find any mention of it in the spec...
|
March 19, 2008 Re: Empty Array is Null? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Frits van Bommel | Frits van Bommel wrote:
> BCS wrote:
>> Steven Schveighoffer wrote:
>>>
>>> Now, here is the weird part. The compiler does some magic with arrays. If you are comparing an array with null, it changes the code to actually just compare the array pointer to null. So, the the following code:
>>>
>>
>> no this is the weird part:
>>
>> IIRC this passes.
>>
>> char* cp = cast(char*)null;
>> char[] ca = ap[0..15];
>> assert(ca.ptr == null && ca.length == 15);
>
> It does with DMD, if you s/ap/cp/, but I'm pretty sure what you're doing is invoking undefined behavior. Or at least it should be, but I can't seem to find any mention of it in the spec...
Can't see why that would be undefined. It's pretty clear what it means.
Perhaps it should be an error when the compiler detects that you're setting .ptr to null but .length to nonzero. But the compiler can't be expected to detect that in the general case, so it would be of limited usefulness. A bit like disallowing comparing objects to 'null' with ==.
|
Copyright © 1999-2021 by the D Language Foundation