Thread overview | |||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
May 18, 2018 is == | ||||
---|---|---|---|---|
| ||||
Why does D complain when using == to compare with null? Is there really any technical reason? if one just defines == null to is null then there should be no problem. It seems like a pedantic move by who ever implemented it and I'm hoping there is actually a good technical reason for it. |
May 18, 2018 Re: is == | ||||
---|---|---|---|---|
| ||||
Posted in reply to IntegratedDimensions | On Friday, 18 May 2018 at 23:53:12 UTC, IntegratedDimensions wrote:
> Why does D complain when using == to compare with null? Is there really any technical reason? if one just defines == null to is null then there should be no problem. It seems like a pedantic move by who ever implemented it and I'm hoping there is actually a good technical reason for it.
D only complains of this when you use ref types (classes or AAs). For e.g:
--- test.d
void main()
{
int * p;
assert (p == null && p is null);
class C
{
int x;
}
C c;
assert (c is null);
assert (c == null); //error, c is a reference, so there is confusion between opEquals and null check
}
---
|
May 19, 2018 Re: is == | ||||
---|---|---|---|---|
| ||||
Posted in reply to Uknown | On Friday, 18 May 2018 at 23:58:18 UTC, Uknown wrote:
> On Friday, 18 May 2018 at 23:53:12 UTC, IntegratedDimensions wrote:
>> Why does D complain when using == to compare with null? Is there really any technical reason? if one just defines == null to is null then there should be no problem. It seems like a pedantic move by who ever implemented it and I'm hoping there is actually a good technical reason for it.
>
> D only complains of this when you use ref types (classes or AAs). For e.g:
> --- test.d
> void main()
> {
> int * p;
> assert (p == null && p is null);
> class C
> {
> int x;
> }
> C c;
> assert (c is null);
> assert (c == null); //error, c is a reference, so there is confusion between opEquals and null check
> }
> ---
or pointers.
|
May 19, 2018 Re: is == | ||||
---|---|---|---|---|
| ||||
Posted in reply to IntegratedDimensions | On Friday, 18 May 2018 at 23:53:12 UTC, IntegratedDimensions wrote: > Why does D complain when using == to compare with null? Is there really any technical reason? if one just defines == null to is null then there should be no problem. It seems like a pedantic move by who ever implemented it and I'm hoping there is actually a good technical reason for it. tldr: this error is outdated. In the days of yore, "obj == null" would call "obj.opEquals(null)". Attempting to call a virtual method on a null object is a quick path to a segmentation fault. So "obj == null" would either yield false or crash your program. Except it's worse than that; your opEquals method had to explicitly check for null. So if your class had a custom equality function, "obj == null" was probably going to segfault no matter what. Because of this common source of errors, in DMD 2.012 (2008), we got an error only for the case of comparing with a literal null. (The compiler isn't a mind-reader; it doesn't know whether that variable will be null when that line of code executes.) This still sucked, so in 2015 we got a runtime function to handle object equality: https://github.com/dlang/druntime/blob/dff824eda422b1fcdde5f2fe53120fcd71733aaa/src/object.d#L140 But we haven't removed the error message. It *is* faster to call "foo is null" than "foo == null", but I don't think that's particularly worth a compiler error. The compiler could just convert it to "is null" automatically in that case. One casualty of the current state of affairs is that no object may compare equal to null. |
May 18, 2018 Re: is == | ||||
---|---|---|---|---|
| ||||
Posted in reply to IntegratedDimensions | On Friday, May 18, 2018 23:53:12 IntegratedDimensions via Digitalmars-d- learn wrote:
> Why does D complain when using == to compare with null? Is there really any technical reason? if one just defines == null to is null then there should be no problem. It seems like a pedantic move by who ever implemented it and I'm hoping there is actually a good technical reason for it.
Because == is pretty much never what you want to do with null. How much it matters depends on the types involved, but if you really want to check for null, is is definitely the right thing to use.
In the case of pointers and references, is checks that they're pointing to the same thing. So,
foo is null
directly checks whether the reference or pointer is null. On the other hand, if you use ==, it's calling some form of opEquals. For pointers, that should generate identical code, but for class references, it means calling the free function opEquals. That function will check whether the references are null before calling opEquals on either of the class objects, but it does add unnecessary overhead (which, as I understand it, the compiler is unfortunately not currently able to optimize away) and provides no benefit over checking with is.
Now, where is vs == _really_ matters (but unfortunately, the compiler does not complain about) is with dynamic arrays. If you do
arr is null
then the compiler will check whether the array's ptr is null. So, something like
"" is null
would be false. However, if you use ==, then it compares the length of the array and then only compares the ptrs if the length is non-zero. So,
"" == null
is true. So, with dynamic arrays, using == with null is a huge code smell. It _may_ be exactly what the programmer intends, but the odds are pretty high that they just don't properly understand the difference between is and ==, and they meant to be checking whether the array was actually null but just ended up checking whether its length was zero (which won't matter for some code but will cause subtle bugs in any code that treats null as special - e.g. if that is used to indicate that the array had not been given a value). Now, because of how == treats null like empty, it _is_ a bit risky to try and treat null as special with arrays, but anyone wanting to be clear in their code should either be checking null with is (in which case, they clearly care about null and not empty), or if they care about length == 0, they should either be calling empty on the array or explicitly checking the array's length, since that's what they care about. Much as having == work with null arrays avoids issues with segfaults due to an array be unitialized as well as avoids needing to give memory to an array just to have it be empty, you pretty much never actually care whether an array == null. You either care that its ptr is null (in which case, is checks that), or you care about whether its length is 0 (in which case empty or directly checking length checks that). arr == null is just unclear and likely buggy.
So really, there are _zero_ advantages to comparing null with ==. Using == with null risks adding extra overhead, and it often makes the code less clear. On the other hand, using is makes it crystal clear what you mean and then does exactly what you mean - check whether the variable is actually null. So, maybe the compiler is being a bit pedantic by insisting that you use is rather than ==, but you really should be using is and not == when checking for null.
- Jonathan M Davis
|
May 18, 2018 Re: is == | ||||
---|---|---|---|---|
| ||||
Posted in reply to Neia Neutuladh | On Saturday, May 19, 2018 01:27:59 Neia Neutuladh via Digitalmars-d-learn wrote: > On Friday, 18 May 2018 at 23:53:12 UTC, IntegratedDimensions > > wrote: > > Why does D complain when using == to compare with null? Is there really any technical reason? if one just defines == null to is null then there should be no problem. It seems like a pedantic move by who ever implemented it and I'm hoping there is actually a good technical reason for it. > > tldr: this error is outdated. > > In the days of yore, "obj == null" would call "obj.opEquals(null)". Attempting to call a virtual method on a null object is a quick path to a segmentation fault. So "obj == null" would either yield false or crash your program. > > Except it's worse than that; your opEquals method had to explicitly check for null. So if your class had a custom equality function, "obj == null" was probably going to segfault no matter what. > > Because of this common source of errors, in DMD 2.012 (2008), we got an error only for the case of comparing with a literal null. (The compiler isn't a mind-reader; it doesn't know whether that variable will be null when that line of code executes.) > > This still sucked, so in 2015 we got a runtime function to handle > object equality: > https://github.com/dlang/druntime/blob/dff824eda422b1fcdde5f2fe53120fcd717 > 33aaa/src/object.d#L140 > > But we haven't removed the error message. Actually, that runtime function has existed since before TDPL came out in 2010. It even shows the implementation of the free function opEquals (which at the time was in object_.d rather than object.d). I'm not even sure that the error message was added before the free function version of opEquals was. Maybe when that error message was first introduced, it avoided a segfault, but if so, it has been a _long_ time since that was the case. > It *is* faster to call "foo is null" than "foo == null", but I don't think that's particularly worth a compiler error. The compiler could just convert it to "is null" automatically in that case. > > One casualty of the current state of affairs is that no object may compare equal to null. Honestly, while the compiler probably should just convert obj == null to obj is null, there really still isn't really a good reason to ever use == with null. It's _never_ better than using is, and in some cases, it's worse. Of course, the most notable case where using == with null is a terrible idea is dynamic arrays, and that's the case where the compiler _doesn't_ complain. Using == with null and arrays is always unclear about the programmer's intent and almost certainly wasn't what the programmer intended. If the programmer cares about null, they should use is. If they care about lengnth, then that's what they should check. Checking null with == is just a huge code smell. So, perhaps the compiler is being pedantic, but it's still telling you the right thing. It's just insisting about it in the case where it matters less while not complaining aobut it in the case where it really matters, which is dumb. So IMHO, if anything, adding an error message for the array case would make more sense than getting rid of the error with pointers and references. - Jonathan M Davis |
May 19, 2018 Re: is == | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jonathan M Davis | On Saturday, 19 May 2018 at 01:48:38 UTC, Jonathan M Davis wrote: > Actually, that runtime function has existed since before TDPL came out in 2010. It even shows the implementation of the free function opEquals (which at the time was in object_.d rather than object.d). I'm not even sure that the error message was added before the free function version of opEquals was. Maybe when that error message was first introduced, it avoided a segfault, but if so, it has been a _long_ time since that was the case. Good catch. I overly trusted git blame. The opEquals(Object, Object) function was added in February 2010, while the error message was added in March 2008. > Of course, the most notable case where using == with null is a terrible idea is dynamic arrays, and that's the case where the compiler _doesn't_ complain. Using == with null and arrays is always unclear about the programmer's intent and almost certainly wasn't what the programmer intended. If the programmer cares about null, they should use is. If they care about lengnth, then that's what they should check. Checking null with == is just a huge code smell. I feel like the array == null version is more explicit about not allocating memory. However, I'm paranoid about whether that's going to check the pointer instead, so I mostly use array.length == 0 instead. |
May 18, 2018 Re: is == | ||||
---|---|---|---|---|
| ||||
Posted in reply to Neia Neutuladh | On Saturday, May 19, 2018 03:32:53 Neia Neutuladh via Digitalmars-d-learn wrote:
> > Of course, the most notable case where using == with null is a terrible idea is dynamic arrays, and that's the case where the compiler _doesn't_ complain. Using == with null and arrays is always unclear about the programmer's intent and almost certainly wasn't what the programmer intended. If the programmer cares about null, they should use is. If they care about lengnth, then that's what they should check. Checking null with == is just a huge code smell.
>
> I feel like the array == null version is more explicit about not allocating memory. However, I'm paranoid about whether that's going to check the pointer instead, so I mostly use array.length == 0 instead.
I'm not sure what memory allocations you're worried about. Neither "" nor [] allocates memory, but regardless, if you're looking to check whether arr.ptr is null, then that's effectively what you get with
arr is null
- though IIRC, it still checks length in that case. It's just that the type system guarantees that a null dynamic array has a length of 0. You'd have to do some pretty screwy @system casting to have a null dynamic array with a length other than 0. But you can always do
arr.ptr is null
Regardless, if you're checking for null, then is does the job, and if what you care about is whether the array is empty, then that's what
arr.length == 0
and
arr.empty
do. arr == null is just risking confusion, because there's no way to know if the programmer meant
arr is null
or
arr.empty
Regardless, it's absolutely guaranteed that
arr == null
is going to avoid checking the value of ptr just like
arr == arr2
won't check ptr if length == 0. == only cares that both arrays have the same number of elements and that they're equal with ==. If length is 0, there's no need to check the elements to verify that, and if the lengths don't match, there's no need to check the elements. If you actually used enough screwed up casts to get two dynamic arrays whose ptr values were null, and they had different lengths, you still wouldn't get a crash. The _only_ way to get a segfault from using == with a null dynamic array is if you did enough screwy @system casts to have two dynamic arrays with the same non-zero length, and one of them had a null ptr. It wouldn't even crash if they were both null, because it's going to check the ptrs before comparing the elements. Regardless, in no real program do you have to worry about segfaulting with == and dynamic arrays, and you don't have to worry about == ever allocating. The closest that you'd get to that would be if you compared against a non-null array literal. e.g.
arr == [1, 2, 3]
and if the compiler is smart enough, not even that should allocate (though I don't remember if it's that smart at the moment).
Ultimately, the question of is vs == comes down to clarity of the programmer's intent.
- Jonathan M Davis
|
May 19, 2018 Re: is == | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jonathan M Davis | On Saturday, 19 May 2018 at 04:30:24 UTC, Jonathan M Davis wrote: > On Saturday, May 19, 2018 03:32:53 Neia Neutuladh via Digitalmars-d-learn wrote: >> > Of course, the most notable case where using == with null is a terrible idea is dynamic arrays, and that's the case where the compiler _doesn't_ complain. Using == with null and arrays is always unclear about the programmer's intent and almost certainly wasn't what the programmer intended. If the programmer cares about null, they should use is. If they care about lengnth, then that's what they should check. Checking null with == is just a huge code smell. >> >> I feel like the array == null version is more explicit about not allocating memory. However, I'm paranoid about whether that's going to check the pointer instead, so I mostly use array.length == 0 instead. > > I'm not sure what memory allocations you're worried about. Neither "" nor [] allocates memory "" is syntax for compile-time constants and shouldn't ever allocate. [] is a specific case of [values...]; the general case allocates, but this one case does not. null is not even a compile-time constant; it's a value baked into the language and is guaranteed not to allocate. > but regardless, if you're looking to check whether arr.ptr is null, then that's effectively what you get with > > arr is null I don't think I've ever wanted to distinguish a zero-length slice of an array from a null array. > Regardless, if you're checking for null, then is does the job, and if what you care about is whether the array is empty, then that's what > > arr.length == 0 > > and > > arr.empty > > do. As I already said, I use "array.length == 0". "array.empty" is part of that newfangled range business. |
May 19, 2018 Re: is == | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jonathan M Davis | On Saturday, 19 May 2018 at 01:31:38 UTC, Jonathan M Davis wrote:
> On Friday, May 18, 2018 23:53:12 IntegratedDimensions via Digitalmars-d- learn wrote:
>> Why does D complain when using == to compare with null? Is there really any technical reason? if one just defines == null to is null then there should be no problem. It seems like a pedantic move by who ever implemented it and I'm hoping there is actually a good technical reason for it.
>
> Because == is pretty much never what you want to do with null. How much it matters depends on the types involved, but if you really want to check for null, is is definitely the right thing to use.
>
> In the case of pointers and references, is checks that they're pointing to the same thing. So,
>
> foo is null
>
> directly checks whether the reference or pointer is null. On the other hand, if you use ==, it's calling some form of opEquals. For pointers, that should generate identical code, but for class references, it means calling the free function opEquals. That function will check whether the references are null before calling opEquals on either of the class objects, but it does add unnecessary overhead (which, as I understand it, the compiler is unfortunately not currently able to optimize away) and provides no benefit over checking with is.
>
> Now, where is vs == _really_ matters (but unfortunately, the compiler does not complain about) is with dynamic arrays. If you do
>
> arr is null
>
> then the compiler will check whether the array's ptr is null. So, something like
>
> "" is null
>
> would be false. However, if you use ==, then it compares the length of the array and then only compares the ptrs if the length is non-zero. So,
>
> "" == null
>
> is true. So, with dynamic arrays, using == with null is a huge code smell. It _may_ be exactly what the programmer intends, but the odds are pretty high that they just don't properly understand the difference between is and ==, and they meant to be checking whether the array was actually null but just ended up checking whether its length was zero (which won't matter for some code but will cause subtle bugs in any code that treats null as special - e.g. if that is used to indicate that the array had not been given a value). Now, because of how == treats null like empty, it _is_ a bit risky to try and treat null as special with arrays, but anyone wanting to be clear in their code should either be checking null with is (in which case, they clearly care about null and not empty), or if they care about length == 0, they should either be calling empty on the array or explicitly checking the array's length, since that's what they care about. Much as having == work with null arrays avoids issues with segfaults due to an array be unitialized as well as avoids needing to give memory to an array just to have it be empty, you pretty much never actually care whether an array == null. You either care that its ptr is null (in which case, is checks that), or you care about whether its length is 0 (in which case empty or directly checking length checks that). arr == null is just unclear and likely buggy.
>
> So really, there are _zero_ advantages to comparing null with ==. Using == with null risks adding extra overhead, and it often makes the code less clear. On the other hand, using is makes it crystal clear what you mean and then does exactly what you mean - check whether the variable is actually null. So, maybe the compiler is being a bit pedantic by insisting that you use is rather than ==, but you really should be using is and not == when checking for null.
>
> - Jonathan M Davis
I don't see your point.
You claim that one should never use == null for whatever reason and that it is "wrong". So, why are you allowing wrong things in a language that can easily be fixed?
Just reinterpret == null as is null and be done with it! This fixes the wrong and everyone can live happily ever after.
Your logic is the same how people "ban" certain words like faggot. They don't like them for some reason, decide that no one should use it any more, and create a new word that essentially means the same thing... and it results in a loop where that new word then eventually gets "banned".
== vs is might not be quite as extreme, maybe is will be the final "word". But if == null is banned by the compiler why the hell not just reinterpret to mean is null internally and be done with it and allow the syntax since it is so common?
The only pitfalls is pasting code from other languages that might have a different interpretation, but those problems always exist since the languages are different.
Your reasons for arrays is not good enough. First, not all types are arrays so you are banning a whole class of valid types for one case. That case, you say, is almost never meant anyways(that is, using == null is really meant as is null).
So, ultimately what I feels like is that you are actually arguing for == null to be interpreted as is null but you don't realize it yet.
|
Copyright © 1999-2021 by the D Language Foundation