is == (page 2) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » Learn » is == (page 2)

May 19, 2018

Posted by Jonathan M Davis
in reply to IntegratedDimensions

Jonathan M Davis

Posted in reply to IntegratedDimensions

On Saturday, May 19, 2018 17:50:50 IntegratedDimensions via Digitalmars-d- learn wrote:
> So, ultimately what I feels like is that you are actually arguing for == null to be interpreted as is null but you don't realize it yet.

Not really, no. Having

foo == null

be rewritten to

foo is null

in the non-dynamic array cases should be fine except for the fact that it's then a terrible habit to be in when you then have to deal with dynamic arrays. Using

foo == null

with dyanmic arrays is an enormous code smell, because the odds are extemely high that the programmer thinks that they're checking if the dynamic array is null when that's not what they're doing at all. IMHO, it should definitely be an error to use == with null and dynamic arrays because it is such a big code smell. Either the code should be using is to check whether the array is null, or it should be checking length. It should never be using == with null.

But unfortunately, the compiler is completely backwards about this and treats it as an error with pointers and references but allows it with dynamic arrays. If the compiler were improved to just replace == with is in the cases that it currently treats as illegal, then that would be fine if it then treated it as illegal with dynamic arrays. But as it stands, it is still more efficient to use is with call references, so encouraging the programmer to use is is benefical, and it encourages the programmer to get in the habit of not using == with null, since it's a terrible habit to be in with dynamic arrays. But actually making it illegal for dynamic arrays would be a much better approach.

If it were up to me, it would just be illgal to use == with null in general, because that's really the way it should be with dynamic arrays, and then the language would be consistent about it. But instead, the compiler screams in the case that matters far less and allows it in the case that is clearly bad. So, it's inconsistent in a dumb way. At least if it were inconsistent by allowing it for pointers and references while disallowing it for arrays, it would be prventing it in the case that truly matters, but instead, what we have is just dumb.

- Jonathan M Davis

May 19, 2018

Posted by Jonathan M Davis
in reply to Neia Neutuladh

Jonathan M Davis

Posted in reply to Neia Neutuladh

On Saturday, May 19, 2018 17:13:36 Neia Neutuladh via Digitalmars-d-learn wrote:
> I don't think I've ever wanted to distinguish a zero-length slice of an array from a null array.

It's safer if you don't, because it's so easy to end up with a dynamic array that is empty instead of null, and stuff like == doesn't care about the difference. But there is code that's written that cares (e.g. IIRC, std.experimental.allocator does in some cases).

if(arr)

is equivalent to

if(cast(bool)arr)

and casting a dynamic array to bool is equivalent to

arr !is null

which means that

if(arr)

means

if(arr !is null)

whereas it's not uncommon for folks to think that it means

if(arr.length != 0)

Similarly,

assert(arr);

is ultimately equivalent to

asser(arr !is null);

which suprises many folks and is rarely what folks want. So, there was a push at one point to make it illegal to use a dynamic array in an if statment or assertion directly, and it did briefly make it into the compiler. However, a few folks (Andrei and Vladimir in particular IIRC), had used arrays in if statments directly quite a bit, knowing full well what it meant. So, their code was right (albeit potentially confusing), and they pushed back. So, the change was reverted, and we're still stuck with the error-prone situation that we've had.

So, most of us would argue that it's risky to treat null dynamic arrays as special and that it should be done with caution, but programmers who know what they're definitely do it. Unfortunately, when you read code that's writen that way, it's usually hard to tell whether it was written with that undertanding or not, and a stray == in the code could easily break it.

> As I already said, I use "array.length == 0". "array.empty" is part of that newfangled range business.

LOL. Well, if you stay away from ranges, you're losing out on a lot of benefits - including large portions of the standard library, but checking for length works fine if you're dealing with code that's just using dynamic arrays and not ranges. The key thing is to avoid arr == null, because that's were the bugs lie.

- Jonathan M Davis

May 20, 2018

Posted by IntegratedDimensions
in reply to Jonathan M Davis

IntegratedDimensions

Posted in reply to Jonathan M Davis

On Sunday, 20 May 2018 at 00:19:28 UTC, Jonathan M Davis wrote:
> On Saturday, May 19, 2018 17:50:50 IntegratedDimensions via Digitalmars-d- learn wrote:
>> So, ultimately what I feels like is that you are actually arguing for == null to be interpreted as is null but you don't realize it yet.
>
> Not really, no. Having
>
> foo == null
>
> be rewritten to
>
> foo is null
>
> in the non-dynamic array cases should be fine except for the fact that it's then a terrible habit to be in when you then have to deal with dynamic arrays. Using
>
> foo == null
>
> with dyanmic arrays is an enormous code smell, because the odds are extemely high that the programmer thinks that they're checking if the dynamic array is null when that's not what they're doing at all. IMHO, it should definitely be an error to use == with null and dynamic arrays because it is such a big code smell. Either the code should be using is to check whether the array is null, or it should be checking length. It should never be using == with null.
>
> But unfortunately, the compiler is completely backwards about this and treats it as an error with pointers and references but allows it with dynamic arrays. If the compiler were improved to just replace == with is in the cases that it currently treats as illegal, then that would be fine if it then treated it as illegal with dynamic arrays. But as it stands, it is still more efficient to use is with call references, so encouraging the programmer to use is is benefical, and it encourages the programmer to get in the habit of not using == with null, since it's a terrible habit to be in with dynamic arrays. But actually making it illegal for dynamic arrays would be a much better approach.
>
> If it were up to me, it would just be illgal to use == with null in general, because that's really the way it should be with dynamic arrays, and then the language would be consistent about it. But instead, the compiler screams in the case that matters far less and allows it in the case that is clearly bad. So, it's inconsistent in a dumb way. At least if it were inconsistent by allowing it for pointers and references while disallowing it for arrays, it would be prventing it in the case that truly matters, but instead, what we have is just dumb.
>
> - Jonathan M Davis

Let D be a dynamic array, O a pointer or object:
          | Conceptually |  in D
D == null     Invalid       Valid
D is null     Valid         Valid
O == null     Valid         Invalid
O is null     Valid         Valid

Right?

So what you are saying is you want to create 2 more invalids in the table to satisfy some weird logic which requires the programmer to remember special cases rather than make them all valid and easy to remember even though it can be done and make sense.

In fact, the 2nd invalid makes sense and should be allowed so really you want to create 3 invalids for the price of one.

Simply require == null as is null and be done with it.  You can't police programmers minds and get them to program correctly. If you had a kid, do you box them up in a bubble room and not let them play because they might hurt themselves? How people learn is by making mistakes. It is better to provide a logical foundation that is consistent rather than produce corner cases to handle some corner case that was created to handle another corner case because someone handled a corner case.

May 19, 2018

Posted by Jonathan M Davis
in reply to IntegratedDimensions

Jonathan M Davis

Posted in reply to IntegratedDimensions

On Sunday, May 20, 2018 01:51:50 IntegratedDimensions via Digitalmars-d- learn wrote:
> Simply require == null as is null and be done with it.

That would be flat out wrong for dynamic arrays, because then

auto result = arr == null

and

int[] nullArr;
auto result = arr == nullArr;

would have different semantics. The way that dynamic arrays are designed to work even if they're null mucks with this considerably here.

> You can't police programmers minds and get them to program correctly.

That's true, but making things that are highly likely to be wrong illegal prevents bugs. e.g.

while(cond);

is illegal in D precisely because it's error-prone. There are cases where doing something like that would be perfectly correct. e.g.

while(++a != b);

but you can do the exact same thing with empty parens

while(++a != b) {}

and all of those bugs with accidentally closing a loop with a semicolon go away, and you don't lose any expressiveness. The compiler just forces you to write it in a way that's far less error-prone.

Making it illegal to compare the null literal with == also prevents bug, and you don't lose any expressiveness doing it either. It's the same kind of logic. Making error-prone constructs illegal when there's a simple equivalent that isn't error-prone is good language design, because it prevents bugs without actually restricting the programmer. It's when the language starts disallowing things that aren't error-prone and/or don't have simple equivalents that you start running into problems with the compiler getting in your way and treating you like a kid. For simple stuff like this, it ultimately saves you time and effort without getting in your way. At most, you occasionally have to replace foo == null with foo is null or foo.length != 0, and it potentially saves you hours of effort tracking down a subtle bug.

- Jonathan M Davis

May 20, 2018

Posted by IntegratedDimensions
in reply to Jonathan M Davis

IntegratedDimensions

Posted in reply to Jonathan M Davis

On Sunday, 20 May 2018 at 02:09:47 UTC, Jonathan M Davis wrote:
> On Sunday, May 20, 2018 01:51:50 IntegratedDimensions via Digitalmars-d- learn wrote:
>> Simply require == null as is null and be done with it.
>
> That would be flat out wrong for dynamic arrays, because then
>
> auto result = arr == null
>
> and
>
> int[] nullArr;
> auto result = arr == nullArr;
>
> would have different semantics. The way that dynamic arrays are designed to work even if they're null mucks with this considerably here.
>

Do you not see they are different?

You think

arr == nullArr

and

arr == null

are suppose to necessarily be the same semantics? That is patently false! You should rethink your position on that because it is wrong. null is a keyword in D and has a very special meaning and hence that meaning MUST be taken in to account.

There is no harm in making them different. Your logic thinks that that they should be the same but if you are wrong then your whole argument is wrong.

for example,

Object o = null;

then

o == null

should not be true even though "null == null" in some sense.

== null is a test of validity. One never checks if null == null and it is a meaningless case so allowing it as a possibility is meaningless.

You are treating null as if it is on the same level as objects and arrays and it is not. By doing so you lose the power of it being singled out as a keyword.

>> You can't police programmers minds and get them to program correctly.
>
> That's true, but making things that are highly likely to be wrong illegal prevents bugs. e.g.

Not necessarily because you just create more bugs by doing that. Your son, the bubble boy, then does not develop an immune system that he should of developed by you trying to protect them from hurting himself.

You should get out of the business of trying to prevent things that you don't even know are going to happen. It is a bad mindset to be in because, for all you know, those things will never happen. Time is better spent than trying to police everyone from doing anything wrong. 1. You can't do it. 2. You make things worse in the long run because who's policing you to keep you from screwing up?

Do you know how many "bugs" are produced by people who are fixing "bugs"? We can surely bet more than zero.

> while(cond);
>
> is illegal in D precisely because it's error-prone. There are cases where doing something like that would be perfectly correct. e.g.
>
> while(++a != b);
>
> but you can do the exact same thing with empty parens
>
> while(++a != b) {}
>
> and all of those bugs with accidentally closing a loop with a semicolon go away, and you don't lose any expressiveness. The compiler just forces you to write it in a way that's far less error-prone.

This is a different problem and therefor not applicable.

> Making it illegal to compare the null literal with == also prevents bug, and you don't lose any expressiveness doing it either. It's the same kind of logic. Making error-prone constructs illegal when there's a simple equivalent that isn't error-prone is good language design, because it prevents bugs without actually restricting the programmer. It's when the language starts disallowing things that aren't error-prone and/or don't have simple equivalents that you start running into problems with the compiler getting in your way and treating you like a kid. For simple stuff like this, it ultimately saves you time and effort without getting in your way. At most, you occasionally have to replace foo == null with foo is null or foo.length != 0, and it potentially saves you hours of effort tracking down a subtle bug.
>
> - Jonathan M Davis

You certainly do lose expressiveness. You loose elegance because you cannot express logically related things in a logically related way.

The problem is the WHOLE reason it is error prone is from who ever decided the dynamic array syntax of == null would not compare it the same way it does everything else.

Basically someone thought they were going to be fancy and treat == null as the same as an allocated 0 length array. That was the problem from the get go.

== null should have a very specific and consistent meaning and someone decided to change that in an irregular and inconsistent meaning and now we have less elegance in the language than we could.

The reason why you are saying it is buggy is PRECISELY because of what was done wrong. Programmers assume that == null means the same thing it does everywhere else, but LO AND BEHOLD! Not in that one special case and if they don't know about that special case they hit the "bug".

See, what you call bugs is really the programmers failing to know the special case that was created. The special case that really had no reason to be a special case. So, in fact, who ever decided on the rules here created more problems than they solved.

Any time you create special cases you create complexity and that is what creates bugs of the type here. These bugs are entirely preventable with proper thought out consistency and these bugs are not the programmers fault but bugs in the design of the language, which are far worse. These are not the same types of bugs like a syntax error or a logic bug.

May 20, 2018

Posted by IntegratedDimensions
in reply to IntegratedDimensions

IntegratedDimensions

Posted in reply to IntegratedDimensions

Furthermore:

https://issues.dlang.org/show_bug.cgi?id=3889

Shows real problems. You argue from the side that the bug already exists so we must work around it because we can't go back and "fix things". Who says? D has had breaking changes in the past so it is not a deal breaker. It is also a relatively easy transition because == null is very easy to find and fix.

With the mentality that one must always deal with introduced logic bugs that, if fixed, will break old code is insane. The whole point of fixing bugs is to make things work *correctly*.

The fact is someone decided it was a good idea to conflate null with some dynamic array BS and that is where all the problems come from. It should have never been done and this issue will persist until someone gets the balls to fix it.

After all, how do you know it won't actually make a lot of "buggy" code better?

May 21, 2018

Posted by Steven Schveighoffer
in reply to Jonathan M Davis

Steven Schveighoffer

Posted in reply to Jonathan M Davis

On 5/18/18 9:48 PM, Jonathan M Davis wrote:
> On Saturday, May 19, 2018 01:27:59 Neia Neutuladh via Digitalmars-d-learn
> wrote:
>> On Friday, 18 May 2018 at 23:53:12 UTC, IntegratedDimensions
>>
>> wrote:
>>> Why does D complain when using == to compare with null? Is
>>> there really any technical reason? if one just defines == null
>>> to is null then there should be no problem. It seems like a
>>> pedantic move by who ever implemented it and I'm hoping there
>>> is actually a good technical reason for it.
>>
>> tldr: this error is outdated.
>>
>> In the days of yore, "obj == null" would call
>> "obj.opEquals(null)". Attempting to call a virtual method on a
>> null object is a quick path to a segmentation fault. So "obj ==
>> null" would either yield false or crash your program.

I remember this, and I remember arguing for the current behavior many times (after having many many crashes) :)

https://forum.dlang.org/post/fqlgah$15v2$1@digitalmars.com

Read that thread if you want to see the rationale.

However, I'd argue it's still good to keep the error as there is literally no point to not using == null on class references vs. is null. You now would get the same result, but it's faster/cleaner.

> Actually, that runtime function has existed since before TDPL came out in
> 2010. It even shows the implementation of the free function opEquals (which
> at the time was in object_.d rather than object.d). I'm not even sure that
> the error message was added before the free function version of opEquals
> was. Maybe when that error message was first introduced, it avoided a
> segfault, but if so, it has been a _long_ time since that was the case.

Some things in TDPL were forward-thinking. I remember Andrei fleshing out some of how the languages SHOULD behave in the forums or mailing lists for the purposes of writing TDPL even though it didn't yet behave that way. In fact, I'm almost positive the new object comparison function came as a result of TDPL (but I'm not 100% sure). Some of TDPL still has never been implemented.

Long story short, don't date the existence of features in TDPL based on the publication :)

In this case, for fun (what is wrong with me), I looked up the exact date it got added, and it was Feb 2010: https://github.com/dlang/druntime/commit/2dac6aa262309e75ad9b524cb4d1c3c1f0ecc2ae. TDPL came out in June 2010, so this feature does predate TDPL by a bit.

In fact, through this exercise, I just noticed that the reason it returns auto instead of bool is to make sure it gets into the now defunct "generated" object.di file (https://github.com/dlang/druntime/pull/2190).

>> It *is* faster to call "foo is null" than "foo == null", but I
>> don't think that's particularly worth a compiler error. The
>> compiler could just convert it to "is null" automatically in that
>> case.

It's not worth a compiler error if we didn't already have it, but I don't know that it's worth taking out. It's really what you should be doing, it's just that the penalty for not doing it isn't as severe as it used to be.

>> One casualty of the current state of affairs is that no object
>> may compare equal to null.

And let's keep it that way!

> Of
> course, the most notable case where using == with null is a terrible idea is
> dynamic arrays, and that's the case where the compiler _doesn't_ complain.

I use arr == null all the time. I'm perfectly fine with that, and understand what it means.

> Using == with null and arrays is always unclear about the programmer's
> intent and almost certainly wasn't what the programmer intended.

I beg to differ.

> If the
> programmer cares about null, they should use is. If they care about lengnth,
> then that's what they should check. Checking null with == is just a huge
> code smell.

IMO, doing anything based on the pointer of an array being null is a huge code smell. In which case, == null is perfectly acceptable. I'm comparing my array to an empty array. What is confusing about that?

I actually hate using the pointer in any aspect -- an array is semantically equivalent to its elements, it's not important where it's allocated. The only place D forces me to care about the pointer is when I'm dealing with ranges.

> So, perhaps the compiler is being pedantic, but it's still telling you the
> right thing. It's just insisting about it in the case where it matters less
> while not complaining aobut it in the case where it really matters, which is
> dumb. So IMHO, if anything, adding an error message for the array case would
> make more sense than getting rid of the error with pointers and references.

I hope this never happens.

-Steve

May 21, 2018

Posted by Jonathan M Davis
in reply to Steven Schveighoffer

Jonathan M Davis

Posted in reply to Steven Schveighoffer

On Monday, May 21, 2018 10:01:15 Steven Schveighoffer via Digitalmars-d- learn wrote:
> On 5/18/18 9:48 PM, Jonathan M Davis wrote:
> > Of
> > course, the most notable case where using == with null is a terrible
> > idea is dynamic arrays, and that's the case where the compiler
> > _doesn't_ complain.
> I use arr == null all the time. I'm perfectly fine with that, and understand what it means.
>
> > Using == with null and arrays is always unclear about the programmer's intent and almost certainly wasn't what the programmer intended.
>
> I beg to differ.
>
> > If the
> > programmer cares about null, they should use is. If they care about
> > lengnth, then that's what they should check. Checking null with == is
> > just a huge code smell.
>
> IMO, doing anything based on the pointer of an array being null is a huge code smell. In which case, == null is perfectly acceptable. I'm comparing my array to an empty array. What is confusing about that?
>
> I actually hate using the pointer in any aspect -- an array is semantically equivalent to its elements, it's not important where it's allocated. The only place D forces me to care about the pointer is when I'm dealing with ranges.

The core problem here is that no one reading a piece of code has any way of knowing whether the programmer knew what they were doing or not when using == null with an array, and the vast majority of newbies are not going to have understood the semantics properly. If I know that someone like you or Andrei wrote the code, then the odds are good that what the code does is exactly what you intended. But for the average D programmer? I don't think that it makes any sense to assume that, especially since anyone coming from another language is going to assume that == null is checking for null, when it's not.

It's the same reason that

if(arr)

was temporarily out of the language. The odds are very high that the programmer using it is using it wrong. Andrei and Vladimir were using it correctly in their code, so they didn't like the fact that it had then become illegal, but while knew what they were doing and were using it correctly, plenty of other folks have been inserting bugs whenever they do that, and if I see if(arr) or assert(arr) in code, I'm going to consider it to be code smell just as much as I consider arr == null to be code smell.

And yes, trying to treat the ptr as being null as special with a dynamic array is risky, and most code shouldn't be doing it, but you're almost forced to in some cases when interacting with C code, and clearly there are folks that do (e.g. Andrei and Vladimir). But even if we could unequivocably say that no one should be doing it, you still have no way of knowing whether someone is attempting it or not when they do arr == null, and since caring whether an array is null or not is a very typical thing to do in other languages where there is a very clear distinction between a null array and an empty one, plenty of folks come to D expecting to be able to do the same. And they're going to write arr == null or arr != null, and any time I see code like that, I'm going to have sit down and figure out whether they really meant arr is null, or whether they meant arr.length == 0, whereas if they had just written arr is null or arr.length == 0 (or arr.empty), their intent would have been perfectly clear. As such, I would strongly advise D programmers to use arr.empty or arr.length == 0 instead of arr == null, even if they know what they're doing, just like I would advise them to not treat null as special for arrays unless they really need to.

At this point, I'm honestly inclined to think that we never should have allowed null for arrays. We should have taken the abstraction a bit further and disallowed using null to represent dynamic arrays. It would then presumably still work to do arr.ptr is null, but arr is null wouldn't work, because null wouldn't be an array, and arr == null definitely wouldn't work. Then we could just use [] for empty arrays everywhere, and there would be no confusion, leaving null for actual pointers. And it would almost certinly kill off all of the cases where null was treated as special for dynamic arrays except maybe for when dealing with C code, but in that case, they'd have to use ptr directly. However, at this point, I expect that that's all water under the bridge, and we're stuck with it.

- Jonathan M Davis

May 21, 2018

Posted by Steven Schveighoffer
in reply to Jonathan M Davis

Steven Schveighoffer

Posted in reply to Jonathan M Davis

On 5/21/18 2:05 PM, Jonathan M Davis wrote:
> The core problem here is that no one reading a piece of code has any way of
> knowing whether the programmer knew what they were doing or not when using
> == null with an array, and the vast majority of newbies are not going to
> have understood the semantics properly. If I know that someone like you or
> Andrei wrote the code, then the odds are good that what the code does is
> exactly what you intended. But for the average D programmer? I don't think
> that it makes any sense to assume that, especially since anyone coming from
> another language is going to assume that == null is checking for null, when
> it's not.

For me, the code smell is using arr is null (is it really necessary to check for a null pointer here?), for which I always have to look at more context to see if it's *really* right.

Even people who write == null may want to check for null thinking that it's how you check an array is empty, not realizing that it *doesn't* check for a null pointer, *AND* it still does exactly what they need it to do ;)

> 
> It's the same reason that
> 
> if(arr)
> 
> was temporarily out of the language.

It's similar, but I consider it a different reason. While the intent of == null may not be crystal clear, 99% of people don't care about the pointer, they just care whether it's empty. So the default case is usually good enough, even if you don't know the true details.

Whereas, if(arr) is checking that the pointer is null as well as the length is 0. Most people aren't expecting that, and those who are are like Andrei and Vladimir -- they know the quirks of the language here. For the longest time I thought it was just checking the pointer!

I think aside from the clout of the ones who wanted it, the biggest reason that change was reverted was that it became really difficult to use an array inside a conditional. One-liners had to be extracted out, temporary variables defined.

> At this point, I'm honestly inclined to think that we never should have
> allowed null for arrays. We should have taken the abstraction a bit further
> and disallowed using null to represent dynamic arrays. It would then
> presumably still work to do arr.ptr is null, but arr is null wouldn't work,
> because null wouldn't be an array, and arr == null definitely wouldn't work.
> Then we could just use [] for empty arrays everywhere, and there would be no
> confusion, leaving null for actual pointers. And it would almost certinly
> kill off all of the cases where null was treated as special for dynamic
> arrays except maybe for when dealing with C code, but in that case, they'd
> have to use ptr directly. However, at this point, I expect that that's all
> water under the bridge, and we're stuck with it.

If we never had null be the default value for an array, and used [] instead, I would be actually OK with that. I also feel one of the confusing things for people coming to the language is that arrays are NOT exactly reference types, even though null can be used as a value for assignment or comparison.

But it still wouldn't change what most people write or mean, they just would write == [] instead of == null. I don't see how this would solve any of your concerns.

-Steve

May 21, 2018

Posted by Jonathan M Davis
in reply to Steven Schveighoffer

Jonathan M Davis

Posted in reply to Steven Schveighoffer

On Monday, May 21, 2018 14:40:24 Steven Schveighoffer via Digitalmars-d- learn wrote:
> On 5/21/18 2:05 PM, Jonathan M Davis wrote:
> > The core problem here is that no one reading a piece of code has any way of knowing whether the programmer knew what they were doing or not when using == null with an array, and the vast majority of newbies are not going to have understood the semantics properly. If I know that someone like you or Andrei wrote the code, then the odds are good that what the code does is exactly what you intended. But for the average D programmer? I don't think that it makes any sense to assume that, especially since anyone coming from another language is going to assume that == null is checking for null, when it's not.
>
> For me, the code smell is using arr is null (is it really necessary to check for a null pointer here?), for which I always have to look at more context to see if it's *really* right.

Really? I would never expect anyone to use is unless they really cared about whether array was null. I'd be concerned about whether the code in general was right, because treating null as special gets tricky, but that particular line wouldn't concern me.

> Even people who write == null may want to check for null thinking that it's how you check an array is empty, not realizing that it *doesn't* check for a null pointer, *AND* it still does exactly what they need it to do ;)

You honestly expect someone first coming to D expect to check whether an array is empty by checking null? That's a bizarre quirk of D that I have never seen anyhwere else. I would never expect anyone to purposefully use == null to check for empty unless they were very familiar with D, and even then, I'd normally expect them to ask what they really mean, which is whether the array is empty.

> > It's the same reason that
> >
> > if(arr)
> >
> > was temporarily out of the language.
>
> It's similar, but I consider it a different reason. While the intent of == null may not be crystal clear, 99% of people don't care about the pointer, they just care whether it's empty. So the default case is usually good enough, even if you don't know the true details.

I think that that's the key point of disagreement here. I would never consider the intent of == null to be crystal clear based solely on the code, because it is so common outside of D to use == null to actually check for null, and there are better ways in D to check for empty if that's what you really mean. My immediate expectation on seeing arr == null is that the programmer does not properly understand arrays in D. If I knew that someone like you wrote the code, I'd probably decide that you knew what you were doing and didn't make a mistake, but I'm not going to assume that in general, and honestly, I would consider it bad coding practice (though we obviously disagree on that point).

I would consider the if(arr) and arr == null cases to be exactly the same. They both are red flags that the person in question does not understand how arrays in D work. Yes, someone who knows what they're doing may get it right, but I'd consider both to be code smells and I wouldn't purposefully do either in my own code. If I found either in my own code, I would expect that I'd just found a careless bug.

> > At this point, I'm honestly inclined to think that we never should have allowed null for arrays. We should have taken the abstraction a bit further and disallowed using null to represent dynamic arrays. It would then presumably still work to do arr.ptr is null, but arr is null wouldn't work, because null wouldn't be an array, and arr == null definitely wouldn't work. Then we could just use [] for empty arrays everywhere, and there would be no confusion, leaving null for actual pointers. And it would almost certinly kill off all of the cases where null was treated as special for dynamic arrays except maybe for when dealing with C code, but in that case, they'd have to use ptr directly. However, at this point, I expect that that's all water under the bridge, and we're stuck with it.
>
> If we never had null be the default value for an array, and used [] instead, I would be actually OK with that. I also feel one of the confusing things for people coming to the language is that arrays are NOT exactly reference types, even though null can be used as a value for assignment or comparison.
>
> But it still wouldn't change what most people write or mean, they just would write == [] instead of == null. I don't see how this would solve any of your concerns.

It would solve the concern, because no one is going to write arr == [] to check for null. They'de write it just like they'd write arr == "". They're clearly checking for empty, not null. The whole problem here is that pretty much everywhere other than D arrays, null and empty are two separate things, and pretty much anyone coming from another language will expect them to be different. It wouldn't surprise me at all to see a newbie D programmer doing something like

if(arr != null && arr == arr2)
{...}

I would never expect anyone coming from another language to use arr == null with the idea that it's actually checking for null, and given how confusing dynamic arrays are for many people, it wouldn't surprise me for someone who has programmed in D for a while to not properly understand the situation. At some point, they learn, but it's clearly one of those topics that confuses pretty much everyone at first.

And out of those who do understand how D dynamic arrays work, a number of them continue to distinguish between null and empty arrays in their code - e.g. folks like Andrei and Vladimir who write code that uses

if(arr)

and means it the way the language means it. The core problem is that D treats null arrays as empty. If it would either treat them as actually null (with all of the segfaults that go with that) or not treat null as a dynamic array, then that whole problem goes away. So, if null were not a dynamic array in any shape or form, and you had to use [] to indicate an empty array, then that would solve my main concerns with null and dynamic arrays.

Now, that then leaves the issue of folks accessing ptr and treating null as special there, but if you had to actually access ptr to do that, I suspect that the practice of treating null arrays as special would go away. But even if it didn't, the cases where someone was trying to do that would then be clear, because they'd have to access ptr directly, so it would almost certainly be something that only folks who knew what they were doing would do much with, whereas arr == null is something that pretty much any D newbie is going to try and screw up.

- Jonathan M Davis

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation