Jonathan M Davis
Posted in reply to Walter Bright
| On Monday, November 25, 2024 1:50:07 AM MST Walter Bright via Digitalmars-d wrote:
> C has an equivalent behavior distinguishing between a null pointer and a 0 length string:
>
> ```
> char *s; // string
> if (s) // pointer
> if (*s) // length
> ```
>
> ```
> char[] a; // array
> if (a) // pointer
> if (a.length) // length
> ```
Given that C arrays are pointers, there's definitely reason to care about whether they're null or not, but D arrays are not pointers. D arrays may have originally come from C arrays, but ultimately, they're fundamentally different from one another.
D arrays are designed in such a way that there is really no reason to care one whit whether they're null or not. They're not pointers, and you don't normally access their ptr member (and when you do, it's @system). When you access the individual elements, you get a RangeError if you attempt to access an element outside of the array, and if you want to know whether an index is within the array, you check its length. If its length isn't 0, then its ptr isn't null, and you don't have any reason to care about null. If its length is 0, then whether its ptr is null is also irrelevant, because you're not going to access non-existent elements.
The result of this is that there's really no reason to care about whether a D array is null, and code that cares is almost certainly buggy. And what compounds that is that precisely because D code in general does not care about null, it's not hard to end up in a situation where you get a null array when you might have expected an empty non-null array - or in some cases, you might end up with a non-null empty array when you might have expected a null one (though the former is more common from what I've seen). For instance, "".idup will give you null, not a non-null empty string, which makes perfect sense from an efficiency perspective given that almost no D code cares about the difference between null and empty. But it's precisely because almost nothing cares about the difference that it becomes very error-prone to treat null as special even if you want to.
For instance, a function could try to return null to indicate that it doesn't have a result and a non-null empty array to indicate that it has a result but that that result is empty (and of course a non-empty array when it has a result that isn't empty). However, while the null return might be clear and explicit and typically be checked immediately on return, it's really easy to get into a situation where you accidentally have a null array when you meant to have a non-null empty array, meaning that such code has a real risk of returning null when it wasn't intended - which is why such functions really should be returning something like a std.typecons.Nullable wrapping an array instead of trying to treat null arrays as special. Treating null arrays as special in D code is just begging for bugs.
As such, I would generally consider it a code smell to see an array in D checked for null instead of empty. It might make sense in some situations when dealing with extern(C) code, but even then, usually you're either passing a length along with it (in which case, a 0 length array shouldn't be dereferenced by C code either), or you're dealing with a string and need to pass a null-terminated string which typically means allocating a string anyway rather than returning the ptr of a D string that might be null. But in the vast majority of D code, checking an array for null almost certainly means that the code is doing something wrong. Checking pointers for null makes sense, because you don't want to dereference a null pointer, but D arrays are not pointers. They contain pointers and will potentially dereference them if their length isn't 0, but they themselves are not pointers and aren't going to be dereferenced if their ptr field is null, because then their length is 0, and it would result in a RangeError.
And to make matters worse, it seems that because of the fact that there's really no reason to care about null with arrays, it's often the case that when someone does it implicitly with an if condition, they think that they're testing for non-empty when they're actually testing for non-null. So, while it's already a code smell to see `if(arr !is null)`, from what I've seen, the odds are extremely high that `if(arr)` is just wrong, because it's not doing what the programmer intended.
There's just no good reason to do it, because it's routinely misunderstood - and that's on top of the fact that `if(arr !is null)` is almost certainly wrong behavior anyway, because outside of very rare cases, D code should not care whether an array is null or empty, because there is no need to maintain that distinction normally, and even trying to maintain that distiction in a section of code is likely to have problems at some point - if nothing else because none of the code interacting with it will make that distinction.
- Jonathan M Davis
|