Empty VS null array? (page 7)

October 25, 2013

Re: Empty VS null array?

Posted by Kagamin
in reply to Regan Heath

Permalink

Kagamin

Posted in reply to Regan Heath

Permalink

On Monday, 21 October 2013 at 10:33:01 UTC, Regan Heath wrote:
> null strings are no different to null class references, they're not a special case.

True. That's an implementation detail which has no meaning for business logic. When implementation deviates from business logic, one ends up fixing the implementation details everywhere in order to implement business logic. That's why string.IsNullOrEmpty is used.

> People seem to have this odd idea that null is somehow an invalid state for a string /reference/ (c# strings are reference types), it's not.

That's the very problem: null and empty are valid states and must be treated equally as "no data", but they can't for purely technical reasons.

> People also seem to elevate empty strings to some sort of special status, that's like saying 0 has some special status for int - it doesn't it's just one of a number of possible values.
>
> In fact, int having no null like state is a "problem" causing solutions like boxing to elevate the value type to a reference in order to allow a null state for int.

You want to check ints for null everywhere too?

> Yet, in D we've decided to inconsistently remove that functionality from string for no gain.  If string could not actually be null then we'd gain something from the limitation, instead we lose functionality and gain nothing - you still have to check your strings for null in D.

Huh? Null slices work just like empty ones - that's why this topic was started in the first place. One doesn't have to check slices for nulls, only for length.

If you want clear nullable semantics, you have Nullable, it works for everything, including strings and ints. You would want this feature only in rare cases, so it doesn't make sense to make it default, or it will be a nuisance.

>> both of them are just "no data", so you end up typing if(string.IsNullOrEmpty(mystr)) every time everywhere.
>
> I only have to code like this when I use 3rd party code which has conflated empty and null.  In my code when it's null it means not specified, and empty is just one type of value - for which I do no special handling.

Equivalence between null and empty is a business logic's requirement, that's why it's done.

>> And, yeah, only one small feature in this big mess ever needs to differentiate between null and empty.
>
> Untrue, null allows many alternate and IMO more direct/obvious designs.

The need for those designs is rare and trivially implementable for all value types.

>> I found this one case trivially implementable, but nulls still plague all remaining code.
>
> Which one case?  The readline() one below?

No, it was an authentication system in third-party code for one special case. I also had to specify this null value in app.config - guess how, explicitly specify, not substitute missing parameter with a default.

Another possibility for readline is to return a tuple
{bool eof, string line(non-null)} - this way you have easy check for eof and don't have to check for null when you don't need it.

> I use this all the time:
> http://msdn.microsoft.com/en-us/library/system.io.streamreader.readline.aspx
>
> It has never caused me any issues.  It explicitly states that null is a possible output, and so I check for it - doing anything less is simply bad programming.
>
>> It works if you read one line per loop cycle, but if you read several lines and assume they're not null (some multiline data format),
>
> There is your problem, never "assume" - the documentation is very clear on the issue.
>
>> you're screwed or your code becomes littered with null checks, but who accounts for all alternative scenarios from the start?
>
> Me, and IMO any competent programmer.  It is misguided to think you can ignore valid states, null is a valid state in C, C++, C#, and D.. You should be thinking about and handling it.

Here null is a valid state for readline, not for the caller: if the caller parses a multiline data format, unexpected end of file is an invalid state.

And what do you gain by littering your code with those null checks? Just making runtime happy and adding noise to the code? You could use that time to improve the code or add features or even relax. It's exactly nullable strings, which gain you only a time waste.

> You don't have to check for it on every access to the variable, but you do need to check for it once where the variable is assigned, or passed (in private functions you can skip this).  From that point onward you can assume non-null, valid, job done.

You just said "never assume". The assumption may fail, because the string type is still nullable, compiler doesn't save you here, this sucks. And in order to check for everything everywhere on a level near that of the compiler, you must be not just competent, but perfect.

>> I believe there's no problem domain, which would like to differentiate between null and empty string instead of treating them as "no data".
>
> null means not specified, non existent, was not there.
> empty means, present but set to empty/blank.
>
> Databases have this distinction for a reason.

Oracle makes no distinction between null and empty string. For a reason?
A database is an implementation detail of a data storage, it doesn't implement business logic, it only provides features, which can be used with more or less success to implement business logic. Ever heard of advantages of OO databases over relational ones? That's an illustration of technical details, which don't precisely map to business logic.

> If you get input from a user a field called "foo" may be:
>  - not specified
>  - specified
>
> and if specified, may be:
>  - empty
>  - not empty

If the user doesn't fill a text box, it's both empty and not specified - there's just no difference. And it doesn't matter how you store it in the database - as null or as empty string - both are presented in the same way. Heck, we use these optional text boxes everywhere - can you tell if their content is empty or not specified?

And what if the value is required? Would you accept an empty value? And if your database treats empty string as not null, would you allow to register a user with an empty login name? And how to express this constraint in the database? In SQL "not null" means "required value", but it's not equivalent to the business logic'a notion of a required value. I wouldn't be surprised if Oracle did that in order to reject empty strings in not null fields.

Let's consider a process of specifying user's data. What text fields do we have?
1. Login. No difference between null and empty - both invalid - "no data", must enter something.
2. First name. No difference between null and empty - both are "no data" and are presented as empty text box.
3. Middle name. ditto.
4. Last name. ditto.
5. Country. ditto.
6. State. ditto.
7. City. ditto.
8. Address. ditto.
9. Building. ditto.
10. Flat. ditto.
11. Zip code. ditto.
12. Phone. ditto.
13. Fax. ditto.
14. E-mail. ditto.
15. Site. ditto.
16. Passport number. ditto.
17. Birth place. ditto.
18. Comment. Hell! Comment!
See? Not a single field in the list requires distinction between null and empty. And slices don't differentiate between them. Just as planned.

> If we have null, lets use it, if we want to remove null the lets remove it, but can we get out of this horrid middle ground please.

*sigh* people just don't buy the KISS principle...

On Friday, 25 October 2013 at 11:41:38 UTC, Kagamin wrote: > That's an implementation detail which has no meaning for business logic. I've no real truck in this, but I do find it pretty bizarre to see _anyone_ using "business logic" as justification for anything here when D's own documentation is pretty explicit about not catering exclusively to that domain. -Wyatt

On Friday, 25 October 2013 at 11:41:38 UTC, Kagamin wrote: > On Monday, 21 October 2013 at 10:33:01 UTC, Regan Heath wrote: >> null strings are no different to null class references, they're not a special case. > > True. That's an implementation detail which has no meaning for business logic. When implementation deviates from business logic, one ends up fixing the implementation details everywhere in order to implement business logic. That's why string.IsNullOrEmpty is used. That's not an implementation detail. Whether "null" is in the set of values of a string type and whether it is identical to "empty" are fundamental properties of that type. If you define the string type to include "null", then "null" should be either identical to "empty" in *all cases* or distinct from that in all cases. D chose to fuse "null" and "empty" together in an inconsistent manner, which is a mistake. If we include "null" in the set, then either the [] literal should be non-null (and "null" and "empty" properly disjoint), or "null" and "empty" should always represent the same value. If we exclude it - *then* "null" becomes an implementation detail and should be dealt with only via .ptr. > >> People seem to have this odd idea that null is somehow an invalid state for a string /reference/ (c# strings are reference types), it's not. > > That's the very problem: null and empty are valid states and must be treated equally as "no data", but they can't for purely technical reasons. Whether they are valid states is irrelevant. What matters is whether they represent identical values. In D, they are unhealthily mixed.

On Friday, 25 October 2013 at 12:35:44 UTC, Wyatt wrote: > On Friday, 25 October 2013 at 11:41:38 UTC, Kagamin wrote: >> That's an implementation detail which has no meaning for business logic. > > I've no real truck in this, but I do find it pretty bizarre to see _anyone_ using "business logic" as justification for anything here when D's own documentation is pretty explicit about not catering exclusively to that domain. Dunno about D documentation, I use tools to get shit done. If they help, that's good, if they don't, that's bad. And by "shit" I don't mean a product, not a heap of text files.

On Friday, 25 October 2013 at 16:31:54 UTC, Max Samukha wrote: > D chose to fuse "null" and "empty" together in an inconsistent manner, which is a mistake. Slices are reasonably consistent and perfectly working with reasonable code, so I see no merit in fixing them, but you can try, why not.

On 2013-10-18 17:32:58 +0000, Jonathan M Davis said: > On Friday, October 18, 2013 09:55:46 Andrei Alexandrescu wrote: >> On 10/18/13 9:26 AM, Max Samukha wrote: >>> *That's* bad API design. readln should be symmetrical to writeln, not >>> write. And about preserving the exact representation of new lines, >>> readln/writeln shouldn't preserve that, pure and simple. >> >> Fair point. I just gave one possible alternative out of many. Thing is, >> relying on client code to distinguish subtleties between empty and null >> strings is fraught with dangers. > > Yeah, but the primary reason that it's bad design is the fact that D tries to > conflate null and empty instead of keeping them distinct (which is essentially > the complaint that was made). Whether that's ultimately good or bad is up for > debate, but the side effect is that relying on the difference between null and > empty ends up being very bug-prone, whereas in other languages which don't > conflate the two, it isn't problematic in the same way, and it's much more > reasonable to have the API treat them differently. > > - Jonathan M Davis Null and the Empty Set are different entities. A set containing exactly nothing, vs undefined. However, null is not handled properly in D or any other systems language since it's simply a pointer with value = 0. if (null == 0) is a true statement in C, C++, and D, but is not in fact true. Null is neither equal to zero, nor not equal to zero.

On 2013-10-25 11:41:36 +0000, Kagamin said: > Oracle makes no distinction between null and empty string. For a reason? > A database is an implementation detail of a data storage, it doesn't implement business logic, it only provides features, which can be used with more or less success to implement business logic. Ever heard of advantages of OO databases over relational ones? That's an illustration of technical details, which don't precisely map to business logic. That's poor friggin design, and it's for a bad reason. Oracle is not the example you want to be following. Sql Server does *NOT* follow their example for GOOD reason. My middle name is not null, it is NOTHING. There are lots of places where Oracle made bad design decisions and they cannot escape them due to requiring backwards compatibility.

As the OP of this thread I want to say that I think nullable is the solution http://dlang.org/phobos/std_typecons.html but I dislike how I cant pass 5 or null to a parameter that is nullable!int, nullable!string

Forums