September 27, 2009
downs:

> Basically, anything that may fill it with nulls.
> 
> The only two allowed instructions would be ~= NonNullable and ~= NonNullableArray. And it's good that way.

I agree.
In such situation I'd also like to have a default method to insert one or more nonnull items in any point of the array (see insert method of Python lists, that can also be expressed as s[i:i]=[x]). Having fee basic default methods will help keep such safe arrays flexible.

Bye,
bearophile
September 27, 2009
Jeremie Pelletier wrote:
> Walter Bright wrote:
>> Yigal Chripun wrote:
>>> An exception trace is *far* better than a segfault and that does not require null values.
>>
>> Seg faults are exceptions, too. You can even catch them (on windows)!
> 
> Walter, check the crash handler I submitted to D.announce, it has signal handlers on linux to convert segfaults into D exception objects and throw them so the code can unwind properly and even catch it.
> 
> It has made my life so much easier, I barely need to run within a debugger anymore for most crashes. I don't know enough of phobos and druntime to port it, but its under a public domain license so anyone is free to do it!
> 
> </shameless plug>

I think that's great. Walter, Sean, please let's look into this.

Andrei
September 27, 2009
Michel Fortin wrote:
> On 2009-09-26 23:28:30 -0400, Michel Fortin <michel.fortin@michelf.com> said:
> 
>> On 2009-09-26 22:07:00 -0400, Walter Bright <newshound1@digitalmars.com> said:
>>
>>> [...] The facilities in D enable one to construct a non-nullable type, and they are appropriate for many designs. I just don't see them as a replacement for *all* reference types.
>>
>> As far as I understand this thread, no one here is arguing that non-nullable references/pointers should replace *all* reference/pointer types. The argument made is that non-nullable should be the default and nullable can be specified explicitly any time you need it.
>>
>> So if you need a reference you use "Object" as the type, and if you want that reference to be nullable you write "Object?". The static analysis can then assert that your code properly check for null prior dereferencing a nullable type and issues a compilation error if not.
> 
> I just want to add: some people here are suggesting the compiler adds code to check for null and throw exceptions... I believe like you that this is the wrong approach because, like you said, it makes people add dummy try/catch statements to ignore the error. What you want a prorammer to do is check for null and properly handle the situation before the error occurs, and this is exactly what the static analysis approach I suggest forces.
> 
> Take this example where "a" is non-nullable and "b" is nullable:
> 
> string test(Object a, Object? b)
> {
>     auto x = a.toString();
>     auto y = b.toString();
>         return x ~ y;
> }
> 
> This should result in a compiler error on line 4 with a message telling you that "b" needs to be checked for null prior use. The programmer must then fix his error with an if (or some other control structure), like this:
> 
> string test(Object a, Object? b)
> {
>     audo result = a.toString();
>     if (b)
>         result ~= b.toString();
> 
>     return result;
> }
> 
> And now the compiler will let it pass. This is what I'd like to see. What do you think?
> 
> I'm not totally against throwing exceptions in some cases, but the above approach would be much more useful. Unfortunatly, throwing exceptions it the best you can do with a library type approach.
> 

I don't think this would fly. One good thing about nullable references is that they are dynamically checked for validity at virtually zero cost. Non-nullable references, therefore, would not add value in that respect, but would add value by reducing the cases when programmers forgot to initialize references properly.

Andrei
September 27, 2009
downs wrote:
> Walter Bright wrote:
>> Nick Sabalausky wrote:
>>
>> I agree with you that if the compiler can detect null dereferences at
>> compile time, it should.
>>
>>
>>>> Also, by "safe" I presume you mean "memory safe" which means free of
>>>> memory corruption. Null pointer exceptions are memory safe. A null
>>>> pointer could be caused by memory corruption, but it cannot *cause*
>>>> memory corruption.
>>> No, he's using the real meaning of "safe", not the
>>> misleadingly-limited "SafeD" version of "safe" (which I'm still
>>> convinced is going to get some poor soul into serious trouble from
>>> mistakingly thinking their SafeD program is much safer than it really
>>> is). Out here in reality, "safe" also means a lack of ability to
>>> crash, or at least some level of protection against it. 
>> Memory safety is something that can be guaranteed (presuming the
>> compiler is correctly implemented). There is no way to guarantee that a
>> non-trivial program cannot crash. It's the old halting problem.
>>
> 
> Okay, I'm gonna have to call you out on this one because it's simply incorrect.
> 
> The halting problem deals with a valid program state - halting.
> 
> We cannot check if every program halts because halting is an instruction that must be allowed at almost any point in the program.
> 
> Why do crashes have to be allowed? They're not an allowed instruction!
> 
> A compiler can be turing complete and still not allow crashes. There is nothing wrong with this, and it has *nothing* to do with the halting problem.
> 
>>> You seem to be under the impression that nothing can be made
>>> uncrashable without introducing the possibility of corrupted state.
>>> That's hogwash.
>> I read that statement several times and I still don't understand what it
>> means.
>>
>> BTW, hardware null pointer checking is a safety feature, just like array
>> bounds checking is.
> 
> PS: You can't convert segfaults into exceptions under Linux, as far as I know.

How did Jeremie do that?

Andrei
September 27, 2009
Andrei Alexandrescu:

> One good thing about nullable references is that they are dynamically checked for validity at virtually zero cost. Non-nullable references, therefore, would not add value in that respect, but would add value by reducing the cases when programmers forgot to initialize references properly.

nonnullable references can also reduce the total amount of code a little, because you don't need to write the null tests often (the points where you use objects are more than the points where you instantiate them).

Bye,
bearophile
September 27, 2009
"Walter Bright" <newshound1@digitalmars.com> wrote in message news:h9n3k5$2eu9$1@digitalmars.com...
> Jason House wrote:
>>> Also, by "safe" I presume you mean "memory safe" which means free of memory corruption. Null pointer exceptions are memory safe. A null pointer could be caused by memory corruption, but it cannot *cause* memory corruption.
>>
>> I reject this argument too :( To me, code isn't safe if it crashes.
>
> Well, we can't discuss this if we cannot agree on terms. The conventional definition of memory safe means no memory corruption.

He keeps saying "safe", and every time he does you turn it into "memory safe". If he meant "memory safe" he probably would have said something like "memory safe". He already made it perfectly clear he's talking about crashes, so continuing to put the words "memory safe" into his mouth doesn't help the discussion.

> Boeing, Boeing, Boeing, Boeing, Boeing...

Straw man. No one's arguing against designing systems to survive failure, and no one's arguing against forcing errors to be exposed.

Your point seems to be: A good system is designed to handle a crash/failure without corruption, so let's allow things to crash/fail all they want.

Our point is: A good system is designed to handle a crash/failure without corruption, but let's also do what we can to minimize the amount of crashes/failures in the first place.

You're acting as if handling failures safely and minimizing failures were mutually exclusive.

> It's not designed to segfault. It's designed to expose errors, not hide them.

Right. And some of these errors can be exposed at compile time...and you want to just leave them as runtime segfaults instead? And you want this because exposing an error at compile time somehow causes it to become a hidden error?


September 27, 2009
Nick Sabalausky:

> He keeps saying "safe", and every time he does you turn it into "memory safe". If he meant "memory safe" he probably would have said something like "memory safe". He already made it perfectly clear he's talking about crashes, so continuing to put the words "memory safe" into his mouth doesn't help the discussion.

Likewise, I think that the name of SafeD modules is misleading, they are MemorySafeD :-)

Bye,
bearophile
September 27, 2009
Nick Sabalausky wrote:

> "Walter Bright" <newshound1@digitalmars.com> wrote in message
...
> You're acting as if handling failures safely and minimizing failures were mutually exclusive.

Not that I have an opinion on this either way, but if I understand Walter
right that is exactly his point (although you exaggerate it a bit), see
below.

>> It's not designed to segfault. It's designed to expose errors, not hide them.
> 
> Right. And some of these errors can be exposed at compile time...and you want to just leave them as runtime segfaults instead? And you want this because exposing an error at compile time somehow causes it to become a hidden error?

somehow -> encourages a practice where programmers get annoyed by the 'exposing of errors' to the point that they hide them

This is what it's about, I think: are non-nullable references *by default* so annoying as to cause programmers to initialize them with wrong values (or circumventing them in other ways)? The answer may depend on the details of the feature, quality of implementation and on the habits of the 'programmers' in question, I don't know.




September 27, 2009
Lutger Wrote:

> This is what it's about, I think: are non-nullable references *by default* so annoying as to cause programmers to initialize them with wrong values (or circumventing them in other ways)? The answer may depend on the details of the feature, quality of implementation and on the habits of the 'programmers' in question, I don't know.

In reality, the issue becomes what will programmers do to bypass compiler errors. This is one area where syntactic sugar is worth its weight in gold. I'm envisioning the syntax of Fan, or a very C#-like syntax:

SomeType x; // Not nullable
SomeType? y; // Nullable

If the developer is too lazy to add the question mark and prefers to do
SomeType x = cast(SomeType) null;
Then it's their own fault when they get a runtime segfault to replace a compile-time error.

September 27, 2009
On Sat, 26 Sep 2009 17:08:32 -0400, Walter Bright <newshound1@digitalmars.com> wrote:

> Denis Koroskin wrote:
>  > On Sat, 26 Sep 2009 22:30:58 +0400, Walter Bright
>  > <newshound1@digitalmars.com> wrote:
>  >> D has borrowed ideas from many different languages. The trick is to
>  >> take the good stuff and avoid their mistakes <g>.
>  >
>  > How about this one:
>  > http://sadekdrobi.com/2008/12/22/null-references-the-billion-dollar-mistake/  >
>  >
>  > :)
>
> I think he's wrong.
>

Analogies aside, we have 2 distinct problems here, with several solutions for each.  I jotted down what I think are the solutions being discussed and the Pros and Cons of each are.

Problem 1. Developer of a function wants to ensure non-null values are passed into his function.

Solution 1:

  Rely on the hardware feature to do the checking for you.

  Pros: Easy to do, simple to implement, optimal performance (hardware's going to do this anyways).
  Cons: Runtime error instead of compile-time, Error doesn't always occur close to the problem, not always easy to get a stack trace.

Solution 2:

  Check for null once the values come into the function, throw an exception.

  Pros: Works with the exception system.
  Cons: Manual implementation required, performance hit for every function call, Runtime error instead of compile-time, Error doesn't always occur close to the problem.

Solution 3:

  Build the non-null requirement into the function signature (note, the requirement is optional, it's still possible to use null references if you want).

  Pros: Easy to implement, Compile-time error, hard to "work around" by putting a dummy value, sometimes no performance hit, most times very little performance hit, allows solution 1 and 2 if you want, runtime errors occur AT THE POINT things went wrong not later.
  Cons: Non-zero performance hit (you have to check for null sometimes before assignment!)

Solution 4:

  Perform a null check for every dereference (The Java/C# solution).

  Pros: Works with the exception system, easy to implement.
  Cons: Huge performance hit (except in OS where segfault can be hooked), Error doesn't always occur close to the problem.

-----------------------

Problem 2. Developer forgets to initialize a declared reference type, but uses it.

Solution 1:

  Assign a default value of null.  Rely on hardware to tell you when you use it later that you screwed up.

  Pros: Easy to do, simple to implement, optimal performance (hardware's going to do this anyways).
  Cons: Runtime error instead of compile-time, Error doesn't always occur close to the problem, not always easy to get a stack trace.

Solution 2:

  Require assignment, even if assignment to null. (The "simple" solution)

  Pros: Easy to implement, forces the developer to clarify his requirements -- reminding him that there may be a problem.
  Cons: May be unnecessary, forces the developer to make a decision, may result in a dummy value being assigned reducing to solution 1.

Solution 3:

  Build into the type the requirement that it can't be null, therefore checking for non-null on assignment.  A default value isn't allowed.  A nullable type is still allowed, which reduces to solution 1.

  Pros: Easy to implement, solution 1 is still possible, compile-time error on misuse, error occurs at the point things went wrong, no performance hit (except when you convert a nullable type to a non-nullable type), allows solution 3 for first problem.
  Cons: Non-zero performance hit when assigning nullable to non nullable type.

Solution 4:

  Compiler performs flow analysis, giving an error when an unassigned variable is used. (The C# solution)

  Pros: Compile-time error, with good flow analysis allows correct code even when assignment isn't done on declaration.
  Cons: Difficult to implement, sometimes can incorrectly require assignment if flow is too complex, can force developer to manually assign null or dummy value.

*NOTE* for solution 3 I purposely did NOT include the con that it makes people assign a dummy value.  I believe this argument to be invalid, since it's much easier to just declare the variable as a nullable equivalent type (as other people have pointed out).  That problem is more a factor of solutions 2 and 4.

----------------------

Anything I missed?

After looking at all the arguments, and brainstorming myself, I think I prefer the non-nullable defaults (I didn't have a position on this concept before this thread, and I had given it some thought).

I completely agree with Ary and some others who say "use C# for a while, and see how much it helps."  I wrote C# code for a while, and I got those errors frequently, usually it was something I forgot to initialize or return.  It definitely does not cause the "assign dummy value" syndrome as Walter has suggested.  Experience with languages that do a good job of letting the programmer know when he made an actual mistake makes a huge difference.

I think the non-nullable default will result in even less of a temptation to assign a dummy value.

-Steve