November 04, 2008
On Tue, 04 Nov 2008 12:32:59 -0800, Walter Bright <newshound1@digitalmars.com> wrote:

> Brendan Miller wrote:
>> This is obviously a problem. Everyone knows that null pointer
>> exceptions in Java/C#, or segmentation faults in C and C++ are one of
>> the biggest sources of runtime errors.
>
> Yes, but those are neither type safe errors or memory safe errors. A null pointer is neither mistyped nor can it cause memory corruption.

null pointers DO cause memory corruption:

   byte* foo = null;   // NULL!
   foo[1244916] = 5;   // WORKS; CORRUPTS!
November 04, 2008
Jarrett Billingsley:
> Have you looked at Delight at all?

Beside the topic of nullable types you are discussing about, Delight's look is designed to appeal mostly to Python programmers (despite being just D2, a little sugared), and/or to people that care a lot about having a clean(er) syntax, so C/C++ programmers may be less interested...

If Delight becomes refined and debugged enough, I hope to see it bundled by default with the LDC compiler, as well as Tango, a GUI toolkit like GTK for D, and few other goodies, like an editor/almost-IDE. I think it can become a way to "sell" D2 to other kind of programmers.

Bye,
bearophile
November 05, 2008
Jarrett Billingsley wrote:
> On Tue, Nov 4, 2008 at 5:31 PM, Walter Bright
> <newshound1@digitalmars.com> wrote:
>>> Don't you think that eliminating something that's
>>> always a bug at compile time is a worthwhile investment?
>> Not always. There's a commensurate increase in complexity that may not make
>> it worth while.
> 
> Have you looked at Delight at all?  I wouldn't call the impact of
> nullable types on D "commensurate."  It's probably far less than
> const, invariant, pure, and escape analysis.

Sorry, I have not looked at Delight.


>> My focus is on eliminating bugs that cannot be reliably detected even at run
>> time. This will be a big win for D.
> 
> Can you expand upon this a bit?  What exactly are some bugs that can't
> be reliably detected at runtime other than memory corruption?

Memory corruption is a big one. Another are sequential consistency bugs, then there's function hijacking.
November 05, 2008
cemiller wrote:
> null pointers DO cause memory corruption:
> 
>    byte* foo = null;   // NULL!
>    foo[1244916] = 5;   // WORKS; CORRUPTS!

Yes, but so will any pointer that you index out of bounds. That's why safe D will not allow arithmetic on pointers.
November 05, 2008
Walter Bright Wrote:

> Brendan Miller wrote:
> > This is obviously a problem. Everyone knows that null pointer exceptions in Java/C#, or segmentation faults in C and C++ are one of the biggest sources of runtime errors.
> 
> Yes, but those are neither type safe errors or memory safe errors. A null pointer is neither mistyped nor can it cause memory corruption.

Well.. I can't speak for null pointers in D, but they can definitely cause memory corruption in C++. Not all OS's have memory protection. *remembers the good old days of Mac OS system 7*

Back to the important point!

A couple of times in this thread I've seen people suggest that null pointers are type safe. I don't see how that statement is justifiable. People accept null because it's always been there for those of us who are long time C coders. What you have to remember, is C was not type safe in any way shape or form.

First off, let's clarify that we're talking about *static* type safety. Languages like python are dynamically type safe because at runtime you will see an exception thrown if you try to perform an operation on a type that it does not support it. If you have a reference in python, you can point it to whatever the hell you want and the runtime will prevent you from performing the wrong operation on the wrong data. It's a more limited form of type checking than static type checking, but many people find this acceptable.

In a statically typed language, it is *impossible* to perform an operation on a type that it does not support because at compile time you know the types of the objects.

Concretely null is a pointer to address zero. For some type T, there is never any T at address zero. Therefor a statically typed language will prevent you from assigning a poitner to an object that is not of type T to a pointer decleared to be type T. That's *the entire point* of static typing. T* means "that which I point to is in the set of T". T sans the star means "I am in the set of T". Not sometimes. Not maybe. Always.

Yes, you can also get performance benefits from type annotations... but that doesn't make the langauge statically type *safe*.

Now of course, sometimes we do want to a pointer to type T to be null... but what does that *mean*? It means, you have a variable that sometimes you want to hold a pointer to T... and sometimes you don't want to hold a pointer to T.

This is called a variant. Different languages implement variants in different ways and have different names for them. In C, they are called unions. C, again, is not type *safe* so if you try to treat a union as the wrong type, it will let you. However, in most langauges, variants provide dynamic typing for variants, and thus offer the lesser form of type safety.

C and C++ pointers to T are variants of type T and the type of NULL. Except, of course, like unions they aren't type safe even dynamically because the runtime won't stop you from derefencing null. The operating system *will* stop you by killing your process, if you are on a system with protected memory because address zero is not accessible to userspace on most systems. *most* systems, not all.

Think about this in terms of set theory and the idea should become clear. Null should not be assignable to a pointer to T because the object it points to at address zero does not lie within the set of T's. If it did lie within the set of T's, then this should be valid:

T myObject;
my Object = *NULL;

It shouldn't even require a type cast because type casts are ways of breaking out of static typing. But it does in C++. In fact, this code generates:

error: invalid type argument of `unary *'

Damn right.

Now, really, what's so hard about adding a statically type safe pointer? C++ already did it, and they are called references. My complaint here, after all, was that D is apparently less type safe than C++.

Now, I have other problems with C++ references. That they have value semantics is just stupid (especially since they are *called* references!). Type safety and value  vs reference semantics have nothing to do with one another. Indeed, sometimes you might even want a variant to have value semantics. That's why C# added nullable value types.

Brendan
November 05, 2008
Brendan Miller wrote:
> Well.. I can't speak for null pointers in D, but they can definitely
> cause memory corruption in C++. Not all OS's have memory protection.
> *remembers the good old days of Mac OS system 7*

Those machines are obsolete, for excellent reasons <g>. If, for some reason, a D implementation needs to be implemented for such a machine, the solution is to optionally insert a runtime check analogously to array bounds checking.

> Concretely null is a pointer to address zero. For some type T, there
> is never any T at address zero. Therefor a statically typed language
> will prevent you from assigning a poitner to an object that is not of
> type T to a pointer decleared to be type T. That's *the entire point*
> of static typing. T* means "that which I point to is in the set of
> T". T sans the star means "I am in the set of T". Not sometimes. Not
> maybe. Always.

I understand your point, and it sounds right technically. But practically, I'm not convinced.

For example, consider a linked list. How do you know you've reached the end of the list? By the pointer being null or pointing to some "impossible" object. If you pick the latter, what really have you gained over a null pointer?
November 05, 2008
On Tue, Nov 4, 2008 at 11:40 PM, Walter Bright <newshound1@digitalmars.com> wrote:
> I understand your point, and it sounds right technically. But practically, I'm not convinced.
>
> For example, consider a linked list. How do you know you've reached the end of the list? By the pointer being null or pointing to some "impossible" object. If you pick the latter, what really have you gained over a null pointer?
>

The implication of non-nullable types isn't that nullable types disappear; quite the opposite, in fact.  Nullable types have obvious use for exactly the reason you explain.  The problem arises when nullable types are used in situations where it makes _no sense_ for null to appear.  This is where bugs show up.  In a system that has both nullable and non-null types, nullable types act as a sort of container, preventing you from accessing anything through them as it cannot be statically proven that the access will be legal at runtime. In order to access something from a nullable type, you have to convert it to a non-null type.  Delight uses D's "declare a variable in the condition of an if or while" to great effect here:

if(auto f = someFuncThatReturnsNullableFoo()) // f is declared as non-null
{
    // f is known not to be null.
}
else
{
    // something else happened.  Handle it.
}

Null still has a purpose.  It's just that its purpose is really only to signal a special case.
November 05, 2008
Walter Bright wrote:
> Jarrett Billingsley wrote:
>> Dereferencing a null pointer is *always* a bug, it doesn't matter how
>> "safe" it is.
> 
> Sure. But I'm interested in creating a safe subset of D, and so the more correct interpretation of what constitutes "safety" is important.
> 
>> Don't you think that eliminating something that's
>> always a bug at compile time is a worthwhile investment?
> 
> Not always. There's a commensurate increase in complexity that may not make it worth while.
> 
> My focus is on eliminating bugs that cannot be reliably detected even at run time. This will be a big win for D.

FWIW, I've _never_ run into a bug const could have prevented. OTOH, I've often run into bugs that non-nullable types could have prevented (including one on a production system... well, there was another bug that raised an exception causing something else to be uninitialized and the system came crashing down).
November 05, 2008
"Robert Fraser" <fraserofthenight@gmail.com> wrote in message news:geregs$sj1$1@digitalmars.com...
> FWIW, I've _never_ run into a bug const could have prevented. OTOH, I've often run into bugs that non-nullable types could have prevented (including one on a production system... well, there was another bug that raised an exception causing something else to be uninitialized and the system came crashing down).

Hear hear!

Nullness should have nothing to do with a type having reference or value semantics. These two concepts are orthogonal.

L. 

November 05, 2008
Jarrett Billingsley wrote:
> The implication of non-nullable types isn't that nullable types
> disappear; quite the opposite, in fact.  Nullable types have obvious
> use for exactly the reason you explain.  The problem arises when
> nullable types are used in situations where it makes _no sense_ for
> null to appear.  This is where bugs show up.  In a system that has
> both nullable and non-null types, nullable types act as a sort of
> container, preventing you from accessing anything through them as it
> cannot be statically proven that the access will be legal at runtime.
> In order to access something from a nullable type, you have to convert
> it to a non-null type.  Delight uses D's "declare a variable in the
> condition of an if or while" to great effect here:
> 
> if(auto f = someFuncThatReturnsNullableFoo()) // f is declared as non-null
> {
>     // f is known not to be null.
> }
> else
> {
>     // something else happened.  Handle it.
> }

I don't see what you've gained here. The compiler certainly can do flow analysis in some cases to know that a pointer isn't null, but that isn't generalizable. If a function takes a pointer parameter, no flow analysis will tell you if it is null or not.