Null references redux (page 2) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Null references redux (page 2)

September 26, 2009

Re: Null references redux

Posted by Walter Bright
in reply to grauzone

Walter Bright

Posted in reply to grauzone

grauzone wrote:
> Walter Bright wrote:
>> It is exactly analogous to a null pointer exception. And it's darned useful.
> 
> On Linux, it just generates a segfault. And then you have no idea where the program went wrong. dmd outputting incorrect debugging information (so you have troubles using gdb or even addr2line) doesn't really help here.

Then the problem is incorrect dwarf output, not null pointers.


> Not so useful.

It's *still* far more useful than generating corrupt output and pretending all is ok.

September 26, 2009

Re: Null references redux

Posted by Jeremie Pelletier
in reply to Jarrett Billingsley

Jeremie Pelletier

Posted in reply to Jarrett Billingsley

Jarrett Billingsley wrote:
> On Sat, Sep 26, 2009 at 5:29 PM, Jeremie Pelletier <jeremiep@gmail.com> wrote:
> 
>> I actually side with Walter here. I much prefer my programs to crash on
>> using a null reference and fix the issue than add runtime overhead that does
>> the same thing. In most cases a simple backtrace is enough to pinpoint the
>> location of the bug.
> 
> There is NO RUNTIME OVERHEAD in implementing nonnull reference types.
> None. It's handled entirely by the type system. Can we please move
> past this?
> 
>> Null references are useful to implement optional arguments without any
>> overhead by an Optional!T wrapper. If you disallow null references what
>> would "Object foo;" initialize to then?
> 
> It wouldn't. The compiler wouldn't allow it. It would force you to
> initialize it. That is the entire point of nonnull references.

How would you do this then?

void foo(int a) {
	Object foo;
	if(a == 1) foo = new Object1;
	else if(a == 2) foo = Object2;
	else foo = Object3;
	foo.doSomething();
}

The compiler would just die on the first line of the method where foo is null.

What about "int a;" should this throw an error too? Or "float f;".

What about standard pointers? I can think of so many algorithms who rely on pointers possibly being null.

Maybe this could be a case to add in SafeD but leave out in standard D. I wouldn't want a nonnull reference type, I use nullables just too often.

September 26, 2009

Re: Null references redux

Posted by Walter Bright
in reply to Denis Koroskin

Walter Bright

Posted in reply to Denis Koroskin

Denis Koroskin wrote:
> I don't understand you. You say you prefer 1, but describe the path D currently takes, which is 2!
> 
> dchar d; // not initialized
> writeln(d); // Soldier on and silently produce garbage output

d is initialized to the "invalid" unicode bit pattern of 0xFFFF. You'll see this if you put a printf in. The bug here is in writeln failing to recognize the invalid value.

http://d.puremagic.com/issues/show_bug.cgi?id=3347

> I don't see at all how is it related to a non-null default.

Both are attempts to use invalid values.

> Non-null default is all about avoiding erroneous situations, enforcing program correctness and stability. You solve an entire class of problem: NullPointerException.

No, it just papers over the problem. The actual problem is the user failed to initialize it to a value that makes sense for his program. Setting it to a default value does not solve the problem.

Let's say the language is changed so that:

   int i;

is now illegal, and generates a compile time error message. What do you suggest the user do?

   int i = 0;

The compiler now accepts the code. But is 0 the correct value for the program? I guarantee you that programmers will simply insert "= 0" to get it to pass compilation, even if 0 is an invalid value for i for the logic of the program. (I guarantee it because I've seen it over and over, and the bugs that result.)

The point is, there really is no correct answer to the question "what should variables be default initialized to that will work correctly"? The best we can do is default initialize it to a NaN value, and then we can track usages of NaNs and know then that we have a program logic bug. A null reference is an ideal NaN value because attempts to use it will cause an immediate program halt with a findable indication of where the program logic went wrong. There's no avoiding it or pretending it didn't happen. There's no silently corrupt program output.

September 26, 2009

Re: Null references redux

Posted by Walter Bright
in reply to Jarrett Billingsley

Walter Bright

Posted in reply to Jarrett Billingsley

Jarrett Billingsley wrote:
> It wouldn't. The compiler wouldn't allow it. It would force you to
> initialize it. That is the entire point of nonnull references.

Initialize it to what?

A user-defined default object? What should happen if that default object is accessed? Throw an exception? <g>

How would you define an "empty" slot in a data structure?

September 26, 2009

Re: Null references redux

Posted by Denis Koroskin
in reply to Jeremie Pelletier

Denis Koroskin

Posted in reply to Jeremie Pelletier

On Sun, 27 Sep 2009 01:59:45 +0400, Jeremie Pelletier <jeremiep@gmail.com> wrote:

> Jarrett Billingsley wrote:
>> On Sat, Sep 26, 2009 at 5:29 PM, Jeremie Pelletier <jeremiep@gmail.com> wrote:
>>
>>> I actually side with Walter here. I much prefer my programs to crash on
>>> using a null reference and fix the issue than add runtime overhead that does
>>> the same thing. In most cases a simple backtrace is enough to pinpoint the
>>> location of the bug.
>>  There is NO RUNTIME OVERHEAD in implementing nonnull reference types.
>> None. It's handled entirely by the type system. Can we please move
>> past this?
>>
>>> Null references are useful to implement optional arguments without any
>>> overhead by an Optional!T wrapper. If you disallow null references what
>>> would "Object foo;" initialize to then?
>>  It wouldn't. The compiler wouldn't allow it. It would force you to
>> initialize it. That is the entire point of nonnull references.
>
> How would you do this then?
>
> void foo(int a) {
> 	Object foo;
> 	if(a == 1) foo = new Object1;
> 	else if(a == 2) foo = Object2;
> 	else foo = Object3;
> 	foo.doSomething();
> }
>

Let's consider the following example, first:

void foo(int a) {
	Object foo;
	if (a == 1) foo = Object1;
	else if(a == 2) foo = Object2;
	else if(a == 3) foo = Object3;

	foo.doSomething();
}

Do you agree that this program has a bug? It is buggy, because one of the paths skips "foo" variable initialization.

Now back to your question. My answer is that compiler should be smart enough to differentiate between the two cases and raise a compile-time error in a latter one. That's what C# compiler does: the first case successfully compiles while the second one doesn't.

Until then, non-nullable references are too hard to use to become useful, because you'll end up with a lot of initializer functions:

void foo(int a) {
	Object initializeFoo() {
		if (a == 1) return new Object1();
		if (a == 2) return new Object2();
		return new Object3();
        }

	Object foo = initializeFoo();
	foo.doSomething();
}

I actually believe the code is more clear that way, but there are cases when you can't do it (initialize a few variables, for example)

September 26, 2009

Re: Null references redux

Posted by language_fan
in reply to Walter Bright

language_fan

Posted in reply to Walter Bright

Sat, 26 Sep 2009 14:49:45 -0700, Walter Bright thusly wrote:

> The problem with non-nullable references is what do they default to? Some "nan" object? When you use a "nan" object, what should it do? Throw an exception?

Well typically if your type system supports algebraic types, you can define a higher order Optional type as follows:

  type Optional T = Some T | Nothing

Now a safe nullable reference type would look like

  Optional (T*)

The whole point is to make null pointer tests explicit. You can pass around the optional type freely, and only on the actual use site you need to pattern match it to see if it's a null pointer:

  void foo(SafeRef[int] a) {
    match(a) {
      case Nothing => // handle null pointer
      case Some(b) => return b + 2;
    }
  }

The default initialization of this type is Nothing.

Some data structures can be initialized in a way that null pointers don't exist. In these cases you can use a type that does not have the 'Nothing' form. This can lead to nice optimizations. There is no default value, cause default initialization can never occur.

September 26, 2009

Re: Null references redux

Posted by Denis Koroskin
in reply to Walter Bright

Denis Koroskin

Posted in reply to Walter Bright

On Sun, 27 Sep 2009 01:49:45 +0400, Walter Bright <newshound1@digitalmars.com> wrote:

> The problem with non-nullable references is what do they default to? Some "nan" object? When you use a "nan" object, what should it do? Throw an exception?
>

Oh, my! You don't even know what a non-null default is!

There is a Null Object pattern (http://en.wikipedia.org/wiki/Null_Object_pattern) - I guess that's what you are talking about, when you mean "nan object" - but it has little to do with non-null references.

With non-null references, you don't have "wrong values", that throw an exception upon use (although it's clearly possible), you get a correct value.

If an object may or may not have a valid value, you mark it as nullable. All the difference is that it's a non-default behavior, that's it. And a user is now warned, that an object may be not initialized.

September 26, 2009

Re: Null references redux

Posted by Jarrett Billingsley
in reply to Jeremie Pelletier

Jarrett Billingsley

Posted in reply to Jeremie Pelletier

On Sat, Sep 26, 2009 at 5:59 PM, Jeremie Pelletier <jeremiep@gmail.com> wrote:
>
> How would you do this then?
>
> void foo(int a) {
>        Object foo;
>        if(a == 1) foo = new Object1;
>        else if(a == 2) foo = Object2;
>        else foo = Object3;
>        foo.doSomething();
> }
>
> The compiler would just die on the first line of the method where foo is null.

Either use Object? (a nullable reference), or factor out the object creation - use a separate method or something.

> What about "int a;" should this throw an error too? Or "float f;".

Those are not reference types. But actually, the D spec says it's an error to use an uninitialized variable, so a compliant D compiler wouldn't be out of line by diagnosing such things as errors if they are used before they're intialized. Such a compiler would break a lot of existing D code, but that's what you get for not following the spec..

> What about standard pointers? I can think of so many algorithms who rely on pointers possibly being null.

Again, you have both nonnull (void*) and nullable (void*?) types.

> Maybe this could be a case to add in SafeD but leave out in standard D. I wouldn't want a nonnull reference type, I use nullables just too often.

You probably use them far less than you'd think.

September 26, 2009

Re: Null references redux

Posted by grauzone
in reply to Walter Bright

grauzone

Posted in reply to Walter Bright

Walter Bright wrote:
> Jarrett Billingsley wrote:
>> It wouldn't. The compiler wouldn't allow it. It would force you to
>> initialize it. That is the entire point of nonnull references.
> 
> Initialize it to what?
> 
> A user-defined default object? What should happen if that default object is accessed? Throw an exception? <g>
> 
> How would you define an "empty" slot in a data structure?

You can allow a non-nullable reference to be null, just like you allow an immutable object to be mutable during construction.

You just have to make sure the non-nullable reference is definitely assigned.

September 26, 2009

Re: Null references redux

Posted by Walter Bright
in reply to Denis Koroskin

Walter Bright

Posted in reply to Denis Koroskin

Denis Koroskin wrote:
>> If you disallow null references what would "Object foo;" initialize to then?
> Nothing. It's a compile-time error.

Should:

   int a;

be disallowed, too? If not (and explain why it should behave differently), what about:

   T a;

in generic code?

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation