Null references redux (page 20) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Null references redux (page 20)

September 29, 2009

Re: Null references redux

Posted by Adam Burton
in reply to language_fan

Adam Burton

Posted in reply to language_fan

I don't know if what I am about to rant about has already been discussed and I haven't noticed, but sometimes I feel like sticking my opinions in and this seems to be one of them times :-) so bare with me and we'll see if I am a crazy man blabbing on about crap or not :-).

language_fan wrote:

> Mon, 28 Sep 2009 15:35:07 -0400, Jesse Phillips thusly wrote:
> 
>> language_fan Wrote:
>> 
>>> > Now if you really want to throw some sticks into the spokes, you would say that if the program crashes due to a null pointer, it is still likely that the programmer will just initialize/set the value to a "default" that still isn't valid just to get the program to continue to run.
>>> 
>>> Why should it crash in the first place? I hate crashes. You liek them? I can prove by structural induction that you do not like them when you can avoid crashes with static checking.
>> 
>> No one likes programs that crash, doesn't that mean it is an incorrect behavior though?
>> 
>>> Have you ever used functional languages? When you develop in Haskell or SML, how often you feel there is a good change something will be initialized to the wrong value? Can you show some statistics that show how unsafe this practice is?
>> 
>> So isn't that the question? Does/can "default" (by human or machine) initialization create an incorrect state?
Yes but this is possible now anyway. Consider

Foo obj;  // Machine default of null right?
obj.bar(); // Null pointer exception due to null being bad state for the app

Now steps in moron programmer, who would put garbage data into non-nullable vars to init them, to fix the issue

Foo obj = new Foo("bleh");  // Fix to avoid null pointer exception (and yes
i have seen people do this)
obj.bar(); // Logic error but the application soldiers on.
>> If it does, do we continue to
>> work as if nothing was wrong or crash?
Depends on the applications specification/purpose/design? See in a few lines what I mean but I don't see how this is pertinent to the discussion. [1]
>> I don't know how often the
>> initialization would be incorrect, but I don't think Walter is concerned
>> with it's frequency, but that it is possible.
Not sure what your getting at, but with my moron programmer example I have shown its possible to insert garbage into classes, maybe we should drop them too? Also the default machine implementation seemed to screw up too. I think there's a point where you have to trust the human factor to do its job correctly. If the feature was so ridiculously complex (like depending on planetary alignment) that it forced the programmer into stupid practices then fair enough, even if its likely most will get it right then that's a problem with the feature not the programmer (although I would personally say this isn't the case, pending I have understood the feature correctly :-P) ... if that makes sense (so any technical issues, e.g. I believe someone mentioned enforcing it in structs allocated with malloc, are good points that I am just not technical enough to comment on, consider me the casual hobby reader who has an interest, but not a good background, in systems languages). However I think the previous discussions as I remember them seem to assert the programmer is an idiot who will initialize with crap, which I think is just out of the languages control.
> ...
> 
> It really depends on your subjective opinion whether you want a program to segfault or spot a set of errors statically, and have illegally behaving non-crashing programs. I say FFFFFFFFFFUUUUUUUUUUU every time I experience a segfault. My hobby programs at home are not that critical, and at work the critical code is *proven* to be correct so no need to worry there.
[1] I think the above touches on an important point when it comes to whether it should crash or continue, without a more in depth knowledge its hard to say. Some applications it may be possible to crash a process within itself (so just throw exception) and return the application to a reasonable state that it may continue (like crashing back to the applications main menu and letting you start again). Others apps you may want to kill there and then (but die gracefully, so rollback transactions etc) before they do more harm.

Regardless the above 2 arguments of crashing vs continuing and the incompetence of some developers seems to have no baring on non-nullable. Ignoring the fact a function with all non-nullable variables could still crash with a non nullpointerexecption, it seems to me if anything non- nullables just make the application crash earlier when it receives a null where not expected. Consider below implemented "normally".

void FuncOne(Foo foo)
{
   ....
   foo = null;   // The bug
   ....
   FuncTwo(foo);
   ....
}

// Does not expect null
void FuncTwo(Foo foo)
{
   foo.bar();   //null pointer exception
}

Trivial example but consider there are chunks of code you can't see that may also use foo that you would need to investigate to see if they set it to null, so plenty of code paths to search. Now consider with non-nullables.

void FuncOne(Foo? foo)
{
   ....
   foo = null;   // The bug
   ....
   FuncTwo(enforce(foo));   // [2]
   ....
}

// Does not expect null
void FuncTwo(Foo foo)
{
   foo.bar();
}

[2] Here I am guessing at what people mean by enforce. My assumption is it checks to see if foo is null and throws nullpointerexception if so. Else it lets to application continue executing and also skips the compiler check that we are passing a nullable into a non-nullable.

So first off without enforce [2] would have had a compiler error that would have made me investigate this potential bug anyway, whether I should have an alternate code path or more in depth look at the design, but lets assume I think it should never get to that state because its not valid for it to be null at [2] (but it is else where in FuncOne). So on execution we get an exception at [2], so we died earlier than we did in the nullable form. So not only do we have less to search (a lot less, cos not only does the trace give us less but also any other functions that only take non-nullable can remove code paths to check making the search area much smaller, sounds productive), but also we killed the application earlier before it did even more damage (like putting a plane into a dive maybe?). I wanted to point that out because I am sure Walter noticed it moved the error from one place to another but I don't think anyone has pointed out it is a way to identify bad application state earlier (which seems to be the focus for one argument) rather than later (seems to be that eventually code paths that allow nulls sooner or later turn into ones that don't because otherwise the variables are pointless, so by telling the compiler where it turns to a non null path you can get it to trigger the exceptions early which I would think would be inline with crashing the application when there is bad state).

I also see non-nullables helping track down potential bugs when changing variables to non-nullable and removing unnecessary code for vice versa.

Seems to me non-nullable is in the same sort of area as const/immutable. Where as immutable or const prevent any data changes from happening to stop bad application state, non-nullables prevent null going where its not allowed for bad application state with numerous other productivity benefits.

Are these the ramblings of a sleep deprived mad man getting involved with things he doesn't understand? you decide :-).

September 29, 2009

Re: Null references redux

Posted by Jesse Phillips
in reply to Steven Schveighoffer

Jesse Phillips

Posted in reply to Steven Schveighoffer

On Mon, 28 Sep 2009 16:01:10 -0400, Steven Schveighoffer wrote:

> On Mon, 28 Sep 2009 15:35:07 -0400, Jesse Phillips <jesse.k.phillips+d@gmail.com> wrote:
> 
>> language_fan Wrote:
>>
>>> Have you ever used functional languages? When you develop in Haskell or SML, how often you feel there is a good change something will be initialized to the wrong value? Can you show some statistics that show how unsafe this practice is?
>>
>> So isn't that the question? Does/can "default" (by human or machine) initialization create an incorrect state? If it does, do we continue to work as if nothing was wrong or crash? I don't know how often the initialization would be incorrect, but I don't think Walter is concerned with it's frequency, but that it is possible.
> 
> It creates an invalid, non-compiling program.

No it doesn't, I'm not referring to null as the invalid state.

float a;

In this program it is invalid for 'a' to equal zero. If the compiler complains it is not initialized the programmer could fulfill the requirements.

float a = 0;

Hopefully the programmer knows that it shouldn't be 0, but a correction like this is still possible, the compiler won't complain and the program won't crash. Depending on what 'a' is controlling this could be very bad.

I'm really not arguing either way, I'm trying to make it clear since no one seems to be getting Walters positions.

BTW, what is it with people writing

SomeObject foo;

If they believe the compiler should enforce explicit initialization? If you think an object should always be initialized at declaration don't write a statement that only declares and don't set a reference to null.

September 29, 2009

Re: Null references redux + Cyclone

Posted by Michel Fortin
in reply to bearophile

Michel Fortin

Posted in reply to bearophile

On 2009-09-28 15:36:05 -0400, bearophile <bearophileHUGS@lycos.com> said:

> Compiled with DMD the running time seems about unchanged. I have no idea why. Maybe some of you can tell me.

If I recall correctly, implementing an interface adds a variable to an class which contains a pointer to that interface's vtable implementation for that particular class. An interface pointer points to that variable inside the object instead (not at the beginning of the object allocated space), and calling a function on it involves dereferencing the interface's vtable, and calling the right function. Obtaining the real "this" pointer for calling the function involves looking at the first value in the interface's vtable which contains an offset you can substract from the interface pointer to get the object pointer.

So basically, if I recall well how it works, calling a function on an interface reference involves one more substraction than calling a member function a class reference, which is pretty marginal.


-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/

September 29, 2009

Re: Null references redux

Posted by Andrei Alexandrescu
in reply to language_fan

Andrei Alexandrescu

Posted in reply to language_fan

language_fan wrote:
> Mon, 28 Sep 2009 15:35:07 -0400, Jesse Phillips thusly wrote:
> 
>> language_fan Wrote:
>>
>>>> Now if you really want to throw some sticks into the spokes, you
>>>> would say that if the program crashes due to a null pointer, it is
>>>> still likely that the programmer will just initialize/set the value
>>>> to a "default" that still isn't valid just to get the program to
>>>> continue to run.
>>> Why should it crash in the first place? I hate crashes. You liek them?
>>> I can prove by structural induction that you do not like them when you
>>> can avoid crashes with static checking.
>> No one likes programs that crash, doesn't that mean it is an incorrect
>> behavior though?
>>
>>> Have you ever used functional languages? When you develop in Haskell or
>>> SML, how often you feel there is a good change something will be
>>> initialized to the wrong value? Can you show some statistics that show
>>> how unsafe this practice is?
>> So isn't that the question? Does/can "default" (by human or machine)
>> initialization create an incorrect state? If it does, do we continue to
>> work as if nothing was wrong or crash? I don't know how often the
>> initialization would be incorrect, but I don't think Walter is concerned
>> with it's frequency, but that it is possible.
> 
> Value types can be incorrectly initialized and nobody notices. E.g.
> 
>   int min;
> 
>   foreach(int value; list)
>     if (value < min) min = value;
> 
> Oops, you forgot to define a flag variable or initialize to int.min

You mean int.max :o).

Andrei

September 29, 2009

Re: Null references redux + Cyclone

Posted by Walter Bright
in reply to Andrei Alexandrescu

Walter Bright

Posted in reply to Andrei Alexandrescu

Andrei Alexandrescu wrote:
> Thanks for posting these interesting numbers. I seem to recall that interface dispach in D does a linear search in the interfaces list, so you may want to repeat your tests with a variable number of interfaces, and a variable position of the interface being used.

No, it is done with one indirection.

interface IA { void foo(); }

interface IB : IA { }

class C : IA { void foo() { } }

void test(C c)
{
    c.foo();
}

========================================

test:
                enter   4,0
                mov     ECX,[EAX]
                call    dword ptr 014h[ECX]
                leave
                ret

September 29, 2009

Re: Null references redux

Posted by Jeremie Pelletier
in reply to Andrei Alexandrescu

Jeremie Pelletier

Posted in reply to Andrei Alexandrescu

Andrei Alexandrescu wrote:
> language_fan wrote:
>> Mon, 28 Sep 2009 15:35:07 -0400, Jesse Phillips thusly wrote:
>>
>>> language_fan Wrote:
>>>
>>>>> Now if you really want to throw some sticks into the spokes, you
>>>>> would say that if the program crashes due to a null pointer, it is
>>>>> still likely that the programmer will just initialize/set the value
>>>>> to a "default" that still isn't valid just to get the program to
>>>>> continue to run.
>>>> Why should it crash in the first place? I hate crashes. You liek them?
>>>> I can prove by structural induction that you do not like them when you
>>>> can avoid crashes with static checking.
>>> No one likes programs that crash, doesn't that mean it is an incorrect
>>> behavior though?
>>>
>>>> Have you ever used functional languages? When you develop in Haskell or
>>>> SML, how often you feel there is a good change something will be
>>>> initialized to the wrong value? Can you show some statistics that show
>>>> how unsafe this practice is?
>>> So isn't that the question? Does/can "default" (by human or machine)
>>> initialization create an incorrect state? If it does, do we continue to
>>> work as if nothing was wrong or crash? I don't know how often the
>>> initialization would be incorrect, but I don't think Walter is concerned
>>> with it's frequency, but that it is possible.
>>
>> Value types can be incorrectly initialized and nobody notices. E.g.
>>
>>   int min;
>>
>>   foreach(int value; list)
>>     if (value < min) min = value;
>>
>> Oops, you forgot to define a flag variable or initialize to int.min
> 
> You mean int.max :o).
> 
> Andrei

He just proved how enforcing initializers can still cause errors! I didn't even think of that one!

:o)

September 29, 2009

Re: Null references redux

Posted by Rainer Deyke
in reply to Jesse Phillips

Rainer Deyke

Posted in reply to Jesse Phillips

Jesse Phillips wrote:
> Yeah, it was brought to my attention that "type safety" by a friend could be another form. bearophile also brings up a good example.

<snip>

> I think that is what Walter is getting at, you're not dealing with memory that is correct, when this happens the program should halt and be dealt with from outside the program.

Type errors and null pointer errors both belong to the same class of errors, namely variables containing bogus contents.  Some languages like Python detect both at runtime.  That's fine for those languages. However, I prefer to detect as many errors as possible at compile time, especially for larger projects.

Nullable types turn compile time errors into runtime errors which may or may not be detected during testing.  In the worst case, nullable types lead to silent data corruption.  Consider what happens when a bogus null field is serialized.

-- 
Rainer Deyke - rainerd@eldwood.com

September 29, 2009

Re: Null references redux

Posted by Derek Parnell
in reply to Andrei Alexandrescu

Derek Parnell

Posted in reply to Andrei Alexandrescu

On Mon, 28 Sep 2009 19:27:03 -0500, Andrei Alexandrescu wrote:

> language_fan wrote:
>> 
>>   int min;
>> 
>>   foreach(int value; list)
>>     if (value < min) min = value;
>> 
>> Oops, you forgot to define a flag variable or initialize to int.min
> 
> You mean int.max :o).

  if (list.length == 0)
     throw( some exception); // An empty or null list has no minimum
  int min = list[0];
  foreach(int value; list[1..$])
    if (value < min) min = value;

I'm still surprised by Walter's stance.

For the purposes of this discussion...
* Null only applies to the memory address portion of reference types and
not to value types. The discussion is not about non-nullable value types.
* There are two types of reference types:
  (1) Those that can be initialized on declaration because the coder knows
what to initialize them to; a.k.a. non-nullable. If the coder does not know
what to initialize them to at declaration time, then either the design is
wrong, the coder doesn't understand the algorithm or application, or it is
truly a complex run-time decision.
  (2) Those that aren't in set (1); a.k.a. nullable.
* The standard declaration should imply non-nullable. And if not
initialized the compiler should complain. This encourages protection, but
does not guarantee it, of course.
* To declare a nullable type, use a special syntax to denote that the coder
is deliberately choosing to declare a nullable reference.
* The compiler will prevent non-nullable types being simply set to null. As
D is a system language too, there will be a rare cases that need to subvert
this compiler protection, so there will need to be a method to explicitly
set a non-nullable type to a null. The point is that such a method should
be a visible warning beacon to maintenance coders.

Priority should be given to coders that prefer safe coding. If a coder, for whatever reason, chooses to use nullable references or initialize non-nullable reference to rubbish data, then the responsibility is on them to ensure safe applications. Safe coding practices should not be penalized.

The C/C++ programming language is inherently "unsafe" in this regard, and that is not news to anyone. The D programming language does not have to follow this paradigm.

I'm still not ready to use D for anything, but I watch it in hope.

-- 
Derek Parnell
Melbourne, Australia
skype: derek.j.parnell

September 29, 2009

Re: Null references redux

Posted by Yigal Chripun

Yigal Chripun

On 29/09/2009 00:31, Nick Sabalausky wrote:
> "Yigal Chripun"<yigal100@gmail.com>  wrote in message
> news:h9r37i$tgl$1@digitalmars.com...
>>
>>>
>>> These aren't just marginal performance gains, they can easily be up to
>>> 15-30% improvements, sometimes 50% and more. If this is too complex or
>>> the risk is too high for you then don't use a systems language :)
>>
>> your approach makes sense if your are implementing say a calculator.
>> It doesn't scale to larger projects. Even C++ has overhead compared to
>> assembly yet you are writing performance critical code in c++, right?
>>
>
> It's *most* important on larger projects, because it's only on big systems
> where small inefficiencies actually add up to a large performance drain.
>
> Try writing a competitive real-time graphics renderer or physics simulator
> (especially for a game console where you're severely limited in your choice
> of compiler - if you even have a choice), or something like Pixar's renderer
> without *ever* diving into asm, or at least low-level "unsafe" code. And
> when it inevitably hits some missing optimization in the compiler and runs
> like shit, try explaining to the dev lead why it's better to beg the
> compiler vender to add the optimization you want and wait around hoping they
> finally do so, instead of just throwing in that inner optimization in the
> meantime.
>
> You can still leave the safe/portable version in there for platforms for
> which you haven't provided a hand-optimization. And unless you didn't know
> what you were doing, that inner optimization will still be small and highly
> isolated. And since it's so small and isolated, not only can you still throw
> in tests for it, but it's not as much harder as you would think to veryify
> correctness. And if/when your compiler finally does get the optimization you
> want, you can just rip out the hand-optimization and revert back to that
> "safe/portable" version that you had still left in anyway as a fallback.
>
>

I think you took my post to an extreme, I actually do agree with the above description.

what you just said was basically:
1. write portable/safe version
2. profile to find bottlenecks that the tools can't optimize and optimize those only while still keeping the portable version.

My objection was to what i feel was Jeremie's description of writing code from the get go in low level hand optimized way instead of what you described in your own words:

> And unless you didn't know
> what you were doing, that inner optimization will still be small and highly
> isolated.

September 29, 2009

Common ground. Re: Null references redux

Posted by Don
in reply to Walter Bright

Don

Posted in reply to Walter Bright

Walter Bright wrote:
> Denis Koroskin wrote:
>  > On Sat, 26 Sep 2009 22:30:58 +0400, Walter Bright
>  > <newshound1@digitalmars.com> wrote:
>  >> D has borrowed ideas from many different languages. The trick is to
>  >> take the good stuff and avoid their mistakes <g>.
>  >
>  > How about this one:
>  > http://sadekdrobi.com/2008/12/22/null-references-the-billion-dollar-mistake/ 
> 
>  >
>  >
>  > :)
> 
> I think he's wrong.
> 
> Getting rid of null references is like solving the problem of dead canaries in the coal mines by replacing them with stuffed toys.

Let's go back a step. The problem being addressed is this: inadvertent null references are an EXTREMELY common bug in D. For example, it's a bug which *every* C++ refugee gets hit by. I have experienced it ridiculously often in D.

*** The problem of null references is an order of magnitude worse in D than in C++, because classes in D use reference semantics. ***

Eliminating that category of bug at compile time would have a huge benefit for code quality. "Non-nullable references by default" is just a proposed solution. Maybe if D had better flow analysis, the demand for non-nullable references wouldn't be so great.
(Neither is a pure subset of the other, flow analysis works for all variables, non-nullable references catches more complex logic errors. But there is a very significant overlap).

Interestingly, while working on CTFE, I noticed that the CTFE code has a  lot in common with flow analysis. I can easily imagine the same code being reused.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation