View mode: basic / threaded / horizontal-split · Log in · Help
September 27, 2009
Re: Null references redux
Andrei Alexandrescu wrote:
> downs wrote:
>> Walter Bright wrote:
>>> Nick Sabalausky wrote:
>>>
>>> I agree with you that if the compiler can detect null dereferences at
>>> compile time, it should.
>>>
>>>
>>>>> Also, by "safe" I presume you mean "memory safe" which means free of
>>>>> memory corruption. Null pointer exceptions are memory safe. A null
>>>>> pointer could be caused by memory corruption, but it cannot *cause*
>>>>> memory corruption.
>>>> No, he's using the real meaning of "safe", not the
>>>> misleadingly-limited "SafeD" version of "safe" (which I'm still
>>>> convinced is going to get some poor soul into serious trouble from
>>>> mistakingly thinking their SafeD program is much safer than it really
>>>> is). Out here in reality, "safe" also means a lack of ability to
>>>> crash, or at least some level of protection against it. 
>>> Memory safety is something that can be guaranteed (presuming the
>>> compiler is correctly implemented). There is no way to guarantee that a
>>> non-trivial program cannot crash. It's the old halting problem.
>>>
>>
>> Okay, I'm gonna have to call you out on this one because it's simply 
>> incorrect.
>>
>> The halting problem deals with a valid program state - halting.
>>
>> We cannot check if every program halts because halting is an 
>> instruction that must be allowed at almost any point in the program.
>>
>> Why do crashes have to be allowed? They're not an allowed instruction!
>>
>> A compiler can be turing complete and still not allow crashes. There 
>> is nothing wrong with this, and it has *nothing* to do with the 
>> halting problem.
>>
>>>> You seem to be under the impression that nothing can be made
>>>> uncrashable without introducing the possibility of corrupted state.
>>>> That's hogwash.
>>> I read that statement several times and I still don't understand what it
>>> means.
>>>
>>> BTW, hardware null pointer checking is a safety feature, just like array
>>> bounds checking is.
>>
>> PS: You can't convert segfaults into exceptions under Linux, as far as 
>> I know.
> 
> How did Jeremie do that?
> 
> Andrei

A signal handler with the undocumented kernel parameters attaches the 
signal context to the exception object, repairs the stack frame forged 
by the kernel to make us believe we called the handler ourselves, does a 
backtrace right away and attaches it to the exception object, and then 
throw it.

The error handling code will unwind down to the runtime's main() where a 
catch clause is waiting for any Throwables, sending them back into the 
unhandled exception handler, and a crash window appears with the 
backtrace, all finally blocks executed, and gracefully shutting down.

All I need to do is an ELF/DWARF reader to extract symbolic debug info 
under linux, its already working for PE/CodeView on windows.

Jeremie
September 27, 2009
Re: Null references redux
Jesse Phillips:

>The thing is that memory safety is the only safety with code.<

Nope. For example in Delphi and C# you can have a runtime integer overflow errors. That's another kind of safety.
If you look at safety-critical code, the one Walter was talking about, you see people test code (and compile time) very well, looking for an enormous amount of possible errors. Doing this increases code safety. So you can have ABS brakes, TAC machine in hospitals, automatic pilots and so on.

Bye,
bearophile
September 27, 2009
Re: Null references redux
Sun, 27 Sep 2009 12:35:23 -0400, Jeremie Pelletier thusly wrote:

> language_fan wrote:
>> Sun, 27 Sep 2009 00:08:50 -0400, Jeremie Pelletier thusly wrote:
>> 
>>> Ary Borenszweig wrote:
>>>> Just out of curiosity: have you ever programmed in Java or C#?
>>> Nope, never got interested in these to tell the truth. I only did C,
>>> C++, D and x86 assembly in systems programming, I have quite a
>>> background in PHP and JavaScript also.
>> 
>> So you only know imperative procedural programming + some features of
>> hybrid OOP languages that are not even proper OOP languages.
> 
> This is what I know best, yeah. I did a lot of work in functional
> programming too, but not enough to add them to the above list.
> 
> What is proper OOP anyways? It's a feature offered by the language, not
> a critical design that must obey to some strict standard rules.  Be it
> class based or prototype based, supporting single or multiple
> inheritance, using abstract base classes or interfaces, having funny
> syntax for ctors and whatnot or using the class name or even 'this', its
> still OOP. If you wan't to call me on not knowing 15 languages like you
> do, I have to call you on not knowing the differences in OOP models.

I must say I have not studied languages that much, only the concepts and 
theory - starting from formal definitions like operational or 
denotational semantics, and some more informal ones. I can professionally 
write code in only about half a dozen languages, but learning new ones is 
trivial if the task requires it.

Generally the common thing for proper pure OOP languages is 'everything 
is an object' mentality. Because of this property there is no strict 
distinction between primitive non-OOP types and OOP types in pure OOP 
languages. In some languages e.g. number values are objects. In others 
there are no static members and even classes are objects, so called meta-
objects. In some way you can see this purity even in UML. If we go into 
details, various OOP languages have major differences in their semantics.

What I meant above is that I know a lot of developers who have a similar 
background as you do. It is really easy to use all of those languages 
without actually using the OOP features in them, at least properly (for 
instance PHP does not even have a real OOP system, it is a cheap rip-off 
of mainstream languages - just look at the scoping rules). I have seen 
Java code where the developer never constructs new objects and only uses 
static methods because he fears the heap allocation is expensive. 
Discussing OOP and language concepts is really hard if you lack the 
theoretical underpinning. It is sad to say this but the best source for 
this knowledge are academic CS books, but nowadays even wikipedia is 
starting to have good articles on the subject.
September 27, 2009
Re: Null references redux
language_fan wrote:
> Sun, 27 Sep 2009 12:35:23 -0400, Jeremie Pelletier thusly wrote:
> 
>> language_fan wrote:
>>> Sun, 27 Sep 2009 00:08:50 -0400, Jeremie Pelletier thusly wrote:
>>>
>>>> Ary Borenszweig wrote:
>>>>> Just out of curiosity: have you ever programmed in Java or C#?
>>>> Nope, never got interested in these to tell the truth. I only did C,
>>>> C++, D and x86 assembly in systems programming, I have quite a
>>>> background in PHP and JavaScript also.
>>> So you only know imperative procedural programming + some features of
>>> hybrid OOP languages that are not even proper OOP languages.
>> This is what I know best, yeah. I did a lot of work in functional
>> programming too, but not enough to add them to the above list.
>>
>> What is proper OOP anyways? It's a feature offered by the language, not
>> a critical design that must obey to some strict standard rules.  Be it
>> class based or prototype based, supporting single or multiple
>> inheritance, using abstract base classes or interfaces, having funny
>> syntax for ctors and whatnot or using the class name or even 'this', its
>> still OOP. If you wan't to call me on not knowing 15 languages like you
>> do, I have to call you on not knowing the differences in OOP models.
> 
> I must say I have not studied languages that much, only the concepts and 
> theory - starting from formal definitions like operational or 
> denotational semantics, and some more informal ones. I can professionally 
> write code in only about half a dozen languages, but learning new ones is 
> trivial if the task requires it.
> 
> Generally the common thing for proper pure OOP languages is 'everything 
> is an object' mentality. Because of this property there is no strict 
> distinction between primitive non-OOP types and OOP types in pure OOP 
> languages. In some languages e.g. number values are objects. In others 
> there are no static members and even classes are objects, so called meta-
> objects. In some way you can see this purity even in UML. If we go into 
> details, various OOP languages have major differences in their semantics.
> 
> What I meant above is that I know a lot of developers who have a similar 
> background as you do. It is really easy to use all of those languages 
> without actually using the OOP features in them, at least properly (for 
> instance PHP does not even have a real OOP system, it is a cheap rip-off 
> of mainstream languages - just look at the scoping rules). I have seen 
> Java code where the developer never constructs new objects and only uses 
> static methods because he fears the heap allocation is expensive. 
> Discussing OOP and language concepts is really hard if you lack the 
> theoretical underpinning. It is sad to say this but the best source for 
> this knowledge are academic CS books, but nowadays even wikipedia is 
> starting to have good articles on the subject.

I agree, Wikipedia is often the first source I check to learn on 
different concepts, then I search for online papers and documentation, 
dig into source code (Google's code search is a gem), and finally books.

I'm not most programmers, and I'm sure you aren't either. I like to 
learn as much of the semantics and implementation details behind a 
language as I can, only then do I feel I know the language, I like to 
make the best out of everything in the languages I use, not specialize 
in a subset of it.

I don't believe in a perfect programming model, I believe in many 
different models each having their pros and cons that can live in the 
same language forming an all-around solution. That's why I usually stay 
away from 'pure' languages because they impose a single point of view of 
the world, that doesn't mean its a bad one, I just like to look at the 
world from different angles at the same time.
September 27, 2009
Re: Null references redux
Michel Fortin wrote:
> On 2009-09-27 07:38:59 -0400, Christopher Wright <dhasenan@gmail.com> said:
> 
>> I dislike these forced checks.
>>
>> Let's say you're dealing with a compiler frontend. You have a semantic 
>> node that just went through some semantic pass and is guaranteed, by 
>> flow control and contracts, to have a certain property initialized 
>> that was not initialized prior to that point.
>>
>> The programmer knows the value isn't null. The compiler shouldn't 
>> force checks. At most, it should have automated checks that disappear 
>> with -release.
> 
> If the programmer knows a value isn't null, why not put the value in a 
> nullable-reference in the first place?

It may not be nonnull for the entire lifetime of the reference.

>> Also, it introduces more nesting.
> 
> Yes and no. It introduces an "if" statement for null checking, but only 
> for nullable references. If you know your reference can't be null it 
> should be non-nullable, and then you don't need to check.

I much prefer explicit null checks than implicit ones I can't control.

>> Also, unless the compiler's flow analysis is great, it's a nuisance -- 
>> you can see that the error is bogus and have to insert extra checks.
> 
> First you're right, if the feature is implemented it should be well 
> implemented. Second, if in a few place you don't want an "if" clause, 
> you can always cast your nullable reference to a non-nullable one, 
> explicitly bypassing the safeties. If you write a cast, you are making a 
> consious decision of not checking for null, which is much better than 
> the current situation where it's very easy to forget to check for null.

That's just adding useless verbosity to the language.

>> It should be fine to provide a requireNotNull template and leave it at 
>> that.
> 
> It's fine to have such a template. But it's not nearly as useful.

It definitely is, the whole point is about reference initializations, 
not what they can or can't initialize to.

What about non-nan floats? Or non-invalid characters? I fear nonnull 
references are a first step in the wrong direction. The focus should be 
about implementing variable initialization checks to the compiler, since 
this solves the issue with any variable, not just references. The flow 
analysis can also be reused for many other optimizations.
September 27, 2009
Re: Null references redux
Rainer Deyke wrote:
> OT, but declaring the variable at the top of the function increases
> stack size.
> 
> Example with changed variable names:
> 
>   void bar(bool foo) {
>     if (foo) {
>       int a = 1;
>     } else {
>       int b = 2;
>     }
>     int c = 3;
>   }
> 
> In this example, there are clearly three different (and differently
> named) variables, but their lifetimes do not overlap.  Only one variable
> can exist at a time, therefore the compiler only needs to allocate space
> for one variable.  Now, if you move your declaration to the top:
> 
>   void bar(bool foo) {
>     int a = void;
>     if (foo) {
>       a = 1;
>     } else {
>       a = 2; // Reuse variable.
>     }
>     int c = 3;
>   }
> 
> You now only have two variables, but both of them coexist at the end of
> the function.  Unless the compiler applies a clever optimization, the
> compiler is now forced to allocate space for two variables on the stack.

Not necessarily. The optimizer uses a technique called "live range 
analysis" to determine if two variables have non-overlapping ranges. It 
uses this for register assignment, but it could just as well be used for 
minimizing stack usage.
September 27, 2009
Re: Null references redux
On Sun, Sep 27, 2009 at 2:07 PM, Jeremie Pelletier <jeremiep@gmail.com> wrote:

>> Yes and no. It introduces an "if" statement for null checking, but only
>> for nullable references. If you know your reference can't be null it should
>> be non-nullable, and then you don't need to check.
>
> I much prefer explicit null checks than implicit ones I can't control.

Nonnull types do not create implicit null checks. Nonnull types DO NOT
need to be checked. And nullable types WOULD force explicit null
checks.

> What about non-nan floats? Or non-invalid characters? I fear nonnull
> references are a first step in the wrong direction. The focus should be
> about implementing variable initialization checks to the compiler, since
> this solves the issue with any variable, not just references. The flow
> analysis can also be reused for many other optimizations.

hash_t foo(Object o) { return o.toHash(); }
foo(null); // bamf, I just killed your function.

Forcing initialization of locals does NOT solve all the problems that
nonnull references would.
September 27, 2009
Re: Null references redux
On 27/09/2009 19:29, Jeremie Pelletier wrote:
> Andrei Alexandrescu wrote:
>> downs wrote:
>>> Walter Bright wrote:
>>>> Nick Sabalausky wrote:
>>>>
>>>> I agree with you that if the compiler can detect null dereferences at
>>>> compile time, it should.
>>>>
>>>>
>>>>>> Also, by "safe" I presume you mean "memory safe" which means free of
>>>>>> memory corruption. Null pointer exceptions are memory safe. A null
>>>>>> pointer could be caused by memory corruption, but it cannot *cause*
>>>>>> memory corruption.
>>>>> No, he's using the real meaning of "safe", not the
>>>>> misleadingly-limited "SafeD" version of "safe" (which I'm still
>>>>> convinced is going to get some poor soul into serious trouble from
>>>>> mistakingly thinking their SafeD program is much safer than it really
>>>>> is). Out here in reality, "safe" also means a lack of ability to
>>>>> crash, or at least some level of protection against it.
>>>> Memory safety is something that can be guaranteed (presuming the
>>>> compiler is correctly implemented). There is no way to guarantee that a
>>>> non-trivial program cannot crash. It's the old halting problem.
>>>>
>>>
>>> Okay, I'm gonna have to call you out on this one because it's simply
>>> incorrect.
>>>
>>> The halting problem deals with a valid program state - halting.
>>>
>>> We cannot check if every program halts because halting is an
>>> instruction that must be allowed at almost any point in the program.
>>>
>>> Why do crashes have to be allowed? They're not an allowed instruction!
>>>
>>> A compiler can be turing complete and still not allow crashes. There
>>> is nothing wrong with this, and it has *nothing* to do with the
>>> halting problem.
>>>
>>>>> You seem to be under the impression that nothing can be made
>>>>> uncrashable without introducing the possibility of corrupted state.
>>>>> That's hogwash.
>>>> I read that statement several times and I still don't understand
>>>> what it
>>>> means.
>>>>
>>>> BTW, hardware null pointer checking is a safety feature, just like
>>>> array
>>>> bounds checking is.
>>>
>>> PS: You can't convert segfaults into exceptions under Linux, as far
>>> as I know.
>>
>> How did Jeremie do that?
>>
>> Andrei
>
> A signal handler with the undocumented kernel parameters attaches the
> signal context to the exception object, repairs the stack frame forged
> by the kernel to make us believe we called the handler ourselves, does a
> backtrace right away and attaches it to the exception object, and then
> throw it.
>
> The error handling code will unwind down to the runtime's main() where a
> catch clause is waiting for any Throwables, sending them back into the
> unhandled exception handler, and a crash window appears with the
> backtrace, all finally blocks executed, and gracefully shutting down.
>
> All I need to do is an ELF/DWARF reader to extract symbolic debug info
> under linux, its already working for PE/CodeView on windows.
>
> Jeremie

Is this Linux specific? what about other *nix systems, like BSD and 
solaris?
September 27, 2009
Re: Null references redux
Walter Bright wrote:
>>   void bar(bool foo) {
>>     int a = void;
>>     if (foo) {
>>       a = 1;
>>     } else {
>>       a = 2; // Reuse variable.
>>     }
>>     int c = 3;
>>   }
>>
>> You now only have two variables, but both of them coexist at the end of
>> the function.  Unless the compiler applies a clever optimization, the
>> compiler is now forced to allocate space for two variables on the stack.
> 
> Not necessarily. The optimizer uses a technique called "live range
> analysis" to determine if two variables have non-overlapping ranges. It
> uses this for register assignment, but it could just as well be used for
> minimizing stack usage.

That's the optimization I was referring to.  It works for ints, but not
for RAII types.  It also doesn't (necessarily) work if you reorder the
function:

  void bar(bool foo) {
    int a = void;
    int c = 3;
    if (foo) {
      a = 1;
    } else {
      a = 2; // Reuse variable.
    }
  }

Of course, a good optimizer can still reorder the declarations in this
case, or even eliminate the whole function body (since it doesn't do
anything).


-- 
Rainer Deyke - rainerd@eldwood.com
September 27, 2009
Re: Null references redux
Jeremie Pelletier wrote:
> Walter Bright wrote:
>> They are completely independent variables. One may get assigned to a
>> register, and not the other.
> 
> Ok, that's what I thought, so the good old C way of declaring variables
> at the top is not a bad thing yet :)

Strange how you can look at the evidence and arrive at exactly the wrong
conclusion.  Declaring variables as close as possible to where they are
used can reduce stack usage, and never increases it.

-- 
Rainer Deyke - rainerd@eldwood.com
10 11 12 13 14 15 16 17 18
Top | Discussion index | About this forum | D home