September 27, 2009
Jesse Phillips wrote:
> The thing is that memory safety is the only safety with code.

That is such bullshit.  For example, this:

  class A {
  }

  class B {
  }

  A x = new B;

No memory access violation (yet).  Clearly incorrect.  Detecting this at compile time is clearly a safety feature, and a good one.

You could argue that assigned a 'B' to a variable that is declared to hold an 'A' is already a memory safety violation.  If so, then the exact argument also applies to assigning 'null' to the same variable.


-- 
Rainer Deyke - rainerd@eldwood.com
September 27, 2009
Rainer Deyke:

> Of course, a good optimizer can still reorder the declarations in this case, or even eliminate the whole function body (since it doesn't do anything).

LLVM has a good optimizer. If you try the LLVM demo on C code with LTO activated: http://llvm.org/demo/index.cgi

This C code:

   void bar(int foo) {
     int a;
     int c = 3;
     if (foo) {
       a = 1;
     } else {
       a = 2;
     }
   }

Produces an useful warining:
/tmp/webcompile/_16254_0.c:3: warning: unused variable 'c'

And an empty function:

define void @bar(i32 %foo) nounwind readnone {
entry:
	ret void
}

Bye,
bearophile
September 27, 2009
Jeremie Pelletier:

> The focus should be about implementing variable initialization checks to the compiler, since this solves the issue with any variable, not just references. The flow analysis can also be reused for many other optimizations.

Are you willing to give your help to implement about 5-10% if this feature? :-)

Bye,
bearophile
September 27, 2009
Hello Lutger,

> The answer may
> depend on [...]
> the habits of the 'programmers' in question, I don't know.
> 

If you can't trust the programmer to write good code, replace them with someone you can trust. There will never be a usable language that can take in garbage and spit out correct programs.


September 27, 2009
Yigal Chripun wrote:
> On 27/09/2009 19:29, Jeremie Pelletier wrote:
>> Andrei Alexandrescu wrote:
>>> downs wrote:
>>>> Walter Bright wrote:
>>>>> Nick Sabalausky wrote:
>>>>>
>>>>> I agree with you that if the compiler can detect null dereferences at
>>>>> compile time, it should.
>>>>>
>>>>>
>>>>>>> Also, by "safe" I presume you mean "memory safe" which means free of
>>>>>>> memory corruption. Null pointer exceptions are memory safe. A null
>>>>>>> pointer could be caused by memory corruption, but it cannot *cause*
>>>>>>> memory corruption.
>>>>>> No, he's using the real meaning of "safe", not the
>>>>>> misleadingly-limited "SafeD" version of "safe" (which I'm still
>>>>>> convinced is going to get some poor soul into serious trouble from
>>>>>> mistakingly thinking their SafeD program is much safer than it really
>>>>>> is). Out here in reality, "safe" also means a lack of ability to
>>>>>> crash, or at least some level of protection against it.
>>>>> Memory safety is something that can be guaranteed (presuming the
>>>>> compiler is correctly implemented). There is no way to guarantee that a
>>>>> non-trivial program cannot crash. It's the old halting problem.
>>>>>
>>>>
>>>> Okay, I'm gonna have to call you out on this one because it's simply
>>>> incorrect.
>>>>
>>>> The halting problem deals with a valid program state - halting.
>>>>
>>>> We cannot check if every program halts because halting is an
>>>> instruction that must be allowed at almost any point in the program.
>>>>
>>>> Why do crashes have to be allowed? They're not an allowed instruction!
>>>>
>>>> A compiler can be turing complete and still not allow crashes. There
>>>> is nothing wrong with this, and it has *nothing* to do with the
>>>> halting problem.
>>>>
>>>>>> You seem to be under the impression that nothing can be made
>>>>>> uncrashable without introducing the possibility of corrupted state.
>>>>>> That's hogwash.
>>>>> I read that statement several times and I still don't understand
>>>>> what it
>>>>> means.
>>>>>
>>>>> BTW, hardware null pointer checking is a safety feature, just like
>>>>> array
>>>>> bounds checking is.
>>>>
>>>> PS: You can't convert segfaults into exceptions under Linux, as far
>>>> as I know.
>>>
>>> How did Jeremie do that?
>>>
>>> Andrei
>>
>> A signal handler with the undocumented kernel parameters attaches the
>> signal context to the exception object, repairs the stack frame forged
>> by the kernel to make us believe we called the handler ourselves, does a
>> backtrace right away and attaches it to the exception object, and then
>> throw it.
>>
>> The error handling code will unwind down to the runtime's main() where a
>> catch clause is waiting for any Throwables, sending them back into the
>> unhandled exception handler, and a crash window appears with the
>> backtrace, all finally blocks executed, and gracefully shutting down.
>>
>> All I need to do is an ELF/DWARF reader to extract symbolic debug info
>> under linux, its already working for PE/CodeView on windows.
>>
>> Jeremie
> 
> Is this Linux specific? what about other *nix systems, like BSD and solaris?

Signal handler are standard to most *nix platforms since they're part of the posix C standard libraries, maybe some platforms will require a special handling but nothing impossible to do.
September 27, 2009
Jarrett Billingsley wrote:
> On Sun, Sep 27, 2009 at 2:07 PM, Jeremie Pelletier <jeremiep@gmail.com> wrote:
> 
>>> Yes and no. It introduces an "if" statement for null checking, but only
>>> for nullable references. If you know your reference can't be null it should
>>> be non-nullable, and then you don't need to check.
>> I much prefer explicit null checks than implicit ones I can't control.
> 
> Nonnull types do not create implicit null checks. Nonnull types DO NOT
> need to be checked. And nullable types WOULD force explicit null
> checks.

Forcing checks on nullables is just as bad, not all nullables need to be checked every time they're used.

>> What about non-nan floats? Or non-invalid characters? I fear nonnull
>> references are a first step in the wrong direction. The focus should be
>> about implementing variable initialization checks to the compiler, since
>> this solves the issue with any variable, not just references. The flow
>> analysis can also be reused for many other optimizations.
> 
> hash_t foo(Object o) { return o.toHash(); }
> foo(null); // bamf, I just killed your function.
> 
> Forcing initialization of locals does NOT solve all the problems that
> nonnull references would.

You didn't kill my function, you shot yourself in the foot. Something trivial to debug.
September 27, 2009
bearophile wrote:
> Jeremie Pelletier:
> 
>> The focus should be about implementing variable initialization checks to the compiler, since this solves the issue with any variable, not just references. The flow analysis can also be reused for many other optimizations.
> 
> Are you willing to give your help to implement about 5-10% if this feature? :-)
> 
> Bye,
> bearophile

Sure, I would love to help implement flow analysis, I don't know enough of the current dmd semantic analysis internals yet, but I'm slowly getting there.

Jeremie
September 27, 2009
On Sun, Sep 27, 2009 at 3:42 PM, Jeremie Pelletier <jeremiep@gmail.com> wrote:
> Jarrett Billingsley wrote:
>> Nonnull types do not create implicit null checks. Nonnull types DO NOT need to be checked. And nullable types WOULD force explicit null checks.
>
> Forcing checks on nullables is just as bad, not all nullables need to be checked every time they're used.

You don't get it, do you. If you have a reference that doesn't need to be checked every time it's used, you make it a *nonnull reference*. You *only* use nullable variables for references where the nullness of the reference should change the program logic.

And if you're talking about things like:

Foo? x = someFunc();

if(x is null)
{
    // one path
}
else
{
    // use x here
}

and you're expecting the "use x here" clause to force you to do (cast(Foo)x) every time you want to use x? That's not the case. The condition of the if has *proven* x to be nonnull in the else clause, so no null checks - at compile time or at runtime - have to be performed there, nor does it have to be cast to a nonnull reference.

And if you have a nullable reference that you know is not null for the rest of the function? Just put "assert(x !is null)" and everything that follows will assume it's not null.

>> hash_t foo(Object o) { return o.toHash(); }
>> foo(null); // bamf, I just killed your function.
>>
>> Forcing initialization of locals does NOT solve all the problems that nonnull references would.
>
> You didn't kill my function, you shot yourself in the foot. Something trivial to debug.

You're dodging. You claim that forcing variable initialization solves the same problem that nonnull references do. It doesn't.
September 27, 2009
Jeremie Pelletier wrote:
> Andrei Alexandrescu wrote:
>> downs wrote:
>>> Walter Bright wrote:
>>>> Nick Sabalausky wrote:
>>>>
>>>> I agree with you that if the compiler can detect null dereferences at compile time, it should.
>>>>
>>>>
>>>>>> Also, by "safe" I presume you mean "memory safe" which means free of memory corruption. Null pointer exceptions are memory safe. A null pointer could be caused by memory corruption, but it cannot *cause* memory corruption.
>>>>> No, he's using the real meaning of "safe", not the misleadingly-limited "SafeD" version of "safe" (which I'm still convinced is going to get some poor soul into serious trouble from mistakingly thinking their SafeD program is much safer than it really is). Out here in reality, "safe" also means a lack of ability to crash, or at least some level of protection against it.
>>>> Memory safety is something that can be guaranteed (presuming the compiler is correctly implemented). There is no way to guarantee that a non-trivial program cannot crash. It's the old halting problem.
>>>>
>>>
>>> Okay, I'm gonna have to call you out on this one because it's simply incorrect.
>>>
>>> The halting problem deals with a valid program state - halting.
>>>
>>> We cannot check if every program halts because halting is an instruction that must be allowed at almost any point in the program.
>>>
>>> Why do crashes have to be allowed? They're not an allowed instruction!
>>>
>>> A compiler can be turing complete and still not allow crashes. There is nothing wrong with this, and it has *nothing* to do with the halting problem.
>>>
>>>>> You seem to be under the impression that nothing can be made uncrashable without introducing the possibility of corrupted state. That's hogwash.
>>>> I read that statement several times and I still don't understand
>>>> what it
>>>> means.
>>>>
>>>> BTW, hardware null pointer checking is a safety feature, just like
>>>> array
>>>> bounds checking is.
>>>
>>> PS: You can't convert segfaults into exceptions under Linux, as far as I know.
>>
>> How did Jeremie do that?
>>
>> Andrei
> 
> A signal handler with the undocumented kernel parameters attaches the signal context to the exception object, repairs the stack frame forged by the kernel to make us believe we called the handler ourselves, does a backtrace right away and attaches it to the exception object, and then throw it.
> 
> The error handling code will unwind down to the runtime's main() where a catch clause is waiting for any Throwables, sending them back into the unhandled exception handler, and a crash window appears with the backtrace, all finally blocks executed, and gracefully shutting down.
> 
> All I need to do is an ELF/DWARF reader to extract symbolic debug info under linux, its already working for PE/CodeView on windows.
> 
> Jeremie


Woah, nice. I stand corrected. Is this in druntime already?
September 27, 2009
Jeremie Pelletier wrote:
> downs wrote:
>> Jeremie Pelletier wrote:
>>> Christopher Wright wrote:
>>>> Jeremie Pelletier wrote:
>>>>> What if using 'Object obj;' raises a warning "unitialized variable" and makes everyone wanting non-null references happy, and 'Object obj = null;' raises no warning and makes everyone wanting to keep the current system (all two of us!) happy.
>>>>>
>>>>> I believe it's a fair compromise.
>>>> It's a large improvement, but only for local variables. If your segfault has to do with a local variable, unless your function is monstrously large, it should be easy to fix, without changing the type system.
>>>>
>>>> The larger use case is when you have an aggregate member that cannot be null. This can be solved via contracts, but they are tedious to write and ubiquitous.
>>> But how would you enforce a nonnull type over an aggregate in the first place? If you can, you could also apply the same initializer semantics I suggested earlier.
>>>
>>> Look at this for example:
>>>
>>> struct A {
>>>     Object cannotBeNull;
>>> }
>>>
>>> void main() {
>>>     A* a = new A;
>>> }
>>>
>>> Memory gets initialized to zero, and you have a broken non-null type. You could have the compiler throw an error here, but the compiler cannot possibly know about all data creation methods such as malloc, calloc or any other external allocator.
>>>
>>> You could even do something like:
>>>
>>> Object* foo = calloc(Object.sizeof);
>>>
>>> and the compiler would let you dereference foo resulting in yet another broken nonnull variable.
>>>
>>> Non-nulls are a cute idea when you have a type system that is much stricter than D's, but there are just way too many workarounds to make it crash in D.
>>
>> "Here are some cases you haven't mentioned yet. This proves that the compiler can't possibly be smart enough. "
>>
>> Yeeeeeah.
> 
> I allocate most structs on the gc, unless I need them only for the scope of a function (that includes RVO). All objects are on the gc already, so it's a pretty major case. The argument was to protect aggregate fields, I'm just pointing out that their usage usually is preventing an easy implementation. I'm not saying its impossible.
> 
> Besides, what I said was, if its possible to enforce these fields to be null/non-null, you can enforce them to be properly initialized in such case, making nulls/non-nulls nearly useless.
> 
>> In the above case, why not implicitly put the cannotBeNull check into the struct invariant? That's where it belongs, imho.
> 
> Exactly, what's the need for null/non-null types then?
> 

You're twisting my words.

Checking for null in the struct invariant would be an _implementation_ of non-nullable types in structs.

Isn't the whole point of defaulting to non-nullable types that we don't have to check for it manually, i.e. in the user-defined invariant?

I think we should avoid having to build recursive checks for null-ness for every type we define.

>> Regarding your example, it's calloc(size_t.sizeof). And a) we probably can't catch that case except with in/out null checks on every method, but then again, how often have you done that? I don't think it's relevant enough to be relevant to this thread. :)
> 
> Actually, sizeof currently returns the size of the reference, so its always going to be the same as size_t.sizeof.

Weird. I remembered that differently. Thanks.