View mode: basic / threaded / horizontal-split · Log in · Help
March 07, 2012
Re: dereferencing null
On 03/07/2012 02:40 AM, Chad J wrote:
> But to initialize non-null fields, I suspect we would need to be able to
> do stuff like this:
>
> class Foo
> {
> int dummy;
> }
>
> class Bar
> {
> Foo foo = new Foo();
>
> this() { foo.dummy = 5; }
> }
>
> Which would be lowered by the compiler into this:
>
> class Bar
> {
> // Assume we've already checked for bogus assignments.
> // It is now safe to make this nullable.
> Nullable!(Foo) foo;
>
> this()
> {
> // Member initialization is done first.
> foo = new Foo();
>
> // Then programmer-supplied ctor code runs after.
> foo.dummy = 5;
> }
> }
>
> I remember C# being able to do this. I never understood why D doesn't
> allow this. Without it, I have to repeat myself a lot, and that is just
> wrong ;).]

It is not sufficient.

class Bar{
    Foo foo = new Foo(this);
    void method(){...}
}
class Foo{
    this(Bar bar){bar.foo.method();}
}
March 07, 2012
Re: dereferencing null
On Mon, 05 Mar 2012 22:51:28 -0500, Jonathan M Davis <jmdavisProg@gmx.com>  
wrote:

> On Monday, March 05, 2012 21:04:20 Steven Schveighoffer wrote:
>> On Mon, 05 Mar 2012 20:17:32 -0500, Michel Fortin
>>
>> <michel.fortin@michelf.com> wrote:
>> > That said, throwing an exception might not be a better response all  
>> the
>> > time. On my operating system (Mac OS X) when a program crashes I get a
>> > nice crash log with the date, a stack trace for each thread with named
>> > functions, the list of all loaded libraries, and the list of VM  
>> regions
>> > dumped into ~/Library/Logs/CrashReporter/. That's very useful when you
>> > have a customer experiencing a crash with your software, as you can  
>> ask
>> > for the crash log. Can't you do the same on other operating systems?
>>
>> It depends on the OS facilities and the installed libraries for such
>> features.  It's eminently possible, and I think on Windows, you can  
>> catch
>> such exceptions too in external programs to do the same sort of dumping.
>> On Linux, you get a "Segmentation Fault" message (or nothing if you have
>> no terminal showing the output), and the program goes away.  That's the
>> default behavior.  I think it's better in any case to do *something*  
>> other
>> than just print "Segmentation Fault" by default.  If someone has a way  
>> to
>> hook this in a better fashion, we can include that, but I hazard to  
>> guess
>> it will not be on stock Linux boxes.
>
> All you have to do is add a signal handler which handles SIGSEV and have  
> it
> print out a stacktrace. It's pretty easy to do. It _is_ the sort of  
> thing that
> programs may want to override (to handle other signals), so I'm not  
> quite sure
> what the best way to handle that is without causing problems for them  
> (e.g.
> initialization order could affect which handler is added last and is  
> therefore
> the one used). Maybe a function should be added to druntime which wraps  
> the
> glibc function so that programs can add their signal handler through  
> _it_, and
> if that happens, the default one won't be used.

Install the default (stack-trace printing) handler before calling any of  
the static constructors.  Any call to signal after that will override the  
installed handler.

-Steve
March 07, 2012
Re: dereferencing null
On Tue, 06 Mar 2012 23:07:24 -0500, Walter Bright  
<newshound2@digitalmars.com> wrote:

> On 3/6/2012 8:05 PM, Walter Bright wrote:
>> What I'm talking about is the idea that one can recover from seg faults
>> resulting from program bugs.
>
> I've written about this before, but I want to emphasize that attempting  
> to recover from program BUGS is the absolutely WRONG way to go about  
> writing fail-safe, critical, fault-tolerant software.

100% agree.  I just want as much information about the bug as possible  
before the program exits.

-Steve
March 07, 2012
Re: dereferencing null
On Mon, 05 Mar 2012 23:58:48 -0500, Chad J  
<chadjoan@__spam.is.bad__gmail.com> wrote:

> On 03/05/2012 11:27 PM, Jonathan M Davis wrote:
>> On Tuesday, March 06, 2012 05:11:30 Martin Nowak wrote:
>>> There are two independent discussions being conflated here. One about
>>> getting more
>>> information out of crashes even in release mode and the other about
>>> adding runtime checks to prevent crashing merely in debug builds.
>>
>> A segfault should _always_ terminate a program - as should  
>> dereferencing a
>> null pointer. Those are fatal errors. If we had extra checks, they  
>> would have
>> to result in NullPointerErrors, not NullPointerExceptions. It's horribly
>> broken to try and recover from dereferencing a null pointer. So, the  
>> question
>> then becomes whether adding the checks and getting an Error thrown is  
>> worth
>> doing as opposed to simply detecting it and printing out a stack trace.  
>> And
>> throwing an Error is arguably _worse_, because it means that you can't  
>> get a
>> useful core dump.
>>
>> Really, I think that checking for null when dereferencing is out of the
>> question. What we need is to detect it and print out a stacktrace. That  
>> will
>> maximize the debug information without costing performance.
>>
>> - Jonathan M Davis
>
> Why is it fatal?

A segmentation fault indicates that a program tried to access memory that  
is not available.  Since the 0 page is never allocated, any null pointer  
dereferencing results in a seg fault.

However, there are several causes of seg faults:

1. You forgot to initialize a variable.
2. Your memory has been corrupted, and some corrupted pointer now points  
into no-mem land.
3. You are accessing memory that has been deallocated.

Only 1 is benign.  2 and 3 are fatal.  Since you cannot know which of  
these three happened, the only valid choice is to terminate.

I think the correct option is to print a stack trace, and abort the  
program.

> I'd like to be able to catch these.  I tend to run into a lot of fairly  
> benign sources of these, and they should be try-caught so that the user  
> doesn't get the boot unnecessarily.  Unnecessary crashing can lose user  
> data.  Maybe a warning message is sufficient: "hey that last thing you  
> did didn't turn out so well; please don't do that again." followed by  
> some automatic emailing of admins.  And the email would contain a nice  
> stack trace with line numbers and stack values and... I can dream huh.

You cannot be sure if your program is in a sane state.

> I might be convinced that things like segfaults in the /general case/  
> are fatal.  It could be writing to memory outside the bounds of an array  
> which is both not bounds-checked and may or may not live on the stack.  
> Yuck, huh.  But this is not the same as a null-dereference:
>
> Foo f = null;
> f.bar = 4;  // This is exception worthy, yes,
>              // but how does it affect unrelated parts of the program?

Again, this is a simple case.  There is also this case:

Foo f = new Foo();
... // some code that corrupts f so that it is now null
f.bar = 4;

This is not a "continue execution" case, and cannot be distinguished from  
the simple case by compiler or library code.

Philosophically, any null pointer access is a program error, not a user  
error, and should not be considered for "normal" execution.  Terminating  
execution is the only right choice.

-Steve
March 07, 2012
Re: dereferencing null
On 03/07/2012 07:57 AM, Steven Schveighoffer wrote:
> On Mon, 05 Mar 2012 23:58:48 -0500, Chad J
> <chadjoan@__spam.is.bad__gmail.com> wrote:
>
>>
>> Why is it fatal?
>
> A segmentation fault indicates that a program tried to access memory
> that is not available. Since the 0 page is never allocated, any null
> pointer dereferencing results in a seg fault.
>
> However, there are several causes of seg faults:
>
> 1. You forgot to initialize a variable.
> 2. Your memory has been corrupted, and some corrupted pointer now points
> into no-mem land.
> 3. You are accessing memory that has been deallocated.
>
> Only 1 is benign. 2 and 3 are fatal. Since you cannot know which of
> these three happened, the only valid choice is to terminate.
>
> I think the correct option is to print a stack trace, and abort the
> program.
>

Alright, I think I see where the misunderstanding is coming from.

I have only ever encountered (1).  And I've encountered it a lot.

I didn't even consider (2) and (3) as possibilities.  Those are far from 
my mind.

I still have a nagging doubt though: since the dereference in question 
is null, then there is no way for that particular dereference to corrupt 
other memory.  The only way this happens in (2) and (3) is that related 
code tries to write to invalid memory.  But if we have other measures in 
place to prevent that (bounds checking, other hardware signals, etc), 
then how is it still possible to corrupt memory?

>
> [...]
>
> -Steve
March 07, 2012
Re: dereferencing null
On Wed, 07 Mar 2012 09:22:27 -0500, Chad J  
<chadjoan@__spam.is.bad__gmail.com> wrote:

> On 03/07/2012 07:57 AM, Steven Schveighoffer wrote:
>> On Mon, 05 Mar 2012 23:58:48 -0500, Chad J
>> <chadjoan@__spam.is.bad__gmail.com> wrote:
>>
>>>
>>> Why is it fatal?
>>
>> A segmentation fault indicates that a program tried to access memory
>> that is not available. Since the 0 page is never allocated, any null
>> pointer dereferencing results in a seg fault.
>>
>> However, there are several causes of seg faults:
>>
>> 1. You forgot to initialize a variable.
>> 2. Your memory has been corrupted, and some corrupted pointer now points
>> into no-mem land.
>> 3. You are accessing memory that has been deallocated.
>>
>> Only 1 is benign. 2 and 3 are fatal. Since you cannot know which of
>> these three happened, the only valid choice is to terminate.
>>
>> I think the correct option is to print a stack trace, and abort the
>> program.
>>
>
> Alright, I think I see where the misunderstanding is coming from.
>
> I have only ever encountered (1).  And I've encountered it a lot.

(1) occurs a lot, and in most cases, happens reliably.  Most QA cycles  
should find them.  There should be no case in which this is not a program  
error, to be fixed.
(2) and (3) are sinister because errors that occur are generally far away  
from the root cause, and the memory you are using is compromised.  For  
example, a memory corruption can cause an error several hours later when  
you try to use the corrupted memory.

If allowed to continue, such corrupt memory programs can cause lots of  
problems, e.g. corrupt your saved data, or run malicious code (buffer  
overflow attack).  It's not worth saving anything.

> I didn't even consider (2) and (3) as possibilities.  Those are far from  
> my mind.
>
> I still have a nagging doubt though: since the dereference in question  
> is null, then there is no way for that particular dereference to corrupt  
> other memory.  The only way this happens in (2) and (3) is that related  
> code tries to write to invalid memory.  But if we have other measures in  
> place to prevent that (bounds checking, other hardware signals, etc),  
> then how is it still possible to corrupt memory?

The null dereference may be a *result* of memory corruption.

example:

class Foo {void foo(){}}

void main()
{
   int[2] x = [1, 2];
   Foo f = new Foo;

   x.ptr[2] = 0; // oops killed f
   f.foo(); // segfault
}

Again, this one is benign, but it doesn't have to be.  I could have just  
nullified my return stack pointer, etc. along with f.

The larger point is, a SEGV means memory is not as it is expected.  Once  
you don't trust your memory, you might as well stop.

-Steve
March 07, 2012
Re: dereferencing null
On Wednesday, 7 March 2012 at 14:23:18 UTC, Chad J wrote:
> On 03/07/2012 07:57 AM, Steven Schveighoffer wrote:
>> On Mon, 05 Mar 2012 23:58:48 -0500, Chad J
>> <chadjoan@__spam.is.bad__gmail.com> wrote:
>>
>>>
>>> Why is it fatal?
>>
>> A segmentation fault indicates that a program tried to access 
>> memory
>> that is not available. Since the 0 page is never allocated, 
>> any null
>> pointer dereferencing results in a seg fault.
>>
>> However, there are several causes of seg faults:
>>
>> 1. You forgot to initialize a variable.
>> 2. Your memory has been corrupted, and some corrupted pointer 
>> now points
>> into no-mem land.
>> 3. You are accessing memory that has been deallocated.
>>
>> Only 1 is benign. 2 and 3 are fatal. Since you cannot know 
>> which of
>> these three happened, the only valid choice is to terminate.
>>
>> I think the correct option is to print a stack trace, and 
>> abort the
>> program.
>>
>
> Alright, I think I see where the misunderstanding is coming 
> from.
>
> I have only ever encountered (1).  And I've encountered it a 
> lot.
>
> I didn't even consider (2) and (3) as possibilities.  Those are 
> far from my mind.
>
> I still have a nagging doubt though: since the dereference in 
> question is null, then there is no way for that particular 
> dereference to corrupt other memory.  The only way this happens 
> in (2) and (3) is that related code tries to write to invalid 
> memory.  But if we have other measures in place to prevent that 
> (bounds checking, other hardware signals, etc), then how is it 
> still possible to corrupt memory?
>
>>
>> [...]
>>
>> -Steve

I spoke too soon!
We missed one:

1. You forgot to initialize a variable.
2. Your memory has been corrupted, and some corrupted pointer
 now points into no-mem land.
3. You are accessing memory that has been deallocated.
4. null was being used as a sentinal value, and it snuck into
 a place where the value should not be a sentinal anymore.

I will now change what I said to reflect this:

I think I see where the misunderstanding is coming from.

I encounter (1) from time to time.  It isn't a huge problem 
because usually if I declare something the next thing on my mind 
is initializing it.  Even if I forget, I'll catch it in early 
testing.  It tends to never make it to anyone else's desk, unless 
it's a regression.  Regressions like this aren't terribly common 
though.  If you make my program crash from (1), I'll live.

I didn't even consider (2) and (3) as possibilities.  Those are 
far from my mind.  I think I'm used to VM languages at this point 
(C#, Java, Actionscript 3, Haxe, Synergy/DE|DBL, etc).  In the 
VM, (2) and (3) can't happen.  I never worry about those.  Feel 
free to crash these in D.

I encounter (4) a lot.  I really don't want my programs crashed 
when (4) happens.  Such crashes would be super annoying, and they 
can happen at very bad times.

------

Now then, I have 2 things to say about this:

- Why can't we distinguish between these?  As I said in my 
previous thoughts, we should have ways of ruling out (2) and (3), 
thus ensuring that our NullDerefException was caused by only (1) 
or (4).  It's possible in VM languages, but given that the VM is 
merely a cheesey abstraction, I beleive that it's always possible 
to accomplish the same things in D %100 of the time.  Usually 
this requires isolating the system bits from the abstractions.  
Saying it can't be done would be giving up way too easily, and 
you can miss the hidden treasure that way.

- If I'm given some sensible way of handling sentinal values then 
(4) will become a non-issue.  Then that leaves (1-3), and I am OK 
if those cause mandatory crashing.  I know I'm probably opening 
an old can of worms, but D is quite powerful and I think we 
should be able to solve this stuff.  My instincts tell me that 
managing sentinal values with special patterns in memory (ex: 
null values or separate boolean flags) all have pitfalls 
(null-derefs or SSOT violations that lead to desync).  Perhaps 
D's uber-powerful type system can rescue us?

The only other problem with this is... what if our list is not 
exhaustive, and (5) exists?
March 07, 2012
Re: dereferencing null
On Wednesday, 7 March 2012 at 14:23:18 UTC, Chad J wrote:
> On 03/07/2012 07:57 AM, Steven Schveighoffer wrote:
>> On Mon, 05 Mar 2012 23:58:48 -0500, Chad J
>> <chadjoan@__spam.is.bad__gmail.com> wrote:
>>
>>>
>>> Why is it fatal?
>>
>> A segmentation fault indicates that a program tried to access 
>> memory
>> that is not available. Since the 0 page is never allocated, 
>> any null
>> pointer dereferencing results in a seg fault.
>>
>> However, there are several causes of seg faults:
>>
>> 1. You forgot to initialize a variable.
>> 2. Your memory has been corrupted, and some corrupted pointer 
>> now points
>> into no-mem land.
>> 3. You are accessing memory that has been deallocated.
>>
>> Only 1 is benign. 2 and 3 are fatal. Since you cannot know 
>> which of
>> these three happened, the only valid choice is to terminate.
>>
>> I think the correct option is to print a stack trace, and 
>> abort the
>> program.
>>
>
> Alright, I think I see where the misunderstanding is coming 
> from.
>
> I have only ever encountered (1).  And I've encountered it a 
> lot.
>
> I didn't even consider (2) and (3) as possibilities.  Those are 
> far from my mind.
>
> I still have a nagging doubt though: since the dereference in 
> question is null, then there is no way for that particular 
> dereference to corrupt other memory.  The only way this happens 
> in (2) and (3) is that related code tries to write to invalid 
> memory.  But if we have other measures in place to prevent that 
> (bounds checking, other hardware signals, etc), then how is it 
> still possible to corrupt memory?
>
>>
>> [...]
>>
>> -Steve

I spoke too soon!
We missed one:

1. You forgot to initialize a variable.
2. Your memory has been corrupted, and some corrupted pointer
   now points into no-mem land.
3. You are accessing memory that has been deallocated.
4. null was being used as a sentinal value, and it snuck into
   a place where the value should not be a sentinal anymore.

I will now change what I said to reflect this:

I think I see where the misunderstanding is coming from.

I encounter (1) from time to time.  It isn't a huge problem
because usually if I declare something the next thing on my mind
is initializing it.  Even if I forget, I'll catch it in early
testing.  It tends to never make it to anyone else's desk, unless
it's a regression.  Regressions like this aren't terribly common
though.  If you make my program crash from (1), I'll live.

I didn't even consider (2) and (3) as possibilities.  Those are
far from my mind.  I think I'm used to VM languages at this point
(C#, Java, Actionscript 3, Haxe, Synergy/DE|DBL, etc).  In the
VM, (2) and (3) can't happen.  I never worry about those.  Feel
free to crash these in D.

I encounter (4) a lot.  I really don't want my programs crashed
when (4) happens.  Such crashes would be super annoying, and they
can happen at very bad times.

------

Now then, I have 2 things to say about this:

- Why can't we distinguish between these?  As I said in my
previous thoughts, we should have ways of ruling out (2) and (3),
thus ensuring that our NullDerefException was caused by only (1)
or (4).  It's possible in VM languages, but given that the VM is
merely a cheesey abstraction, I beleive that it's always possible
to accomplish the same things in D %100 of the time.  Usually
this requires isolating the system bits from the abstractions.
Saying it can't be done would be giving up way too easily, and
you can miss the hidden treasure that way.

- If I'm given some sensible way of handling sentinal values then
(4) will become a non-issue.  Then that leaves (1-3), and I am OK
if those cause mandatory crashing.  I know I'm probably opening
an old can of worms, but D is quite powerful and I think we
should be able to solve this stuff.  My instincts tell me that
managing sentinal values with special patterns in memory (ex:
null values or separate boolean flags) all have pitfalls
(null-derefs or SSOT violations that lead to desync).  Perhaps
D's uber-powerful type system can rescue us?

The only other problem with this is... what if our list is not
exhaustive, and (5) exists?
March 07, 2012
Re: dereferencing null
On Wed, 07 Mar 2012 10:10:32 -0500, Chad J  
<chadjoan@__spam.is.bad__gmail.com> wrote:

> On Wednesday, 7 March 2012 at 14:23:18 UTC, Chad J wrote:
>> On 03/07/2012 07:57 AM, Steven Schveighoffer wrote:
>>> On Mon, 05 Mar 2012 23:58:48 -0500, Chad J
>>> <chadjoan@__spam.is.bad__gmail.com> wrote:
>>>
>>>>
>>>> Why is it fatal?
>>>
>>> A segmentation fault indicates that a program tried to access memory
>>> that is not available. Since the 0 page is never allocated, any null
>>> pointer dereferencing results in a seg fault.
>>>
>>> However, there are several causes of seg faults:
>>>
>>> 1. You forgot to initialize a variable.
>>> 2. Your memory has been corrupted, and some corrupted pointer now  
>>> points
>>> into no-mem land.
>>> 3. You are accessing memory that has been deallocated.
>>>
>>> Only 1 is benign. 2 and 3 are fatal. Since you cannot know which of
>>> these three happened, the only valid choice is to terminate.
>>>
>>> I think the correct option is to print a stack trace, and abort the
>>> program.
>>>
>>
>> Alright, I think I see where the misunderstanding is coming from.
>>
>> I have only ever encountered (1).  And I've encountered it a lot.
>>
>> I didn't even consider (2) and (3) as possibilities.  Those are far  
>> from my mind.
>>
>> I still have a nagging doubt though: since the dereference in question  
>> is null, then there is no way for that particular dereference to  
>> corrupt other memory.  The only way this happens in (2) and (3) is that  
>> related code tries to write to invalid memory.  But if we have other  
>> measures in place to prevent that (bounds checking, other hardware  
>> signals, etc), then how is it still possible to corrupt memory?
>>
>>>
>>> [...]
>>>
>>> -Steve
>
> I spoke too soon!
> We missed one:
>
> 1. You forgot to initialize a variable.
> 2. Your memory has been corrupted, and some corrupted pointer
>   now points into no-mem land.
> 3. You are accessing memory that has been deallocated.
> 4. null was being used as a sentinal value, and it snuck into
>   a place where the value should not be a sentinal anymore.
>
> I will now change what I said to reflect this:
>
> I think I see where the misunderstanding is coming from.
>
> I encounter (1) from time to time.  It isn't a huge problem because  
> usually if I declare something the next thing on my mind is initializing  
> it.  Even if I forget, I'll catch it in early testing.  It tends to  
> never make it to anyone else's desk, unless it's a regression.   
> Regressions like this aren't terribly common though.  If you make my  
> program crash from (1), I'll live.
>
> I didn't even consider (2) and (3) as possibilities.  Those are far from  
> my mind.  I think I'm used to VM languages at this point (C#, Java,  
> Actionscript 3, Haxe, Synergy/DE|DBL, etc).  In the VM, (2) and (3)  
> can't happen.  I never worry about those.  Feel free to crash these in D.
>
> I encounter (4) a lot.  I really don't want my programs crashed when (4)  
> happens.  Such crashes would be super annoying, and they can happen at  
> very bad times.

You can use sentinels other than null.

-Steve
March 07, 2012
Re: dereferencing null
On Wed, Mar 07, 2012 at 09:22:27AM -0500, Chad J wrote:
> On 03/07/2012 07:57 AM, Steven Schveighoffer wrote:
[...]
> >However, there are several causes of seg faults:
> >
> >1. You forgot to initialize a variable.
> >2. Your memory has been corrupted, and some corrupted pointer now
> >points into no-mem land.
> >3. You are accessing memory that has been deallocated.
> >
> >Only 1 is benign. 2 and 3 are fatal. Since you cannot know which of
> >these three happened, the only valid choice is to terminate.
> >
> >I think the correct option is to print a stack trace, and abort the
> >program.
> >
> 
> Alright, I think I see where the misunderstanding is coming from.
> 
> I have only ever encountered (1).  And I've encountered it a lot.
> 
> I didn't even consider (2) and (3) as possibilities.  Those are far
> from my mind.
> 
> I still have a nagging doubt though: since the dereference in
> question is null, then there is no way for that particular
> dereference to corrupt other memory.  The only way this happens in
> (2) and (3) is that related code tries to write to invalid memory.
> But if we have other measures in place to prevent that (bounds
> checking, other hardware signals, etc), then how is it still
> possible to corrupt memory?
[...]

It's not that the null pointer itself corrupts memory. It's that the
null pointer is a sign that something may have corrupted memory *before*
you got to that point.

The point is, it's impossible to tell whether the null pointer was
merely the result of forgetting to initialize something, or it's a
symptom of a far more sinister problem. The source of the problem could
potentially be very far away, in unrelated code, and only when you tried
to access the pointer, you discover that something is wrong.

At that point, it may very well be the case that the null pointer isn't
just a benign uninitialized pointer, but the result of a memory
corruption, perhaps an exploit in the process of taking over your
application, or some internal consistency error that is in the process
of destroying user data. Trying to continue is a bad idea, since you'd
be letting the exploit take over, or allowing user data to get even more
corrupted than it already is.


T

-- 
Be in denial for long enough, and one day you'll deny yourself of things you wish you hadn't.
10 11 12 13 14 15 16 17 18
Top | Discussion index | About this forum | D home