Jump to page: 1 2 3
Thread overview
3 days ago

I've been looking once again at having an exception being thrown on null pointer dereferencing.
However the following can be extended to other hardware level exceptions.

I do not like the conclusion, but it is based upon facts that we do not control.

Conclusion

We cannot rely upon hardware or kernel support for throwing of a null pointer exception.
To do this we have to use read barriers, like we do for array bounds check.

There are three levels of support needed:

  1. Language support, it does not alter code generation, available everwhere.
    Can be tuned to the users threshold of pain and need for guarantees.
  2. Altering codegen, throws an exception via a read barrier just like a bounds check does.
  3. Something has gone really wrong and CPU has said NOPE and a signal has fired to kill the process.

Read barriers can be optimized out thanks to language support via data flow analysis and they can be turned on/off like bounds checks are.

Analysis

Platform support

For Posix we can handle any errors that occur and throw an exception, its tricky but it is possible.

For Windows it'll result in a exception being thrown, which can be caught and handled... except we do not support the exception mechanism on Win64 for cleanup routines (dmd only) let alone catching.
I've asked this recently and this is not a bug nor is it guaranteed by the language to work.

Even if this were to work, signal handlers can be changed, and they can be a bit touchy at times.
We cannot rely on kernel level or cpu support to catch these errors.

Read barriers

To have a 100% solution for within D code there is really only one option: read barriers.
We already have them for bounds checks.
And they can throw a D exception without any problems, plus they bypass the cpu/kernel guarantees which can result in infinite loops.

This catches logic problems, but not program corruption where pointers point to something that they shouldn't.

There is one major problem with a read barrier on pointers, how do you disable it?
With slices you can access the pointer directly and do the dereference that by-passes it.
Sadly we'd be stuck with either a storage class or attribute to turn it off.
I know Walter would hate the proposal of a storage class so that is a no go.

.net

So how does .net handle it?
As another Microsoft owned project, .net is a very good thing to study, it has the exact same problems that we do here.

The .net exceptions are split into managed and unmanaged exceptions.
Unmanaged exceptions are what we are comparable to (but don't support their method).
These are not meant to be caught by .net, including stuff like null dereference exceptions, they kill the process.

The managed exceptions include ones like null dereference for .net and are allowed to be caught.
Quite importantly in frameworks like asp.net it guarantees that non-framework code cannot crash the process even in the most extreme cases.
This is possible because null is a valid pointer value, and a pointer cannot point into unmapped memory.

The guarantee of .net that you cannot corrupt a pointer, also happens to be what a signal that causes a process crash is good at handling; corrupted pointers.

Application VM languages

In application VM languages like C# nullability is now made part of the type system with assistance with data flow analysis to prevent you from doing bad things at compile time.

It is a very involved process to upgrade all code to it, and from what I have seen many people in the D community would be appauled at the notion that they have to explicitly state a pointer as being non-null or nullable.

Worse the typing of a pointer as non-null or nullable tends to be in the type system, but has the data flow analysis to infect other variables.

C++

In C++ nullability is handled via lint level analysis without any language help.

It is compiler specific analysis that can require not only opt-ing into it, but also turning on optimizations.

We can do better than this.
It is no where near a desirable solution.

Plan for three failures

A known good strategy for handling errors in a system is to have three solutions.
For handling pointer related issues we could have three solutions at play:

  1. CPU/kernel kill process via signal/exception handling
  2. Read barrier throw an exception, use for when null dereference occurs, not when pointer corruption occurs i.e. dereference of unmapped memory
  3. Language level support

All read barriers in D are optional, with differing level of action when they error.
It should be the same here also.
I do not care about the default.
Although if the default is not on, it could cause problems in practice with say PhobosV3.

Language support

As already stated, forced typing is likely to annoy too many people and any pointer typing that could solve that is already out as it is doing the classic managed vs unmanaged pointer typing.

There must be a way for the programmer to acknowledge pointer nullability status without it being required or be part of a type.
This makes us fairly unique.

Since we cannot require attribution by default, we cannot have a 100% solution with just language level support.
But we also do not want to have a common error to be in the code and effect runtime if at all possible.
Or at least in cases where me and Adam Wilson care about regarding eventloops.

Which means we are limited to local only information, similar to C++, except... we can store the results into the type system.
The result is more code is checked than C++, but also less than say C# by default.

It is possible to opt into more advanced analysis and error when it cannot model your code.
Which gets us to the same level of support as C# (more or less).

2 days ago

On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew Cattermole wrote:

>

I've been looking once again at having an exception being thrown on null pointer dereferencing.

Why Exception and not Error?

2 days ago

On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew Cattermole wrote:

>

I've been looking once again at having an exception being thrown on null pointer dereferencing.
However the following can be extended to other hardware level exceptions.

I do not like the conclusion, but it is based upon facts that we do not control.

Conclusion

We cannot rely upon hardware or kernel support for throwing of a null pointer exception.
To do this we have to use read barriers, like we do for array bounds check.

There are three levels of support needed:

  1. Language support, it does not alter code generation, available everwhere.
    Can be tuned to the users threshold of pain and need for guarantees.
  2. Altering codegen, throws an exception via a read barrier just like a bounds check does.
  3. Something has gone really wrong and CPU has said NOPE and a signal has fired to kill the process.

From the unix/posix perspective, I'd say don't even try. Just allow the signal (SIGSEGV and/or SIGBUS) to be raised, not caught, and have the process crash. Debugging then depends upon examining the corefile, or using a debugger.

The only gain from catching it is to generate some pretty form of back trace before dying, and that is just as well (or better) handled by a debugger, or an external crash handling and backtrace generating corset/monitor/supervision process.

Once one catches either of these signals, one has to be very careful in handling if any processing is to continue beyond the signal handler. With a complex runtime, and/or a multi-threaded application, it often isn't worth the effort.

2 days ago

On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew Cattermole wrote:

>

I've been looking once again at having an exception being thrown on null pointer dereferencing.

[...]

Read barriers

To have a 100% solution for within D code there is really only one option: read barriers.
We already have them for bounds checks.
And they can throw a D exception without any problems, plus they bypass the cpu/kernel guarantees which can result in infinite loops.

This catches logic problems, but not program corruption where pointers point to something that they shouldn't.

There is one major problem with a read barrier on pointers, how do you disable it?
With slices you can access the pointer directly and do the dereference that by-passes it.
Sadly we'd be stuck with either a storage class or attribute to turn it off.

I think that you actually dont need any language addition at all.

The read barriers can be considered as implicit, "opt-in", contracts and whether they are codegened can be controlled with a simple command line switch.

At first glance this system may appear costly. This is exact but there are actually many cases for which the compiler can determine that it has not to add a barrier.

2 days ago

On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew Cattermole wrote:

>

Language support

As already stated, forced typing is likely to annoy too many people and any pointer typing that could solve that is already out as it is doing the classic managed vs unmanaged pointer typing.

There must be a way for the programmer to acknowledge pointer nullability status without it being required or be part of a type.
This makes us fairly unique.

Since we cannot require attribution by default, we cannot have a 100% solution with just language level support.
But we also do not want to have a common error to be in the code and effect runtime if at all possible.
Or at least in cases where me and Adam Wilson care about regarding eventloops.

Which means we are limited to local only information, similar to C++, except... we can store the results into the type system.
The result is more code is checked than C++, but also less than say C# by default.

It is possible to opt into more advanced analysis and error when it cannot model your code.
Which gets us to the same level of support as C# (more or less).

As to language level support for nullable vs non-nullable pointers, without having used it yet, I believe I'd like to have such. Picking a default is an issue.

I probably need to play (in C) with the clang __nullable and _nonnull markers to see how well they work. From reading the GCC docs I can't see benefit from its mechanisms, as they serve to guide optimisation rather than checks / assertions at compile time and/or checks at runtime.

I think I really want something like Cyclone offered, with forms of non-null pointers and nullable pointers. Or maybe something like Odin/Zig offer with default non-null pointers and optional nullable pointers, the latter requring source guards (plus dataflow analysis).

As to how that translates to D, I'm not yet sure.

However references alone are not the answer, as I want an explicit annotation at function call sites to indicate that a pointer/reference may be passed. Hence I have a quibble with D safe mode not allowing passing pointers to locals; only mitigated by the 'scoped pointer' annotation when that preview flag is enabled.

2 days ago
On 14/04/2025 1:07 AM, Derek Fawcus wrote:
> On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew Cattermole wrote:
>> I've been looking once again at having an exception being thrown on null pointer dereferencing.
>> However the following can be extended to other hardware level exceptions.
>>
>> I do not like the conclusion, but it is based upon facts that we do not control.
>>
>> ## Conclusion
>>
>> We cannot rely upon hardware or kernel support for throwing of a null pointer exception.
>> To do this we have to use read barriers, like we do for array bounds check.
>>
>> There are three levels of support needed:
>>
>> 1. Language support, it does not alter code generation, available everwhere.
>>     Can be tuned to the users threshold of pain and need for guarantees.
>> 2. Altering codegen, throws an exception via a read barrier just like a bounds check does.
>> 3. Something has gone really wrong and CPU has said NOPE and a signal has fired to kill the process.
> 
>  From the unix/posix perspective, I'd say don't even try.  Just allow the signal (SIGSEGV and/or SIGBUS) to be raised, not caught, and have the process crash.  Debugging then depends upon examining the corefile, or using a debugger.
> 
> The only gain from catching it is to generate some pretty form of back trace before dying, and that is just as well (or better) handled by a debugger, or an external crash handling and backtrace generating corset/ monitor/supervision process.
> 
> Once one catches either of these signals, one has to be very careful in handling if any processing is to continue beyond the signal handler.  With a complex runtime, and/or a multi-threaded application, it often isn't worth the effort.

This is how it is implemented currently.

Which is don't touch it and it matches my analysis.

2 days ago
On 13/04/2025 9:48 PM, Ogion wrote:
> On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew Cattermole wrote:
>> I've been looking once again at having an exception being thrown on null pointer dereferencing.
> 
> Why `Exception` and not `Error`?

Neither of them.

The former has language specific behavior, and the latter has implementation specific behavior that isn't suitable for catching by read barriers.

2 days ago
On 14/04/2025 9:45 AM, Richard (Rikki) Andrew Cattermole wrote:
> On 13/04/2025 9:48 PM, Ogion wrote:
>> On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew Cattermole wrote:
>>> I've been looking once again at having an exception being thrown on null pointer dereferencing.
>>
>> Why `Exception` and not `Error`?
> 
> Neither of them.
> 
> The former has language specific behavior, and the latter has implementation specific behavior that isn't suitable for catching by read barriers.

err, thrown by read barriers and then caught.

1 day ago

On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew Cattermole wrote:

>

I've been looking once again at having an exception being thrown on null pointer dereferencing.

null pointer in x86 is any pointer less than 0x00010000

also what about successful dereferencing 0x23a7b63c41704h827? its depends: reserved this memory by process, committed, mmaped etc

failure for any "wrong" pointer should be same as for 0x0000....000 (pure NULL)

so read barrier is wrong option (maybe good for DEBUG only for simple cases)

silent killing process by OS is also wrong option, not every programmer is kernel debugger/developer or windbg guru

imo need to dig into .net source and grab their option

1 day ago

On Monday, 14 April 2025 at 09:49:27 UTC, a11e99z wrote:

>

On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew Cattermole wrote:

>

I've been looking once again at having an exception being thrown on null pointer dereferencing.

null pointer in x86 is any pointer less than 0x00010000

imo need to dig into .net source and grab their option

almost same problem is misaligned access:
x86/x64 allows this except few instructions
ARM - prohibit it (as I know)

it seems there are no other options except to handle kernel signals and WinSEH

« First   ‹ Prev
1 2 3