[RFC] Throwing an exception with null pointers (page 4)

Settings

Help

Index » General » [RFC] Throwing an exception with null pointers (page 4)

April 17

Re: [RFC] Throwing an exception with null pointers

Posted by Richard (Rikki) Andrew Cattermole
in reply to Derek Fawcus

Permalink

Richard (Rikki) Andrew Cattermole

Posted in reply to Derek Fawcus

Permalink

On 17/04/2025 1:41 AM, Derek Fawcus wrote:
> On Wednesday, 16 April 2025 at 08:49:39 UTC, Richard (Rikki) Andrew Cattermole wrote:
>> On 16/04/2025 8:18 PM, Atila Neves wrote:
>>
>>> * Use a nullable/option type.
>>
>> While valid to box pointers, we would then need to disallow them in business logic functions.
> 
> I'm not sure what you have in mind, what I have in mind is something like this:
> 
>    https://discourse.llvm.org/t/rfc-nullability-qualifiers/35672
> https://clang.llvm.org/docs/analyzer/developer-docs/nullability.html
> 
> The checks here are performed in a distinct SA tool, not in the main compiler.  However it catches the main erroneous cases - first two listed checks of second link:

``clang --analyze -Xanalyzer -analyzer-output=text``

"While it’s somewhat exceptional for us to introduce new type qualifiers that don’t produce semantically distinct types, we feel that this is the only plausible design and implementation strategy for this feature: pushing nullability qualifiers into the type system semantically would cause significant changes to the language (e.g., overloading, partial specialization) and break ABI (due to name mangling) that would drastically reduce the number of potential users, and we feel that Clang’s support for maintaining type sugar throughout semantic analysis is generally good enough [6] to get the benefits of nullability annotations in our tools."

Its available straight from clang, it annotates variables and is part of the type system, but also isn't effecting symbol lookup or introspection. Its part of the frontend, not backend.

Exactly what I want also.

I'm swearing right now, I knew we were 20 years behind, I didn't realize that they are one stones throw away from the end game. We can't get ahead of them at this point.

The attributes are different to what I want in D however. For D I want us to solve all of type state analysis not just nullability.

>> If a pointer p has a nullable annotation and no explicit null check or assert, we should warn in the following cases:
>>
>> -    p gets implicitly converted into nonnull pointer, for example, we are passing it to a function that takes a nonnull parameter.
>>
>> -    p gets dereferenced
> 
> Given how individual variable / fields have to be annotated, it probably does not need complete DFA, but only function local analysis for loads/ stores/compares.

Fields no, it'll be hell if we were to start annotating them, Walter balked at that idea ages ago and he was right to.

I want stuff like this to work, without needing annotation:

```d
bool isNull(int* ptr) => ptr is null;

int* ptr;

if (!isNull(ptr))
	int v = *ptr; // ok
else
	int v = *ptr; // error

```

No annotations when things are not virtual or for the default case complex (such as backwards goto's, that needs a full CFG as part of DFA).

```d
void main() {
	func(new int); // ok
	func(null); // error
}

void func(/*?nonnull*/ int* ptr) {
	int v = *ptr;
}
```

April 16

Re: [RFC] Throwing an exception with null pointers

Posted by Walter Bright
in reply to Richard (Rikki) Andrew Cattermole

Permalink

Walter Bright

Posted in reply to Richard (Rikki) Andrew Cattermole

Permalink

On 4/12/2025 4:11 PM, Richard (Rikki) Andrew Cattermole wrote:
> The .net exceptions are split into managed and unmanaged exceptions.

Sounds equivalent to D's Exception and Error hierarchies.

April 17

Re: [RFC] Throwing an exception with null pointers

Posted by Richard (Rikki) Andrew Cattermole
in reply to Walter Bright

Permalink

Richard (Rikki) Andrew Cattermole

Posted in reply to Walter Bright

Permalink

On 17/04/2025 4:55 AM, Walter Bright wrote:
> On 4/12/2025 4:11 PM, Richard (Rikki) Andrew Cattermole wrote:
>> The .net exceptions are split into managed and unmanaged exceptions.
> 
> Sounds equivalent to D's Exception and Error hierarchies.

Its not.

The unmanaged exceptions are from native, then it gets wrapped by .net so it can be caught. Including for null dereference.

As far as I'm aware cleanup routines are not messed with.

We've lately been discussing what to do with Error on Discord, and so far it seems like the discussion is going in the direction of it should either do what assert does and kill the process with a function pointer in the middle to allow configurability, or throw what I've dubbed a framework exception.

A framework exception sits in the middle of the existing hierarchy, does cleanup, but doesn't effect nothrow.

Manu wanted something like this recently for identical reasons that me and Adam do.

April 16

Re: [RFC] Throwing an exception with null pointers

Posted by Walter Bright
in reply to Richard (Rikki) Andrew Cattermole

Permalink

Walter Bright

Posted in reply to Richard (Rikki) Andrew Cattermole

Permalink

I confess I don't understand the fear behind a null pointer.

A null pointer is a NaN (Not a Number) value for a pointer. It's similar (but not exactly the same behavior) as 0xFF is a NaN value for a character and NaN is a NaN value for a floating point value.

It means the pointer is not pointing to a valid object. Therefore, it should not be dereferenced. To dereference a null pointer is:

A BUG IN THE PROGRAM

When a bug in the program is detected, the only correct course of action is:

GO DIRECTLY TO JAIL, DO NOT PASS GO, DO NOT COLLECT $200

It's the same thing as `assert(condition)`. When the condition evaluates to `false`, there's a bug in the program.

A bug in the program means the program has entered an unanticipated state. The notion that one can recover from this and continue running the program is only for toy programs. There is NO WAY to determine if continuing to run the program is safe or not.

I did a lot of programming on MS-DOS. There is no memory protection there. Writing through a null pointer would scramble the operating system tables, which meant the operating system would do something terrible. There were many times when it literally scrambled my hard disk. (I made lots of backups.)

If you haven't had this pleasure, it may be hard to realize what a godsend protected memory is. A null pointer no longer requires reinstalling the operating system. Your program simply quits with a stack trace.

With the advent of protected mode, I immediately ceased all program development in real mode DOS. Instead, I'd fully debug it in protected mode, and then as the very last step I'd test it in real mode.

Protected mode is the greatest invention ever for computer programs. When the hardware detects a null pointer dereference, it produces a seg fault, the program stops running and you get a stack trace which gives you the best chance ever of finding the cause of the seg fault.

A lovely characteristic of seg faults is they come FOR FREE! There is zero cost to them. They don't slow your program down at all. They do not add bloat. It's all under the hood.

The idea that a null pointer is a billion dollar mistake is just ludicrous to me. The real mistake is having unchecked arrays, which don't get hardware protection, and are the #1 source of malware injection problems.

Being unhappy about a null pointer seg fault is like complaining that the seatbelt left a bruise on your body as it saved you from your body being broken (this has happened to me, I always always wear that seatbelt!).

Of course, it is better to detect a seg fault at compile time. Data Flow Analysis can help:

```d
int x = 1;
void main()
{
int* p;
if (x) *p = 3;
}
```
Compiling with `-O`, which enables Data Flow Analysis:
```
dmd -O test.d
Error: null dereference in function _Dmain
```
Unfortunately, DFA has its limitations that nobody has managed to solve (the halting problem), hence the need for runtime checks, which the hardware does nicely for you.

Fortunately, D is powerful enough so you can make a non-nullable type.

In summary, the notion that one can recover from an unanticipated null pointer dereference and continue running the program is a seriously bad idea. There are far better ways to make failsafe systems. Complaining about a seg fault is like complaining that a seatbelt left a bruise while saving you from being maimed.

April 16

Re: [RFC] Throwing an exception with null pointers

Posted by Walter Bright
in reply to Steven Schveighoffer

Permalink

Walter Bright

Posted in reply to Steven Schveighoffer

Permalink

Thank you, Steven. This is correct.

April 16

Re: [RFC] Throwing an exception with null pointers

Posted by Walter Bright
in reply to Richard (Rikki) Andrew Cattermole

Permalink

Walter Bright

Posted in reply to Richard (Rikki) Andrew Cattermole

Permalink

The correct solution is to restart the process.

The null pointer dereference could be a symptom of a wild pointer writing all over the process space.

April 16

Re: [RFC] Throwing an exception with null pointers

Posted by Derek Fawcus
in reply to Walter Bright

Permalink

Derek Fawcus

Posted in reply to Walter Bright

Permalink

On Wednesday, 16 April 2025 at 18:19:58 UTC, Walter Bright wrote:
> Thank you, Steven. This is correct.

Yup - I like the crash...

However I do have an interest in being able to write code with distinct nullable and nonnull pointers.  That such that the compiler (or an SA tool) can complain when they're incorrectly confused.

So passing (or assigning) a nullable pointer to a nonnull one without a prior check should generate a compile error or warning.  That should only require function local DFA.

The reason to want it is simply that test cases may not exercise complete coverage for various paths when one only has the C style pointer, and so it should allow for easy latent bug detection and fixes when one is not bypassing the type system.

If one is bypassing the type system, then one takes the risks, but the SIGSEGV is still there to catch the bug.

(Yes, I've programmed under DOS.  I also took advantage of a protected mode OS (FlexOS) when available to prove and debug the code first.  The TurboC style 'detected a null pointer write' at program exit while occasionally useful, was grossly inadequate)

April 17

Re: [RFC] Throwing an exception with null pointers

Posted by Richard (Rikki) Andrew Cattermole
in reply to Walter Bright

Permalink

Richard (Rikki) Andrew Cattermole

Posted in reply to Walter Bright

Permalink

On 17/04/2025 6:38 AM, Walter Bright wrote:
> The correct solution is to restart the process.
> 
> The null pointer dereference could be a symptom of a wild pointer writing all over the process space.

Yes, we are in agreement on this situation.

.net has a very strong guarantee that a pointer can only point to null or a valid instance of that type.

The restrictions that @safe place on a function does remove this as a possibility in D also.

And this is where we are diverging, there is a subset which does have this guarantee, that it does indicate logic error, and not program corruption.

This is heavily present in web development, but rather rare in comparison to other types of projects.

April 16

Re: [RFC] Throwing an exception with null pointers

Posted by Walter Bright
in reply to Derek Fawcus

Permalink

Walter Bright

Posted in reply to Derek Fawcus

Permalink

On 4/16/2025 11:43 AM, Derek Fawcus wrote:
> However I do have an interest in being able to write code with distinct nullable and nonnull pointers.  That such that the compiler (or an SA tool) can complain when they're incorrectly confused.

That's what templates are for!

April 17

Re: [RFC] Throwing an exception with null pointers

Posted by Richard (Rikki) Andrew Cattermole
in reply to Walter Bright

Permalink

Richard (Rikki) Andrew Cattermole

Posted in reply to Walter Bright

Permalink

On 17/04/2025 6:18 AM, Walter Bright wrote:
> I confess I don't understand the fear behind a null pointer.
> 
> A null pointer is a NaN (Not a Number) value for a pointer. It's similar (but not exactly the same behavior) as 0xFF is a NaN value for a character and NaN is a NaN value for a floating point value.

Agreed.

But unlike floating point, pointer issues kill the process.

They invalidate the task at hand.

> It means the pointer is not pointing to a valid object. Therefore, it should not be dereferenced.

If you write purely @safe code that isn't possible.

Just like what .net guarantees.

> To dereference a null pointer is:
> 
> A BUG IN THE PROGRAM

Agreed, the task has not got the ability to continue and must stop.

A task is not the same thing as a process.

> When a bug in the program is detected, the only correct course of action is:
> 
> GO DIRECTLY TO JAIL, DO NOT PASS GO, DO NOT COLLECT $200
> 
> It's the same thing as `assert(condition)`. When the condition evaluates to `false`, there's a bug in the program.

You are not going to like what the unittest runner is doing then.

https://github.com/dlang/dmd/blob/d6602a6b0f658e8ec24005dc7f4bf51f037c2b18/druntime/src/core/runtime.d#L561

> A bug in the program means the program has entered an unanticipated state. The notion that one can recover from this and continue running the program is only for toy programs. There is NO WAY to determine if continuing to run the program is safe or not.

Yes, that is certainly possible in a lot of cases.

We are in total agreement that the default should always be to kill the process.

The problem lies in a very specific scenario where @safe is being used heavily, where logic errors are extremely common but memory errors are not.

I want us to be 100% certain that a read barrier cannot function as a backup plan to DFA language features. If it can, it will give a better user experience then just DFA, we've seen what happens when you try to solve these kinds of problems exclusively with DFA, it shows up as DIP1000 cannot be turned on by default.

If the end result is that we have to recommend the slow DFA exclusively for production code then so be it. I want us to be certain that we have no other options.

> I did a lot of programming on MS-DOS. There is no memory protection there. Writing through a null pointer would scramble the operating system tables, which meant the operating system would do something terrible. There were many times when it literally scrambled my hard disk. (I made lots of backups.)

As you know I'm into retro computers, so yeah I'm familiar with not having memory protection and the consequences thereof.

> If you haven't had this pleasure, it may be hard to realize what a godsend protected memory is. A null pointer no longer requires reinstalling the operating system. Your program simply quits with a stack trace.
> 
> With the advent of protected mode, I immediately ceased all program development in real mode DOS. Instead, I'd fully debug it in protected mode, and then as the very last step I'd test it in real mode.

I've read your story on this in the past and believed you the first time.

> Protected mode is the greatest invention ever for computer programs. When the hardware detects a null pointer dereference, it produces a seg fault, the program stops running and you get a stack trace which gives you the best chance ever of finding the cause of the seg fault.

You don't always get a stack trace.

Nor does it allow you to fully report to a reporting daemon what went wrong for diagnostics.

What Windows does instead of a signal, is to have it throw an exception that then gets caught right at the top. This then triggers the reporting daemon kicking in. It allows for catching, filtering and adding of more information to the report. Naturally we can't support it due to exceptions...

At the OS level things have progressed from simply segfaulting out, even in the native world.

https://learn.microsoft.com/en-us/windows/win32/api/werapi/nf-werapi-werregisterruntimeexceptionmodule

> A lovely characteristic of seg faults is they come FOR FREE! There is zero cost to them. They don't slow your program down at all. They do not add bloat. It's all under the hood.
> 
> The idea that a null pointer is a billion dollar mistake is just ludicrous to me. The real mistake is having unchecked arrays, which don't get hardware protection, and are the #1 source of malware injection problems.

While I don't agree that it was a mistake (token values are just as bad), and that is his name for it.

I view it the same way as I view coroutine coloring.

Its a feature to keep operating environments sane. But by doing so it causes pain and forces you to deal with the problem rather than let it go unnoticed.

Have a read of the show notes: https://www.infoq.com/presentations/Null-References-The-Billion-Dollar-Mistake-Tony-Hoare/

"27:40 This led me to suggest that the null value is a member of every type, and a null check is required on every use of that reference variable, and it may be perhaps a billion dollar mistake."

None of this is new! :)

> Being unhappy about a null pointer seg fault is like complaining that the seatbelt left a bruise on your body as it saved you from your body being broken (this has happened to me, I always always wear that seatbelt!).

Never happened to me, and I still wear it.

Doesn't mean I want to be in a car that is driven with hard stops that is in the drivers control to not do.

> Of course, it is better to detect a seg fault at compile time. Data Flow Analysis can help:
> 
> ```d
> int x = 1;
> void main()
> {
>      int* p;
>      if (x) *p = 3;
> }
> ```
> Compiling with `-O`, which enables Data Flow Analysis:
> ```
> dmd -O test.d
> Error: null dereference in function _Dmain
> ```

Right, local information only.

Turns out even the C++ folks are messing around with frontend DFA for this :/ With cross-procedural information in AST.

> Unfortunately, DFA has its limitations that nobody has managed to solve (the halting problem), hence the need for runtime checks, which the hardware does nicely for you.
> 
> Fortunately, D is powerful enough so you can make a non-nullable type.

I've considered the possibility of explicit boxing.

With and without compiler forcing it (by disallowing raw pointers and slices).

Everything we can do with boxing using library types, can be done better with the compiler. Including making sure that it actually happens.

I've seen what happens if we force boxing rather than doing something in the language in my own stuff. The amount of errors I have with my @mustuse error type is staggering. We gotta get a -betterC compatible solution to exceptions that isn't heap allocated or using unwinding tables ext.

It would absolutely poor engineering to try to convince anyone to box raw pointers let alone being the recommended or required solution as part of PhobosV3. There has to be a better way.

> In summary, the notion that one can recover from an unanticipated null pointer dereference and continue running the program is a seriously bad idea. There are far better ways to make failsafe systems. Complaining about a seg fault is like complaining that a seatbelt left a bruise while saving you from being maimed.

Program != task.

No one wants the task to continue after a null dereference occurs. We are not in disagreement. It must attempt to cleanup (if segfault handler fires then straight to death the process goes) and die.

We are not as far off as it might appear.

Top | Forum index | About this forum

Forums