July 28
On 28/07/2024 6:33 AM, Walter Bright wrote:
> On 7/27/2024 2:30 AM, Richard (Rikki) Andrew Cattermole wrote:
>> There are two solutions for a native language that isn't just ignoring it like we do today. Check it at CT and warn/error if you do not handle null, or inject read barriers like we do for bounds checks to cause the runtime exception.
> 
> How is `assert(p != null)` better?

Two reasons.

1. You made the decision. You had to consider what would happen in this situation. The question was asked, what happens should this be null? If you want to assert that's fine. What is not fine is making an assumption and never state it, never have it proven to be true.

2. It throws an exception (in D), which can be caught safely. It is only when exception chaining occurs that the state may not have been cleaned up correctly.

An even better solution to using an assert, is to use an if statement instead.

```d
if (int* ptr = var.field) {
	// success
} else {
	// fail, you can gracefully degrade/log here
}
```

With the help of ``?.`` operator, we can also make this very nice to use for loads and stores, AND be temporally safe!

```d
if (S1* ptr1 = var.field1) {
	if (S2* ptr2 = ptr1.field2) {
		// success
		goto AfterFailure;
	}
}

// failure

AfterFailure:
```

From:

```d
if (int* ptr = var?.field1?.field2) {
	// success
} else {
	// failure
}
```

>> Note: this is solved in C/C++ world. They have such analysis in their compilers: analyzer-null-dereference
>>
>> https://gcc.gnu.org/onlinedocs/gcc-10.1.0/gcc/Static-Analyzer-Options.html
> 
> -Wanalyzer-possible-null-argument
> 
> That's not solving it.

Yes and no.

It cannot solve it completely in C or C++ because there is no language integration.

For temporal safety, you have to make sure you do a load into a variable, and only then do the test.

Which means this:

```d
void func(int** ptr) {
	assert(*ptr !is null);

	int i = **ptr; // ERROR: `**ptr` is in an unknown type state, it could be null
}
```

> Many simple cases can indeed be found by DFA. But not all. There's also a lot of DFA trying to eliminate array bounds errors, but such errors remain the #1 security problem with C and C++.
> 
> D does insert checks for array bounds overflow, because the hardware is of no help.

I agree about bounds checks. However as read barriers, these are things that we do control, and throw an exception that we can catch.

For this reason, bounds checks do not need CT analysis. They do not have to bring down an application if caught.

On the other hand, null dereference requires signal handling to catch. If you don't own thread and process you cannot use that to throw an exception. For this reason it's not comparable and requires CT analysis to prevent bringing down the process.

On the other hand, if you were to propose read barriers that throw exceptions for null dereference, I would agree to the statement that they are comparable.

>> It only takes one major outage where a business loses money before they consider dumping a D companies solution. No client wants to hear: "We did this in a known unsafe language for this particular error, when a more main stream language has solutions to it and D doesn't."
> 
> How is this different from `assert(p != null)` ?

See at top of my reply.




A joke I've been making at BeerConf atm (more or less):

User: Press F5
User: Press F5
User: Press F5, WHY ISN'T THE ABOUT PAGE SHOWING??????
User: Press F5 * 1000

Developer: Why are all the web server instances down?
Developer: Oh a segfault due to null dereference

User: Press F5 * 100000000000

Owner: Why did you use D! If you just used <insert mainstream application VM language> we wouldn't have lost millions of dollars!!!
July 27
BTW,

```
void func()
{
    int* p = null;
    *p = 3;
}
```

```
dmd -c null.d -vasm
_D4null4funcFZv:
0000:   31 C0                    xor       EAX,EAX
0002:   C7 00 03 00 00 00        mov       dword ptr [RAX],3
0008:   C3                       ret
```

```
./cc -c null.d -O
Error: null dereference in function _D4null4funcFZv
```

What is happening here? D's optimizer, as a result of doing a Data Flow Analysis called "constant propagation", discovers an attempt to dereference a null pointer.

It does intra-procedural, DFA, not inter-procedural, unless the called functions get inlined. (The optimizer runs after inlining.)

----------------

Let's look at gcc:

```
void func()
{
    int* p = (int*)0;
    *p = 3;
}
```

```
> gcc -c null.c
> gcc -c null.c -O
>
```


July 28
It would help if you turned on the analysis ;)

```c
int main(int argc, char** argv)
{
     int* p = (int*)0;
     *p = 3;
     return 0;
}
```

GCC args: ``-fanalyzer``

GCC Output:

```
<source>: In function 'main':
<source>:4:9: warning: dereference of NULL 'p' [CWE-476] [-Wanalyzer-null-dereference]
    4 |      *p = 3;
      |      ~~~^~~
  'main': events 1-2
    |
    |    3 |      int* p = (int*)0;
    |      |           ^
    |      |           |
    |      |           (1) 'p' is NULL
    |    4 |      *p = 3;
    |      |      ~~~~~~
    |      |         |
    |      |         (2) dereference of NULL 'p'
    |
ASM generation compiler returned: 0
<source>: In function 'main':
<source>:4:9: warning: dereference of NULL 'p' [CWE-476] [-Wanalyzer-null-dereference]
    4 |      *p = 3;
      |      ~~~^~~
  'main': events 1-2
    |
    |    3 |      int* p = (int*)0;
    |      |           ^
    |      |           |
    |      |           (1) 'p' is NULL
    |    4 |      *p = 3;
    |      |      ~~~~~~
    |      |         |
    |      |         (2) dereference of NULL 'p'
    |
Execution build compiler returned: 0
Program returned: 139
Program terminated with signal: SIGSEGV
```

CLANG args: ``--analyze``

CLANG Output:

```
clang: warning: -Wl,-rpath,./lib: 'linker' input unused [-Wunused-command-line-argument]
clang: warning: -Wl,-rpath,/opt/compiler-explorer/gcc-13.2.0/lib64: 'linker' input unused [-Wunused-command-line-argument]
clang: warning: -Wl,-rpath,/opt/compiler-explorer/gcc-13.2.0/lib32: 'linker' input unused [-Wunused-command-line-argument]
clang: warning: argument unused during compilation: '-L./lib' [-Wunused-command-line-argument]
<source>:4:9: warning: Dereference of null pointer (loaded from variable 'p') [core.NullDereference]
    4 |      *p = 3;
      |       ~ ^
1 warning generated.
ASM generation compiler returned: 0
clang: warning: -Wl,-rpath,./lib: 'linker' input unused [-Wunused-command-line-argument]
clang: warning: -Wl,-rpath,/opt/compiler-explorer/gcc-13.2.0/lib64: 'linker' input unused [-Wunused-command-line-argument]
clang: warning: -Wl,-rpath,/opt/compiler-explorer/gcc-13.2.0/lib32: 'linker' input unused [-Wunused-command-line-argument]
clang: warning: argument unused during compilation: '-L./lib' [-Wunused-command-line-argument]
<source>:4:9: warning: Dereference of null pointer (loaded from variable 'p') [core.NullDereference]
    4 |      *p = 3;
      |       ~ ^
1 warning generated.
Execution build compiler returned: 0
Program returned: 255
[F][2024-07-27T19:10:38+0000][1] runChild():487 Launching child process failed
```
July 27
On Saturday, 27 July 2024 at 18:23:28 UTC, Walter Bright wrote:
> It's true that many algorithms depend on a null pointer being a "sentinel", and people sometimes forget to check for it. That means:
>
> 1. if they forgot to check for the null special case, then the seg fault tells them where the error is
>
> 2. if null was supposed not ever happen, then the seg fault tells where the error is
>
> Dereferencing a null pointer is always a bug.

The derefencing is not the bug, it is at best one of its symptoms.

In D (in constrast to C++) the null ptr dereference is implementation defined.[1] I.e. if you need a test program that dereferences a null ptr you can write one in D (you cannot in C++). If that program does what it is expected ("[The] program will be aborted" [1]) one can hardly say that such a program contains a bug.

[1] https://dlang.org/spec/type.html#pointers
July 27
On Saturday, 27 July 2024 at 19:02:19 UTC, Walter Bright wrote:
> [...]
> Let's look at gcc:
>
> ```
> void func()
> {
>     int* p = (int*)0;
>     *p = 3;
> }
> ```
>
> ```
> > gcc -c null.c
> > gcc -c null.c -O
> >
> ```

$ gcc -c -O -Wnull-dereference null.c
null.c: In function 'func':
null.c:4:8: warning: null pointer dereference [-Wnull-dereference]
    4 |     *p = 3;
      |     ~~~^~~

It seems that at least optimization O1 is required to trigger the detection and -Wnull-dereference seems not to be included in -Wall -pedantic. Works since GCC 6.5.0.

| -Wnull-dereference warns if the compiler detects paths that
| trigger erroneous or undefined behavior due to dereferencing a null
| pointer. This option is only active when -fdelete-null-pointer-checks
| is active, which is enabled by optimizations in most targets. The
| precision of the warnings depends on the optimization options used.

[1] https://gcc.gnu.org/gcc-6/changes.html
July 27
On 7/27/2024 12:00 PM, Richard (Rikki) Andrew Cattermole wrote:
> 1. You made the decision. You had to consider what would happen in this situation. The question was asked, what happens should this be null? If you want to assert that's fine. What is not fine is making an assumption and never state it, never have it proven to be true.

?? Dereferencing a null pointer is always a bug, whether you decided to check for it or not.


> 2. It throws an exception (in D), which can be caught safely. It is only when exception chaining occurs that the state may not have been cleaned up correctly.

Exceptions in D are the same as the ones used for seg faults (except for on Win64, where I couldn't figure out how the system exceptions worked).


> An even better solution to using an assert, is to use an if statement instead.
> 
> ```d
> if (int* ptr = var.field) {
>      // success
> } else {
>      // fail, you can gracefully degrade/log here
> }
> ```

The reason exceptions were invented was because such code for every pointer dereference made reasonable code look quite ugly.

Of course, you can still write such code if you like.


> Which means this:
> 
> ```d
> void func(int** ptr) {
>      assert(*ptr !is null);
> 
>      int i = **ptr; // ERROR: `**ptr` is in an unknown type state, it could be null
> }
> ```

Which means the language will require you to manually insert assert()s everywhere.


> For this reason, bounds checks do not need CT analysis.

Whether it is needed or not, the compiler can't do it.

> They do not have to bring down an application if caught.

Array overflows are fatal programming bugs.

A huge discussion about this raged in the n.g. many years ago. There was a camp that maintained that assert failures should be recoverable to the point that the program could continue.

The other camp maintained that when a program enters a state unanticipated by the programmer, then the program is not recoverable, because one no longer has any idea what will happen. The only path forward is to stop the program, gracefully or not.

Obviously, I'm in the latter camp. Obviously, one can write programs in the first camp (D lets you do whatever you want) but I cannot endorse it.


> On the other hand, null dereference requires signal handling to catch. If you don't own thread and process you cannot use that to throw an exception. For this reason it's not comparable and requires CT analysis to prevent bringing down the process.

Whoever catches the signal will then stop the buggy, broken program from doing more damage. It's kinda the point of a computer with memory protection.


> On the other hand, if you were to propose read barriers that throw exceptions for null dereference, I would agree to the statement that they are comparable.

That'll turn D into a very poorly performing language. Besides, throwing an exception is just what sig faults are. It's the same mechanism.


> Owner: Why did you use D! If you just used <insert mainstream application VM language> we wouldn't have lost millions of dollars!!!

D is often unfairly accused of flaws.
July 28
On 28/07/2024 9:19 AM, Walter Bright wrote:
> On 7/27/2024 12:00 PM, Richard (Rikki) Andrew Cattermole wrote:
>> 1. You made the decision. You had to consider what would happen in this situation. The question was asked, what happens should this be null? If you want to assert that's fine. What is not fine is making an assumption and never state it, never have it proven to be true.
> 
> ?? Dereferencing a null pointer is always a bug, whether you decided to check for it or not.

The point is to make it not possible for you to dereference a null pointer to begin with.

The compiler won't let you do it without dropping to inline assembly or doing some unsafe casts.

If the compiler forces a check to occur, and it becomes null after, there is something very wrong going on. Likely stack corruption. At which point a segfault is absolutely the right tool for the job!

```d
void func(int* ptr) {
	if (ptr !is null) {
		writeln(*ptr); // ok pointer is known to be good
		writeln(*ptr); // IF ptr is null this needs to segfault!!! STACK CORRUPTION???
	}

	writeln(*ptr); // Error: ptr could be null!!! This should not compile
}
```

>> 2. It throws an exception (in D), which can be caught safely. It is only when exception chaining occurs that the state may not have been cleaned up correctly.
> 
> Exceptions in D are the same as the ones used for seg faults (except for on Win64, where I couldn't figure out how the system exceptions worked).

Unless I can do:

```d
try {
	...
} catch (NullPointerError) {
	...
}
```

It is not the same system from a "I have to write code to handle it" stand point.

>> An even better solution to using an assert, is to use an if statement instead.
>>
>> ```d
>> if (int* ptr = var.field) {
>>      // success
>> } else {
>>      // fail, you can gracefully degrade/log here
>> }
>> ```
> 
> The reason exceptions were invented was because such code for every pointer dereference made reasonable code look quite ugly.
> 
> Of course, you can still write such code if you like.

It can be improved greatly by using ``?.`` operator for long chains.

Yes, exceptions are how application VM languages do this and that's a good solution for them.

But we can't. We can't setup the signal handler to throw the exception.
We won't use a read barrier.
So we have to force the check upon the user at CT.

>> Which means this:
>>
>> ```d
>> void func(int** ptr) {
>>      assert(*ptr !is null);
>>
>>      int i = **ptr; // ERROR: `**ptr` is in an unknown type state, it could be null
>> }
>> ```
> 
> Which means the language will require you to manually insert assert()s everywhere.

Have another look at the example. The assert is useless and that is the whole point of that example.

You should be using if statements more often than not, AND you should only need to do this for variables that are loaded from an external source.

```d
int* global;

void func() {
	if (int val = *global) {
		// ok a check has occured
	}

	writeln(*global); // Error: no check has occured

	assert(global !is null); // Useless, it could change before next statement (not temporally safe).
	writeln(*global); // Error: no check has occured
}
```

Asserts are not as much use as one may think. They can only check variables in a function body. You have to perform a load into a variable before you can do the test and a if statement is far better at that.

>> For this reason, bounds checks do not need CT analysis.
> 
> Whether it is needed or not, the compiler can't do it.

For what we'd want agreed.

However there is DFA to do this verification, so I thought it was worth mentioning.

>> They do not have to bring down an application if caught.
> 
> Array overflows are fatal programming bugs.
> 
> A huge discussion about this raged in the n.g. many years ago. There was a camp that maintained that assert failures should be recoverable to the point that the program could continue.
> 
> The other camp maintained that when a program enters a state unanticipated by the programmer, then the program is not recoverable, because one no longer has any idea what will happen. The only path forward is to stop the program, gracefully or not.
> 
> Obviously, I'm in the latter camp. Obviously, one can write programs in the first camp (D lets you do whatever you want) but I cannot endorse it.

There is a third camp. This camp is the one that I and a few others are in. It is used by application VM languages.

It is also the only camp that is practical for long lived applications.

You need to distinguish between "cannot cleanup" and "cannot continue request".

If you cannot cleanup, then we align, shutdown process. There is no way to know what state the process is in, or what it could become. It could infect other processes and with that the user.

A good example of this is the stack corruption above. If you load a pointer into a variable, and then check to see if it is null and it is not. But then when you go to dereference it, you find that it is null. Absolutely segfault out. You have stack corruption.

Same situation with unmapped memory. That should not be possible, crash time!

On the other hand, a bounds check failing just means your business logic is probably faulty. You cannot fulfill the request. BUT the data itself could still be cleaned up, it hasn't been corrupted.

This is where exception chaining comes in, if you attempt cleanup and it turns out to throw another exception that isn't caught that means cleanup failed. The program is once again in an unknown state and should crash.

```d
struct Foo {
	~this() {
		throw new Exception;
	}
}

void callee() {
	try {
		called();
	} catch(Exception) {
		// is chained exception, oh noes cleanup failed!!!!
	}
}

void called() {
	Foo foo;
	throw new Exception;
}
```

It is important to note that requests have fixed known entry points (i.e. they continue execution of a coroutine). This isn't any random place in a code base. If you threw in the eventloop you'd still crash out.

>> On the other hand, null dereference requires signal handling to catch. If you don't own thread and process you cannot use that to throw an exception. For this reason it's not comparable and requires CT analysis to prevent bringing down the process.
> 
> Whoever catches the signal will then stop the buggy, broken program from doing more damage. It's kinda the point of a computer with memory protection.

Agreed, that is the behavior you want.

HOWEVER if you need that behavior, it better be because the compiler could not have helped you in anyway to not trigger it.

This should not compile:

```d
func(null);

void func(?nonnull int* ptr) {
	int val = *ptr;
}
```

This sort of thing is the norm in application VM languages now. It isn't a hypothetical improvement.

https://kotlinlang.org/docs/generics.html#definitely-non-nullable-types

https://kotlinlang.org/docs/null-safety.html

https://developer.apple.com/documentation/swift/designating-nullability-in-objective-c-apis#Annotate-Nullability-of-Individual-Declarations

https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/attributes/nullable-analysis#postconditions-maybenull-and-notnull

>> On the other hand, if you were to propose read barriers that throw exceptions for null dereference, I would agree to the statement that they are comparable.
> 
> That'll turn D into a very poorly performing language. Besides, throwing an exception is just what sig faults are. It's the same mechanism.

Agreed, I would never propose it, hence the hypothetical you.

I used the idea for a comparison to bounds checks.

>> Owner: Why did you use D! If you just used <insert mainstream application VM language> we wouldn't have lost millions of dollars!!!
> 
> D is often unfairly accused of flaws.

As an issue, this one has had CT analysis in application VM languages for around 15 years so this one is kinda fair. Especially with D's Java heritage.
July 29

On Saturday, 27 July 2024 at 18:23:28 UTC, Walter Bright wrote:

>

It's true that many algorithms depend on a null pointer being a "sentinel", and people sometimes forget to check for it.

Or they aren't supposed to have a sentinel but they accidentally got passed a null value because the type system allows it.

>

That means:

  1. if they forgot to check for the null special case, then the seg fault tells them where the error is

  2. if null was supposed not ever happen, then the seg fault tells where the error is

You don't get a segfault if your tests weren't run or don't (or can't) cover every case in development. Then your users get the segfault.

Do you accept that the developer detecting those bugs at compile-time is advantageous to the user having their program abort? The user might not even know how to file a bug, and it could cost them money, time or worse.

July 29

On Saturday, 27 July 2024 at 01:12:09 UTC, Walter Bright wrote:

>

You're right that ref's can be null, too. C++ says they can't be null, but it's trivial to make one in C++, so a fat lot of good that does.

Let's say we have a linked list. It will have a next pointer:

struct List {
    int payload;
    List* next;
}

A null value is perfectly valid for next, as that is how the end is found. At least in my coding, I use null values all the time to signify there is nothing there.

Yes, in your example, by virtue of List being a linked list, that null is a valid value for next can be readily guessed.

My suggestion would be that such a pointer would be defined as List*? to indicate to the type system that null is a valid value for it. This forces any function operating on lists to at least consider the possibility of a null pointer for next.

>

It it wasn't null, some other value would have to be there to signify "not a valid item". If there must be something there, then the value would have to be checked against the "not a valid item" value. What should I then do if it unexpectedly is the "not a valid item"? The only sensible thing is to abort the program, as it's a program bug.

But with null, I don't have to check, because the hardware does it for free.

I understand the hardware check. I really do. Most of us do.

TL;DR: Let me rephrase it in terms of implicit conversions: If you have a reference-type object (e.g. a pointer) that may be null according to the type system (which, currently, is every reference type except slices), using it in a way that requires it not to be null is an implicit conversion to the non-null version of its type. (The fact that it has zero run-time cost is irrelevant.) We, the many on this thread, want this implicit conversion to be an error. That’s because it’s a wrong implicit conversion the same way calling a mutable member function on a const object would be a wrong implicit conversion from const to mutable. This does not bar anyone from using an explicit cast to assert the programmer’s wit over the rules of the type system. The other direction, non-null to nullable, is of course valid and should not require an explicit cast, same as mutable to const does not require an explicit cast.

(End of TL;DR.)

What we want is the type system reminding us to mind the null value where it could be a bug to ignore it. To take an analogy from the London Underground, you want to mind the gap, right? But what if you had to mind the gap on every possible step? Would you really do it? Of course you wouldn’t; it’s exhausting and wasted attention on almost every step. But sometimes, you’d trip because you actually should have minded the gap.

What I’m saying is, if programmers can specify which pointers are expected to be null and which are expected not to be null, and the type system keeps track that a non-nullable pointer isn’t assigned a possibly null value without some explicit cast (which may even have zero-runtime cost with the right compiler switches, giving you core dumps or – on WebAssembly – UB), we can mind the null where it is to be minded, and rest assured that there won’t be nulls where we don’t expect them.

>

The only time a null pointer dereference is an actual problem is when running on a machine that does not have memory protection, which are decades in obsolescence.

I don’t know myself, but people consistently point out that it’s not true. As far as I’m told, we’re here:

  • D can target WebAssembly and adds nullable/non-nullable annotations.
  • D’s null is @safe.

Choose one.

>

The real issue with null pointer seg faults is the screen dump of mysterious numbers and letters.

What you’re saying is that null dereferences, which are bugs, can rather easily debugged. First, that depends on the experience of the programmer; I wouldn’t bet my life on being able to understand a core dump, let a lone that of a link-time optimized program. Second, I’d rather have a compile-error that tells me I’m risking a null dereference bug here than having to test and hopefully discover the bug before code goes to production. What I do with the error depends on circumstance. I have the following options, probably more:

  • Mark the left-hand side nullable.
  • Mark the origin of the right-hand side as non-nullable.
  • Handle the null case.
  • Insert a cast(!null), risking a bug if I’m wrong.

Only in the “handle the null case” does it incur a run-time cost, which I decided was actually needed and had merely forgotten to do.

My suggestion is, adding two type suffixes: T? for indicating that null is a valid value and T! for indicating that null is not a valid value. Together with module defaults, that makes for a rather seamless transition.

In the current state, the language default is ?, i.e. every reference type is as if suffixed by ?. A module default of !null changes that to !, so explicit ? are needed. A module default only affects what is lexically in the module, not e.g. imported stuff.

// D tomorrow:
module m;
int*  f(); // as if: `int*? f();`
int*? g(); // same type as `f`
int*! h(); // explicitly non-nullable result
// D tomorrow:
default(!null)
module m;
int*  f(); // as if: `int*! f();`: explicitly non-nullable result
int*? g(); // explicitly nullable result
int*! h(); // same type as `f`
// Possible future D edition where non-null as the language default:
module m;
int*  f(); // as if: `int*! f();`
int*? g(); // explicitly nullable result
int*! h(); // same type as `f`
// Possible future D edition where non-null as the language default:
default(null)
module m;
int*  f(); // as if: `int*? f();` explicitly non-nullable result
int*? g(); // same type as `f()`
int*! h(); // explicitly non-nullable result

This is how D could introduce non-nullable reference types (pointers, delegates, class handles, …) to the language. Only for ref, I’d immediately go with ref? is allowed to be null, but ordinary ref isn’t because practically, almost all ref function parameters are expected to be non-null and are bound to arguments that are obviously not *null.

As pointed out, if D targets WebAssembly, the cast(!null) can’t be @safe. This isn’t even controversial. D can only target the WebAssembly that exists with all its flaws.

>

The real biggest mistake in C is the eager decay of arrays to pointers, and the tragedy of C is nobody has any interest in fixing it.

I don’t disagree, but it’s unrelated.

July 30
On Tuesday, 23 July 2024 at 11:53:24 UTC, Quirin Schroll wrote:
> On Saturday, 20 July 2024 at 12:57:10 UTC, ryuukk_ wrote:
>> On Saturday, 20 July 2024 at 05:58:19 UTC, Richard (Rikki) Andrew Cattermole wrote:
>>> We either get the DFA I'm building, something like it or we're toast in commercial usage.
>>
>> What is DFA?
>
> I always read “deterministic finite automaton.” (I hate acronyms on the forums.)

And I read that as Data Flow Analysis.

As an ESL (English as a Second Language), I wish we had some sort reference for this in this Forum.

Matheus.