[RFC] Throwing an exception with null pointers (page 10)

Settings

Help

Index » General » [RFC] Throwing an exception with null pointers (page 10)

2 days ago

Re: [RFC] Throwing an exception with null pointers

Posted by GrimMaple
in reply to Richard (Rikki) Andrew Cattermole

Permalink

GrimMaple

Posted in reply to Richard (Rikki) Andrew Cattermole

Permalink

On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew Cattermole wrote:

Language support

FYI, some time ago I designed this https://github.com/GrimMaple/mud/blob/master/source/mud/nullable.d to work as sort-of C#-like Nullable checks (that work if you <Nullable>enable</Nullable>). My intention was to include it in OpenD to work like this:

When a compiler flag is enabled (eg -nullcheck), the code below:

Object o = new Object();

Would be then silently rewritten by compiler as

NotNull!Object o = new Object();

thus enabling compile-time null checks.

The solution is not perfect and needs some further compiler work (eg checking if some field is inderectly initialized by some func called in a constructor). Also, I lack dmd knowledge to insert this myself, so this didn't go anywhere in terms of actual inclusion, but I might give it a go some time in the future.

2 days ago

Re: [RFC] Throwing an exception with null pointers

Posted by Richard (Rikki) Andrew Cattermole
in reply to GrimMaple

Permalink

Richard (Rikki) Andrew Cattermole

Posted in reply to GrimMaple

Permalink

On 21/04/2025 10:34 PM, GrimMaple wrote:
> On Saturday, 12 April 2025 at 23:11:41 UTC, Richard (Rikki) Andrew Cattermole wrote:
>> ## Language support
> 
> FYI, some time ago I designed this https://github.com/GrimMaple/mud/ blob/master/source/mud/nullable.d to work as sort-of C#-like Nullable checks (that work if you `<Nullable>enable</Nullable>`). My intention was to include it in OpenD to work like this:
> 
> When a compiler flag is enabled (eg -nullcheck), the code below:
> ```d
> Object o = new Object();
> ```
> Would be then silently rewritten by compiler as
> ```
> NotNull!Object o = new Object();
> ```
> thus enabling compile-time null checks.
> 
> The solution is not perfect and needs some further compiler work (eg checking if some field is inderectly initialized by some func called in a constructor). Also, I lack dmd knowledge to insert this myself, so this didn't go anywhere in terms of actual inclusion, but I might give it a go some time in the future.

That is the Swift solution to the problem, more or less.

2 days ago

Re: NonNull template

Posted by kdevel
in reply to Jonathan M Davis

Permalink

kdevel

Posted in reply to Jonathan M Davis

Permalink

On Sunday, 20 April 2025 at 22:19:39 UTC, Jonathan M Davis wrote:
>> I consider nonconforming generally inacceptable.
>
> Writing a program which doesn't behave properly is always a problem and should be consider unacceptable.

The problematic word is "behave". Only recently there was a
thread on reddit where the user Zde-G pinpointed the
problem while discussing a "new name" for undefined
behavior (UB) [5]:

    '90% of confusion about UB comes from the simple fact
    that something is called behavior. Defined, undefined,
    it doesn't matter: layman observes world behavior,
    layman starts thinking about what kind of behavior can
    there be.

    The mental model every programmer which observes that
    term for the first time is “some secret behavior which
    is too complex to write in the description of the
    language… but surely I can glean it from the compiler
    with some experiments”.

    This is entirely wrong mental model even for C and doubly
    so for Rust or Zig. And it takes insane amount of effort
    to teach **every single newcomer** that it's wrong model.
    I have seen **zero** exceptions.

    New name should talk about code, not about behavior.
    “Invalid code” or “forbidden code” or maybe “erroneous
    construct”, but something, anything which is not related
    to what happens in runtime.

    There are no runtime after UB, it's as simple as that.
    The only option if your code have UB is to go and fix
    the code… and yet the name doesn't include anything
    related to code at all and concentrates on entirely wrong
    thing.'

> [...]
>
> If you create a reference from a null pointer, you have a bug whether the program is written in C++ or D.

That is not true. A D program like this:

    void main ()
    {
        int *p = null;
        ref int i = *p; // DMD v2.111.0
    }

is a valid program and there is no UB, no crash and no bug.
I already pointed this out earlier with reference to the D spec.
[3].

[3] https://dlang.org/spec/type.html#pointers
    "When a pointer to T is dereferenced, it must either contain a null
    value, or point to a valid object of type T."

[5] Zde-G kommentiert Blog Post: UB Might Be a Wrong Term for Newer Languages:
    https://old.reddit.com/r/rust/comments/129mz8z/blog_post_ub_might_be_a_wrong_term_for_newer/jep231f/

2 days ago

Re: [RFC] Throwing an exception with null pointers

Posted by GrimMaple
in reply to Richard (Rikki) Andrew Cattermole

Permalink

GrimMaple

Posted in reply to Richard (Rikki) Andrew Cattermole

Permalink

On Monday, 21 April 2025 at 10:45:15 UTC, Richard (Rikki) Andrew Cattermole wrote:

That is the Swift solution to the problem, more or less.

My take here is that null pointers (references/whatever) just shouldn't be, period. The compiler should disallow me to generate a null pointer, unless specifically asked for - like with Nullable!MyType. or, even better, Optional!MyType -- then we can finally ditch that "null pointer semantic" and use "correct" terminology, because most often a null pointer is just used as a quasi-optional type. Any memory allocations that end up in a null pointer, eg new running out of memory, should result in an Exception/Error, not returning null.

I think that having NotNull!T provides a soft transition from having null to not having null at all -- it's fairly trivial to append MaybeNull! to any var type that needs it; and then it's fairly easy to just rename it to Optional c:

2 days ago

Re: NonNull template

Posted by Atila Neves
in reply to Jonathan M Davis

Permalink

Atila Neves

Posted in reply to Jonathan M Davis

Permalink

On Thursday, 17 April 2025 at 22:12:22 UTC, Jonathan M Davis wrote:
> On Thursday, April 17, 2025 11:36:49 AM MDT Dave P. via Digitalmars-d wrote:
>> On Thursday, 17 April 2025 at 16:39:28 UTC, Walter Bright wrote:
> That being said, I honestly think that the concern over null pointers is completely overblown. I can't even remember the last time that I encountered one being dereferenced.

I can, last week. The process crashed, I ran `coredumpctl gdb`, immediately fixed the issue and carried on with my day. By which I mean I agree with you that I don't think it's a big deal either.

> And when I have, it's usually because I used a class and forgot to initialize it, which blows up very quickly in testing rather than it being a random bug that occurs during execution.

That's exactly what I did last week.

2 days ago

Re: NonNull template

Posted by Johan
in reply to Jonathan M Davis

Permalink

Johan

Posted in reply to Jonathan M Davis

Permalink

On Saturday, 19 April 2025 at 22:49:19 UTC, Jonathan M Davis wrote:

On Thursday, April 17, 2025 8:39:27 PM MDT Walter Bright via Digitalmars-d wrote:

I'd like to know what those gdc and ldc transformations are, and whether they are controllable with a switch to their optimizers.

I know there's a problem with WASM not faulting on a null dereference, but in another post I suggested a way to deal with it.

Unfortunately, my understanding isn't good enough to explain those details. I discussed it with Johan in the past, but I've never worked on ldc or with llvm (or on gdc/gcc), so I really don't know what is or isn't possible. However, from what I recall of what Johan said, we were kind of stuck, and llvm considered dereferencing null to be undefined behavior.

There is a way now to tell LLVM that dereferencing null is defined (nota bene) behavior.

It may be the case that there's some sort of way to control that (and llvm may have more capabilities in that regard since I last discussed it with Johan), but someone who actually knows llvm is going to have to answer those questions. And I don't know how gdc's situation differs either.

So far not responded in this thread because I feel it is an old discussion, with old misunderstandings.

There is confusion between dereferencing in the language, versus dereferencing by the CPU. What I think that C and C++ do very well is separate language behavior from implementation/CPU behavior, and only prescribe language behavior, no (or very little) implementation behavior. I feel D should do the same.

Non-virtual method example, where (in my opinion) the dereference happens at call site, not inside the function:

class A {
   int a;
   final void foo() { // non-virtual
      a = 1; // no dereference here
   }
}

A a;
a.foo();  <--  DEREFERENCE

During program execution, with the current D implementation of classes and non-virtual methods, the CPU will only "dereference" the this pointer to do the assignment to a. But that is only the case for our current implementation. For the D language behavior, it does not matter what the implementation does: same behavior should happen on any architecture/platform/execution model.

If you want to fault on null-dereference, I believe you have to add a null-check at every dereference at language level (regardless of implementation details). Perhaps it does not impact performance very much (with optimizer enabled); I vaguely remember a paper from Microsoft where they tried this and did not see a big perf impact (if any).

Some notes to trigger you to think about distinguishing language behavior from CPU/implementation details:

You don't have to implement classes and virtual functions using a vptr/vtable, there are other options!
There does not need to be a "stack" (implementation detail vocabulary). Some "CPUs" don't have a "stack", and instead do "local storage" (language vocabulary) in an alternative way. In fact, even on CPUs with stack, it can help to not use it! (read about Address Sanitizer detection of stack-use-after-scope and ASan's "fake stack")
Pointers don't have to be memory addresses (you probably already know that they are not physical addresses on common CPUs), but could probably be implemented as hashes/keys into a database as well. C does not define ordered comparison (e.g. > and <) for pointers (it's implementation defined, IIRC), except when they point into the same object (e.g. an array or struct). Why? Because what does it mean on segmented memory architectures (i.e. x86)?
Distinguishing language from implementation behavior means that correct programs work the same on all kinds of different implementations (e.g. you can run your C++ program in a REPL, or run it in your browser through WASM).

cheers,
Johan

2 days ago

Re: NonNull template

Posted by Richard (Rikki) Andrew Cattermole
in reply to Johan

Permalink

Richard (Rikki) Andrew Cattermole

Posted in reply to Johan

Permalink

On 22/04/2025 5:29 AM, Johan wrote:
> If you want to fault on null-dereference, I believe you /have/ to add a null-check at every dereference at /language/ level (regardless of implementation details). Perhaps it does not impact performance very much (with optimizer enabled); I vaguely remember a paper from Microsoft where they tried this and did not see a big perf impact (if any).

I agree with what you're saying here, but I want to refine it a little bit.

Every language dereference must have an _associated_ read barrier.

What this means is:

```d
T* ptr;
readbarrier(ptr);
ptr.field1;
ptr.field2;

ptr = ...;
readbarrier(ptr);
ptr.field3;
```

A very simple bit of object tracking when inserting the check, will eliminate a ton of these, tbf we should be doing that for array bounds checking if we are not already.

Also the fast DFA which this would be used with, would eliminate a ton of them, so performance should be a complete non-issue, given how ok we are with array bounds checks.

Top | Forum index | About this forum

Forums

Language support