Null-checked reference types

Null-checked reference types
Aug 06 Quirin Schroll
Aug 06 Tim
Aug 06 Quirin Schroll
Aug 07 Richard (Rikki) Andrew Cattermole
Aug 07 Richard (Rikki) Andrew Cattermole
Aug 07 Quirin Schroll
Aug 07 Richard (Rikki) Andrew Cattermole
Aug 12 Quirin Schroll
Aug 12 Richard (Rikki) Andrew Cattermole
Aug 13 Quirin Schroll
Aug 13 Richard (Rikki) Andrew Cattermole
Aug 07 IchorDev
Aug 07 Quirin Schroll
Aug 07 IchorDev
Aug 07 Quirin Schroll
Aug 07 Sebastiaan Koppe

August 06

Posted by Quirin Schroll

Permalink

Quirin Schroll

Permalink

Proposal for types

Add the following type suffixes to the language: ? and !.

For every reference type (definition excludes slices, see below) T, the meaning of T! is “non-nullable T” and the meaning of T? is “nullable T”, and T without suffix means either T? or T! depending on context. For every non-reference type, T! is a synonym for T; there is T? added for an optional type with the same values as T plus a dedicated null value.

Naturally, T! converts to T? implicitly, but for T? to T!, an explicit cast is required and that cast is @system.

Multiple suffixes are allowed: T?! is T! and T!? is T?. That is, later ! or ? override any previous ones.

Every lexical use of a reference type without ? or ! appended is equivalent to one of them, depending on the module’s default. The module’s default is either specified (default null module m; or default !null module m;) or is the language’s default (which depends on the Edition).

In class member functions, this has ! type.

Any operation that requires a value of type T? to be non-null is a compile-time error.

Add operators for null-respecting access: ?., ?(…) (call if not null), ?[…] (index if not null), ?= (assign if null).

For if (auto x = expr) and if (T! x = expr) if expr is of type T?, x infers type T! and contrary to normal variable definitions, in an if or while condition, T? implicitly converts to T!.

To wrap the rest of the function in the then block of such an if statement, add if (auto x = expr) ... else … to the language. The ... is part of the core syntax and is intended to be read as “whatever follows next.” The else branch is mandatory, and its … means any statement or a possibly empty block. However, for an else block that would be { bool f = false; assert(f); }, add assert(auto x !is expr) to the language. (Note that assert(0) as special semantics and isn’t equivalent to a failed assertion.) Assert with declaration enforces non-null for a possibly null value.

No data flow analysis is proposed. Null checking is local and done by tracking ? and ! by the type system.

Proposal for `ref`

The most difficult one is ref. ref parameters and variables are assumed to be non-null, i.e. for ref x, &x should not be null. To allow for null references, add ref?.

Non-null enforcement of ref should be done even in the current edition to some degree. My bet is not a single D program ever correctly expected and handled a ref returning function returning a null reference. One would have to take the address of the result and test that pointer for null. No-one does that, except some people toying around with the edges of the language intentionally used ref with null.

In the current Edition, because T* is T*?, a dereferenced pointer is a possibly null reference. Binding one by ref would be an error. For this special case, I propose to allow it instead, as some programs would be full of errors (or deprecation warnings) otherwise.

Reference types

In this DIP Idea, reference types are:

Pointer types
Class / interface types
Associative array types
Function pointer types
Delegate types

Slice types are not reference types in this logic because null slices are equivalent to an empty slices for the most part.

August 06

Re: Null-checked reference types

Posted by Tim
in reply to Quirin Schroll

Permalink

Tim

Posted in reply to Quirin Schroll

Permalink

On Tuesday, 6 August 2024 at 14:55:46 UTC, Quirin Schroll wrote:

Add the following type suffixes to the language: ? and !.

Using ! seems to be ambiguous with templates. For example:

void f(T!x)
{}

Is T!x a template instance with template argument x or the new type T! and parameter name x?

August 06

Re: Null-checked reference types

Posted by Quirin Schroll
in reply to Tim

Permalink

Quirin Schroll

Posted in reply to Tim

Permalink

On Tuesday, 6 August 2024 at 15:21:58 UTC, Tim wrote:

On Tuesday, 6 August 2024 at 14:55:46 UTC, Quirin Schroll wrote:

Add the following type suffixes to the language: ? and !.

Using ! seems to be ambiguous with templates. For example:

void f(T!x)
{}

Is T!x a template instance with template argument x or the new type T! and parameter name x?

I totally missed that. It would definitely be parsed as a template instance in that case for backward compatibility. With Primary Type Syntax, you could write (T!) and it’s clear. With the proposed changes, you could be cheeky and write T?! because T? can’t be a template to be instantiated. But… I don’t like any of these.

August 07

Re: Null-checked reference types

Posted by Richard (Rikki) Andrew Cattermole
in reply to Quirin Schroll

Permalink

Richard (Rikki) Andrew Cattermole

Posted in reply to Quirin Schroll

Permalink

Lots here to talk about.

> Add the following type suffixes to the language: ? and !.

I'll ignore the syntax since that is already covered.



> Add operators for null-respecting access: ?., ?(…) (call if not null), ?[…] (index if not null), ?= (assign if null).

This should be split out into a separate DIP. They are things I already want for similar reasons.

However, they must decompose to loads and stores inside if statements. This makes it temporally safe.

``var1.var2?.field = 2;``

```d
if (auto var2 = var1.var2) {
	var2.field = 2;
}
```

This allows you to do both loads and stores and do something if it failed transitively.

```d
if (var1.var2?.var3?.field = 3) {
	// success
} else {
	// failure
}
```



> No data flow analysis is proposed. Null checking is local and done by tracking ? and ! by the type system.

DFA is only required if you want the type state to change as the function is interpreted. So that's fine. That is a me thing to figure out.

However, you do not need to annotate function body variables with this approach.

Look at the initializer of a function variable declaration, it'll tell you if it has the non-null type state.

```d
int* ptr1;
int* ptr2 = ptr1;
```

Function parameters, (including return and this) need to be annotateable with their type state.

```d
int* func(?nonnull return, ?nonnull this, ?nonnull int* ptr) {
	return ptr;
}
```

They can also be inferred by first usage.

```d
void func(int* ptr) {
	int v = *ptr;
}
```

Clearly ``ptr`` has the type state non-null.

However the problem which caused me some problems in the past is on tracking variables outside of a function. You cannot do it.

Variables outside a function change type state during their lifespan. They have the full life cycle, starting at reachable, into non-null and then back to reachable. If you tried to force it to be non-null, the language would force you to have an .init value that is non-null. This is an known issue with classes already. It WILL produce logic errors that are undetectable.

August 07

Re: Null-checked reference types

Posted by Richard (Rikki) Andrew Cattermole
in reply to Richard (Rikki) Andrew Cattermole

Permalink

Richard (Rikki) Andrew Cattermole

Posted in reply to Richard (Rikki) Andrew Cattermole

Permalink

Something else, for ``ref``, ``out`` and return value, you need to be able to express not just the input, but the output and output on exception type states.

I.e.

``void destroy(?nonnull,initialized,initialized ref T*)``

Otherwise you cannot catch a change of type state.

August 07

Re: Null-checked reference types

Posted by IchorDev
in reply to Quirin Schroll

Permalink

IchorDev

Posted in reply to Quirin Schroll

Permalink

On Tuesday, 6 August 2024 at 14:55:46 UTC, Quirin Schroll wrote:

Add the following type suffixes to the language: ? and !.

Reference types are already nullable. Having a way to force them to be non-null would be nice (although you could just use contracts?), but I don’t think having to explicitly mark reference types as nullable makes sense. They’re reference types, of course they’re nullable! Having nullability for value types might be nice too, but again it’s something you can already achieve in other ways.

August 07

Re: Null-checked reference types

Posted by Quirin Schroll
in reply to IchorDev

Permalink

Quirin Schroll

Posted in reply to IchorDev

Permalink

On Wednesday, 7 August 2024 at 04:17:42 UTC, IchorDev wrote:

On Tuesday, 6 August 2024 at 14:55:46 UTC, Quirin Schroll wrote:

Add the following type suffixes to the language: ? and !.

Reference types are already nullable.

Yes, that’s the issue.

Having a way to force them to be non-null would be nice (although you could just use contracts?), but I don’t think having to explicitly mark reference types as nullable makes sense.
They’re reference types, of course they’re nullable!

Reference types are nullable, yet most of the time, actual references aren’t null and expected to be non-null. Prime example: ref. Can be null, never is expected to be in practice, and when it happens to be null, it’s a bug. A bug not by the language semantics, but in practical code. 100% of the time.

Instead of a contract or documentation saying they have to be non-null, the best way is to have the type system enforce it at compile-time. Just to mention two, Kotlin and Zig default to non-nullable references / pointers. You have to annotate nullable ones and handle the null case.

If your module has a default of reference types being non-nullable (because almost all the time, they’re expected not to be null), of course you have to mark them as nullable in case they are. If your module default is reference types being nullable, you have to annotate non-nullable ones.

Having nullability for value types might be nice too, but again it’s something you can already achieve in other ways.

Yes, optionals, which aren’t great to use. Having worked with C#, which has core-language nullable value types, I can tell you, it makes it really nice to work with them. If an indexOf function returns size_t? (or even better: some index_t? which hooks into the null semantics so that it reserves size_t.max for its null state), it’s clear that the case of whatever you’re seeking might not be there as to be accounted for.

August 07

Re: Null-checked reference types

Posted by Quirin Schroll
in reply to Richard (Rikki) Andrew Cattermole

Permalink

Quirin Schroll

Posted in reply to Richard (Rikki) Andrew Cattermole

Permalink

On Wednesday, 7 August 2024 at 01:39:29 UTC, Richard (Rikki) Andrew Cattermole wrote:

> >

Add operators for null-respecting access: ?., ?(…) (call if
not null), ?[…] (index if not null), ?= (assign if null).

This should be split out into a separate DIP. They are things I already want for similar reasons.

Right. They’re useful anyways.

However, they must decompose to loads and stores inside if statements. This makes it temporally safe.

var1.var2?.field = 2;

if (auto var2 = var1.var2) {
	var2.field = 2;
}

That seems like a little too much magic. var?.x = 0 can be two things, a field assignment or a property setter call. For the latter, the property setter call wouldn’t be executed if var is null.

It may sound harsh, but probably, it’s best not to allow assignments like that and require the programmer write it out:

if (Field* field = &var1.var2?.field) *field = 2;
if (void delegate(Field) setter = &var1.var2?.field) setter(2);

This allows you to do both loads and stores and do something if it failed transitively.

if (var1.var2?.var3?.field = 3) {
	// success
} else {
	// failure
}

I somehow don’t like if (… = …) when it’s not a declaration. At first sight, I thought you intended … == 3.

> >

No data flow analysis is proposed. Null checking is local and
done by tracking ? and ! by the type system.

DFA is only required if you want the type state to change as the function is interpreted. So that's fine. That is a me thing to figure out.

If I understand correctly, by “type state” you means something like value range propagation. It basically is value range propagation, however the ranges in question are null and all non-null values. You don’t suggest typeof type of a variable or expression changes, correct? (I think that would be very weird.)

However, you do not need to annotate function body variables with this approach.

Look at the initializer of a function variable declaration, it'll tell you if it has the non-null type state.

int* ptr1;
int* ptr2 = ptr1;

The only issue is, just because e.g. a pointer is initialized with something non-null (e.g. the address of a variable), that doesn’t mean some logic later won’t assign null to it.

Function parameters, (including return and this) need to be annotateable with their type state.

int* func(?nonnull return, ?nonnull this, ?nonnull int* ptr) {
	return ptr;
}

So, using my syntax, that would be:

int*! func(int*! ptr) => ptr;

If we want null-annotations for member functions’ this, my suggestion would be to use null as a function attribute.

int*! func(int*! ptr) null => ptr;

It’s similar how scope and return as member function attributes work.

They can also be inferred by first usage.

void func(int* ptr) {
	int v = *ptr;
}

Clearly ptr has the type state non-null.

We can only do this when stuff is to be inferred. Otherwise, it would be weird changing a precisely given signature because of what’s going on inside the function.

In a context where int* is nullable, the usage of *ptr could suggest declaring it non-nullable.

However the problem which caused me some problems in the past is on tracking variables outside of a function. You cannot do it.

Variables outside a function change type state during their lifespan. They have the full life cycle, starting at reachable, into non-null and then back to reachable. If you tried to force it to be non-null, the language would force you to have an .init value that is non-null. This is an known issue with classes already. It WILL produce logic errors that are undetectable.

I don’t care much about tracking. Probably, with if (auto) ..., you can just rename the variable, but typed non-nullable:

void f(int*? p)
{
    if (int* q = p) ... else return;
    int v = *q; // no error, q isn’t nullable, not by analysis, just by type
}

August 07

Re: Null-checked reference types

Posted by Richard (Rikki) Andrew Cattermole
in reply to Quirin Schroll

Permalink

Richard (Rikki) Andrew Cattermole

Posted in reply to Quirin Schroll

Permalink

On 07/08/2024 11:22 PM, Quirin Schroll wrote:
> On Wednesday, 7 August 2024 at 01:39:29 UTC, Richard (Rikki) Andrew Cattermole wrote:
>> This allows you to do both loads and stores and do something if it failed transitively.
>>
>> ```d
>> if (var1.var2?.var3?.field = 3) {
>>     // success
>> } else {
>>     // failure
>> }
>> ```
> 
> I somehow don’t like `if (… = …)` when it’s not a declaration. At first sight, I thought you intended `… == 3`.

It's going to be valid regardless, due to AssignExpression.

>> > No data flow analysis is proposed. Null checking is local and
>> done by tracking ? and ! by the type system.
>>
>> DFA is only required if you want the type state to change as the function is interpreted. So that's fine. That is a me thing to figure out.
> 
> If I understand correctly, by “type state” you means something like value range propagation. It basically *is* value range propagation, however the ranges in question are `null` and all non-null values. You don’t suggest `typeof` type of a variable or expression changes, correct? (I think that would be very weird.)

No, I meant type state.

https://en.wikipedia.org/wiki/Typestate_analysis

unreachable < reachable < initialized < default-initialized < non-null < user

>> However, you do not need to annotate function body variables with this approach.
>>
>> Look at the initializer of a function variable declaration, it'll tell you if it has the non-null type state.
>>
>> ```d
>> int* ptr1;
>> int* ptr2 = ptr1;
>> ```
> 
> The only issue is, just because e.g. a pointer is initialized with something non-null (e.g. the address of a variable), that doesn’t mean some logic later won’t assign `null` to it.

Right, that would have to be disallowed without DFA, since the type state must not change throughout a function body.

>> However the problem which caused me some problems in the past is on tracking variables outside of a function. You cannot do it.
>>
>> Variables outside a function change type state during their lifespan. They have the full life cycle, starting at reachable, into non-null and then back to reachable. If you tried to force it to be non-null, the language would force you to have an .init value that is non-null. This is an known issue with classes already. It WILL produce logic errors that are undetectable.
> 
> I don’t care much about tracking. Probably, with `if (auto) ...`, you can just rename the variable, but typed non-nullable:
> 
> ```d
> void f(int*? p)
> {
>      if (int* q = p) ... else return;
>      int v = *q; // no error, q isn’t nullable, not by analysis, just by type
> }
> ```

What matters here is that you do not need to add annotation to the type itself. It only needs to exist within the function signature. Anywhere else its useless information.

August 07

Re: Null-checked reference types

Posted by IchorDev
in reply to Quirin Schroll

Permalink

IchorDev

Posted in reply to Quirin Schroll

Permalink

On Wednesday, 7 August 2024 at 10:13:05 UTC, Quirin Schroll wrote:

Reference types are nullable, yet most of the time, actual references aren’t null and expected to be non-null.

Well that’s why associative arrays implicitly allocate themselves. I don’t think that would work for classes though…

Prime example: ref. Can be null, never is expected to be in practice, and when it happens to be null, it’s a bug. A bug not by the language semantics, but in practical code. 100% of the time.

Uh yeah… we should be preventing that.

Interesting. Do people who use those languages actually like that?

> >

Having nullability for value types might be nice too, but again it’s something you can already achieve in other ways.

Well there you go, maybe for value types we need something like the range interface but for nullability? And then for reference types can just do is null.

Top | Forum index | About this forum

Forums

Proposal for types

Proposal for ref

Reference types

Proposal for `ref`