Sum Types - first draft (page 2) - D Programming Language Discussion Forum

Settings

Help

Index » DIP Development » Sum Types - first draft (page 2)

September 11

Re: Sum Types - first draft

Posted by Richard (Rikki) Andrew Cattermole
in reply to Paul Backus

Permalink

Richard (Rikki) Andrew Cattermole

Posted in reply to Paul Backus

Permalink

On 11/09/2024 5:23 AM, Paul Backus wrote:
> On Tuesday, 10 September 2024 at 17:05:49 UTC, Walter Bright wrote:
>> Thanks for your detailed response. Let me address just one for the moment:
>>
>> On 9/10/2024 9:20 AM, Paul Backus wrote:
>>>> * std.sumtype cannot optimize the tag out of existence, for example, when having:
>>>>
>>>>       enum Option { None, int* Ptr }
>>>
>>> A built-in sum type would not be able to do this either, because in D, every possible sequence of 4 bytes is a potentially-valid int* value.
>>>
>>> The reason Rust is able to perform this optimization is that Rust has non-nullable reference types [2]. If D had non-nullable pointer types, then std.sumtype could perform the same optimization using reflection and `static if`.
>>
>> I was approaching it from the other way around. Isn't a non-nullable pointer a sumtype? Why have both non-nullable types and sumtypes?
> 
> You have it exactly backwards. A _nullable_ pointer type is the sum of a non-nullable pointer type and typeof(null).
> 
> A non-nullable pointer type is a pointer type with its range of valid values restricted. You could think of it as a "difference type"--if you take T*, and _subtract_ typeof(null) from it (i.e., take the set difference [1] of their values), you get a non-nullable pointer type.
> 
> [1] https://en.wikipedia.org/wiki/Complement_(set_theory)#Relative_complement

Yes, Paul is correct.

A non-null pointer is guaranteed by the compiler (at compile time), that it may ONLY point to a valid value that is directly usable.

A nullable pointer replaces guarantees for UNCERTAINTY.

It could point to unmapped memory (i.e. null), junk, or something completely different. The only guarantee is that the pointer itself exists, what it holds is entirely unknown.

To resolve this you introduce type state analysis to make guarantees of non-null. Which I have wanted for quite a while now. The amount of uncertainty for pointers in D right now are not good enough if you want to guarantee memory safety.

An interesting paper on this subject is [Blame for Null](https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECOOP.2020.3).

It reviews a number of languages, and proves using lambda calculus that only nullable pointers can introduce runtime errors (null dereferencing).

Of note is that sum types are only discussed as an implementation detail of Scala.

"
Keeping λnull simple. We could reduce the number of function types and avoid the need for safe applications through a combination of sum types and case analysis. For example, in Scala nullable values are represented with sum types (e.g. a nullable string has type String | Null). The case analysis in turn requires support for flow-typing:
"

September 10

Re: Sum Types - first draft

Posted by Paul Backus
in reply to Richard (Rikki) Andrew Cattermole

Permalink

Paul Backus

Posted in reply to Richard (Rikki) Andrew Cattermole

Permalink

On Tuesday, 10 September 2024 at 18:03:32 UTC, Richard (Rikki) Andrew Cattermole wrote:

A non-null pointer is guaranteed by the compiler (at compile time), that it may ONLY point to a valid value that is directly usable.

This is actually not true. You are conflating two separate properties: whether a pointer is null, and whether it is valid to dereference.

In D, we have @safe to guard against invalid pointers, but we do not have any way to guard against null pointers. (Indeed, because of this, @safe relies on null being valid to dereference!)

September 10

Re: Sum Types - first draft

Posted by jmh530
in reply to Paul Backus

Permalink

jmh530

Posted in reply to Paul Backus

Permalink

On Tuesday, 10 September 2024 at 16:20:55 UTC, Paul Backus wrote:

On Tuesday, 10 September 2024 at 04:06:16 UTC, Walter Bright wrote:

https://github.com/WalterBright/documents/blob/96bca2f9f3520cf53ed5c4dec8e5e2d855e64e66/sumtype.md

Summary of comments

Special cases are bad.
New capabilities should ideally be general-purpose, not sumtype-specific.
Sumtype syntax should be modeled after unions, not enums.
[snip]

Given your involvement with sumtype, it may be useful to provide specific recommendations about what enhancements to the language would be valuable.

September 10

Re: Sum Types - first draft

Posted by Walter Bright
in reply to Dennis

Permalink

Walter Bright

Posted in reply to Dennis

Permalink

Great! Seems like it has improved substantially since I wrote the proposal a couple years ago.

What are its shortcomings presently?

September 10

Re: Sum Types - first draft

Posted by Walter Bright
in reply to Paul Backus

Permalink

Walter Bright

Posted in reply to Paul Backus

Permalink

On 9/10/2024 10:23 AM, Paul Backus wrote:
>> I was approaching it from the other way around. Isn't a non-nullable pointer a sumtype? Why have both non-nullable types and sumtypes?
> 
> You have it exactly backwards. A _nullable_ pointer type is the sum of a non-nullable pointer type and typeof(null).

"the sum of ..." doesn't make it a sumtype?

> A non-nullable pointer type is a pointer type with its range of valid values restricted. You could think of it as a "difference type"--if you take T*, and _subtract_ typeof(null) from it (i.e., take the set difference [1] of their values), you get a non-nullable pointer type.
> 
> [1] https://en.wikipedia.org/wiki/Complement_(set_theory)#Relative_complement

How is that different from a sumtype?

September 10

Re: Sum Types - first draft

Posted by Paul Backus
in reply to Walter Bright

Permalink

Paul Backus

Posted in reply to Walter Bright

Permalink

On Tuesday, 10 September 2024 at 19:13:33 UTC, Walter Bright wrote:

On 9/10/2024 10:23 AM, Paul Backus wrote:

> >

I was approaching it from the other way around. Isn't a non-nullable pointer a sumtype? Why have both non-nullable types and sumtypes?

You have it exactly backwards. A nullable pointer type is the sum of a non-nullable pointer type and typeof(null).

"the sum of ..." doesn't make it a sumtype?

A non-nullable pointer type is a pointer type with its range of valid values restricted. You could think of it as a "difference type"--if you take T*, and subtract typeof(null) from it (i.e., take the set difference [1] of their values), you get a non-nullable pointer type.

[1] https://en.wikipedia.org/wiki/Complement_(set_theory)#Relative_complement

How is that different from a sumtype?

Let's walk through this very slowly, one step at a time.

One way to define a type is as a set of possible values.

For example:

bool = {false, true}
ubyte = {0, 1, 2, ..., 255}

Suppose we define a sum type of these two types:

Example = bool + ubyte
// Equivalent to:
// alias Example = SumType!(bool, ubyte);

What is the set of possible values for the Example type? It's the set that contains all the elements of the bool set, and all the elements of the ubyte set--in other words, the set union. [1]

Example = {false, true} ∪ {0, ..., 255} = {false, true, 0, ..., 255}

Now, what are the possible values of a normal, nullable pointer type, like the ones we have in D today? Let's say we're on a 32-bit architecture, just to make the numbers easier to write. In that case:

uybte* = {null, 0x00000001, 0x00000002, ..., 0xFFFFFFFF}

Each 32-bit integer corresponds to a distinct pointer value, with 0 corresponding to null.

Now, let's say we add non-nullable pointers to D, so that for every pointer type T*, there's a new type nonnull(T*) which is just like T*, except that it can't be null. What's the set of values for nonnull(ubyte*)?

nonnull(ubyte*) = {0x00000001, 0x00000002, ..., 0xFFFFFFFF}

Naturally, it's the set of values for ubyte* with the value null removed. In set theory terms, we could write it as the difference between the ubyte* set, and the set that contains the single value null:

nonnull(ubyte*) = ubyte* - {null}

As it turns out, there is actually a type in D that corresponds to the set {null}--it's called typeof(null):

typeof(null) = {null}

So, by substitution, we can write:

nonnull(ubyte*) = ubyte* - typeof(null)

Or, in English: the type nonnull(ubyte*) is the difference between the types ubyte* and typeof(null).

Finally, what happens if we create a sum type of nonnull(ubyte*) and typeof(null)?

Sum = nonnull(ubyte*) + typeof(null)
Sum = {0x00000001, ..., 0xFFFFFFFF} ∪ {null}
Sum = {null, 0x00000001, ..., 0xFFFFFFFF}

Wait a minute...we've seen that set before! It's the set for ubyte*!

So, once again, by substitution, we can write:

Sum = ubyte*
ubyte* = nonnull(ubyte*) + typeof(null)

Or, in English: the type ubyte* is the sum of the types nonnull(ubyte*) and typeof(null).

[1] https://en.wikipedia.org/wiki/Union_(set_theory)

September 10

Re: Sum Types - first draft

Posted by monkyyy
in reply to Paul Backus

Permalink

monkyyy

Posted in reply to Paul Backus

Permalink

On Tuesday, 10 September 2024 at 16:20:55 UTC, Paul Backus wrote:

On Tuesday, 10 September 2024 at 04:06:16 UTC, Walter Bright

std.sumtype cannot optimize the tag out of existence, for example, when having:
```
enum Option { None, int* Ptr }
```

A built-in sum type would not be able to do this either, because in D, every possible sequence of 4 bytes is a potentially-valid int* value.

Isnt the int*=cast()0 labled as invalid by the type theory inherited from c?

You shouldnt do such a thing if you were designing from scratch but I think walters correct here, you could even if you shouldnt

Likewise you could collapse nullable!float into a wrapper of float and use nan as a invalid state; you shouldnt and nan should be destroyed; but you could.

September 10

Re: Sum Types - first draft

Posted by Walter Bright
in reply to Paul Backus

Permalink

Walter Bright

Posted in reply to Paul Backus

Permalink

On 9/10/2024 10:23 AM, Paul Backus wrote:
>> I was approaching it from the other way around. Isn't a non-nullable pointer a sumtype? Why have both non-nullable types and sumtypes?
> 
> You have it exactly backwards. A _nullable_ pointer type is the sum of a non-nullable pointer type and typeof(null).

"is the sum of..." makes it a sum type, doesn't it?

September 11

Re: Sum Types - first draft

Posted by Paul Backus
in reply to Walter Bright

Permalink

Paul Backus

Posted in reply to Walter Bright

Permalink

On Tuesday, 10 September 2024 at 22:57:05 UTC, Walter Bright wrote:
> On 9/10/2024 10:23 AM, Paul Backus wrote:
>>> I was approaching it from the other way around. Isn't a non-nullable pointer a sumtype? Why have both non-nullable types and sumtypes?
>> 
>> You have it exactly backwards. A _nullable_ pointer type is the sum of a non-nullable pointer type and typeof(null).
>
> "is the sum of..." makes it a sum type, doesn't it?

Let's compare and contrast your statement and my statement side by side:

    Yours: Isn't a non-nullable pointer    a sum type?
    Mine:        A     nullable pointer is a sum type.

Do you see the difference now?

I'm happy to explain anything that's unclear, but at this point, I'm beginning to suspect that you are only skimming my messages, not reading them all the way through.

September 11

Re: Sum Types - first draft

Posted by Dukc
in reply to Walter Bright

Permalink

Dukc

Posted in reply to Walter Bright

Permalink

On Tuesday, 10 September 2024 at 04:06:16 UTC, Walter Bright wrote:

https://github.com/WalterBright/documents/blob/96bca2f9f3520cf53ed5c4dec8e5e2d855e64e66/sumtype.md

I wrote that some time ago back in November 2022. The idea is to have a sumtypes proposal, followed by a match proposal.

Previous discussions:

https://www.digitalmars.com/d/archives/digitalmars/D/sumtypes_for_D_366242.html

https://www.digitalmars.com/d/archives/digitalmars/D/Sum_type_the_D_way_366389.html

https://www.digitalmars.com/d/archives/digitalmars/D/draft_proposal_for_Sum_Types_for_D_366307.html

Reviewing without having looked at other replies first.

Please study std.sumtype a bit more. You list many shortcomings that aren't actually there. It very much can provide a compile-time error if not all arms are accounted for. It is safe to use with an int and a pointer. And it can provide regular enum members, albeit in a bit roundabout way.

Also, everything I wrote here still applies. Pasting here for reference.

We don't want this special case for pointers - or at least it needs to be much, much more refined before it carries it's weight. If I have sumtype S { a, int* b }, S.a == S.b(null);, right? Well, why doesn't the DIP say the same should happen with sumtype S { a, Object b } ? Even more interesting case, sumtype S { a, b, c, d, bool e}. A boolean has 254 illegal bit patterns - shouldn't they be used for the tag in this case? And what happens with sumtype S {a, int* b, int* c}? Since we need space for a separate tag anyway, does it make sense for null b to be equal to a?

The proposed special case doesn't help much. If one wants a pointer and a special null value, one can simply use a pointer. On the other hand, one might want a pointer AND a separate tag value. To accomplish that, the user will have to either put the 0 value to the end or do something like sumtype S {int[0] a, int* b}. Certainly doable, but it's a special case with no good reason.

The query expression is not a good idea. This introduces new syntax that isn't consistent with rest of the langauge. Instead, I propose that each sumtype has a member function has, that returns a DRuntime-defined nested struct with an opDispatch defined for quessing the tag:

sumtype Sum {int a, float b, dchar c}

auto sum = Sum.b(2.5);

assert(!sum.has.a);
assert(sum.has.b);
assert(!sum.has.c);

Alternatively, we can settle for simply providing a way for the user to get the tag of the sumtype. Then he can use that tag as he'd use it in case of a regular enum. In fact we will want to provide tag access in any case, because the sum type is otherwise too hard to use in switch statements.

Top | Forum index | About this forum

Forums

Summary of comments