draft proposal for Sum Types for D (page 7)

Settings

Help

Index » General » draft proposal for Sum Types for D (page 7)

November 29, 2022

Re: draft proposal for Sum Types for D

Posted by Steven Schveighoffer
in reply to Walter Bright

Permalink

Steven Schveighoffer

Posted in reply to Walter Bright

Permalink

On 11/29/22 3:04 PM, Walter Bright wrote:

On 11/29/2022 9:14 AM, Steven Schveighoffer wrote:

In the DIP, it says that "Member functions of field declarations are restricted the same way union member functions are." What does this mean? I can't find any information on this in the spec.

An example of what is not allowed would be helpful.

I thought it mentioned that copy constructors, postblits, and destructors are not allowed.

It mentions postblit and destructors on the union itself are not allowed. It says nothing on field declarations.

These make sense given that a union itself cannot know what the actual data stored its (unless you embed the type somehow in the data itself). However, a sumtype would not have that problem.

Honestly, the trickiest parts of making a library sumtype are the lifetime issues. I would hope the builtin thing would solve this for us.

-Steve

November 30, 2022

Re: draft proposal for Sum Types for D

Posted by Andrey Zherikov
in reply to Walter Bright

Permalink

Andrey Zherikov

Posted in reply to Walter Bright

Permalink

On Tuesday, 29 November 2022 at 06:26:20 UTC, Walter Bright wrote:

Go ahead, Make My Day! Destroy!

https://github.com/WalterBright/DIPs/blob/sumtypes/DIPs/1NNN-(wgb).md

cannot produce compile time error if not all the arms are accounted for in a pattern match rather than a thrown exception

It does:

SumType!(int, string) s;
s.match!((int i) => true);  // Error: static assert:  "No matching handler for types `(string)`"

if (?x.busy)

Please consider moving ? to be after identifier so the following can be allowed:

sumtype C
{
    Default,
    int value
}
sumtype B
{
    Default,
    C c
}
sumtype A
{
    Default,
    B b
}

A a;
if(a.b?.c?.value?)
  ...

Compare with if(?a.b && ?a.b.c && ?a.b.c.value)

This will also allow natural improving of ternary expression: a.b?:123 === (a.b?) ? (a.b) : (123). (may be just a.b ? 123?)

Also if you even think about adding optional type, ?-suffix will perfectly match here:

int? optional;
if(optional?)
  writeln("it's set to ", optional);

string? s;
writeln("the value is ", s?:"not set");

November 30, 2022

Re: draft proposal for Sum Types for D

Posted by Walter Bright
in reply to Per Nordlöw

Permalink

Walter Bright

Posted in reply to Per Nordlöw

Permalink

On 11/29/2022 12:37 PM, Per Nordlöw wrote:
> ```d
> sumtype Result
> {
>      Error,
>      int Value
> }
> ```
> 
> , is
> 
> ```d
> Result res;
> res = 25;
> ```
> 
> supposed to be supported aswell?

At the moment, no, should be:

     res.Value = 25;

> If so, why not

I'd reframe that as why do it?

On a pragmatic note, there may be multiple matches for an int in the sumtype.

November 30, 2022

Re: draft proposal for Sum Types for D

Posted by Zealot
in reply to Andrey Zherikov

Permalink

Zealot

Posted in reply to Andrey Zherikov

Permalink

On Wednesday, 30 November 2022 at 08:49:00 UTC, Andrey Zherikov wrote:

On Tuesday, 29 November 2022 at 06:26:20 UTC, Walter Bright wrote:

[...]

It does:

SumType!(int, string) s;
s.match!((int i) => true);  // Error: static assert:  "No matching handler for types `(string)`"

[...]

+1
this would actually reduce a lot of clutter.

November 30, 2022

Re: draft proposal for Sum Types for D

Posted by deadalnix
in reply to Paul Backus

Permalink

deadalnix

Posted in reply to Paul Backus

Permalink

On Wednesday, 30 November 2022 at 04:09:10 UTC, Paul Backus wrote:
> Off the top of my head, I can tell you that most of the code volume comes from having to copy and paste each function 4 times to handle different mutability qualifiers (mutable, const, immutable, inout). Which is tedious (and a symptom of a missing language feature IMO), but not exactly complex.

It's amazing how many of these these discussion surfaced. This is the actual things that we need to fix.

And when they are fixed, if we still feel like the need for builtin sum types remains, then yes, go for it. In the meantime, it is just adding to the pile of things that don't really work.

November 30, 2022

Re: draft proposal for Sum Types for D

Posted by Quirin Schroll
in reply to Walter Bright

Permalink

Quirin Schroll

Posted in reply to Walter Bright

Permalink

On Tuesday, 29 November 2022 at 06:26:20 UTC, Walter Bright wrote:

Go ahead, Make My Day! Destroy!

https://github.com/WalterBright/DIPs/blob/sumtypes/DIPs/1NNN-(wgb).md

Feedback

In Alternative Syntax the following is not supported by the grammar you provided:

sumtype Option(T) { None, Some(T) }

It should probably be

sumtype Option(T) { None, T Some }

Maybe you should mention anonymous sumtype (cf. anonymous struct and anonymous union).
As for a keyword, you could circumvent the problem by re-using existing keywords. enum union would be a candidate.
“Members of sumtypes cannot have copy constructors, postblits, or destructors.” Limits its application severely. Andrei had a talk explaining how destruction of std::variant in C++ could be done efficiently using static foreach – if C++ had it. I don’t see how a copy constructor/postblit is evil.
“Sumtypes can be copied if all its constituent types can be copied.”

Discussion

It seems it tries to be too many things at once and that is confusing. It tries to be:

An algebraic union with a tagged union as its special case
An optional type
Some form of non-nullable annotation.

Let’s talk about 3. first: The recognition of a magical pattern is bad; it is backwards and leads to surprises. Were reference types non-nullable from day one, enum union { typeof(null), int* } would be a great candidate for a nullable pointer to int.

The same way typeof(return) is special-cased in the grammar (the token return not being an expression), you could at least special-case null (the keyword) as a possible case of a D sum type. However, this is still a confusing design because it looks like you’re adding a case, but actually it removes a value form the overall type. I’ve heard of negative types in type theory before, but to be honest, I don’t know much about it. Better than seemingly adding null would be to add a (visually) negative null: Allow the pseudo-member !null or -null. It’s still weird and maybe even hard to teach, but at least we can tell people: “The same way you can add something to 2 to make it 1, namely −1, you can add -typeof(null) to int*, you get an int* that cannot be null. For your convenience, instead of -typeof(null) you can write -null.”

A sum type consisting of at least one reference type (pointers, classes, AAs) may include -null as a special member that makes the nullable options non-nullable.

We can even add syntax for simple non-nullable reference types: int*-null or int*\null or int*! or whatever you like. Likely, it’d be a lowering to a template in object.d named nonnull or something similar.

As for 1. and 2., those make sense. Adding a single, distinct value to a type makes this an optional type. A template optional in Phobos would become a vocabulary type and makes people not implement their own optionals and be implemented using a sum type. This could even be added to object.d with syntax.

I still think that D would be a better and safer language if it had non-null reference types: In such a hypothetical D, an int* always points to an int, but an int*? might be null. For value types, something like int? can be done in two ways: Either reserve an otherwise invalid value as the null value (possibly −2³²) or make int? an alias for enum union NullableInt { typeof(null), int } – the first option is a non-starter because of backward compatibility, but an entirely new language could do that. User-defined types can have a compile-time constant opNull that tells the compiler: “Not all combinations of values for my members signify a valid object; use this as the ‘null’ object instead of adding a boolean tag when I’m combined with ?.” It’s probably worth exploring how much of that can be done and to what cost.

Why all this talk about optional types? Because they play well with sum types. Asking for a possible variant of a sum type is inherently returning an optional value. The first step to getting sum types right is getting optional types right. That doesn’t mean optional types cannot be a special case of sum types. In that sense, optional types are the most relevant application of sum types.

  EnumUnionDeclaration:
      `enum` `union` Identifier EnumUnionBody
      EnumUnionTemplateDeclaration
+     AnonymousEnumUnionDeclaration
+
+ AnonymousEnumUnionDeclaration:
+     `enum` `union` EnumUnionBody

  EnumUnionTemplateDeclaration:
      `enum` `union` Identifier TemplateParameters Constraint (opt) EnumUnionBody

  EnumUnionBody:
      `{` EnumUnionMembers `}`

  EnumUnionMembers:
      EnumUnionMember
      EnumUnionMember `,`
      EnumUnionMember `,` EnumUnionMembers

  EnumUnionMember:
+     'null'
+     '!null'
      EnumMemberAttributes EnumUnionMember
      EnumMember
      FieldDeclaration

  EnumMember:
-     Identifier
-     Identifier = AssignExpression
+     case Identifier
+     case Identifier = AssignExpression

  FieldDeclaration:
-     Type Identifier
-     Type Identifier = AssignExpression
+     Type Identifier(opt)
+     Type Identifier(opt) = AssignExpression

  QueryExpression:
      `?` PostfixExpression `.` Identifier

I mentioned null and !null above. null is a shorthand for typeof(null). Omitting the Identifier is allowed only when the EnumMember is the only one with its type and the declaration is not anonymous.

Basically we have these categories of sum types:

Optionals: 1 member (usually unnamed) plus null.
Non-nulls: 1 member (usually unnamed) of reference type minus null (= plus !null).
Commutative: ≥ 2 members (usually unnamed) with different types.
Homogeneous: ≥ 2 named members with the same type.
Algebraic: All members named, potentially with repeated types, potentially self-referential.
Non-null commutative: ≥ 2 members (usually unnamed) with different types among which are reference types plus !null
Non-null algebraic: All members named, potentially with repeated types among which are reference types (potentially self-referential) plus !null.

A member can be an untyped named constant. Those are equivalent to members of unique unit type. But if types are optional and identifiers are optional, how do we know which is which? We don’t. We need something to disambiguate. I went with case for something that’s maybe acceptable. Here, case can be read as a type: “Make a unique case type for this.” The case in case x = init is Case_x defined as struct Case_x { typeof(init) value = init; }. Because the type is a unit type, its value need not be stored in the instance.
If all members are case members, the type is effectively a plain-old enum. The union (not the tag field) can be elided as the tag field suffices to retrieve the values. The only difference to regular enum I see is that the DIP proposes that for the tag field, the smallest available integral type be used, whereas enum by default has int values.

I call the (usually unnamed) type-distinguished sum types commutative because there’s no reason whatsoever to treat int + string and string + int differently. There is the int member and the string member. It’s the same type spelled differently the same way ordering of type constructors or function attributes (on e.g. function pointer types) not only makes no difference, but creates the very same type. One does not simply reorder a struct, but one can reorder a union and commutative sum types are close enough. Algebraic sum types are too much struct-like to be reordered (naming matters, knowing the type to extract does not suffice).

The member querying syntax is okay, I guess it can be improved. While value.member? reads best, it meaningfully conflicts with the trinary operator, so the second best would be value.?member. If a sum type is an optional type, i.e. it is enum union X {null, T} for some T, it’d be great to have some C#-esque ?. and ?? operators as well: optionalcat?.name returns an optional string: null if there is no cat and the cat’s name if there is a cat. This conflicts with the trinary operator as well, but the ? followed by module-scope . without a space should be virtually non-existent. For a nullable value v, v ?? d returns v if it’s not null (but typed as not null) and d (default, lazy evaluated) if it is. A . could implicitly be .? if the member after it is followed up by ?? or ?., i.e. value.member?.toString() is actually value.?member?.toString() and yourpet.cat ?? mycat is actually yourpet.?cat ?? mycat because you might not have a cat.

Querying for a commutative sum type would usually be done via the type: The sum type converts implicitly to an optional of any of its constituent types:

// object.d
enum union SumType(Ts...) { Ts }
enum union NotNull(Ts...) if (/* any is reference type */) { !null, Ts }
enum union Nullable(T) if (/* T is not reference type and has no opNull */) { null, T }
// main.d
StringOrInt = SumType!(string, int);
StringOrInt stringOrInt;
if (string? s = stringOrInt) { }
else (int? i = stringOrInt) { }
else assert(0);

Also, stringOrInt is int and stringOrInt !is int would work.

For an associative array, aa[key] can return an optional value instead of throwing a RangeError. That would enable aa[key] ?? d; and aa[key] ??= value can set the value only if key is not present already.

For a (nullable) pointer ptr, the dereference expression *ptr can be treated like an optional value if it is followed by ?? or ?., and ptr?.member

I see one problem with ?. vs. .?, but the rule is easy: The question mark is where the optional is. For .?, the left operand is a concrete value and the member on the right side might or might not be present. If optional sum types are a common thing, maybe we need ?.?: lhs?.?member is: “Give me the member’s value if lhs isn’t null and member is there; otherwise give me null.”

Sorry for the long post.

November 30, 2022

Re: draft proposal for Sum Types for D

Posted by Basile.B
in reply to Andrey Zherikov

Permalink

Basile.B

Posted in reply to Andrey Zherikov

Permalink

On Wednesday, 30 November 2022 at 08:49:00 UTC, Andrey Zherikov wrote:

On Tuesday, 29 November 2022 at 06:26:20 UTC, Walter Bright wrote:

Go ahead, Make My Day! Destroy!

https://github.com/WalterBright/DIPs/blob/sumtypes/DIPs/1NNN-(wgb).md

cannot produce compile time error if not all the arms are accounted for in a pattern match rather than a thrown exception

It does:

SumType!(int, string) s;
s.match!((int i) => true);  // Error: static assert:  "No matching handler for types `(string)`"

if (?x.busy)

Please consider moving ? to be after identifier so the following can be allowed:

sumtype C
{
    Default,
    int value
}
sumtype B
{
    Default,
    C c
}
sumtype A
{
    Default,
    B b
}

A a;
if(a.b?.c?.value?)
  ...

Compare with if(?a.b && ?a.b.c && ?a.b.c.value)

This will also allow natural improving of ternary expression: a.b?:123 === (a.b?) ? (a.b) : (123). (may be just a.b ? 123?)

Also if you even think about adding optional type, ?-suffix will perfectly match here:

int? optional;
if(optional?)
  writeln("it's set to ", optional);

string? s;
writeln("the value is ", s?:"not set");

That would not generally work, or that would not be efficient. In the background "optional access" allocates a default value at the use site, meanings an alloca, and then you select the default or the right one if evaluation has succeed. Optional access is more tied to the language. Optional access does not generate good unoptimized code, let's not making its D version worst that what already exists.

You see a thing like

a?.b = c

is like

typeof(a.b) fallback;
(a.b ? a.b : fallback) = c;

where the ternary yields a lvalue.

This causes problems with members that will be eventually used in optional accesses. The default value must be related to that particular expression.

Using a sum type for optional access is still a hack.

November 30, 2022

Re: draft proposal for Sum Types for D

Posted by Basile.B
in reply to Basile.B

Permalink

Basile.B

Posted in reply to Basile.B

Permalink

On Wednesday, 30 November 2022 at 14:22:08 UTC, Basile.B wrote:

On Wednesday, 30 November 2022 at 08:49:00 UTC, Andrey Zherikov wrote:

[...]

You see a thing like

a?.b = c

is like

typeof(a.b) fallback;
(a.b ? a.b : fallback) = c;

where the ternary yields a lvalue.

This causes problems with members that will be eventually used in optional accesses. The default value must be related to that particular expression.

Using a sum type for optional access is still a hack.

TLDR; I mean the member of an aggregate is not necessarily a sumtype, but you want optional access on it. Because of that I think that optional access should be a well defined expression in a language. Let's not merge two things because they partially intersect.

November 30, 2022

Re: draft proposal for Sum Types for D

Posted by Quirin Schroll
in reply to Timon Gehr

Permalink

Quirin Schroll

Posted in reply to Timon Gehr

Permalink

On Tuesday, 29 November 2022 at 14:46:35 UTC, Timon Gehr wrote:

On 11/29/22 07:26, Walter Bright wrote:

Go ahead, Make My Day! Destroy!

https://github.com/WalterBright/DIPs/blob/sumtypes/DIPs/1NNN-(wgb).md

Maybe consider changing the syntax to something like:

sumtype ST{
a;
int* b;
}

The reason is that with comma-separated values, metaprogramming is hobbled.

[snip]

Similar for static foreach. The fact that this does not work for enums is among the most annoying limitations of enums.

I wanted to write something like this in my post, but it really didn’t fit. As for enum, can’t semicolon syntax just be added? Can’t we just both allow enum X { A; B; } and enum { A, B }? (We don’t need to deprecate the comma syntax, it works fine except when it’s a hindrance. It’s not evil.)

November 30, 2022

Re: draft proposal for Sum Types for D

Posted by Adam D Ruppe
in reply to Walter Bright

Permalink

Adam D Ruppe

Posted in reply to Walter Bright

Permalink

On Wednesday, 30 November 2022 at 03:23:42 UTC, Walter Bright wrote:
> As I expected, the threads on sum types have a lot of posts, and each reply I make spawns many more. I apologize if I don't respond in depth to all of them.

This phenomenon is what prompted my newest blog post:
http://dpldocs.info/this-week-in-d/Blog.Posted_2022_11_28.html#dip-dip,-part-2

I'd formalize some of the feedback stuff to ensure discussions happen before feedback and to remove duplicates.

Top | Forum index | About this forum

Forums

Feedback

Discussion