A proposal: Sumtypes (page 2)

Settings

Help

Index » General » A proposal: Sumtypes (page 2)

February 16, 2024

Re: A proposal: Sumtypes

Posted by Dukc
in reply to Richard (Rikki) Andrew Cattermole

Permalink

Dukc

Posted in reply to Richard (Rikki) Andrew Cattermole

Permalink

On Thursday, 8 February 2024 at 15:42:25 UTC, Richard (Rikki) Andrew Cattermole wrote:

Yesterday I mentioned that I wasn't very happy with Walter's design of sum types, at least as per his write-up in his DIP repository.
I have finally after two years written up an alternative to it, that should cover everything you would expect from such a language feature.
There are also a couple of key differences with regards to the tag and ABI that will make value type exceptions aka zero cost exceptions work fairly fast.

Thanks for the writeup. I read both DIPs. Honestly, both of them need improvement IMO. At present state, I prefer Walter's DIP, mainly because the details there are better nailed down.

Problems in Walter's DIP

We don't want this special case for pointers - or at least it needs to be much, much more refined before it carries it's weight. If I have sumtype S { a, int* b }, S.a == S.b(null);, right? Well, why doesn't the DIP say the same should happen with sumtype S { a, Object b } ? Even more interesting case, sumtype S { a, b, c, d, bool e}. A boolean has 254 illegal bit patterns - shouldn't they be used for the tag in this case? And what happens with sumtype S {a, int* b, int* c}? Since we need space for a separate tag anyway, does it make sense for null b to be equal to a?

The proposed special case doesn't help much. If one wants a pointer and a special null value, one can simply use a pointer. On the other hand, one might want a pointer AND a separate tag value. To accomplish that, the user will have to either put the 0 value to the end or do something like sumtype S {int[0] a, int* b}. Certainly doable, but it's a special case with no good reason.

The query expression is not a good idea. This introduces new syntax that isn't consistent with rest of the langauge. Instead, I propose that each sumtype has a member function has, that returns a DRuntime-defined nested struct with an opDispatch defined for quessing the tag:

sumtype Sum {int a, float b, dchar c}

auto sum = Sum.b(2.5);

assert(!sum.has.a);
assert(sum.has.b);
assert(!sum.has.c);

Alternatively, we can settle for simply providing a way for the user to get the tag of the sumtype. Then he can use that tag as he'd use it in case of a regular enum. In fact we will want to provide tag access in any case, because the sum type is otherwise too hard to use in switch statements.

Problems in Rikki's DIP

Like Timon said, the types proposed don't seem to know whether they are supposed to be an unique type. Consider that any tuple can be used to initialise part of another tuple: Tuple(int, int, char, char) can be initialised with tuple(5, tuple(10, 'x').expand, \n). It makes sense - tuples are defined by their contents and beyond that have no identity of their own. However, there are excellent reaons why you can't do "std.datetime.StopWatch(999l.nullable.expand, 9082l)`. You aren't supposed to just declare any random bool and two longs as stopwatches just because their internal representation happens to be that. Structs are not just names for tuples, they're independent types that shouldn't be implicitly mixable unless the struct author explicitly declares so.

By saying that a sumtype is always implicitly convertible to another sumtype that can structurally hold the same values, you're making it the tuple of sumtypes. If the user wants to protect the details, he must put it inside a struct or an union. But this feels wrong:

struct MySumType
{	sumtype Impl = int a | float b | dchar c;
	Impl impl;
}

Why do I need to invent three names for this? If I want to define a tuple type that doesn't mix/match freely, I need just one name for the struct I use for that.

If you insist on this implicit conversion thing, I propose that sum types don't have names by default. Instead, they would become part of type declaration syntax. void would be the type for members with no values beside the tag, and array indexes would be used for getting the members:

double | float sumTypeInstance = 3.4;
alias SumTypeMixable = int | float | dchar;
struct SumTypeUnmixable
{	short | wchar | ubyte[2] members;
	alias asShort = members[0];
	alias asWchar = members[1];
	alias asBytePair = members[2];
}

Then again, the problem would be that how do you name the members this way? Maybe it can work with udas. double a | float b | :c sumTypeInstance could be rewritten to @memberName(0, "a") @memberName(1, "b") @memberName(2, "c") double | float | void sumTypeInstance. The compiler would check for those udas of the symbol when accessing members via name, and also propagate udas of an alias to any declaration done using it. I suspect this rabbit hole goes a bit too deep though:

alias Type1 = int a | float b;
alias Type2 = int b | float a;

// What would be the member names of this? Sigh.
auto sumtype = [Type1.init, Type2.init];

So okay, I don't have very good ideas. Maybe we should just require putting the sum type inside another type if naming is desired.

There is more I could say, on both of these DIPs but I've used a good deal of time on this post already. Maybe I'll do some more another time.

February 17, 2024

Re: A proposal: Sumtypes

Posted by Richard (Rikki) Andrew Cattermole
in reply to Dukc

Permalink

Richard (Rikki) Andrew Cattermole

Posted in reply to Dukc

Permalink

On 17/02/2024 3:34 AM, Dukc wrote:
> Thanks for the writeup. I read both DIPs. Honestly, both of them need improvement IMO. At present state, I prefer Walter's DIP, mainly because the details there are better nailed down.

Mine isn't nailed down, because it isn't worth the effort if Walter won't even reply to this thread ;)

> By saying that a sumtype is always implicitly convertible to another sumtype that can structurally hold the same values, you're making it the tuple of sumtypes. If the user wants to protect the details, he must put it inside a struct or an union.

What I tried to do was to make it a set. Some places you want a set to merge, some you don't.

```d
sumtype S1 = int;
sumtype S2 = int | float;

S1 s1 = 1;
S2 s2 = s1;
```

That to me makes sense, both sets contain int, it'll be a common operation that people will want to do.

So the question I have is why would this not be appropriate behavior, and instead should error?

More importantly, if you did want to do it, how do you do it without doing a match, then assign the value? Especially when all the compiler would need to do is codegen a blit (or with copy constructor call if provided).

February 17, 2024

Re: A proposal: Sumtypes

Posted by Richard (Rikki) Andrew Cattermole
in reply to ryuukk_

Permalink

Richard (Rikki) Andrew Cattermole

Posted in reply to ryuukk_

Permalink

On 09/02/2024 7:10 AM, ryuukk_ wrote:
> I personally not a fan of having a new keyword, it's words i can no longer use in my code, we have `union` and `enum` a sumtype is the combination of both, so why not:

I suspect we'll be using ``sumtype`` for this regardless.

But at least it isn't as bad as Haskell with the ``data`` keyword to designate more or less a sumtype.

> I am not a fan of using `.match` and not a fun of having `match` wich is yet another new keyword, why not reuse `switch`?

I am not proposing the match capability, first let's wait and see what Walter proposes in his DConf Online 2024 presentation. If it is a good design that hits the marks then a competing design isn't required to be attempted ;)

February 17, 2024

Re: A proposal: Sumtypes

Posted by Richard (Rikki) Andrew Cattermole
in reply to Timon Gehr

Permalink

Richard (Rikki) Andrew Cattermole

Posted in reply to Timon Gehr

Permalink

On 09/02/2024 8:26 AM, Timon Gehr wrote:
> On 2/8/24 16:42, Richard (Rikki) Andrew Cattermole wrote:
>> ## Nullability
>>
>> A sum type cannot have the type state of null.
>> ...
> 
> I am not sure what that means.

It comes from type state theory.

What it means in the context of D is: ``s is null`` doesn't exist.

It also removes the temptation that the entry is considered for the null check.

https://en.wikipedia.org/wiki/Typestate_analysis

>> ## Set Operations
>>
>> A sumtype which is a subset of another, will be assignable.
>>
>> ```d
>> sumtype S1 = :none | int;
>> sumtype S2 = :none | int | float;
>>
>> S1 s1;
>> S2 s2 = s1;
>> ```
>> ...
> 
> This seems like a strange mix of nominal and structural typing.

It is a mix yes.

It is based around the entries being a set.

Something almost all D programmers will be familiar with and should be easily explainable.

>> This covers other scenarios like returning from a function or an argument to a function.
>>
>> To remove a possible entry from a sumtype you must peform a match (which is not being proposed here):
>>
>> ```d
>> sumtype S1 = :none | int;
>> sumtype S2 = :none | int | float;
>>
>> S1 s1;
>> S2 s2 = s1;
>>
>> s2.match {
>>      (float) => assert(0);
>>      (default val) s1 = val;
>> }
>> ```
>>
>> To determine if a type is in the set:
>>
>> ```d
>> sumtype S1 = :none | int;
>>
>> pragma(msg, int in S1); // true
>> pragma(msg, :none in S1); // true
>> pragma(msg, "none" in S1); // true
>> ```
>> ...
> 
> I think a priori here you will have an issue with parsing.

I don't know, in expression already exist, this would be a different version that is differentiated at semantic analysis. I wouldn't be surprised if it already parses with my member of operator PR.

>> Or you can expand a sumtype directly into another:
>>
>> ```d
>> sumtype S1 = :none | int i;
>> sumtype S2 = :none | S1.expand | long l; // :none | int i | long l
>> ```
>>
>> When merging, duplicate types and names are not an error, they will be combined.
>> Although if two names have different types this will error.
>> ...
> 
> Again this mixes nominal and structural typing in a way that seems unsatisfying. Note that different struct types do not become assignable just because they share member types and names.

A struct is not a set, a sumtype and with that this design is based upon being a set in terms of options.

February 19, 2024

Re: A proposal: Sumtypes

Posted by Dukc
in reply to Richard (Rikki) Andrew Cattermole

Permalink

Dukc

Posted in reply to Richard (Rikki) Andrew Cattermole

Permalink

On Saturday, 17 February 2024 at 02:46:26 UTC, Richard (Rikki) Andrew Cattermole wrote:

What I tried to do was to make it a set. Some places you want a set to merge, some you don't.

sumtype S1 = int;
sumtype S2 = int | float;

S1 s1 = 1;
S2 s2 = s1;

That to me makes sense, both sets contain int, it'll be a common operation that people will want to do.

So the question I have is why would this not be appropriate behavior, and instead should error?

It does not have to error, it makes sense per se. It just mixes poorly with having to name the sum type. Consider enums:

enum E1 { a, e = 4 }
enum E2 { a, d = 3, e, f }

You can't impliclitly assign an instance of E1 to E2 either even though all legal members of E1 are representable in E2. On the other hand, if you just declared two bunches of enum ints, you could exchange them in the same variable.

In the same way, if I have to give a type a name, I'd expect it won't implicitly convert to anything. If the sum types were, by default, declared without a type name then I wouldn't be bothered.

More importantly, if you did want to do it, how do you do it without doing a match, then assign the value? Especially when all the compiler would need to do is codegen a blit (or with copy constructor call if provided).

S2 can well accept an assignment from an int (or float). It's assigning S1 directly, without explicitly fetching the int from inside that I'm criticising. In the source code fetching that int from S1 is going to require a match (like any use of a sum type outside unsafe pointer casts), but since there is only one member the compiler can optimise that out.

March 03, 2024

Re: A proposal: Sumtypes

Posted by IchorDev
in reply to Richard (Rikki) Andrew Cattermole

Permalink

IchorDev

Posted in reply to Richard (Rikki) Andrew Cattermole

Permalink

On Thursday, 8 February 2024 at 15:42:25 UTC, Richard (Rikki) Andrew Cattermole wrote:

I am pretty pleased with both of these DIPs.
The syntax of the sum types could be tweaked a little:

I think that the new keyword should be avoided:
sumtype => enum union/case union (or similar)
The :none in the sum type's declaration is really odd... it seems to reference itself from within its own declaration? Why not just use void?
Why comma-separation between members? These are a union-like type, use semicolons.

So, all in all my suggestions would look like this:

case union Nullable(T){
	void none;
	T value;
}

March 07, 2024

Re: A proposal: Sumtypes

Posted by cc
in reply to Richard (Rikki) Andrew Cattermole

Permalink

Posted in reply to Richard (Rikki) Andrew Cattermole

Permalink

On Thursday, 8 February 2024 at 15:42:25 UTC, Richard (Rikki) Andrew Cattermole wrote:

sum types

In D there is a case where a variable can be declared in a condition, e.g. if (auto a = someFunc) a.doWhatever;. Could similar logic like this work?

sumtype Foo = int | float | string;
Foo foo;
foo = 2;
assert(foo is int);
assert(foo !is float);

void barFloat(float) {}
Foo getPi() => {
	if (userIsBaker)
		return "apple";
	return 3.14f;
}

foo = getPi();
barFloat(foo); // Illegal, use:
if (foo is float) { // inside this scope, accesses to foo treat it as float
	barFloat(foo); // ok now!
} else if (foo is string) {
	writefln("Sure could go for some %s PIE...", foo.toUpper);
}
writefln("handy foo debugger: %s", foo); // template/trait magic would make this ok though and format to library standards. static if (is(typeof(arg) == sumtype)) ... etc

March 08, 2024

Re: A proposal: Sumtypes

Posted by Richard (Rikki) Andrew Cattermole
in reply to cc

Permalink

Richard (Rikki) Andrew Cattermole

Posted in reply to cc

Permalink

On 08/03/2024 12:59 AM, cc wrote:
> On Thursday, 8 February 2024 at 15:42:25 UTC, Richard (Rikki) Andrew Cattermole wrote:
>> sum types
> 
> In D there is a case where a variable can be declared in a condition, e.g. `if (auto a = someFunc) a.doWhatever;`.  Could similar logic like this work?
> 
> ```d
> sumtype Foo = int | float | string;
> Foo foo;
> foo = 2;
> assert(foo is int);
> assert(foo !is float);
> 
> void barFloat(float) {}
> Foo getPi() => {
>      if (userIsBaker)
>          return "apple";
>      return 3.14f;
> }
> 
> foo = getPi();
> barFloat(foo); // Illegal, use:
> if (foo is float) { // inside this scope, accesses to foo treat it as float
>      barFloat(foo); // ok now!
> } else if (foo is string) {
>      writefln("Sure could go for some %s PIE...", foo.toUpper);
> }
> writefln("handy foo debugger: %s", foo); // template/trait magic would make this ok though and format to library standards. static if (is(typeof(arg) == sumtype)) ... etc
> ```

It could be made to work yes.

However for matching, I'm waiting on Walter before having opinions about it.

Top | Forum index | About this forum

Forums

Problems in Walter's DIP

Problems in Rikki's DIP