On Thursday, 8 February 2024 at 15:42:25 UTC, Richard (Rikki) Andrew Cattermole wrote:
>Yesterday I mentioned that I wasn't very happy with Walter's design of sum types, at least as per his write-up in his DIP repository.
I have finally after two years written up an alternative to it, that should cover everything you would expect from such a language feature.
There are also a couple of key differences with regards to the tag and ABI that will make value type exceptions aka zero cost exceptions work fairly fast.
Thanks for the writeup. I read both DIPs. Honestly, both of them need improvement IMO. At present state, I prefer Walter's DIP, mainly because the details there are better nailed down.
Problems in Walter's DIP
We don't want this special case for pointers - or at least it needs to be much, much more refined before it carries it's weight. If I have sumtype S { a, int* b }
, S.a == S.b(null);
, right? Well, why doesn't the DIP say the same should happen with sumtype S { a, Object b }
? Even more interesting case, sumtype S { a, b, c, d, bool e}
. A boolean has 254 illegal bit patterns - shouldn't they be used for the tag in this case? And what happens with sumtype S {a, int* b, int* c}
? Since we need space for a separate tag anyway, does it make sense for null b
to be equal to a
?
The proposed special case doesn't help much. If one wants a pointer and a special null value, one can simply use a pointer. On the other hand, one might want a pointer AND a separate tag value. To accomplish that, the user will have to either put the 0 value to the end or do something like sumtype S {int[0] a, int* b}
. Certainly doable, but it's a special case with no good reason.
The query expression is not a good idea. This introduces new syntax that isn't consistent with rest of the langauge. Instead, I propose that each sumtype has a member function has
, that returns a DRuntime-defined nested struct with an opDispatch defined for quessing the tag:
sumtype Sum {int a, float b, dchar c}
auto sum = Sum.b(2.5);
assert(!sum.has.a);
assert(sum.has.b);
assert(!sum.has.c);
Alternatively, we can settle for simply providing a way for the user to get the tag of the sumtype. Then he can use that tag as he'd use it in case of a regular enum. In fact we will want to provide tag access in any case, because the sum type is otherwise too hard to use in switch
statements.
Problems in Rikki's DIP
Like Timon said, the types proposed don't seem to know whether they are supposed to be an unique type. Consider that any tuple can be used to initialise part of another tuple: Tuple(int, int, char, char)
can be initialised with tuple(5, tuple(10, 'x').expand,
\n)
. It makes sense - tuples are defined by their contents and beyond that have no identity of their own. However, there are excellent reaons why you can't do "std.datetime.StopWatch(999l.nullable.expand, 9082l)`. You aren't supposed to just declare any random bool and two longs as stopwatches just because their internal representation happens to be that. Structs are not just names for tuples, they're independent types that shouldn't be implicitly mixable unless the struct author explicitly declares so.
By saying that a sumtype is always implicitly convertible to another sumtype that can structurally hold the same values, you're making it the tuple of sumtypes. If the user wants to protect the details, he must put it inside a struct or an union. But this feels wrong:
struct MySumType
{ sumtype Impl = int a | float b | dchar c;
Impl impl;
}
Why do I need to invent three names for this? If I want to define a tuple type that doesn't mix/match freely, I need just one name for the struct I use for that.
If you insist on this implicit conversion thing, I propose that sum types don't have names by default. Instead, they would become part of type declaration syntax. void
would be the type for members with no values beside the tag, and array indexes would be used for getting the members:
double | float sumTypeInstance = 3.4;
alias SumTypeMixable = int | float | dchar;
struct SumTypeUnmixable
{ short | wchar | ubyte[2] members;
alias asShort = members[0];
alias asWchar = members[1];
alias asBytePair = members[2];
}
Then again, the problem would be that how do you name the members this way? Maybe it can work with udas. double a | float b | :c sumTypeInstance
could be rewritten to @memberName(0, "a") @memberName(1, "b") @memberName(2, "c") double | float | void sumTypeInstance
. The compiler would check for those udas of the symbol when accessing members via name, and also propagate udas of an alias to any declaration done using it. I suspect this rabbit hole goes a bit too deep though:
alias Type1 = int a | float b;
alias Type2 = int b | float a;
// What would be the member names of this? Sigh.
auto sumtype = [Type1.init, Type2.init];
So okay, I don't have very good ideas. Maybe we should just require putting the sum type inside another type if naming is desired.
There is more I could say, on both of these DIPs but I've used a good deal of time on this post already. Maybe I'll do some more another time.