On Saturday, 22 June 2024 at 21:02:34 UTC, Richard (Rikki) Andrew Cattermole wrote:
> This proposal is a subset of matching capabilities to allow for tagged unions to safely access with language support its values and handle each tag.
Some minor things have been changed from the ideas thread, I have changed the match block to be a declaration block to allow for static foreach
and other conditional compilation language features. So it is now using semicolon instead of colon.
alias MTU = MyTaggedUnion!(int, float, string);
MTU mtu = MTU(1.5);
mtu.match {
(float v) => writeln("a float! ", v);
v => writeln("catch all! ", v);
};
Ideas thread: https://forum.dlang.org/post/chzxzjiwsxmvnkthbdyy@forum.dlang.org
Latest: https://gist.github.com/rikkimax/79cbe199618b3f99104f7df2fc2a9681
Permanent: https://gist.github.com/rikkimax/79cbe199618b3f99104f7df2fc2a9681/95ae646da1ebb079a522b0c993e3408e5a1c0d78
I guess I have implemented something like that: https://d.godbolt.org/z/ePv4ndxeE
If I understand you correctly, we share the vision of a tagged union (I call them enum unions) as a type with certain members (duck typing), not the instance of a particular template.
But that’s where it seems our views diverge. In my implementation, the tag also allows distinct same-type options. (Options are discerned by tag, not by type.)
A type with the appropriate members is (usually) generated by mixing in a given mixin template (EnumUnion
) which takes one parameter of struct type (usually a small private struct named Impl
) and uses its data members (types and names) for types and tags. (I used to have EnumUnion
take an array of string for names and a type tuple for types, but those get really long really fast and error messages become incomprehensible name–type gibberish.)
Example time! Let’s say we want simple expression parsing where an expression is a constant, a variable, a unary minus expression, or a binary plus or times expression.
class Expr
{
struct Binary { Expr lhs, rhs; }
private static struct Impl
{
int constant;
string variable;
Expr minus;
Binary plus, times;
}
mixin EnumUnion!Impl;
// Provides: Constructors, a destructor if needed (not this case),
// eponymous accessors (@safe get and @system set), @system re-assignment,
// and some other stuff with two underscores in front.
}
Accessors:
constant
, variable
, etc. getters return the constant/variable/… if the option is active, otherwise assert(0)
with error message.
constant
, variable
, etc. setters make the constant/variable/… option active and assign a value. (@system)
Among the other stuff:
__is_constant
, __is_variable
, etc. return a boolean if the option is active.
__as_constant
, __as_variable
, etc. return a pointer to the mentioned option if it’s active, or null
. Essentially a safe cast. Similar to key in aa
for associative array lookup.
__unsafe_constant
, __unsafe_variable
, etc. return a reference, checked by an in
contract. (@system)
We’re not done! Because enum unions aren’t simply instances of a template, but just duck-typed stuff, enum union types can be classes or structs depending on your needs and can have additional members!
class Expr
{
…
int eval(int[string] context) => this.matchOrdered!(
(constant) => constant,
(variable) => context[variable],
(minus) => -minus.eval(context),
(plus) => plus.lhs.eval(context) + plus.rhs.eval(context),
(times) => times.lhs.eval(context) * times.rhs.eval(context),
);
}
What is matchOrdered
? A template defined in the same module as EnumUnion
. It requires that all cases be handled (no default/catch-all) and in order of tags, that is, if you swap (constant) => constant,
and (variable) => context[variable],
you get an error. You do get the error because the parameter and tag names don’t line up, not because of a coincidental type mismatch.
There is also match
which also requires all cases be handled but in any order. Handlers are inspected for the names of their parameters, get reordered, and passed to matchOrdered
. Generally, use matchOrdered
as you get better diagnosis.
There are also matchOrderedDefault
and matchDefault
which consider their last argument a default/catch-all handler.
Tags are also used for construction (named parameters). If, by types, construction is ambiguous, a tag can be used to clarify:
void main() @safe
{
// Build (-2) * 1 + (-x)
immutable Expr expr = new Expr(plus: Expr.Binary(
new Expr(times: Expr.Binary(
new Expr(-2),
new Expr(1)
)),
new Expr(minus: new Expr("x"))
));
import std.stdio;
writeln(expr, " = ", expr.eval(["x": 1]));
}
For plus
and times
, tags are required as they’re indistinguishable otherwise. For minus
, the tag is optional, but helps understanding what’s built. For variables and constants, tags aren’t used in the example.
You could use enum unions to back sum types:
struct SumType(Ts...)
{
private static struct Impl
{
static foreach (i, alias T; Ts)
mixin("T field", cast(int) i, ";");
}
mixin EnumUnion!Impl;
}
From what I see, you want to make match
an intrinsic, and TBH, the value of
x.match {
// handlers
}
over
x.match!(
// handlers
)
is negligible.
The value of being a first-class language construct is similar to the foreach
→ opApply
lowering: return
and other control-flow statements in the handlers could get lowered some way. Allowing that for arbitrary lambdas would be powerful and essentially allow programmers to implement custom control-flow statements.