Jump to page: 1 2 3
Thread overview
Type Inference for Struct/Enum Literals
Jul 05
IchorDev
Jul 06
ryuukk_
Jul 06
IchorDev
Jul 08
ryuukk_
Jul 08
ryuukk_
Jul 09
IchorDev
Jul 06
IchorDev
Jul 09
IchorDev
Jul 08
IchorDev
Jul 09
IchorDev
Jul 09
IchorDev
Jul 09
IchorDev
Jul 20
IchorDev
Re: Type Inference for Struct/Enum Literals - identifier literals
Jul 22
harakim
Jul 22
IchorDev
July 05

With language editions on the horizon, I thought it might be worth taking another look at this idea, perhaps with a slightly broadened scope. Let me know what you think.

What & why?

For those not familiar, let's say you have a situation like this:

enum Font{ normal, heading, caption }
struct Colour{ float r,g,b,a=1f; }
class TextStyle{
	private Font font_;
	private Colour fg, bg;
	TextStyle font(Font val){ font_ = val; return this; }
	TextStyle foregroundColour(Colour val){ fg = val; return this; }
	TextStyle backgroundColour(Colour val){ bg = val; return this; }
}

void main(){
	auto textStyle = (new TextStyle)
		.font(Font.heading)
		.foregroundColour(Colour(1f, 0.5f, 0.5f))
		.backgroundColour(Colour(0f, 0f, 1f));
}

If you look at main, you'll see that we're repeating ourselves a lot to call those member functions. What if we weren't forced to write the types each time? Then it could be more like this:

void main(){
	auto textStyle = (new TextStyle)
		.font(.heading)
		.foregroundColour(.(1f, 0.5f, 0.5f))
		.backgroundColour(.(0f, 0f, 1f));
}

Boom, that's it. For any context where an enum literal or struct literal is being assigned/passed/etc. to something with a known type, allow us to omit the type from the literal. I'm not really set on any particular syntax, but here are some thoughts:

  • The . syntax feels natural for enum literals, but not for struct literals. This would have to mean changing the module scope operator (perhaps to ./? (Like to how ./ represents 'here' in a terminal) because otherwise module-level symbols would interfere.
  • My original proposal used $, which I still like for struct literals, but it's a bit random.
  • I don't like the C initialiser syntax ({}) but it could be re-purposed for type-inferred struct literals, which would mean we can basically merge the two.

Implementation

The original enum literal inference DIP was basically fully implemented in the end and could easily be resurrected.

Essentially, you should be able to use inference in any situation where the compiler can trivially infer the type from the context of the literal. Here's a few examples with enums:

enum Enum { foo,bar }
enum Mune { foo,far }

auto x = .bar; //ERR: can't use type inference when there is no type specified anywhere!
auto x = Enum.foo;
x = .bar; //OK: `typeof(x)` is an enum with member `bar`
x = .car; //ERR: no member `bar` in enum type `Enum`!

void myFn(Enum x){}

myFn(.bar); //OK: the type of the parameter is an enum with member `bar`
myFn(.car); //ERR: no member `bar` in enum type `Enum`!

void myOverload(Enum x){}
void myOverload(Mune x){}

myOverload(.foo); //ERR: `foo` is a member of both `Enum` and `Mune`!
myOverload(.bar); //OK: first overload selected based on `Enum` having member `bar`

The specifics of how this interacts with existing features (e.g. array literals) very much depends on how those features infer type. Here's a good example:

Enum getEnum() => .bar; //OK: return type is an enum with member `bar`
Enum x = (() => .bar)(); //it's probably not worth trying to make this work
auto y = (int num){
	switch(num){
		case 0: return Enum.foo;
		case 1: return .bar; //this will only work if D's compiler frontend knows the return type of this lambda based on the first return statement.
		default: assert(0);
	}
}(1);

Other languages

Enum literals having their type contextually inferred is a feature in many modern programming languages like Zig, V, Swift, and Odin. Java has something similar, but it's useless because it only applies to switch statements.
For struct literals however, the closest thing I can think of is C's struct initialisers.

July 06

On Friday, 5 July 2024 at 13:19:44 UTC, IchorDev wrote:

>
  • The . syntax feels natural for enum literals, but not for struct literals. This would have to mean changing the module scope operator (perhaps to ./? (Like to how ./ represents 'here' in a terminal) because otherwise module-level symbols would interfere.
  • My original proposal used $, which I still like for struct literals, but it's a bit random.
  • I don't like the C initialiser syntax ({}) but it could be re-purposed for type-inferred struct literals, which would mean we can basically merge the two.

OK, so in fact, we have an interesting possibility here. With editions we could potentially overtake the . prefix, and use something else for global scope.

The biggest problem with this is -- it breaks all existing knowledge of what this does. It is, for sure, the most natural syntax for this feature. One might argue that $.symbol could mean "global symbol" instead (or something else).

It's a pretty big lift for this, and I don't see it being accepted. Already there is pushback from the language maintainers on the concept itself (from the original review).

However, I would love to see this inference piece get into the language, whatever the syntax. I very much enjoyed it in my swift programs. I would guess that the best possibility to be accepted would be to have syntax that doesn't conflict with existing syntax.

-Steve

July 06

On Saturday, 6 July 2024 at 02:49:32 UTC, Steven Schveighoffer wrote:

>

On Friday, 5 July 2024 at 13:19:44 UTC, IchorDev wrote:

>
  • The . syntax feels natural for enum literals, but not for struct literals. This would have to mean changing the module scope operator (perhaps to ./? (Like to how ./ represents 'here' in a terminal) because otherwise module-level symbols would interfere.
  • My original proposal used $, which I still like for struct literals, but it's a bit random.
  • I don't like the C initialiser syntax ({}) but it could be re-purposed for type-inferred struct literals, which would mean we can basically merge the two.

OK, so in fact, we have an interesting possibility here. With editions we could potentially overtake the . prefix, and use something else for global scope.

The biggest problem with this is -- it breaks all existing knowledge of what this does. It is, for sure, the most natural syntax for this feature. One might argue that $.symbol could mean "global symbol" instead (or something else).

It's a pretty big lift for this, and I don't see it being accepted. Already there is pushback from the language maintainers on the concept itself (from the original review).

However, I would love to see this inference piece get into the language, whatever the syntax. I very much enjoyed it in my swift programs. I would guess that the best possibility to be accepted would be to have syntax that doesn't conflict with existing syntax.

-Steve

The pushback is from people who never got to learn new languages, they believe this is what everyone wants to write

MySelfExplanatoryType flag = MySelfExplanatoryType.A | MySelfExplanatoryType.B MySelfExplanatoryType.C;

And when they tell you that "you can use an alias", it's when you know they are being dishonest, the point is not to make things unreadable or obfuscated, it's to avoid repetition and to make code more concise in places that are relevant

I use many languages, the ones that i want to stick with the most are the ones that lets me write concise code, recently it's been Odin for me

change_state(State.IDLE);

There is no reason to repeat "state" in that line

same here:

switch (player_state)
{
    case State.IDLE: /* .. */ break;
}
import std.stdio;
import module_a.module_b.module_c;

void main() {
    module_a.module_b.module_c.Data data;
    data.value = int(42);
    print_data(data);
    module_a.module_b.module_c.this_is_a_function(module_a.module_b.module_c.MyEnum.A);
}

void print_data(module_a.module_b.module_c.Data data) {
    std.stdio.writeln(data);
}

Now you understand the value of concise code ;)

I think rikki proposed :, perhaps that's the way

July 06

On Saturday, 6 July 2024 at 02:49:32 UTC, Steven Schveighoffer wrote:

>

One might argue that $.symbol could mean "global symbol" instead (or something else).

Hey, not a bad idea! I think that could work for struct literals like $(a, b, c), and then they could both use the same prefix. Probably not with $ though—Atila said it was ‘unsightly’.
That said, the plain .symbol does look rather more 'sightly' than having any prefix for enum literals in my opinion. I was very into the idea of using \ during my original proposal, but I remember there being some really stupid issue with it. Might be worth trying it again though.

>

Already there is pushback from the language maintainers on the concept itself (from the original review).

On the contrary, Walter was pretty much on-board with enum literal type inference, he just didn't like it working with function overloads because he was concerned about performance (which I'm pretty sure UplinkCoder already solved), and he didn't want there to be a prefix; you'd just write the enum member like you would in C. Like so:

enum A{ a,b,c,d }
struct S{ A one, two; }

void main(){
	A    myA1 = b;      // myA1 = A.b
	A    myA2 = b | c;  // myA2 = A.d
	auto myA3 = b;      // error, b is undefined
	
	S myS;
	myS.one = c; // myB.one = A.c
	myS.two = d; // myB.two = A.d
}

Back then I was strongly against this syntax due to its ambiguity, breaking existing code, and also because it means you always have to check if the type of an assignment/parameter is an enum (so the best-case performance is much worse). I think if it was the only choice it'd be better than nothing, but I think it'd cause people a lot of headaches compared to a more explicit syntax. For struct literals though, I think implementing the same prefix-less syntax would be a complete and utter nightmare.

>

I very much enjoyed it in my swift programs.

Ah yes, and Swift enums are already sumtypes. I guess there might be potential for sumtype literal type inference down the road.

>

I would guess that the best possibility to be accepted would be to have syntax that doesn't conflict with existing syntax.

I doubt it. My original proposal didn't conflict with any existing syntax, whereas Walter's counter-proposal would've broken a fair bit of code.
A lot of people said they'd only be on board if I replaced the module scope operator, which I was against at the time because I didn't want the feature to break anyone's code. With editions on the horizon, I'm down for pretty much whatever syntax Walter and Atila could agree on.

July 06

On Saturday, 6 July 2024 at 08:04:36 UTC, ryuukk_ wrote:

>

The pushback is from people who never got to learn new languages, they believe this is what everyone wants to write

MySelfExplanatoryType flag = MySelfExplanatoryType.A | MySelfExplanatoryType.B MySelfExplanatoryType.C;

And when they tell you that "you can use an alias", it's when you know they are being dishonest, the point is not to make things unreadable or obfuscated, it's to avoid repetition and to make code more concise in places that are relevant

Hey, precisely. :)

>
import std.stdio;
import module_a.module_b.module_c;

void main() {
    module_a.module_b.module_c.Data data;
    data.value = int(42);
    print_data(data);
    module_a.module_b.module_c.this_is_a_function(module_a.module_b.module_c.MyEnum.A);
}

void print_data(module_a.module_b.module_c.Data data) {
    std.stdio.writeln(data);
}

That's a very amusing example. I never thought of how much more verbose D could be without its syntactic sugar. You could've added some range copying and eponymous templates in there for good measure. ;)
With D being a reasonably modern language and having a lot of shortcuts for writing otherwise cumbersome things, I really always expected it to have a shortcut for enum literals, since enum literals are the sort of thing so often used in places where specifying the type doesn't help readability at all.

>

I think rikki proposed :, perhaps that's the way

I'm not sure if it's the most readable thing when compared to just . for enum literals, but it's certainly a decent option if we're against replacing the module scope operator. For struct literals it could work too. One thing is now that we have named function parameters it might result in a bit of a colon headache: myFn(x: :a, y: cond ? :c : :d);

July 07

On Saturday, 6 July 2024 at 08:04:36 UTC, ryuukk_ wrote:

>

The pushback is from people who never got to learn new languages, they believe this is what everyone wants to write

They do not believe that.

The proposal is fine except where it breaks existing code. I don't think editions should redefine syntax without a very good reason. There is a limited budget for breakage per edition. (An example of such a reason is breaking bug-prone code).

>

MySelfExplanatoryType flag = MySelfExplanatoryType.A | MySelfExplanatoryType.B MySelfExplanatoryType.C;

You could write:

auto flag = { with (MySelfExplanatoryType) return A | B | C; }();
>

And when they tell you that "you can use an alias", it's when you know they are being dishonest, the point is not to make

You don't know they are being dishonest (and its unhelpful to assume that).

It is valid (and necessary) when considering language changes to compare them against uses of existing features.

>

things unreadable or obfuscated, it's to avoid repetition and to make code more concise in places that are relevant

Do you think using very short names for variables is obfuscation?

July 07

On Friday, 5 July 2024 at 13:19:44 UTC, IchorDev wrote:

>

The specifics of how this interacts with existing features (e.g. array literals) very much depends on how those features infer type. Here's a good example:

Enum getEnum() => .bar; //OK: return type is an enum with member `bar`
Enum x = (() => .bar)(); //it's probably not worth trying to make this work
auto y = (int num){
	switch(num){
		case 0: return Enum.foo;
		case 1: return .bar; //this will only work if D's compiler frontend knows the return type of this lambda based on the first return statement.
		default: assert(0);
	}
}(1);

Just to mention that Vladimir's idea of making an identifier expression have its own type should work for all the cases above, with minimal compiler complexity.
https://forum.dlang.org/post/ysbmglpkvygrpjsggqco@forum.dlang.org

However, that doesn't seem to help for an implicit struct construction expression.

July 07

On Saturday, 6 July 2024 at 11:04:16 UTC, IchorDev wrote:

>

On Saturday, 6 July 2024 at 02:49:32 UTC, Steven Schveighoffer wrote:

>

Already there is pushback from the language maintainers on the concept itself (from the original review).

On the contrary, Walter was pretty much on-board with enum literal type inference, he just didn't like it working with function overloads because he was concerned about performance (which I'm pretty sure UplinkCoder already solved), and he didn't want there to be a prefix; you'd just write the enum member like you would in C.

My memory of it might be flawed, but I felt like the response was "just use with".

There is also a fundamental misunderstanding of the proposal, if there is some question of performance here. The decision of whether the thing matches or not is not any more expensive than the current overloads check.

-Steve

July 08

On Sunday, 7 July 2024 at 19:42:25 UTC, Nick Treleaven wrote:

>

Just to mention that Vladimir's idea of making an identifier expression have its own type should work for all the cases above, with minimal compiler complexity.
https://forum.dlang.org/post/ysbmglpkvygrpjsggqco@forum.dlang.org

That’s clever actually. From memory I think Uplink’s implementation works/worked something like that? It would error with ‘cannot convert from void’ if the type inference failed (at least in the early versions).

Not why it was brought up, but I might as well address that their argument for copying Ruby and Lisp using : is rather weak.
Ruby doesn’t have enums as a language feature, just an enumeration class, so its : is just for generic symbol lookup.
As for suggesting to copy Lisp syntax at all, Lisp’s syntax is fundamentally different from D’s ALGOL-based syntax. We don’t use-kebab-case or have (nested lists with no commas).
Besides that, neither language appears to use : for much else, whereas D uses colons in ways that might conflict with its legibility as a prefix. (See: ternary operators, named parameters)
I still think : is a good option, it’s just not my first pick. People coming from other ALGOL-style languages would naturally expect . so that would be my preference, and failing that I’d like to give a more contextually legible prefix a shot.

>

However, that doesn't seem to help for an implicit struct construction expression.

I don’t see why a similar solution couldn’t work?

July 08

On Sunday, 7 July 2024 at 19:07:10 UTC, Nick Treleaven wrote:

>

On Saturday, 6 July 2024 at 08:04:36 UTC, ryuukk_ wrote:

>

The pushback is from people who never got to learn new languages, they believe this is what everyone wants to write

They do not believe that.

The proposal is fine except where it breaks existing code. I don't think editions should redefine syntax without a very good reason. There is a limited budget for breakage per edition. (An example of such a reason is breaking bug-prone code).

>

MySelfExplanatoryType flag = MySelfExplanatoryType.A | MySelfExplanatoryType.B MySelfExplanatoryType.C;

You could write:

auto flag = { with (MySelfExplanatoryType) return A | B | C; }();

No, this syntax is just bad, it's unreadable and over engineered

« First   ‹ Prev
1 2 3