Third and Hopefully Last Draft: Primary Type Syntax

Third and Hopefully Last Draft: Primary Type Syntax
Sep 21 Quirin Schroll
Sep 21 Richard (Rikki) Andrew Cattermole
Sep 21 Quirin Schroll
Sep 21 Richard (Rikki) Andrew Cattermole
Sep 22 IchorDev
Sep 22 Tim
Sep 23 Quirin Schroll
Sep 24 Quirin Schroll
Feb 14 Mike Parker

September 21

Posted by Quirin Schroll

Permalink

Quirin Schroll

Permalink

The obligatory permalink and latest draft

September 22

Re: Third and Hopefully Last Draft: Primary Type Syntax

Posted by Richard (Rikki) Andrew Cattermole
in reply to Quirin Schroll

Permalink

Richard (Rikki) Andrew Cattermole

Posted in reply to Quirin Schroll

Permalink

I recommend that you put it through Grammarly prior to Mike getting it, it'll lessen his workload.

I.e. ``excpetion``

Otherwise it is looking pretty good, and good job on doing the implementation!

September 21

Re: Third and Hopefully Last Draft: Primary Type Syntax

Posted by Quirin Schroll
in reply to Richard (Rikki) Andrew Cattermole

Permalink

Quirin Schroll

Posted in reply to Richard (Rikki) Andrew Cattermole

Permalink

On Saturday, 21 September 2024 at 13:29:05 UTC, Richard (Rikki) Andrew Cattermole wrote:

I recommend that you put it through Grammarly prior to Mike getting it, it'll lessen his workload.

I.e. excpetion

Otherwise it is looking pretty good, and good job on doing the implementation!

I gave it two people to proofread and probably one just didn't do it (he said it's good), the other sent me a revised version, which did contain some style suggestions. It's not like I didn't try something.

I'll try Grammerly. Haven't used it in ages.

The implementation has some workarounds that I'd hope won't make it into the compiler. But as Walter pointed out in the Monthly Meeting, it's not obvious the grammar changes won't lead to weird parsings. Therefore, I hope the implementation can give people like you, Paul Backus, and Timon Gehr the opportunity to find holes or, hopefully, find none, which might be enough for Walter to dispel his concerns.

September 22

Re: Third and Hopefully Last Draft: Primary Type Syntax

Posted by Richard (Rikki) Andrew Cattermole
in reply to Quirin Schroll

Permalink

Richard (Rikki) Andrew Cattermole

Posted in reply to Quirin Schroll

Permalink

On 22/09/2024 4:01 AM, Quirin Schroll wrote:
> On Saturday, 21 September 2024 at 13:29:05 UTC, Richard (Rikki) Andrew Cattermole wrote:
>> I recommend that you put it through Grammarly prior to Mike getting it, it'll lessen his workload.
>>
>> I.e. ``excpetion``
>>
>> Otherwise it is looking pretty good, and good job on doing the implementation!
> 
> I gave it two people to proofread and probably one just didn't do it (he said it's good), the other sent me a revised version, which did contain some style suggestions. It's not like I didn't try something.

Yeah you did good, its just that tools are guaranteed to catch stuff like this :)

> The implementation has some workarounds that I'd hope won't make it into the compiler. But as Walter pointed out in the Monthly Meeting, it's not obvious the grammar changes won't lead to weird parsings. Therefore, I hope the implementation can give people like you, Paul Backus, and Timon Gehr the opportunity to find holes or, hopefully, find none, which might be enough for Walter to dispel his concerns.

Grammar stuff like this isn't where I shine, as long as it passes buildkite I'm happy. The text shows you've done your research and put in the effort.

Ideally we'd throw a fuzzer at the parser to verify that it works as expected.

https://llvm.org/docs/LibFuzzer.html

https://johanengelen.github.io/ldc/2018/01/14/Fuzzing-with-LDC.html

September 22

Re: Third and Hopefully Last Draft: Primary Type Syntax

Posted by IchorDev
in reply to Quirin Schroll

Permalink

IchorDev

Posted in reply to Quirin Schroll

Permalink

On Saturday, 21 September 2024 at 01:01:22 UTC, Quirin Schroll wrote:

The obligatory permalink and latest draft

Not sure what was wrong with the other two drafts, but this one seems equally great. This feature would represent a massive improvement to string mixin code generation, and general language cohesion. Much like how you’re always allowed to use trailing commas in comma-separated lists.

September 22

Re: Third and Hopefully Last Draft: Primary Type Syntax

Posted by Tim
in reply to Quirin Schroll

Permalink

Tim

Posted in reply to Quirin Schroll

Permalink

On Saturday, 21 September 2024 at 01:01:22 UTC, Quirin Schroll wrote:

The obligatory permalink and latest draft

The grammar changes look good. I found some new ambiguities, but the implementation seems to always prefer the old meaning, so it should be no problem.

Attributes with optional parens

// deprecated (size_t) x1 = 1; // Syntax error
// align (size_t) x2 = 1; // Syntax error
// package (size_t) x3 = 1; // Syntax error
// extern (size_t) x4 = 1; // Syntax error
struct UDA{}
// @UDA (size_t) x5 = 1; // Syntax error

The attributes deprecated, align, package and extern as well as
UDAs can be followed with optional arguments in parens, like the
deprecation message. These parens are now ambiguous with a basic type in
parens.

The implementation seems to always try to parse the parens as arguments
for the attribute, so it remains backward compatible.

Maybe this could be confusing for the user, when a declaration uses a
type in parens and later an attribute is added.

Scope guards

alias exit = Object;
Object x1;
void main()
{
    scope (exit) x1 = new Object(); // Still a scope guard
    // scope (Object) x2 = new Object(); // Syntax error
    // scope (int) x3 = 3; // Syntax error
    @0 scope (exit) x4 = new Object(); // Declares variable with type exit
}

The first statement is a scope guard with the current grammar. With the
new grammar it could also be a variable declaration of type exit and
storage class scope. The implementation still parses it as a scope
guard, so it remains backward compatible.

The next line could also be a variable declaration, but it is still
parsed as a scope guard. DMD then prints an error, because Object
is not a valid scope identifier. The line with x3 is a syntax error
for the same reason.

The last statement is parsed as a variable declaration, because scope
guards can't have UDAs.

Function literals

auto test1 = function (float){return 0;};
// auto test2 = function (float)(int){return 0;}; // Syntax error

Function literals have an optional return type and optional parameters.
The type float for test1 could be a parameter or a return type in
parens. The implementation always parses the parens as parameters,
so it remains backward compatible.

The second function literal has both a return type and parameters, but
it results in a syntax error, because the parens are parsed as
parameters and no other parens are expected after that.

Anonymous classes

void main()
{
    auto o1 = new class (Object) {};
}

The parens could be constructor arguments or a basic type in
AnonBaseClassList?. The implementation always tries to parse
constructor arguments, which should be fine.

September 23

Re: Third and Hopefully Last Draft: Primary Type Syntax

Posted by Quirin Schroll
in reply to Tim

Permalink

Quirin Schroll

Posted in reply to Tim

Permalink

On Sunday, 22 September 2024 at 10:58:55 UTC, Tim wrote:

On Saturday, 21 September 2024 at 01:01:22 UTC, Quirin Schroll wrote:

The obligatory permalink and latest draft

The grammar changes look good. I found some new ambiguities, but the implementation seems to always prefer the old meaning, so it should be no problem.

In general, ambiguities are resolved considering Maximum Munch: If the next token can be parsed as part of the entity that the grammar suggests, it will be; only if it can’t, the entity is closed or it’s an error.

Attributes with optional parens

// deprecated (size_t) x1 = 1; // Syntax error
// align (size_t) x2 = 1; // Syntax error
// package (size_t) x3 = 1; // Syntax error
// extern (size_t) x4 = 1; // Syntax error
struct UDA{}
// @UDA (size_t) x5 = 1; // Syntax error

Those all fall under Maximum Munch: A parenthesis following any of these attributes constitutes their optional arguments. Attributes with optional arguments are greedy.

I wasn’t even aware of align without argument.

The biggest one is extern because it’s realistically used with the new parsing. If you have a class C, extern (C) is ambiguous – except for Maximum Munch.

The implementation seems to always try to parse the parens as arguments
for the attribute, so it remains backward compatible.

Yes, and it follows MM, which is generally something programmers can rely on.

What can be done about those? For one:

attribute
{
    declaration;
}

Always works at declaration scope, but for statement scope, that’s not possible. Here, I thought one could use an empty UDA list @(), but those are expressly illegal, so one has to resort to using a dummy UDA like @(""). Not nice, but if you insist on expressing something at statement scope in one swath, I guess we can ask the programmer for some concessions.

Maybe this could be confusing for the user, when a declaration uses a type in parens and later an attribute is added.

There’s unfortunately little that can be done about it. A better implementation can possibly backtrack and re-interpret what used to be an attribute’s argument as a basic type, but to be honest, that is a lot of work.

Scope guards

alias exit = Object;
Object x1;
void main()
{
    scope (exit) x1 = new Object(); // Still a scope guard
    // scope (Object) x2 = new Object(); // Syntax error
    // scope (int) x3 = 3; // Syntax error
    @0 scope (exit) x4 = new Object(); // Declares variable with type exit
}

The big issue with these is, basically, that IMO this must work:

scope (ref void function())* fpp = null;

And it doesn’t.

IIRC, I ran into this and implemented a look-ahead to handle scope guards correctly. The Scope guards utilize magic identifiers, and unlike __traits or pragma, there is no-arg scope.

I just fixed that because it was fairly easy to do so. My implementation now looks ahead to see if it’s scope(exit/success/failure) and if it’s not, it tries to parse it as scope attribute.

The last statement is parsed as a variable declaration, because scope
guards can't have UDAs.

This is interesting. It’s unlikely that something like that is going to be a real-world problem, though, as it requires two unlikely things: Someone naming a type exit and putting parentheses around it and using a UDA on statement scope.

My fix from above doesn’t change that, but again, it’s really unlikely to be in code anyways.

Function literals

auto test1 = function (float){return 0;};
// auto test2 = function (float)(int){return 0;}; // Syntax error

Yes, for backwards compatibility, it must be done that way. However, this is a MM violation and must be mentioned in the DIP.

The second function literal has both a return type and parameters, but
it results in a syntax error, because the parens are parsed as
parameters and no other parens are expected after that.

The second one should be allowed; otherwise some things aren’t expressible. This should work because there’s no valid reason why it can’t:

auto fp = function (ref int function()) () => null;

However, this currently works and must keep behavior:

auto fp = function (ref int function()) => null;
static assert(is(typeof(fp) : typeof(null) function(ref int function())));

The implementation will do a look-ahead to figure out if it’s seeing (Params) FunctionLiteralBody or (Type)(params) FunctionLiteralBody.

It might be noteworthy that this is not a MM violation. There is no other way to parse (Type)(Parameters) FunctionLiteralBody.

Anonymous classes

void main()
{
    auto o1 = new class (Object) {};
}

The parens could be constructor arguments or a basic type in
AnonBaseClassList?. The implementation always tries to parse
constructor arguments, which should be fine.

I going to look into this. Probably this is low-priority because a base class or interface name following new class never requires parens. But it should not be an error either. Probably I’ll do the same as with function literals: Look ahead and see if there’s another set of parens. If yes, it’s new class (Type)(Arguments) {}. If not, it’s new class /*implicit Object*/(Arguments) {} because of backwards compatibility.

I’ll commit my stuff probably tomorrow. I can’t do it now, unfortunately.

September 24

Re: Third and Hopefully Last Draft: Primary Type Syntax

Posted by Quirin Schroll
in reply to Quirin Schroll

Permalink

Quirin Schroll

Posted in reply to Quirin Schroll

Permalink

On Monday, 23 September 2024 at 19:03:47 UTC, Quirin Schroll wrote:

I’ll commit my stuff probably tomorrow. I can’t do it now, unfortunately.

Done. And I updated the DIP draft to include the new Maximum Munch exceptions.

I did everything as suggested in my post, except for the anonymous class stuff. There, I was mistaken. The constructor arguments go first, then the base class / interfaces follow:

new class ConstructorArgs? AnonBaseClassList? AggregateBody

This means there is no real issue. If someone writes new class (Object), that’s a compile error today (if Object refers to a type, which it usually does) as parsing takes (Object) as the argument list, and it will stay one. Someone who wants to surround a the first base class / interface with parentheses has to use an explicit empty argument list, e.g. new class () (Object) {}.

Please review the latest draft here.

February 14

Re: Third and Hopefully Last Draft: Primary Type Syntax