Meaningful identifiers and other multi-token keywords

Meaningful identifiers and other multi-token keywords
Sep 24 Quirin Schroll
Sep 24 Dom DiSc
Sep 25 Richard (Rikki) Andrew Cattermole
Sep 25 Tim
Sep 26 Quirin Schroll
Sep 25 ryuukk_
Sep 27 Walter Bright
Sep 27 Imperatorn

September 24

Posted by Quirin Schroll

Permalink

Quirin Schroll

Permalink

D’s has 4 places in the grammar where meaningful identifiers are used instead of keywords:

Pragmas
Traits
Linkages
Scope guards

For pragmas and traits, this is total non-issue as they have special and dedicated keywords. For linkages and scope guards, there will be rough edges if we make (Type) be a well-formed BasicType. The reason is that extern(C) could mean extern plus the basic type (C), where C denotes e.g. a dummy class; or scope (exit) x = 10; with the intention not to assign x, but to declare x as a scope variable of type exit. In general, you could ask: Why would one write such code? and you’d be correct.

The issue is with the argument to extern and linkage being identifiers. For linkage, it’s implementation defined which ones are supported, and they’re not just identifiers (e.g. C++ and Objective-C), however, with scope guards, there are only exit, success, and failure.

I want to suggest moving the parsing of scope guards and linkages to the lexer, i.e., if the lexer sees scope, (, any one of the identifiers exit, success, or failure, and ), that is a scope guard and is treated as a single token.

The same with extern(C) – it will never be seen as anything but a linkage. It’s a multi-token keyword.

Possibly, we can handle other cases alike, e.g. static assert, static foreach, and auto ref. By all accounts, their meaning isn’t derived from composing the semantics of the parts.

What do you think?

September 24

Re: Meaningful identifiers and other multi-token keywords

Posted by Dom DiSc
in reply to Quirin Schroll

Permalink

Dom DiSc

Posted in reply to Quirin Schroll

Permalink

On Tuesday, 24 September 2024 at 20:37:36 UTC, Quirin Schroll wrote:

The same with extern(C) – it will never be seen as anything but a linkage. It’s a multi-token keyword.

Possibly, we can handle other cases alike, e.g. static assert, static foreach, and auto ref. By all accounts, their meaning isn’t derived from composing the semantics of the parts.

What do you think?

I think this is a good idea. They are multi-token keywords just to not occupy more words as keywords, but in fact could be treated as single entities.

September 25

Re: Meaningful identifiers and other multi-token keywords

Posted by Richard (Rikki) Andrew Cattermole
in reply to Quirin Schroll

Permalink

Richard (Rikki) Andrew Cattermole

Posted in reply to Quirin Schroll

Permalink

On 25/09/2024 8:37 AM, Quirin Schroll wrote:
> The same with |extern(C)| – it will never be seen as anything but a linkage. It’s a multi-token keyword.

I'm not sure this one is a good idea.

Not all linkages can be done i.e. C++ has namespace.

So it is moving one behavior that has no special casing, into another place that would require special casing and will slow things down.

Overall I'm convinced that given how the lexer works, that this isn't a path we should be going down. Its done the way it is for a reason.

I would expect that any changes down this path to slow down all identifiers for very little value.

September 25

Re: Meaningful identifiers and other multi-token keywords

Posted by Tim
in reply to Quirin Schroll

Permalink

Tim

Posted in reply to Quirin Schroll

Permalink

On Tuesday, 24 September 2024 at 20:37:36 UTC, Quirin Schroll wrote:

The same with extern(C) – it will never be seen as anything but a linkage. It’s a multi-token keyword.

I don't think, the lexer would be the right place, because the constructs are still multiple tokens. For example whitespace and comments are allowed in extern ( C ++ /*comment*/ ).

Unknown languages in extern(...) attributes should also produce errors, so future compilers can add them without breaking code. Consider this example:

extern(X) x = 0;

Currently X is a normal identifier, but in the future it be could another language supported by the compiler. If (X) is interpreted as a type, then adding extern(X) to the compiler would be a breaking change. For forward compatibility it would be best if extern(...) and scope(...) are always parsed as whole attributes and not attributes with types in parens. Unknown languages or scope guard identifiers would then produce errors, so future compilers could add them without breaking code.

September 25

Re: Meaningful identifiers and other multi-token keywords

Posted by ryuukk_
in reply to Quirin Schroll

Permalink

ryuukk_

Posted in reply to Quirin Schroll

Permalink

On Tuesday, 24 September 2024 at 20:37:36 UTC, Quirin Schroll wrote:

D’s has 4 places in the grammar where meaningful identifiers are used instead of keywords:

Pragmas
Traits
Linkages
Scope guards

The same with extern(C) – it will never be seen as anything but a linkage. It’s a multi-token keyword.

Possibly, we can handle other cases alike, e.g. static assert, static foreach, and auto ref. By all accounts, their meaning isn’t derived from composing the semantics of the parts.

What do you think?

I love scope guards, i use them all the time, however, they are both painful to type and makes code ugly to read

Perhaps scope(exit) and scope(failure) should be renamed, defer and errDefer

Solves your problem, and mine

September 26

Re: Meaningful identifiers and other multi-token keywords

Posted by Quirin Schroll
in reply to Tim

Permalink

Quirin Schroll

Posted in reply to Tim

Permalink

On Wednesday, 25 September 2024 at 15:50:20 UTC, Tim wrote:

On Tuesday, 24 September 2024 at 20:37:36 UTC, Quirin Schroll wrote:

The same with extern(C) – it will never be seen as anything but a linkage. It’s a multi-token keyword.

I don't think, the lexer would be the right place, because the constructs are still multiple tokens. For example whitespace and comments are allowed in extern ( C ++ /*comment*/ ).

The whitespace is not an issue. The comments maybe are. But even if they were, one option would be to just ban comments in linkage attributes and scope guards and not deal with the problem. I mean, who would do that, except for a QA tester?

Unknown languages in extern(...) attributes should also produce errors, so future compilers can add them without breaking code.

Officially, it’s implementation defined what’s supported beyond D and C, see here. The fact that DMD supports C++, Objective-C, System, and Windows is already an extension.

Considering C++ namespaces, the syntax is quite flexible. Essentially, any token soup with balanced parentheses is allowed. Maybe C++ was right, there it’s extern "C".

Consider this example:

extern(X) x = 0;

With Primary Type Syntax, extern (Type) can happen by accident, yes.

Then, Type could happen to be a valid linkage, but even in that case, there’s a high likelihood that there’s a parse error down the line. (I fact, it might be guaranteed, I couldn’t find a way how it’s not.) That is because linkage attributes are not storage classes. Unlike static, ref, etc., extern(C) cannot be used instead of auto.

// Current behavior:
alias C = int;

extern (C) x = 0; // Error: basic type expected
extern (C) auto x = 0; // Good, and `C` can’t be the type of `x`

static extern (C) x = 0; // Error: basic type expected
extern (C) static x = 0; // Good, and `C` can’t be the type of `x`, even if it denotes a type

alias Type = int;

extern (Type) x = 0; // Error: Type is not a linkage
extern (Type) auto x = 0; // Error: Type is not a linkage

static extern (Type) x = 0; // Error: Type is not a linkage
extern (Type) static x = 0; // Error: Type is not a linkage

// My implementation:
alias C = int;

extern (C) x = 0; // Error: basic type expected
extern (C) auto x = 0; // Good, and `C` can’t be the type of `x`

static extern (C) x = 0; // Error: basic type expected
extern (C) static x = 0; // Good, but `C` can’t be the type of `x`, even if it denotes a type

alias Type = int;

extern (Type) x = 0; // Error: `Type` is not a linkage
extern (Type) auto x = 0; // Error: `Type` is not a linkage

static extern (Type) x = 0; // Error: `Type` is not a linkage
extern (Type) static x = 0; // Error: `Type` is not a linkage

From what I’ve understood in the attribute spec, extern marks a symbol a declaration whereas without, it would be a definition. It comes with or implies export, or at least implies static. So, if needed, one can just put that (or any nothingburger) between extern and (Type) and be good.

// My implementation
extern export (Type) x; // ok
extern static (Type) y; // ok
extern @0 (Type) z; // ok

September 27

Re: Meaningful identifiers and other multi-token keywords

Posted by Walter Bright
in reply to Quirin Schroll

Permalink

Walter Bright

Posted in reply to Quirin Schroll

Permalink

On 9/24/2024 1:37 PM, Quirin Schroll wrote:
> What do you think?

It isn't clear what problem is being solved by this.

September 27

Re: Meaningful identifiers and other multi-token keywords

Posted by Imperatorn
in reply to Quirin Schroll

Permalink

Imperatorn

Posted in reply to Quirin Schroll

Permalink

On Tuesday, 24 September 2024 at 20:37:36 UTC, Quirin Schroll wrote:

D’s has 4 places in the grammar where meaningful identifiers are used instead of keywords:

Pragmas
Traits
Linkages
Scope guards

[...]

Just a question, what would that mean for backwards compatibility and potential loss of flexibility?

Top | Forum index | About this forum

Forums