Thread overview
Meaningful identifiers and other multi-token keywords
Sep 24
Dom DiSc
Sep 25
Tim
Sep 25
ryuukk_
September 24

D’s has 4 places in the grammar where meaningful identifiers are used instead of keywords:

  • Pragmas
  • Traits
  • Linkages
  • Scope guards

For pragmas and traits, this is total non-issue as they have special and dedicated keywords. For linkages and scope guards, there will be rough edges if we make (Type) be a well-formed BasicType. The reason is that extern(C) could mean extern plus the basic type (C), where C denotes e.g. a dummy class; or scope (exit) x = 10; with the intention not to assign x, but to declare x as a scope variable of type exit. In general, you could ask: Why would one write such code? and you’d be correct.

The issue is with the argument to extern and linkage being identifiers. For linkage, it’s implementation defined which ones are supported, and they’re not just identifiers (e.g. C++ and Objective-C), however, with scope guards, there are only exit, success, and failure.

I want to suggest moving the parsing of scope guards and linkages to the lexer, i.e., if the lexer sees scope, (, any one of the identifiers exit, success, or failure, and ), that is a scope guard and is treated as a single token.

The same with extern(C) – it will never be seen as anything but a linkage. It’s a multi-token keyword.

Possibly, we can handle other cases alike, e.g. static assert, static foreach, and auto ref. By all accounts, their meaning isn’t derived from composing the semantics of the parts.

What do you think?

September 24

On Tuesday, 24 September 2024 at 20:37:36 UTC, Quirin Schroll wrote:

>

I want to suggest moving the parsing of scope guards and linkages to the lexer, i.e., if the lexer sees scope, (, any one of the identifiers exit, success, or failure, and ), that is a scope guard and is treated as a single token.

The same with extern(C) – it will never be seen as anything but a linkage. It’s a multi-token keyword.

Possibly, we can handle other cases alike, e.g. static assert, static foreach, and auto ref. By all accounts, their meaning isn’t derived from composing the semantics of the parts.

What do you think?

I think this is a good idea. They are multi-token keywords just to not occupy more words as keywords, but in fact could be treated as single entities.

September 25
On 25/09/2024 8:37 AM, Quirin Schroll wrote:
> The same with |extern(C)| – it will never be seen as anything but a linkage. It’s a multi-token keyword.

I'm not sure this one is a good idea.

Not all linkages can be done i.e. C++ has namespace.

So it is moving one behavior that has no special casing, into another place that would require special casing and will slow things down.

Overall I'm convinced that given how the lexer works, that this isn't a path we should be going down. Its done the way it is for a reason.

I would expect that any changes down this path to slow down all identifiers for very little value.
September 25

On Tuesday, 24 September 2024 at 20:37:36 UTC, Quirin Schroll wrote:

>

I want to suggest moving the parsing of scope guards and linkages to the lexer, i.e., if the lexer sees scope, (, any one of the identifiers exit, success, or failure, and ), that is a scope guard and is treated as a single token.

The same with extern(C) – it will never be seen as anything but a linkage. It’s a multi-token keyword.

I don't think, the lexer would be the right place, because the constructs are still multiple tokens. For example whitespace and comments are allowed in extern ( C ++ /*comment*/ ).

Unknown languages in extern(...) attributes should also produce errors, so future compilers can add them without breaking code. Consider this example:

extern(X) x = 0;

Currently X is a normal identifier, but in the future it be could another language supported by the compiler. If (X) is interpreted as a type, then adding extern(X) to the compiler would be a breaking change. For forward compatibility it would be best if extern(...) and scope(...) are always parsed as whole attributes and not attributes with types in parens. Unknown languages or scope guard identifiers would then produce errors, so future compilers could add them without breaking code.

September 25

On Tuesday, 24 September 2024 at 20:37:36 UTC, Quirin Schroll wrote:

>

D’s has 4 places in the grammar where meaningful identifiers are used instead of keywords:

  • Pragmas
  • Traits
  • Linkages
  • Scope guards

For pragmas and traits, this is total non-issue as they have special and dedicated keywords. For linkages and scope guards, there will be rough edges if we make (Type) be a well-formed BasicType. The reason is that extern(C) could mean extern plus the basic type (C), where C denotes e.g. a dummy class; or scope (exit) x = 10; with the intention not to assign x, but to declare x as a scope variable of type exit. In general, you could ask: Why would one write such code? and you’d be correct.

The issue is with the argument to extern and linkage being identifiers. For linkage, it’s implementation defined which ones are supported, and they’re not just identifiers (e.g. C++ and Objective-C), however, with scope guards, there are only exit, success, and failure.

I want to suggest moving the parsing of scope guards and linkages to the lexer, i.e., if the lexer sees scope, (, any one of the identifiers exit, success, or failure, and ), that is a scope guard and is treated as a single token.

The same with extern(C) – it will never be seen as anything but a linkage. It’s a multi-token keyword.

Possibly, we can handle other cases alike, e.g. static assert, static foreach, and auto ref. By all accounts, their meaning isn’t derived from composing the semantics of the parts.

What do you think?

I love scope guards, i use them all the time, however, they are both painful to type and makes code ugly to read

Perhaps scope(exit) and scope(failure) should be renamed, defer and errDefer

Solves your problem, and mine

September 26

On Wednesday, 25 September 2024 at 15:50:20 UTC, Tim wrote:

>

On Tuesday, 24 September 2024 at 20:37:36 UTC, Quirin Schroll wrote:

>

I want to suggest moving the parsing of scope guards and linkages to the lexer, i.e., if the lexer sees scope, (, any one of the identifiers exit, success, or failure, and ), that is a scope guard and is treated as a single token.

The same with extern(C) – it will never be seen as anything but a linkage. It’s a multi-token keyword.

I don't think, the lexer would be the right place, because the constructs are still multiple tokens. For example whitespace and comments are allowed in extern ( C ++ /*comment*/ ).

The whitespace is not an issue. The comments maybe are. But even if they were, one option would be to just ban comments in linkage attributes and scope guards and not deal with the problem. I mean, who would do that, except for a QA tester?

>

Unknown languages in extern(...) attributes should also produce errors, so future compilers can add them without breaking code.

Officially, it’s implementation defined what’s supported beyond D and C, see here. The fact that DMD supports C++, Objective-C, System, and Windows is already an extension.

Considering C++ namespaces, the syntax is quite flexible. Essentially, any token soup with balanced parentheses is allowed. Maybe C++ was right, there it’s extern "C".

>

Consider this example:

extern(X) x = 0;

Currently X is a normal identifier, but in the future it be could another language supported by the compiler. If (X) is interpreted as a type, then adding extern(X) to the compiler would be a breaking change. For forward compatibility it would be best if extern(...) and scope(...) are always parsed as whole attributes and not attributes with types in parens. Unknown languages or scope guard identifiers would then produce errors, so future compilers could add them without breaking code.

With Primary Type Syntax, extern (Type) can happen by accident, yes.

Then, Type could happen to be a valid linkage, but even in that case, there’s a high likelihood that there’s a parse error down the line. (I fact, it might be guaranteed, I couldn’t find a way how it’s not.) That is because linkage attributes are not storage classes. Unlike static, ref, etc., extern(C) cannot be used instead of auto.

// Current behavior:
alias C = int;

extern (C) x = 0; // Error: basic type expected
extern (C) auto x = 0; // Good, and `C` can’t be the type of `x`

static extern (C) x = 0; // Error: basic type expected
extern (C) static x = 0; // Good, and `C` can’t be the type of `x`, even if it denotes a type

alias Type = int;

extern (Type) x = 0; // Error: Type is not a linkage
extern (Type) auto x = 0; // Error: Type is not a linkage

static extern (Type) x = 0; // Error: Type is not a linkage
extern (Type) static x = 0; // Error: Type is not a linkage

// My implementation:
alias C = int;

extern (C) x = 0; // Error: basic type expected
extern (C) auto x = 0; // Good, and `C` can’t be the type of `x`

static extern (C) x = 0; // Error: basic type expected
extern (C) static x = 0; // Good, but `C` can’t be the type of `x`, even if it denotes a type

alias Type = int;

extern (Type) x = 0; // Error: `Type` is not a linkage
extern (Type) auto x = 0; // Error: `Type` is not a linkage

static extern (Type) x = 0; // Error: `Type` is not a linkage
extern (Type) static x = 0; // Error: `Type` is not a linkage

From what I’ve understood in the attribute spec, extern marks a symbol a declaration whereas without, it would be a definition. It comes with or implies export, or at least implies static. So, if needed, one can just put that (or any nothingburger) between extern and (Type) and be good.

// My implementation
extern export (Type) x; // ok
extern static (Type) y; // ok
extern @0 (Type) z; // ok
September 27
On 9/24/2024 1:37 PM, Quirin Schroll wrote:
> What do you think?

It isn't clear what problem is being solved by this.
September 27

On Tuesday, 24 September 2024 at 20:37:36 UTC, Quirin Schroll wrote:

>

D’s has 4 places in the grammar where meaningful identifiers are used instead of keywords:

  • Pragmas
  • Traits
  • Linkages
  • Scope guards

[...]

Just a question, what would that mean for backwards compatibility and potential loss of flexibility?