September 06, 2006
The D grammar, as given in the spec, has a number of holes, typos, inconsistencies with implementation and bits that simply ought to be different.  One place where these problems occur quite a bit is in the statement grammar.  I hereby propose the following grammar to replace it, thereby fixing some of these problems and introducing one or two checks for coding absurdities in the process.

Statement:
        BasicStatement
        BlockStatement
        Declaration
        ScopeStatement

BasicStatement:
        LabelledStatement
        ExpressionStatement
        IfStatement
        WhileStatement
        DoWhileStatement
        ForStatement
        ForeachStatement
        SwitchStatement
        CaseStatement
        DefaultStatement
        ContinueStatement
        BreakStatement
        ReturnStatement
        GotoStatement
        WithStatement
        SynchronizeStatement
        TryStatement
        ThrowStatement
        VolatileStatement
        AsmStatement
        PragmaStatement
        CCStatement

IfStatement:
        'if' '(' IfCondition ')' ControlledStatement
        'if' '(' IfCondition ')' ControlledStatement else ControlledStatement

IfCondition:
        Expression
        'auto' Identifier '=' Expression
        Declarator '=' Expression

WhileStatement:
        'while' '(' Expression ')' ControlledStatement

DoStatement:
        'do' ControlledStatement while '(' Expression ')' ';'

ForStatement:
        'for' '(' Initialize ';' Test ';' Increment ')' ControlledStatement

ForeachStatement:
        'foreach' '(' ForeachTypeList ';' Expression ')' ControlledStatement

WithStatement:
        'with' '(' Expression ')' ControlledStatement
        'with' '(' Symbol ')' ControlledStatement
        'with' '(' TemplateInstance ')' ControlledStatement

SynchronizeStatement:
        'synchronized' ControlledStatement
        'synchronized' '(' Expression ')' ControlledStatement

ScopeStatement:
        'scope' '(' 'exit' ')' ControlledStatement
        'scope' '(' 'success' ')' ControlledStatement
        'scope' '(' 'failure' ')' ControlledStatement

TryStatement:
        try ControlledStatement Catches
        try ControlledStatement Catches FinallyStatement
        try ControlledStatement FinallyStatement

Catches:
        LastCatch
        Catch
        Catch Catches

LastCatch:
        catch BlockStatement

Catch:
        catch '(' CatchParameter ')' ControlledStatement

FinallyStatement:
        finally ControlledStatement

PragmaStatement:
        Pragma ';'
        Pragma CCedStatement

ControlledStatement:
        BasicStatement
        BlockStatement

BlockStatement:
        '{' '}'
        '{' StatementList '}'

CCStatement:
        CompileCondition CCedStatement
        CompileCondition CCedStatement else CCedStatement

CCedStatement:
        BasicStatement
        Declaration
        ScopeStatement
        CCedBlockStatement

CCedBlockStatement:
        '{' '}'
        '{' StatementList '}'

StatementList:
        Statement
        Statement StatementList


The idea:

- The syntax distinguishes between control statements and conditional compilation statements.  BlockStatement and CCedBlockStatement are syntactically the same, but differ semantically in that BlockStatement opens a new scope and CCedBlockStatement doesn't.

- The current spec of DeclarationStatement is far from complete.  We might as well get rid of it and just use Declaration (defined in declaration.html).  It's what DMD seems to do anyway.

- Because a Declaration isn't a valid form of ControlledStatement, the highly debated question of whether a control statement opens a new scope even without {...} becomes mostly irrelevant.  It doesn't make sense to allow a declaration here anyway.  This also eliminates the confusing ambiguity between a SynchronizeStatement and a nested function with the synchronized attribute.

I say _mostly_ irrelevant because there are cases such as this:

    if (qwert) static if (yuiop) {
        int asdfg;
        ...
    }
    writefln(asdfg);

The {...} applies to the static if and therefore doesn't introduce a scope.  However, it wouldn't make sense if the compile-time legality of the statement after the IfStatement depends at runtime on the value of qwert.  It follows that control statements must still be defined to create a scope even if they don't have {...} attached.


Notice also:

- ConditionalStatement and Condition (currently defined in version.html) have been renamed to the more descriptive CCStatement (conditional compilation statement) and CompileCondition.

- WithStatement and TryStatement are defined in the current spec to take only BlockStatement bodies, but the compiler accepts anything here (except in LastCatch).  Meanwhile, I've made them use ControlledStatement.

- PragmaStatement, previously undefined, now has a definition.  Of course, a given pragma may or may not introduce a new scope.  Generally, if a particular pragma (recognised by the compiler) creates a scope, then the compiler should disallow a naked declaration as the body - this could be implemented either during parsing or during semantic analysis.

- ThenStatement and ElseStatement are unnecessary aliases and have been removed.

- I have used several nonterminals in the above BNF without defining them.  These will retain their current meanings, whether currently defined on statement.html or elsewhere.


A few further points:

- I thought about building the solution to the dangling else problem directly into the BNF.  But with the number of cases to consider, I now realise it would probably be simpler to expect the parser to resolve the conflict via a disambiguation rule.

- The current spec can't make up its mind whether it's LabelledStatement or LabeledStatement.  Indeed, the spec is dotted with both American and British spellings.  Which is it supposed to be written in?

- The BNF grammars on module.html and class.html could also do with some cleaning up.  But I'll have a go at that another day (unless someone else gets there first).

-- 
-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GCS/M d- s:-@ C++@ a->--- UB@ P+ L E@ W++@ N+++ o K-@ w++@ O? M V? PS- PE- Y? PGP- t- 5? X? R b DI? D G e++++ h-- r-- !y
------END GEEK CODE BLOCK------

My e-mail is valid but not my primary mailbox.  Please keep replies on the 'group where everyone may benefit.