Thread overview
[Bug?] Current D Grammar not context free?
Mar 26, 2004
Manfred Nowak
Mar 26, 2004
Andy Friesen
Mar 26, 2004
Stewart Gordon
Apr 18, 2004
Walter
March 26, 2004
In the current spec there are the following grammar rules for expressions:

        UnaryExpression:
		PostfixExpression
		( Type ) UnaryExpression
		( Type ) . Identifier
	PostfixExpression:
		PrimaryExpression
	PrimaryExpression:
		.Identifier

As one can notice, for the code `( int).hello' there are two different derivations from the grammar to the corresponding token sequence. One directly from "UnaryExpression" and one including "PostfixExpression" and "PrimayExpression".

To me this suggests, that the current grammar is not context free.

In any case I do not understand, what the semantical difference of this two derivations is.

So long!

March 26, 2004
Manfred Nowak wrote:
> In the current spec there are the following grammar rules for expressions:
> 
>         UnaryExpression:
> 		PostfixExpression
> 		( Type ) UnaryExpression
> 		( Type ) . Identifier
> 	PostfixExpression:
> 		PrimaryExpression
> 	PrimaryExpression:
> 		.Identifier
> 
> As one can notice, for the code `( int).hello' there are two different
> derivations from the grammar to the corresponding token sequence. One
> directly from "UnaryExpression" and one including "PostfixExpression" and
> "PrimayExpression".
> 
> To me this suggests, that the current grammar is not context free.
> 
> In any case I do not understand, what the semantical difference of this
> two derivations is.

Looks like the difference is that of casting a global scope symbol and accessing a type property.  ie (wchar[]).toString(i) vs (int).sizeof

Dumping C-style cast syntax would solve the problem.

 -- andy
March 26, 2004
Andy Friesen wrote:

> Manfred Nowak wrote:
> 
>> In the current spec there are the following grammar rules for expressions:
>>
>>         UnaryExpression:
>>         PostfixExpression
>>         ( Type ) UnaryExpression
>>         ( Type ) . Identifier
>>     PostfixExpression:
>>         PrimaryExpression
>>     PrimaryExpression:
>>         .Identifier
<snip>

The way that's written, there are quite a few ambiguities.  Like, is

	(Qwert) - 3

a cast or a binary subtraction?

But look down at the Cast Expressions subsection of that page, you'll see that it just can't make up its mind whether the syntax is

	UnaryExpression ::= ( Type ) UnaryExpression

or

	UnaryExpression ::= cast ( Type ) UnaryExpression

> Dumping C-style cast syntax would solve the problem.

Only that little bit of it.  Even after it, here's another ambiguity:

	PostfixExpression	::= PostfixExpression . Identifier
	PrimaryExpression	::= Type . Identifier

Given

	Identifier . Identifier

two possible parse trees:

	PostfixExpression
		PrimaryExpression
			Type
				Identifier
			.
			Identifier

	PostfixExpression
		PostfixExpression
			PrimaryExpression
				Identifier
		.
		Identifier

Stewart.

-- 
My e-mail is valid but not my primary mailbox, aside from its being the unfortunate victim of intensive mail-bombing at the moment.  Please keep replies on the 'group where everyone may benefit.
April 18, 2004
"Stewart Gordon" <smjg_1998@yahoo.com> wrote in message news:c41tnd$14d$1@digitaldaemon.com...
> Andy Friesen wrote:
>
> > Manfred Nowak wrote:
> >
> >> In the current spec there are the following grammar rules for expressions:
> >>
> >>         UnaryExpression:
> >>         PostfixExpression
> >>         ( Type ) UnaryExpression
> >>         ( Type ) . Identifier
> >>     PostfixExpression:
> >>         PrimaryExpression
> >>     PrimaryExpression:
> >>         .Identifier
> <snip>
>
> The way that's written, there are quite a few ambiguities.  Like, is
>
> (Qwert) - 3
>
> a cast or a binary subtraction?
>
> But look down at the Cast Expressions subsection of that page, you'll see that it just can't make up its mind whether the syntax is
>
> UnaryExpression ::= ( Type ) UnaryExpression
>
> or
>
> UnaryExpression ::= cast ( Type ) UnaryExpression
>

The way the ambiguity is dealt with is if the parentheses are pointless,
then it is treated as a (Type) rather than (Expression). I.e. in order for
it to be a (Type), it has to not be parseable as an expression. For example:
    (int)    => (Type)
    (T)    => (Expression)
    (T*)    => (Type)
Look on line 3746 of parse.c to see how it works.

> > Dumping C-style cast syntax would solve the problem.

Yes, that is correct. That's probably the right way to go.

> Only that little bit of it.  Even after it, here's another ambiguity:
>
> PostfixExpression ::= PostfixExpression . Identifier PrimaryExpression ::= Type . Identifier
>
> Given
>
> Identifier . Identifier
>
> two possible parse trees:
>
> PostfixExpression
> PrimaryExpression
> Type
> Identifier
> .
> Identifier
>
> PostfixExpression
> PostfixExpression
> PrimaryExpression
> Identifier
> .
> Identifier

That should be done a bit better, the 'Type' really should be BasicType.