View mode: basic / threaded / horizontal-split · Log in · Help
February 11, 2011
Context-Free Grammars? What about arrays?
Hi,

I think I'm having a little trouble understanding what's meant by context-free
grammar. I've read that D is context-free, but is it really? What about an
expression like:

int[U] s;

? You can't tell -- without looking at the context -- whether U is a data type
or a number, and so because associative arrays and regular arrays are
syntactically different elements of the language, the syntax of D is tied in
with its semantics, just like in C++.

So is D really context-free? Or am I misunderstanding the meaning of the term?

Thanks!
February 11, 2011
Re: Context-Free Grammars? What about arrays?
On Friday 11 February 2011 01:03:21 %u wrote:
> Hi,
> 
> I think I'm having a little trouble understanding what's meant by
> context-free grammar. I've read that D is context-free, but is it really?
> What about an expression like:
> 
> int[U] s;
> 
> ? You can't tell -- without looking at the context -- whether U is a data
> type or a number, and so because associative arrays and regular arrays are
> syntactically different elements of the language, the syntax of D is tied
> in with its semantics, just like in C++.
> 
> So is D really context-free? Or am I misunderstanding the meaning of the
> term?

Yes. You're misunderstanding. It's actually quite difficult to parse a language if 
its grammar isn't context-free. You really should read up in context-free 
grammars if you want to understand it properly.

Context-free means that it's unambiguous as to which grammar rule is to be 
applied at any point during the parsing process. In this case, U is a symbol. 
The parser probably doesn't care about much more than that. That symbol could 
end up being a number or a type. And actually, it probably isn't even that hard. 
U is a class, struct, alias, or template argument. The template argument and 
alias would be substituted with a real type or value before the compiler had to 
determine what type int[U] was.

Not to mention, the context-free grammar is entirely for parsing tokens into an 
abstract syntax tree. The compiler then has to do whatever it does with the AST 
to actually generate assembly from it. So, as long as the parser can parse it 
into the AST, it could be the next stage's problem to determine what that really 
means.

Regardless, I suggest that you read a compiler book if you really want to 
understand context-free grammars. I have "Compiler Construction: Principles and 
Practices" by Kenneth C. Louden, which is quite good, but the most popular one 
is the so-called "dragon" book. I'm not sure what it's actual title is though.

- Jonathan M Davis
February 11, 2011
Re: Context-Free Grammars? What about arrays?
On 02/11/2011 10:03 AM, %u wrote:
> Hi,
>
> I think I'm having a little trouble understanding what's meant by context-free
> grammar. I've read that D is context-free, but is it really? What about an
> expression like:
>
> int[U] s;
>
> ? You can't tell -- without looking at the context -- whether U is a data type
> or a number, and so because associative arrays and regular arrays are
> syntactically different elements of the language, the syntax of D is tied in
> with its semantics, just like in C++.
>
> So is D really context-free? Or am I misunderstanding the meaning of the term?

Have you tried it? int[U] cannot be a plain array because the size must be 
constant. But I agree there is some context in play for the compiler (or rather 
the linker?) to determine /that/:
--> some error if U undefined
--> some other error if defined, but neither uint nore type
--> some other error if uint but not constant
--> ass array if type
(I guess)

Denis
-- 
_________________
vita es estrany
spir.wikidot.com
February 11, 2011
Re: Context-Free Grammars? What about arrays?
On Friday 11 February 2011 03:22:54 spir wrote:
> On 02/11/2011 10:03 AM, %u wrote:
> > Hi,
> > 
> > I think I'm having a little trouble understanding what's meant by
> > context-free grammar. I've read that D is context-free, but is it
> > really? What about an expression like:
> > 
> > int[U] s;
> > 
> > ? You can't tell -- without looking at the context -- whether U is a data
> > type or a number, and so because associative arrays and regular arrays
> > are syntactically different elements of the language, the syntax of D is
> > tied in with its semantics, just like in C++.
> > 
> > So is D really context-free? Or am I misunderstanding the meaning of the
> > term?
> 
> Have you tried it? int[U] cannot be a plain array because the size must be
> constant. But I agree there is some context in play for the compiler (or
> rather the linker?) to determine /that/:
> --> some error if U undefined
> --> some other error if defined, but neither uint nore type
> --> some other error if uint but not constant
> --> ass array if type
> (I guess)

Actually, it _can_ be a plain, static array. U could be a template argument 
which is a number.

- Jonathan M Davis
February 11, 2011
Re: Context-Free Grammars? What about arrays?
On 2/11/11 6:03 AM, %u wrote:
> Hi,
>
> I think I'm having a little trouble understanding what's meant by context-free
> grammar. I've read that D is context-free, but is it really? What about an
> expression like:
>
> int[U] s;
>
> ? You can't tell -- without looking at the context -- whether U is a data type
> or a number, and so because associative arrays and regular arrays are
> syntactically different elements of the language, the syntax of D is tied in
> with its semantics, just like in C++.
>
> So is D really context-free? Or am I misunderstanding the meaning of the term?
>
> Thanks!

That will always parse to an associative array. Then in the semantic 
pass, if U is a constant expression that turns out to be an integer it 
is reinterpreted as a static array.

Take a look: 
https://github.com/D-Programming-Language/dmd/blob/master/src/mtype.c#L3924
February 11, 2011
Re: Context-Free Grammars? What about arrays?
> That will always parse to an associative array. Then in the semantic pass, if U
is a constant expression that turns out to be an integer it is reinterpreted as a
static array.

Ah, interesting. But that describes the implementation (i.e. how the compiler
copes with the ambiguity) and not the language itself.

> Context-free means that it's unambiguous as to which grammar rule is to be
applied at any point during the parsing process.

The problem is that it _is_ ambiguous what rule to apply. To me, just because
static arrays and associative arrays happen to have similar _looks_ doesn't make
parsing them context-free -- they're defined with completely separate syntaxes,
and they just happen to look identical. The fact that the compiler has to do a
semantic analysis in order to figure this out (which was originally a syntax
problem) really makes me wonder why you mention that D is still context-free.

> Not to mention, the context-free grammar is entirely for parsing tokens into an
abstract syntax tree.

Again, just because the AST's _happen_ to _look_ the same for static and
associative arrays, does that mean that this makes D context-free? The meaning of
the expression is still ambiguous, whether or not it fits well within the tree.

Furthermore, I would say that a struct like this:

struct Temp(T...) { int[T] arr; }

should work perfectly if T.length == 0, since it would then be a dynamic array.
But the parser is forced to treat it as an associative array, and so _both_ of
these break:

Temp!() a;
Temp!(2) b;      //Why does this break?

Why should the second one break, in any case?
February 11, 2011
Re: Context-Free Grammars? What about arrays?
Sorry, please ignore the very last line:
  Temp!(2) b;      //Why does this break?

I totally wasn't thinking about the fact that T is a tuple in "int[T] arr;". (I
would intuitively think that it should still work, given that a tuple is just a
bunch of types, but I can see why it wouldn't.)

However, why the previous statement breaks:
   Temp!() a;
is still puzzling me, since giving nothing to the tuple should be equivalent to
nothing being in the array.

Thanks.
February 11, 2011
Re: Context-Free Grammars? What about arrays?
On 2011-02-11 10:37, Jonathan M Davis wrote:
> On Friday 11 February 2011 01:03:21 %u wrote:
>> Hi,
>>
>> I think I'm having a little trouble understanding what's meant by
>> context-free grammar. I've read that D is context-free, but is it really?
>> What about an expression like:
>>
>> int[U] s;
>>
>> ? You can't tell -- without looking at the context -- whether U is a data
>> type or a number, and so because associative arrays and regular arrays are
>> syntactically different elements of the language, the syntax of D is tied
>> in with its semantics, just like in C++.
>>
>> So is D really context-free? Or am I misunderstanding the meaning of the
>> term?
>
> Yes. You're misunderstanding. It's actually quite difficult to parse a language if
> its grammar isn't context-free. You really should read up in context-free
> grammars if you want to understand it properly.
>
> Context-free means that it's unambiguous as to which grammar rule is to be
> applied at any point during the parsing process. In this case, U is a symbol.
> The parser probably doesn't care about much more than that. That symbol could
> end up being a number or a type. And actually, it probably isn't even that hard.
> U is a class, struct, alias, or template argument. The template argument and
> alias would be substituted with a real type or value before the compiler had to
> determine what type int[U] was.
>
> Not to mention, the context-free grammar is entirely for parsing tokens into an
> abstract syntax tree. The compiler then has to do whatever it does with the AST
> to actually generate assembly from it. So, as long as the parser can parse it
> into the AST, it could be the next stage's problem to determine what that really
> means.
>
> Regardless, I suggest that you read a compiler book if you really want to
> understand context-free grammars. I have "Compiler Construction: Principles and
> Practices" by Kenneth C. Louden, which is quite good, but the most popular one
> is the so-called "dragon" book. I'm not sure what it's actual title is though.
>
> - Jonathan M Davis

The title of the so called "dragon" book is "Compilers: Principles, 
Techniques, and Tools" and can be found here:

http://www.amazon.com/Compilers-Principles-Techniques-Tools-2nd/dp/0321486811/ref=sr_1_1?s=books&ie=UTF8&qid=1297440057&sr=1-1

It's also listed on the bookshelf page on the DigitalMars site.

-- 
/Jacob Carlborg
February 14, 2011
Re: Context-Free Grammars? What about arrays?
"%u" <wfunction@hotmail.com> wrote in message 
news:ij3la9$t53$1@digitalmars.com...
>
> The problem is that it _is_ ambiguous what rule to apply. To me, just 
> because
> static arrays and associative arrays happen to have similar _looks_ 
> doesn't make
> parsing them context-free -- they're defined with completely separate 
> syntaxes,
> and they just happen to look identical.

Yes, they just happen to look identical: but that's the crux: How it "looks" 
is the *only* thing that parser/grammar are concerned with. The fact that it 
could be one of two different *types* is irrelevent since type systems are 
purely a semantic matter and (at least in D) what type something is never 
affects how the code's *structure* is interpreted.

True, there are times when different types are distinguished via different 
syntaxes (for instance, function vs integer vs array), but that's 
incidental. The important thing is that that semantic-level type information 
never actually needs to feed back into the parser.

In the case of "int[U] s;", the *only* thing the parser is concerned with is 
"Ok, this is a variable declaration, and a variable declaration is a 
statement, and a statement can occur in x, y and z places". All of that 
stays exactly the same no matter what "U" is and no matter what type 
"int[U]" is. Anything beyond that is not the parser's job.

Consider this:

double a;
bool b;

Two different variables that are treated very differently. But the 
*structure* is exactly the same: {type} {identifier} {semicolon}. In all 
cases, that means "variable declaration". And a variable declaration is a 
statement, etc. For both variables, that structure is identical, and that's 
all that the parser is concerned with. In a grammar-definition that would 
look something like this:

<Var Decl> ::= <Type> <Ident> ';'
<Statement> ::= <Var Decl>
                      |  any other type of statement here
                      |  etc

Likewise, in the case of "int[U] s;" we have:

<Var Decl> ::= <Type> '[' <Ident> ']' <Ident> ';'

Note that everything about that is true no matter what the ident inside the 
brackets is. Also note that there is *nothing* in there the pulls in any 
information from the semantic phase. If it did, it wouldn't be context-free.

If it worked like this:

<Array Decl> ::= <Type> '[' <Ident> ']' <Ident> ';'
<Assoc Array Decl> ::= <Type> '[' <Ident> ']' <Ident> ';'

...*Then* it would no longer be context-free because the parser would have 
to rely on the semantics of 'U' to determine how to parse it. But D doesn't 
do that. It doesn't even bother distinguishing between array declarations 
and assoc array declarations until the semantic phase. Therefore the grammar 
(ie, the parsing) is still context-free.

One other thing that would prevent it from being context-free would be if 
the left-side of a production rule had more than one token:

<A> <B> ::= etc...


> The fact that the compiler has to do a
> semantic analysis in order to figure this out (which was originally a 
> syntax
> problem) really makes me wonder why you mention that D is still 
> context-free.
>

It's not a syntax problem unless the grammar defines it to be a syntax 
problem. Anything that's defined in the grammar is syntax; anything that 
isn't defined in the grammar is not syntax. Also, remember that grammar only 
apples to lexing/parsing and does not specify anything about semantics (and 
"sementics" effectively means "anything that isn't handled by the lexer or 
the parser").
February 14, 2011
Re: Context-Free Grammars? What about arrays?
"%u" <wfunction@hotmail.com> wrote in message 
news:ij3la9$t53$1@digitalmars.com...
>
> Again, just because the AST's _happen_ to _look_ the same for static and
> associative arrays, does that mean that this makes D context-free?

Grammar is *just* about how it *looks*, not what it means.

> The meaning of
> the expression is still ambiguous, whether or not it fits well within the 
> tree.
>

"Meaning" is semanitcs, not syntax. Therefore, whether or not the *meaning* 
is ambiguous is irrelevent to the grammar. The parser/grammar does *not* 
deal with meaning, just structure.
« First   ‹ Prev
1 2
Top | Discussion index | About this forum | D home