Thread overview
D grammar (yacc format)
Jan 28, 2004
Sammy
Jan 28, 2004
C
Jan 28, 2004
Ben Hinkle
Jan 29, 2004
Sammy
Jan 29, 2004
Sammy
Jan 29, 2004
Sammy
January 28, 2004
I began to write a yacc style grammar for D.

I am using the parse.c from the dmd source.
Each function is being translated to a piece of
the grammar. Each piece is added to the source
file using some tag ( /**@Grammar )

Eg:

/**@Grammar
 *
 * %token "^" TOKxor
 *
 * XorExp : AndExp
 *           | XorExp "^" AndExp
 *           ;
 *
 **/

Expression *Parser::parseXorExp()
{   Expression *e;
    Expression *e2;
    Loc loc = this->loc;

    e = parseAndExp();
    while (token.value == TOKxor)
    {
 nextToken();
 e2 = parseAndExp();
 e = new XorExp(loc, e, e2);
    }
    return e;
}

I dont know if the final grammar will be LALR, nor
if it will be unabiguous.

If everything is ok, i will use bison to process the grammar.
if not, there are other parser generators who can
handle anbiguous and/or non-lalr grammars
( eg: btyacc, accent, msta, D'Parser )

I would like to see the grammar maintained ( i can do that )
and distributed with DMD ( i cant do that :-) .

I also have a lexer (made with flex), who needs to be
updated.

What do you think ?


January 28, 2004
Wow, this would be awesome!  Think you'll end up using somethign besides bison though , D'Parser looks really cool.

C
"Sammy" <none@here.com> wrote in message
news:bv941u$1e1e$1@digitaldaemon.com...
> I began to write a yacc style grammar for D.
>
> I am using the parse.c from the dmd source.
> Each function is being translated to a piece of
> the grammar. Each piece is added to the source
> file using some tag ( /**@Grammar )
>
> Eg:
>
> /**@Grammar
>  *
>  * %token "^" TOKxor
>  *
>  * XorExp : AndExp
>  *           | XorExp "^" AndExp
>  *           ;
>  *
>  **/
>
> Expression *Parser::parseXorExp()
> {   Expression *e;
>     Expression *e2;
>     Loc loc = this->loc;
>
>     e = parseAndExp();
>     while (token.value == TOKxor)
>     {
>  nextToken();
>  e2 = parseAndExp();
>  e = new XorExp(loc, e, e2);
>     }
>     return e;
> }
>
> I dont know if the final grammar will be LALR, nor
> if it will be unabiguous.
>
> If everything is ok, i will use bison to process the grammar.
> if not, there are other parser generators who can
> handle anbiguous and/or non-lalr grammars
> ( eg: btyacc, accent, msta, D'Parser )
>
> I would like to see the grammar maintained ( i can do that )
> and distributed with DMD ( i cant do that :-) .
>
> I also have a lexer (made with flex), who needs to be
> updated.
>
> What do you think ?
>
>


January 28, 2004
Interestingly GCC's C++ frontend switched from using a yacc generated parser to a hand-written parser in GCC-3.4. I don't know all the reasons why but you could probably Google around to find out why. Personally I find the 3.4 hand-written parser is easier to understand.

-Ben

"Sammy" <none@here.com> wrote in message news:bv941u$1e1e$1@digitaldaemon.com...
> I began to write a yacc style grammar for D.
>
> I am using the parse.c from the dmd source.
> Each function is being translated to a piece of
> the grammar. Each piece is added to the source
> file using some tag ( /**@Grammar )
>
> Eg:
>
> /**@Grammar
>  *
>  * %token "^" TOKxor
>  *
>  * XorExp : AndExp
>  *           | XorExp "^" AndExp
>  *           ;
>  *
>  **/
>
> Expression *Parser::parseXorExp()
> {   Expression *e;
>     Expression *e2;
>     Loc loc = this->loc;
>
>     e = parseAndExp();
>     while (token.value == TOKxor)
>     {
>  nextToken();
>  e2 = parseAndExp();
>  e = new XorExp(loc, e, e2);
>     }
>     return e;
> }
>
> I dont know if the final grammar will be LALR, nor
> if it will be unabiguous.
>
> If everything is ok, i will use bison to process the grammar.
> if not, there are other parser generators who can
> handle anbiguous and/or non-lalr grammars
> ( eg: btyacc, accent, msta, D'Parser )
>
> I would like to see the grammar maintained ( i can do that )
> and distributed with DMD ( i cant do that :-) .
>
> I also have a lexer (made with flex), who needs to be
> updated.
>
> What do you think ?
>
>


January 29, 2004
"Ben Hinkle" <bhinkle4@juno.com> wrote in message news:bv9etc$21ib$1@digitaldaemon.com...
> Interestingly GCC's C++ frontend switched from using a yacc generated
parser
> to a hand-written parser in GCC-3.4. I don't know all the reasons why but you could probably Google around to find out why. Personally I find the
3.4
> hand-written parser is easier to understand.
>
> -Ben
>

I am not sure, and i dont remember where i read this,
but I think its because the parser is faster, simpler and
as you said, easier to understand ( and possibly to implement ).

Implement the parser, any way you want.
But if you want to have a clear picture of the language,
i think you need a grammar.

If you use EBNF, instead of pure BNF, you will end up with a grammar fitting on 2 or 3 pages instead of the 67 pages of the file parser.c

So you are right, but i still find its still and interesting exercice.

- S


January 29, 2004
"Ben Hinkle" <bhinkle4@juno.com> wrote in message news:bv9etc$21ib$1@digitaldaemon.com...
> Interestingly GCC's C++ frontend switched from using a yacc generated
parser
> to a hand-written parser in GCC-3.4. I don't know all the reasons why but you could probably Google around to find out why. Personally I find the
3.4
> hand-written parser is easier to understand.
>
> -Ben
>

Found this short explaination on :

     http://gnu.essentkabel.com/software/gcc/gcc-3.4/changes.html

GCC 3.4 Release Series. Changes, New Features, and Fixes

C++
A hand-written recursive-descent C++ parser has replaced the YACC-derived
C++ parser from previous GCC releases. The new parser contains much improved
infrastructure needed for better parsing of C++ source codes, handling of
extensions, and clean separation (where possible) between proper semantics
analysis and parsing. The new parser fixes many bugs that were found in the
old parser.

- S


January 29, 2004
"Ben Hinkle" <bhinkle4@juno.com> wrote in message news:bv9etc$21ib$1@digitaldaemon.com...
> Interestingly GCC's C++ frontend switched from using a yacc generated
parser
> to a hand-written parser in GCC-3.4. I don't know all the reasons why but you could probably Google around to find out why. Personally I find the
3.4
> hand-written parser is easier to understand.
>
> -Ben
>

Sorry, to post again.
All those explanation tells that its a very good choice
to use a hand written parser...

From: http://compilers.iecc.com/comparch/article/02-01-022
----------------------------------------------------------------------
Re: C++ parsing : what's new ?
From comp.compilers

From:  Zack Weinberg <zackw@panix.com>
Newsgroups:  comp.compilers
Date:  4 Jan 2002 00:09:35 -0500
Organization:  PANIX -- Public Access Networks Corp.
References:  01-12-179
Keywords:  C++, parse
Posted-Date:  04 Jan 2002 00:09:35 EST

Peter H. Froehlich <pfroehli@ics.uci.edu> writes:
>On Saturday, December 29, 2001, at 10:30 , Travis Nixon wrote:
>
>> "Martin von Loewis" <loewis@informatik.hu-berlin.de> wrote in message
>>> The gcc grammar is currently being rewritten, from a bison-based one to a hand-written recursive-descent parser.
>>
>> Is there a discussion online anywhere about the reasons for doing this?
>
>As I recall the motivation is performance. The people doing the rewrite expect the recursive descent parser to be way faster than the table-driven one they have now. However, I sure would *not* want to write a parser like that for a grammar as hairy as C/C++. :-)


The motivations are correctness, performance, and better error recovery, in that order.


The existing gcc grammar for C++ does not interpret all language constructs correctly; this has been judged unfixable while continuing to use yacc. There are already horrible kludges to get certain constructs mostly right (look at cp/spew.c, if you want to know) that become unnecessary with a recursive descent parser.


A hand-rolled recursive descent parser has a lot more information accessible to it. This should permit it to make better backtracking decisions. Bad backtracking is the major cause of bad performance in the existing parser. It will certainly permit better error recovery.


Note that this approach is known to work - the Edison Design Group (EDG)'s C/C++ front end uses a hand-written recursive descent parser, and it does all the above things better than g++. As someone who's worked on both front ends, I can also say that I found it _easier_ to work with the recursive descent parser.


zw