Thread overview
An idea for an extensible, Lexer/Parser/Framework for compilers.
Mar 06, 2008
Ryan Bloomfield
Mar 06, 2008
BCS
Mar 06, 2008
Ryan Bloomfield
Mar 07, 2008
BCS
March 06, 2008
    I was brainstorming over a project, one that may not amount to much, but I wanted to give it a go.  I was curious what the newsgroup thought of it.  I would love any ideas, no matter how crazy. the best brainstorming works with a thousand ideas thrown on the table, and 90%, in the end, ending up in the trash.

    I was thinking it might be possible to implement a Lexer/Parser Generator whose source would essentially be an extension of the D language and would then generate D modules.

	Each class could describe an individual language feature, with grammer language directly in the class definitions, complete with unit tests to ensure compatibility with standards.  The generator could then scan all the classes and generate a grammer description that could be used to make the lexer and parser.

	Those classes would then be extended to implement the backend, whether it be a compiler, language translator, or whatever else.  It would be more of a framework then a generator, but it could useful in finding mistakes and bugs, and guarantee language compatibility, during the generation of the code.

	The goal would primarily be to make a D parser, but I think it can be designed to be flexible enough(or easy enough to modify) for other languages.

	It wouldn't be as easy to create a language from scratch using this tool, but it might make it easier to modify an existing language definition with a custom features.  For example, a D compiler written with this framework would be easy for a small group to make their own flavor of D, while still being standards compliant.

	Perhaps an in-house group wants to make sure their programmers only use safe, high-level features of the language to minimize bugs.

	Or maybe a programmer wants to implement a subset of the class design when performance and size are at a premium.

	Perhaps someone will get energetic and create a full implementation of extern(c++) (won't hold my breath though :-).

	Maybe the creator of a latest and greatest programming philosophy, like the now popular(i think) contracts and unit testing, needs a compiler to try out in.

	This ought to help test out new features to the language, that can be later added to the standard language.  As well, it would allow those that need a language for a special application, think of using an extended D, instead of writing their own.

	My personal goal is to extend D with an extern(gobject) to do something like waht VALA(http://live.gnome.org/Vala) is trying to do.

	So, what do you think?  Sound like a useful project?

March 06, 2008
Ryan Bloomfield wrote:

> The generator could then scan all the classes and 
> generate a grammer description that could be used
> to make the lexer and parser.

with respect to the above, take a look at this:

http://www.dsource.org/projects/scrapple/browser/trunk/dparser/dparse.d

it doesn't do lexing but it is a fully functional parser generator that does it's processing in-language using templates and string precessing. The input is a BNF text string.

It's very limited at this time because it generates huge symbols (it takes 700MB of ram to compile a ~50 rule grammar) but I have some ideas on how to fix much of that and another idea on how to make it easier to build in common actions.

Is that something along the lines of what you are thinking of?
March 06, 2008
BCS Wrote:
> Ryan Bloomfield wrote:
> 
> > The generator could then scan all the classes and
>  > generate a grammer description that could be used
>  > to make the lexer and parser.
> 
> with respect to the above, take a look at this:
> 
> http://www.dsource.org/projects/scrapple/browser/trunk/dparser/dparse.d

Wow, that's some pretty meta-programming. :)  I thought of doing some simple meta-program just to see what it looked like compared to c++ templates.  I did enough c++ template programming, including a psuedo if statement, to really get excited about D's stuff.


> it doesn't do lexing but it is a fully functional parser generator that does it's processing in-language using templates and string precessing. The input is a BNF text string.
>
> 
> It's very limited at this time because it generates huge symbols (it takes 700MB of ram to compile a ~50 rule grammar) but I have some ideas on how to fix much of that and another idea on how to make it easier to build in common actions.
> 
> Is that something along the lines of what you are thinking of?

I like it, it would definitely simplify the generated Parser.  I'm thinking it might be a good idea to make the Parser generator very high level(i.e. D-like specific), I haven't tried yet, but maybe there is a way to make it work much like a hand-written recursive descent parser.  I'm going to use this is part of my research. Thank you!

Ryan
March 07, 2008
Reply to Ryan,

> BCS Wrote:
> 
>> Ryan Bloomfield wrote:
>> 
>>> The generator could then scan all the classes and generate a grammer
>>> description that could be used to make the lexer and parser.
>>> 
>> with respect to the above, take a look at this:
>> 
>> http://www.dsource.org/projects/scrapple/browser/trunk/dparser/dparse
>> .d
>> 
> Wow, that's some pretty meta-programming. :)  I thought of doing some
> simple meta-program just to see what it looked like compared to c++
> templates.  I did enough c++ template programming, including a psuedo
> if statement, to really get excited about D's stuff.
> 
>> it doesn't do lexing but it is a fully functional parser generator
>> that does it's processing in-language using templates and string
>> precessing. The input is a BNF text string.
>> 
>> It's very limited at this time because it generates huge symbols (it
>> takes 700MB of ram to compile a ~50 rule grammar) but I have some
>> ideas on how to fix much of that and another idea on how to make it
>> easier to build in common actions.
>> 
>> Is that something along the lines of what you are thinking of?
>> 
> I like it, it would definitely simplify the generated Parser.  I'm
> thinking it might be a good idea to make the Parser generator very
> high level(i.e. D-like specific), I haven't tried yet, but maybe there
> is a way to make it work much like a hand-written recursive descent
> parser.  I'm going to use this is part of my research. Thank you!
> 
> Ryan
> 

let me known if you run into problems. Knowing someone will benefit from something will help me find time to work on it.