Writing a Parser

Here you can find some parsergenerators: http://www.prowiki.org/wiki4d/wiki.cgi?GrammarParsers I use lemonde as parsergenerator and re2c as lexer. Using re2c in most cases you have only to replace 'unsigned int' with 'uint' in the resulting code. regards cs Dan Wrote: > > I've been messing with how to write a parser, and so far I've played with numerous patterns before eventually wanting to cry. > > At the moment, I'm trying recursive descent parsing. > > The problem is that I've realized I'm duplicating huge volumes of code to cope with the tristate decision of { unexpected, allow, require } for any given token. > > For example, to consume a for loop, you consume something similar to > /for\s*\((.*?)\)\s*\{(.*?)\}/ > > I have it doing that, but my soul feels heavy with the masses of looped switches it's doing. Is there any way to ease the pain? > > Regards, > Dan

January 10, 2008

Re: Writing a Parser

Posted by bearophile
in reply to Christoph Singewald

Permalink

bearophile

Posted in reply to Christoph Singewald

Permalink

Christoph Singewald:
> I use lemonde as parsergenerator and re2c as lexer. Using re2c in most
> cases you have only to replace 'unsigned int' with 'uint' in the resulting code.

I have seen re2c and it looks nice. I think its source code can be modified with not that much efforts to make it produce D code instead of C.

I think C isn't the right language to write such tool: its sources are about 200 KB of C code (plus some code generated by itself), they can probably be replaced by a 50 (or less) KB Python module (that generates the C/D code).

Something like a tiny but really fast C compiler like TinyCC (http://fabrice.bellard.free.fr/tcc/) can be used to compile the C code on the fly in memory and execute it. This may become the starting point to give run-time compiled Regular Expressions to D :-) Probably there are other smarter ways to do similar things.

Bye,
bearophile

Forums