>> > On Wednesday, February 29, 2012 02:16:12 Christopher Bergqvist wrote:
>> > > I agree that the current direction of D in this area is impressive.
>> > > However, I fail to see a killer-feature in generating a lexer-parser
>> > > generator at compile-time instead of run-time.
>> > >
>
>
> CTFE parsing is especially useful for DSEL (Domain Specific Embedded Languages) or internal DSLs. The advantages are:
>
> 1. Syntactic errors (in the parsed constructs)  are given out at compile time.
> 2. D reflections are available only at compile time. Referencing the variables/identifiers in the parsed subset of DSL with the mainstream D code is impossible without reflections in place.

One of my  goals while writing a CT grammar generator was to get a compile-time parse-tree. Since it contains strings, it's easy to walk the tree, assembling strings as you go and generating the code you want (if//when you want to write code, that is)

Strings are a D way to represent code, so any way to get structured strings at compile-time opens whole vistas for code generation.

As for semantic actions, I added them in my code yesterday. I had hopes for using D's new anonymous syntax (p => p), but by being anonymous, they cannot be inserted easily in string mixins (other modules do not now about __lambda1 and co).

Anyway, I now have semantic actions at compile-time, I used them to write a small (as in, veeery simple) XML parser: I use semantic actions to push node names while encountering them and pop the last tag while encountering a closing tag. It seems to work OK.

That looks a bit like this (sorry, writing on a pad)

mixin(Grammar!("Doc <- Node*"
                "Node <- OpeningTag (Text / Node)* ClosingTag", NodeAction,
                "OpeningTag <- '<' Identifier '>'", OpeningAction,
                "ClosingTag <-  `</` Identifier '>'", ClosingAction,
                "Text <- (!(OpeningTag / ClosingTag) _)+"));

The PEG for Text just means: any char, as long as it's not an OpeningTag nor a ClosingTag. PEG use '.' to say 'Any char', but I wanted to be able to deal with qualified names, so I chose '_' instead.
When there is no action, it default to NoOp, as is the case for  Doc, Node and Text.

I also added named captures (and capture comparison), to be able to say: "I want a sequence of equal chars":

"Equal <- _@first (_=first)*"

That is: any char, store it as "first", then take any number of char, long as their match is equal to first's match.

All this work at CT.

I'm  afraid being in holidays right now means I do not have easy access to GitHub (no git on a pad, and the computer I use to code right now does not have any network connection). I'll put all this online in a few days, because that must seems like the ramblings of a madman right now...

Philippe