Help with porting grammar from PEGjs to D for dustjs project!

Aug 03, 2014

Uranuz

Aug 03, 2014

Philippe Sigaud

Aug 04, 2014

Aug 04, 2014

Aug 05, 2014

Aug 05, 2014

Aug 05, 2014

Aug 06, 2014

Aug 06, 2014

I want to try to implement web template engine dustjs: http://akdubya.github.io/dustjs/ At the first step implementing parser for it's grammar is needed. As far as code for parsing grammar was generated via PEGjs grammar generator the resulting code is enough long (about 4200 lines of code). I though that it is productive way to make porting manually. It's long and something could change in the source JavaScript code. And it will be not so easy to maintain. So I remembered about PEGGED project for D. As I see formats for writing grammar differs for these two systems and I have no experience with using grammar generators. So I need some help with rewriting grammar from PEGjs into PEGGED. Also I don't understand in PEGGED (I have not tried to use it yet) how to generate some logic from AST. Where should I describe it or should I walk around all nodes for somehow and generate code for them. Goal of this is to use dust template system as template engine at server side. I also considered handlebars.js, but I can't evaluate what is more suitable for my purposes. The choice is just very subjective.

Uranuz: > http://akdubya.github.io/dustjs/ > So I need some help with rewriting grammar from PEGjs into PEGGED. Is this the grammar? https://github.com/akdubya/dustjs/blob/master/src/dust.pegjs If so, then I think I can provide some help. But I don't get what output you want (see below). > Also I don't understand in PEGGED (I have not tried to use it yet) how to generate some logic from AST. Where should I describe it or should I walk around all nodes for somehow and generate code for them. You can put semantic actions in the grammar (code between curly brackets). dust.pegjs seems to have that in their grammar definition also (all these { return something } blocks) Or you can walk the parse tree afterwards. See the Pegged tutorial here: https://github.com/PhilippeSigaud/Pegged/wiki/Pegged-Tutorial More particularly: https://github.com/PhilippeSigaud/Pegged/wiki/Using-the-Parse-Tree The example explains (I hope) how to use a wiki grammar to parse wiki text and output LaTeX code. > Goal of this is to use dust template system as template engine at server side. More concretely, what's the goal? A template as input and... what should be output? If I understand correctly, dustjs produces Javascript code. Is that what you want? Or do you want D code? Also, did you have a look at vide.d and its Diet templates?

I am real noob about grammar description languages so I need some explanation about it. As far as I understand expressions in curly bracers are used to modify syntax tree just in process of parsing instead of modifying it after? How I could use PEGGED to map some code to these parsed expressions to generate code that will perform operations defined by this grammar? Should I walk around all the syntax tree and just append code to some string and mix it in then or are there some features for code generation? Something that I was thinking about is comparision of resulting syntax tree to check if it was correctly implemented. It would be great if different gramar parsers will output result in common format (JSON or XML for example) and it will be possiple to compare them for equality. But different parsers have different internal format of tree so maybe creating some transformation is possible. With this feature it could be possible to say if parser is working correctly.

August 04, 2014

Re: Help with porting grammar from PEGjs to D for dustjs project!

Posted by Philippe Sigaud
in reply to Uranuz

Permalink

Philippe Sigaud

Posted in reply to Uranuz

Permalink

On Mon, Aug 4, 2014 at 7:13 AM, Uranuz via Digitalmars-d-learn <digitalmars-d-learn@puremagic.com> wrote:
> I am real noob about grammar description languages so I need some explanation about it. As far as I understand expressions in curly bracers are used to modify syntax tree just in process of parsing instead of modifying it after?

Yes, that's it. Or at least that's the way Pegged does it. I'm not such a specialist myself, I just dabbled in it to write Pegged. As I understand it, many parsers do not produce a parse tree, they only 'return' what their embedded action tell them to.

Personally, I have a slight preference to using the parse tree once it's complete. a) Because that way I have access to all the information (parent nodes, sibling nodes, even far way) whereas when doing it with an action means you only have the local, current node context and b) because it decouples steps that are in my mind separated anyway: parsing, then producing a value out of the parse tree. A bit like ranges in D, and Walter and his components programming speech.

> How I could use PEGGED to map some code to these parsed expressions to generate code that will perform operations defined by this grammar? Should I walk around all the syntax tree and just append code to some string and mix it in then or are there some features for code generation?

You can insert semantic actions inside the grammar definition, as is
done by Dustjs, or you can have a function walking the tree
afterwards.
For D, since Pegged works at compile time, you can have your parse
tree at compile time. Use the walking function to generate D code (a
string) and mix it in.

enum parseTree = Grammar(input); // CT parsing
string codeMaker(ParseTree pt) { ... } // recursive walker

mixin(codeMaker(parseTree));

See https://github.com/PhilippeSigaud/Pegged/wiki/Using-the-Parse-Tree

If what you need is generating Javascript code, no need to do that at compile-time: you can assemble the JS code as a string at runtime and then write it to a file somewhere, I guess.

> Something that I was thinking about is comparision of resulting syntax tree to check if it was correctly implemented. It would be great if different gramar parsers will output result in common format (JSON or XML for example) and it will be possiple to compare them for equality. But different parsers have different internal format of tree so maybe creating some transformation is possible. With this feature it could be possible to say if parser is working correctly.

Different formats and also different languages. I don't see how you can compare a parse tree that's a D object and another tree made by dustjs: you never see the AST produced by dust, you only see the resulting JS code.

> Different formats and also different languages. I don't see how you > can compare a parse tree that's a D object and another tree made by > dustjs: you never see the AST produced by dust, you only see the > resulting JS code. Yes. That's a point. Thanks for all the explanations. I'll try to make something useful of it.

On Tuesday, 5 August 2014 at 08:13:25 UTC, Uranuz wrote: >> Different formats and also different languages. I don't see how you >> can compare a parse tree that's a D object and another tree made by >> dustjs: you never see the AST produced by dust, you only see the >> resulting JS code. > > Yes. That's a point. Thanks for all the explanations. I'll try to make something useful of it. Is there multiline comments available inside PEGGED template? As far as I understand inline comments are set via # sign.

> Is there multiline comments available inside PEGGED template? As far as I understand inline comments are set via # sign. No, no multiline comment. That's based on the original PEG grammar, which allows only #-comments.

What I was thinking about is possibility to change ParseTree struct with user-defined version of it. And I was thinking about setting tree type as template parameter to grammar: grammar!(MyParseTree)(" Arithmetic: ... "); Or somethink like this. I think changing source code of library in order to change tree type is not good and should be set as parameter. If it's already implemented please let me knoe because I couldn't find it. And also some minimal interface is needed to be described in documentation for ParseTree (may be ability to set it as class is good in order to have polymorthic nodes with different methods and properties). Of course I can transform PEGGED syntactic tree into another form of tree specified by usage domain. But if it doesn't significantly differs from PEGGED tree (for example node have a pair of additional properties) it only causes into additional memory consumption and CPU to transform tree. But if domain specific tree differs a lot of course we need some useful way to transform trees.

On Wed, Aug 6, 2014 at 9:09 AM, Uranuz via Digitalmars-d-learn <digitalmars-d-learn@puremagic.com> wrote: > What I was thinking about is possibility to change ParseTree struct with user-defined version of it. And I was thinking about setting tree type as template parameter to grammar: > > grammar!(MyParseTree)(" > Arithmetic: > ... > "); That's already the case, look at https://github.com/PhilippeSigaud/Pegged/blob/master/pegged/parser.d on line 134, for example. struct GenericPegged(TParseTree) { ... } And then, the library alias a standard version (line 1531): alias GenericPegged!(ParseTree).Pegged Pegged; Which means the customer-facing Pegged parser is in fact a generic parser specialized on ParseTree. But you can substitute your own parse tree. That's of course the same for all parsers: GenericXXX(TParseTree) {...} and then alias XXX = GenericXXX!(ParseTree); But you're right, it's not really documented and I don't have a template checking whether TParseTree respect some static inferface. This parameterization was asked by someone on github, but I don't think they used it finally. ParseTree is described here: https://github.com/PhilippeSigaud/Pegged/wiki/Parse-Trees Maybe we can continue this thread by private mail? I'm not sure people on the D list are that interested by the internals of a library.

Forums