Thread overview | |||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
October 18, 2006 Anyone interested in a Spirit for D? | ||||
---|---|---|---|---|
| ||||
Along the lines of Don's regexp template metaprograms, is anyone interested in a Spirit-like parser generator capability in D? http://spirit.sourceforge.net/ Apparently, someone has gotten Spirit to work with C#: http://www.codeproject.com/useritems/spart.asp |
October 18, 2006 Re: Anyone interested in a Spirit for D? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | Walter Bright wrote: > Along the lines of Don's regexp template metaprograms, is anyone interested in a Spirit-like parser generator capability in D? > > http://spirit.sourceforge.net/ > > Apparently, someone has gotten Spirit to work with C#: > > http://www.codeproject.com/useritems/spart.asp Now there's an idea! Words of caution to follow: FWIW, I looked into doing this years ago, and didn't get to far. The biggest hurdle, aside from the limitations of templates at the time, was a lack of unary operators to override. In particular, not being able to override unary '*' and '!' caused some cosmetic problems. The only other major hangup I had was not having IFTI so I could instantiate templates transparently. This feature alone could close the gap on most of Spirit's useage of C++ templates. At a minimum, it means that a D programmer could get very close to the cosmetic appeal of Spirit (operator problems aside). Don't get me wrong: I'm not a nay-sayer here. I think this is very doable and worthwhile suggestion by Walter. Folks should take it seriously. But it will require some design compromises and changes from the original - IMO, it'll probably require more of a re-write than a port. -- - EricAnderton at yahoo |
October 18, 2006 Re: Anyone interested in a Spirit for D? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Pragma | Pragma wrote:
> Walter Bright wrote:
>> Along the lines of Don's regexp template metaprograms, is anyone interested in a Spirit-like parser generator capability in D?
>>
>> http://spirit.sourceforge.net/
>>
>> Apparently, someone has gotten Spirit to work with C#:
>>
>> http://www.codeproject.com/useritems/spart.asp
>
> Now there's an idea!
>
> Words of caution to follow:
>
> FWIW, I looked into doing this years ago, and didn't get to far. The biggest hurdle, aside from the limitations of templates at the time, was a lack of unary operators to override. In particular, not being able to override unary '*' and '!' caused some cosmetic problems.
>
> The only other major hangup I had was not having IFTI so I could instantiate templates transparently. This feature alone could close the gap on most of Spirit's useage of C++ templates. At a minimum, it means that a D programmer could get very close to the cosmetic appeal of Spirit (operator problems aside).
>
> Don't get me wrong: I'm not a nay-sayer here. I think this is very doable and worthwhile suggestion by Walter. Folks should take it seriously. But it will require some design compromises and changes from the original - IMO, it'll probably require more of a re-write than a port.
>
Yeah Its a good idea, but my first thought was "is that even possible?" It wont be spirit, but a lexer in the uh spirit of spirit :)
|
October 18, 2006 Re: Anyone interested in a Spirit for D? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Pragma | Pragma wrote: > Words of caution to follow: > > FWIW, I looked into doing this years ago, and didn't get to far. The biggest hurdle, aside from the limitations of templates at the time, was a lack of unary operators to override. In particular, not being able to override unary '*' and '!' caused some cosmetic problems. I think the operator overloading aspect of Spirit is only a minor part of the implementation - in fact, just a pretty shell around it. It could all be done using functional notation. > The only other major hangup I had was not having IFTI so I could instantiate templates transparently. This feature alone could close the gap on most of Spirit's useage of C++ templates. At a minimum, it means that a D programmer could get very close to the cosmetic appeal of Spirit (operator problems aside). I think it would be worth looking at again. The C# version of it doesn't use operator overloading or even templates. > Don't get me wrong: I'm not a nay-sayer here. I think this is very doable and worthwhile suggestion by Walter. Folks should take it seriously. But it will require some design compromises and changes from the original - IMO, it'll probably require more of a re-write than a port. I think it would be a complete rewrite. The reason I'm interested in it for D is that: 1) it's a pretty cool library 2) it's one of Boost's most popular ones 3) it's been touted as a reason why D is no good and C++ roolz 4) it's popular enough to have been a driving force behind improvements in C++ compilers 5) it would surely improve D 6) and last, and most importantly, it's very useful |
October 18, 2006 Re: Anyone interested in a Spirit for D? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | Walter Bright wrote: > Along the lines of Don's regexp template metaprograms, is anyone interested in a Spirit-like parser generator capability in D? > > http://spirit.sourceforge.net/ Now that would be useful I think. Take this example from the Spirit intro of code to make a parser for a list of real numbers: r = real_p >> *(ch_p(',') >> real_p); In EBNF that's just: real_number ("," real_number)* In C++ you have to get creative with the operator overloading there (prefix '*' used to denote the regexp Kleene star, '>>' used to separate tokens) But given Don's experiments with compile-time text parsing in D, it's conceivable that in D the above parser could just be created with: r = make_parser("real_number (',' real_number)*"); I.e. use the EBNF version directly in a string literal that gets parsed at compile time. That would be pretty cool. Though, you know, even thinking about Boost::Spirit, I have to wonder if it really is necessary. From the intro it says that it's primary use is "extremely small micro-parsers", not a full blown language processor. But if that's the target then the runtime overhead of translating the EBNF description to a parser would be pretty trivial. So I guess the real benefit of a compile-time parser-generator is that your grammar can be _verified_ at compile-time. I wonder if it would be any easier to make a compile-time grammar verifier than a full blown parser generator? Then just do the parser-generating at runtime. --- heh heh, this is fun. From one of the code examples: typedef alternative<alternative<space_parser, sequence<sequence< strlit<const char*>, kleene_star<difference<anychar_parser, chlit<char> > > >, chlit<char> > >, sequence<sequence< strlit<const char*>, kleene_star<difference<anychar_parser, strlit<const char*> > > >, strlit<const char*> > > skip_t; skip_t skip; That monster type signature was determined by deliberately forcing a compiler error and then copy-pasting the type from the resulting error message. Too funny. (Note that this as given not as the main way to use the library but as a way to eliminate some of the code bloat all the templates lead to -- another reason to not try to generate the parser at compile-time, but just verify it.) At any rate the Spirit documentation seems to be rife with juicy comments of the form "yes it looks funky, but we're stuck with C++ here". So it's a good place to get ideas for how to make things better. --bb |
October 18, 2006 Re: Anyone interested in a Spirit for D? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | Walter Bright wrote: > I think it would be worth looking at again. The C# version of it doesn't use operator overloading or even templates. Huh. Very interesting. Here's the example: // spirit: num_p >> *( ch_p(',') >> num_p) // C# Ops.Seq( Prims.Digit, Ops.Start( Ops.Seq(Prims.Ch(','), Prims.Digit))) Though it's definitely not as easy to read, I think I might actually prefer the C# version. Part of the annoyance with Boost super-clever use of operator-overloading is that it can be a real pain to discover things because they don't have real names. I bet the C# version could be compacted with some aliases or imports (assuming C# has these): Seq( Digit, Start( Seq(Ch(','), Digit))) That doesn't look too bad to me. Still it would rock the world if you could just do: parser("digit (',' digit)*"); and have the grammar be verified at compile-time. > I think it would be a complete rewrite. > > The reason I'm interested in it for D is that: > > 1) it's a pretty cool library > 2) it's one of Boost's most popular ones > 3) it's been touted as a reason why D is no good and C++ roolz > 4) it's popular enough to have been a driving force behind improvements in C++ compilers > 5) it would surely improve D > 6) and last, and most importantly, it's very useful Excellent reasons. --bb |
October 18, 2006 Re: Anyone interested in a Spirit for D? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Bill Baxter | Bill Baxter wrote:
> Walter Bright wrote:
>> I think it would be worth looking at again. The C# version of it doesn't use operator overloading or even templates.
>
> Huh. Very interesting. Here's the example:
>
> // spirit:
> num_p >> *( ch_p(',') >> num_p)
>
> // C#
> Ops.Seq( Prims.Digit, Ops.Start( Ops.Seq(Prims.Ch(','), Prims.Digit)))
>
> Though it's definitely not as easy to read, I think I might actually prefer the C# version. Part of the annoyance with Boost super-clever use of operator-overloading is that it can be a real pain to discover things because they don't have real names.
>
> I bet the C# version could be compacted with some aliases or imports (assuming C# has these):
> Seq( Digit, Start( Seq(Ch(','), Digit)))
>
> That doesn't look too bad to me.
>
> Still it would rock the world if you could just do:
> parser("digit (',' digit)*");
> and have the grammar be verified at compile-time.
>
>> I think it would be a complete rewrite.
>>
>> The reason I'm interested in it for D is that:
>>
>> 1) it's a pretty cool library
>> 2) it's one of Boost's most popular ones
>> 3) it's been touted as a reason why D is no good and C++ roolz
>> 4) it's popular enough to have been a driving force behind improvements in C++ compilers
>> 5) it would surely improve D
>> 6) and last, and most importantly, it's very useful
>
> Excellent reasons.
>
> --bb
all that is cool, but (i know i am the dummy here) readability as in bnf is something that eludes me. better to go for coco?
richard
|
October 18, 2006 Re: Anyone interested in a Spirit for D? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Bill Baxter | Bill Baxter wrote: > But given Don's experiments with compile-time text parsing in D, it's conceivable that in D the above parser could just be created with: > > r = make_parser("real_number (',' real_number)*"); > > I.e. use the EBNF version directly in a string literal that gets parsed at compile time. > That would be pretty cool. Yes, it would be. But there's a catastrophic problem with it. Spirit enables code snippets to be attached to terminals by overloading the [] operator. If the EBNF was all in a string literal, this would be impossible. > Though, you know, even thinking about Boost::Spirit, I have to wonder if it really is necessary. From the intro it says that it's primary use is "extremely small micro-parsers", not a full blown language processor. But if that's the target then the runtime overhead of translating the EBNF description to a parser would be pretty trivial. So I guess the real benefit of a compile-time parser-generator is that your grammar can be _verified_ at compile-time. I disagree. I think the real benefit is avoiding reliance on an add-on tool. Such tools are a nuisance; making archival, maintenance, etc., clumsy. > At any rate the Spirit documentation seems to be rife with juicy comments of the form "yes it looks funky, but we're stuck with C++ here". So it's a good place to get ideas for how to make things better. Yup. |
October 18, 2006 Re: Anyone interested in a Spirit for D? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Bill Baxter | Bill Baxter wrote: > [snip] > > Though, you know, even thinking about Boost::Spirit, I have to wonder if it really is necessary. From the intro it says that it's primary use is "extremely small micro-parsers", not a full blown language processor. But if that's the target then the runtime overhead of translating the EBNF description to a parser would be pretty trivial. So I guess the real benefit of a compile-time parser-generator is that your grammar can be _verified_ at compile-time. From what I gather, that's the major benefit, other than a "self-documenting design". All the "prettyness" of using a near EBNF syntax in C++ code gets you close enough to actual EBNF that it's apparent what and how it functions. However, the only problem with composing this as an EBNF compile-time parser, is that you can't attach actions to arbitrary terminals without some sort of binding lookup. I'm not saying it's impossible, but it'll be a little odd to use until we get some stronger reflection support. But what you're suggesting could just as easily be a Compile-Time rendition of Enki. It's quite possible to pull off. Especially if you digest the grammar one production at a time as to side-step any recursion depth limitations when processing the parser templates. :) auto grammar = new Parser( Production!("Number ::= NumberPart {NumberPart}", // binding attached to production ('all' is supplied by default?) void function(char[] all){ writefln("Parsed Number: %s",all); } ), Production!("NumberPart ::= Sep | Digit "), Production!("Digit ::= 0|1|2|3|4|5|6|7|8|9"), Production!("Sep ::= '_' | ','") ); // call specifying start production grammar.parse("Number",myInput); Depending on how you'd like the call bindings to go, you could probably go about as complex as what Enki lets you get away with. But you'll have to accept a 'soft' binding in there someplace, hence you loose the type/name checking benefits of being at compile time. > > I wonder if it would be any easier to make a compile-time grammar verifier than a full blown parser generator? Then just do the parser-generating at runtime. Maybe I don't fully understand, but I don't think there's a gain there. If you've already gone through the gyrations of parsing the BNF expression, it's hardly any extra trouble to do something at each step of the resulting parse tree*. (* of course template-based parsers use the call-tree as a parse-tree but that's besides the point) -- - EricAnderton at yahoo |
October 18, 2006 Re: Anyone interested in a Spirit for D? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | Walter Bright wrote: > Bill Baxter wrote: >> But given Don's experiments with compile-time text parsing in D, it's conceivable that in D the above parser could just be created with: >> >> r = make_parser("real_number (',' real_number)*"); >> >> I.e. use the EBNF version directly in a string literal that gets parsed at compile time. >> That would be pretty cool. > > Yes, it would be. But there's a catastrophic problem with it. Spirit enables code snippets to be attached to terminals by overloading the [] operator. If the EBNF was all in a string literal, this would be impossible. But maybe you could allow the user to access those terminals via strings: r.lookup_terminal("real_number").add_action(&func); or just r.add_action("real_number", &func); >> So I guess the real benefit of a compile-time parser-generator is that your grammar can be _verified_ at compile-time. > > I disagree. I think the real benefit is avoiding reliance on an add-on tool. Such tools are a nuisance; making archival, maintenance, etc., clumsy. Hmm. Well if no external tools is the main benefit, then simply making Lex/Yacc (or more apropriately, Enki) into a library should be sufficient. I guess you do need some way to attach code to terminals at runtime, but that's doable via various existing callback mechanisms. The machinery needed is basically the same as signals/slots. You just need to be able to do something like connect(ASTreeNode.accept(), mycode); at runtime. Then you should be able to get this kind of thing to work: auto r = make_parser_node("real_number (',' real_number)*"); r.add_action("real_number", &func); using nothing but runtime parsing of the grammar to build your AST. No fancy templates needed, except perhaps in adding the callback to &func. That kind of thing could be done in C++ too. --bb |
Copyright © 1999-2021 by the D Language Foundation