Thread overview | |||||||
---|---|---|---|---|---|---|---|
|
October 25, 2018 Pegged: spaces | ||||
---|---|---|---|---|
| ||||
Is Pegged suppose to consume white spaces automatically? I have some text like "abvdfs dfddf" and I have made some rules to divide the two parts by a space. The sub-rules are complex but none of them contain a space(' ', they do contain spaces to separate the sub-rules). The parser though is essentially ignore the space. Sometimes it seems to work on certain rule construction and then other times it doesn't. Basically none of my sub-rules have any space and the main rule is A ' '+ B Where A attempts to parse the first half and B the second half. But A consumes the whole string! A does not consume any spaces though! (no . usage or ' ' in the rule definitions that A uses) I'd be able to limit the application of the rule to a substring that sort of emulates splitting of the string. (!' ':A) ' '+ B hypothetically would parse each char for A but terminate the rule when it encounters a space before A get to see the space. Is this possible with pegged? It's sort of a look ahead but it has to do it for each character rather than (!' ' A) which would only check the first character then continue on with A. |
October 25, 2018 Re: Pegged: spaces | ||||
---|---|---|---|---|
| ||||
Posted in reply to Michelle Long | Ignores spaces: <- Doesn't: < Concatenates results: <~ |
October 26, 2018 Re: Pegged: spaces | ||||
---|---|---|---|---|
| ||||
Posted in reply to Michelle Long | 25.10.2018 23:34, Michelle Long пишет:
> Ignores spaces: <-
>
> Doesn't: <
>
> Concatenates results: <~
>
>
Thank you for sharing your results!
|
October 27, 2018 Re: Pegged: spaces | ||||
---|---|---|---|---|
| ||||
Posted in reply to drug | On Friday, 26 October 2018 at 07:36:50 UTC, drug wrote:
> 25.10.2018 23:34, Michelle Long пишет:
>> Ignores spaces: <-
>>
>> Doesn't: <
>>
>> Concatenates results: <~
>>
>>
> Thank you for sharing your results!
I got it backwards when posting:
/*
< (space arrow) consume spaces before, between and after elements
<-
<~ (squiggly arrow) concatenates the captures on the right-hand side of the arrow.
<: (colon arrow) drops the entire rule result (useful to ignore comments, for example)
<^ (keep arrow) that calls the 'keep' operator to all subelements in a rule.
/ binary operator - conditional or (Matches first rule, if fails then matches the next)
| binary operator - Longest match alternation(matches the longest rule first)
: Prefix that ignores match in rule but requires it to be valid.
*/
List is not complete, maybe I will update.
What would be really cool if one could have an autogrammar generator! Somehow it looks at text and figures out the grammar. Might require some human interaction but can figure out the rules that will generate the specific grammars. Maybe neural net could do it? Train it enough and it could be fairly accurate and a human just has to fix up small cases.
e.g., get a few million lines of C++ source code, pass in to the generator and it pops out a grammar for it! Should be possible since it's usually 1 to 1(for peg grammars at least).
|
October 27, 2018 Re: Pegged: spaces | ||||
---|---|---|---|---|
| ||||
Posted in reply to Michelle Long | On Saturday, 27 October 2018 at 14:21:51 UTC, Michelle Long wrote:
> What would be really cool if one could have an autogrammar generator! Somehow it looks at text and figures out the grammar. Might require some human interaction but can figure out the rules that will generate the specific grammars. Maybe neural net could do it? Train it enough and it could be fairly accurate and a human just has to fix up small cases.
>
> e.g., get a few million lines of C++ source code, pass in to the generator and it pops out a grammar for it! Should be possible since it's usually 1 to 1(for peg grammars at least).
Something like eclipse's xtext would be nice, parts of the grammar are attached to OOP features in the code.
|
Copyright © 1999-2021 by the D Language Foundation