Jump to page: 1 2
Thread overview
Pry v0.3.1 is out!
Jan 15, 2017
Dmitry Olshansky
Jan 15, 2017
Dicebot
Jan 15, 2017
Dmitry Olshansky
Jan 15, 2017
Dicebot
Jan 15, 2017
Dmitry Olshansky
Jan 16, 2017
Bastiaan Veelo
Jan 16, 2017
Dmitry Olshansky
Jan 17, 2017
Bastiaan Veelo
Jan 17, 2017
Dmitry Olshansky
Jan 17, 2017
Bastiaan Veelo
Jan 17, 2017
Bastiaan Veelo
Jan 20, 2017
Dmitry Olshansky
January 15, 2017
Pry is a new pragmatic parser combinators library.

https://github.com/DmitryOlshansky/pry

(also available on Dub)

It's still in the early stages but I think it might be a good time to gauge some interest.

Two key areas of focus are (compared to say Pegged):
- performance, on par with hand-written code or die
- versatility, generating some goofy parse tree is not a goal, the goal is extraction of data the way the user specifies

For now it contains a modest example of a calculator app and a benchmark inspired by it. The interesting tidbit is that this version already includes an optimization that typically prevents recursive descent parser from falling into exponential behavior.

Future directions:
- actually add grammar layer on top of combinators, it is fancy but useful
- more built-in parsers to play with, including some stuff inspired by std.regex
- some goodies for usability, e.g. shortcuts to avoid constructing Stream out of string etc.

---
Dmitry Olshansky
January 15, 2017
Sounds intriguing!

On 01/15/2017 01:26 AM, Dmitry Olshansky wrote:
> - versatility, generating some goofy parse tree is not a goal, the goal is extraction of data the way the user specifies

Can you show an example of what you have in mind for this?




January 15, 2017
On 1/15/17 12:43 PM, Dicebot wrote:
> Sounds intriguing!
>
> On 01/15/2017 01:26 AM, Dmitry Olshansky wrote:
>> - versatility, generating some goofy parse tree is not a goal, the goal
>> is extraction of data the way the user specifies
>
> Can you show an example of what you have in mind for this?
>

Each parser has a value type that it extracts from the stream. It can be an AST node but doesn't have to. For instance, taking a page form the calculator example:

auto expr = dynamic!int;
auto primary = any(
	range!('0', '9').rep.map!(x => x.to!int),
	seq(tk!'(', expr, tk!')').map!(x => x[1])
);


Here 'expr' is a parser that produces an int that will be defined later. Then we see how digits are converted to string: 'range' has value of parsed digit, rep tells to apply it repeatedly and use the slice of input as the value. Finally map converts parser producing slice to parser producing int.

Similarly the result of 'seq' is a tuple and map takes the middle one ('expr'), that produces an int.


I could have wasted time by creating nodes and assigning values in the map functions but if you just want the result of calculation it's all moot.

---
Dmitry Olshansky
January 15, 2017
On 1/15/17 2:26 AM, Dmitry Olshansky wrote:
> Pry is a new pragmatic parser combinators library.
>
[snip]
> Two key areas of focus are (compared to say Pegged):
> - performance, on par with hand-written code or die

Actually testing the latest version with LDC I found out that
handwritten code is a bit *slower*. Beats me, as I spent quite some time
laying out that handwritten stuff.

All in all, this makes me confident that I soon will never have to
write parsers by hand, the last nebulous reason is out.

---
Dmitry Olshansky

January 15, 2017
On Sunday, 15 January 2017 at 13:14:45 UTC, Dmitry Olshansky wrote:
> I could have wasted time by creating nodes and assigning values in the map functions but if you just want the result of calculation it's all moot.

Thanks for explanation! This is indeed very promising and much in spirit of D selling point of zero-overhead convenience abstractions.
January 16, 2017
On Sunday, 15 January 2017 at 01:26:07 UTC, Dmitry Olshansky wrote:
> Pry is a new pragmatic parser combinators library.
>
> https://github.com/DmitryOlshansky/pry

Interesting. How about left-recursion? (I added support for left-recursive grammars to Pegged.)
January 16, 2017
On 1/16/17 1:29 AM, Bastiaan Veelo wrote:
> On Sunday, 15 January 2017 at 01:26:07 UTC, Dmitry Olshansky wrote:
>> Pry is a new pragmatic parser combinators library.
>>
>> https://github.com/DmitryOlshansky/pry
>
> Interesting. How about left-recursion? (I added support for
> left-recursive grammars to Pegged.)

I think left-recursion is better handled at the grammar level.
What I currently have is parser combinators level where adding
this transformation is awkward and too much magic IMO.

However I plan to add a grammar on top of combinators, and yes handling left-recursive grammars is going to be an interesting challenge.

---
Dmitry Olshansky
January 17, 2017
On Monday, 16 January 2017 at 22:29:01 UTC, Dmitry Olshansky wrote:
> I think left-recursion is better handled at the grammar level.
> What I currently have is parser combinators level where adding
> this transformation is awkward and too much magic IMO.

Handling left-recursion by grammar transformation often has unwanted side-effects (operator precedence) and eliminating indirect left-recursion this way can be impossible in practice. Depending on the complexity of the grammar, even identifying the recursive loop can be a challenge.

The trouble is that one can be happily implementing a parser or designing a grammar when suddenly for some input the parser hangs indefinitely. Users are likely quick to blame the parser lib, while in fact it is the grammar that has left-recursion. Hitting that roadblock is a real bummer.

In some cases the grammar is a given (by a standard for example) and transforming it to combat left-recursion can obfuscate it beyond recognition. Hardening Pegged to deal with the various kinds of left-recursion was very challenging, but easier than transforming the grammar I was dealing with (ISO 10206).
January 17, 2017
On 1/17/17 1:16 PM, Bastiaan Veelo wrote:
> On Monday, 16 January 2017 at 22:29:01 UTC, Dmitry Olshansky wrote:
>> I think left-recursion is better handled at the grammar level.
>> What I currently have is parser combinators level where adding
>> this transformation is awkward and too much magic IMO.
>
> Handling left-recursion by grammar transformation often has unwanted
> side-effects (operator precedence) and eliminating indirect
> left-recursion this way can be impossible in practice. Depending on the
> complexity of the grammar, even identifying the recursive loop can be a
> challenge.

I do not suggest to change the grammar itself, I think that processing of the grammar may perform hidden transformations internally.

>
> The trouble is that one can be happily implementing a parser or
> designing a grammar when suddenly for some input the parser hangs
> indefinitely. Users are likely quick to blame the parser lib, while in
> fact it is the grammar that has left-recursion. Hitting that roadblock
> is a real bummer.
>
> In some cases the grammar is a given (by a standard for example) and
> transforming it to combat left-recursion can obfuscate it beyond
> recognition. Hardening Pegged to deal with the various kinds of
> left-recursion was very challenging, but easier than transforming the
> grammar I was dealing with (ISO 10206).

Interesting, what kind of hardening?

---
Dmitry Olshansky
January 17, 2017
On Tuesday, 17 January 2017 at 21:10:17 UTC, Dmitry Olshansky wrote:
> On 1/17/17 1:16 PM, Bastiaan Veelo wrote:
>> On Monday, 16 January 2017 at 22:29:01 UTC, Dmitry Olshansky wrote:
>>> I think left-recursion is better handled at the grammar level.
>>> What I currently have is parser combinators level where adding
>>> this transformation is awkward and too much magic IMO.
>>
>> Handling left-recursion by grammar transformation often has unwanted
>> side-effects (operator precedence) and eliminating indirect
>> left-recursion this way can be impossible in practice. Depending on the
>> complexity of the grammar, even identifying the recursive loop can be a
>> challenge.
>
> I do not suggest to change the grammar itself, I think that processing of the grammar may perform hidden transformations internally.

Interesting. Would that work with indirect left-recursion? That would be daunting, I fear. It'l still mess with order of operation / associativity, won't it?

>> Hardening Pegged to deal with the various kinds of
>> left-recursion was very challenging, but easier than transforming the
>> grammar I was dealing with (ISO 10206).
>
> Interesting, what kind of hardening?

The trick is to recurse, but knowing when to stop. At a certain point, recursing once more will reduce the length of the matched string. So if you can break recursion when the match stops growing, you are good. This is implemented by memoizing previous recursions.
There is an explanation along with literature references in my first PR on left-recursion [1]. However, the PR itself was flawed [2] and received several makeovers [3-5]. In the end, the solution was considerably more complex than the initial PR, involving introspection of the grammar to identify left-recursive loops and the appropriate rules to memoize. We now have complete support for direct, indirect and hidden left-recursion, including multiple intersecting loops and mixed left- and right-recursion. And it does not interfere with general memoization, which is important to speed up PEG parsers [6]. There are some illustrative unit tests in [7] and further.

Bastiaan.

[1] https://github.com/PhilippeSigaud/Pegged/pull/164
[2] https://github.com/PhilippeSigaud/Pegged/pull/165#issuecomment-158914006
[3] https://github.com/PhilippeSigaud/Pegged/pull/168
[4] https://github.com/PhilippeSigaud/Pegged/pull/186
[5] https://github.com/PhilippeSigaud/Pegged/pull/196
[6] Pegged does not memoize at CT though, at the moment.
[7] https://github.com/PhilippeSigaud/Pegged/blob/master/pegged/grammar.d#L2965
« First   ‹ Prev
1 2