Let's stop parser Hell (page 6) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Let's stop parser Hell (page 6)

July 06, 2012

Re: Let's stop parser Hell

Posted by Nick Sabalausky
in reply to Philippe Sigaud

Nick Sabalausky

Posted in reply to Philippe Sigaud

On Thu, 5 Jul 2012 21:54:29 +0200
Philippe Sigaud <philippe.sigaud@gmail.com> wrote:
> 
> (Hesitating between 'The Art of the Metaobject Protocol' and 'Compilers, Techniques and Tools', right now)

FWIW, I've found this one to be pretty helpful:

http://www.pearsonhighered.com/educator/product/Crafting-A-Compiler/9780136067054.page

It's the closest one I've seen to being more for programmers than mathematicians.

July 06, 2012

Re: Let's stop parser Hell

Posted by Philippe Sigaud
in reply to Nick Sabalausky

Philippe Sigaud

Posted in reply to Nick Sabalausky

On Fri, Jul 6, 2012 at 7:26 AM, Nick Sabalausky <SeeWebsiteToContactMe@semitwist.com> wrote:
> On Thu, 5 Jul 2012 21:54:29 +0200
> Philippe Sigaud <philippe.sigaud@gmail.com> wrote:
>>
>> (Hesitating between 'The Art of the Metaobject Protocol' and 'Compilers, Techniques and Tools', right now)
>
> FWIW, I've found this one to be pretty helpful:
>
> http://www.pearsonhighered.com/educator/product/Crafting-A-Compiler/9780136067054.page
>
> It's the closest one I've seen to being more for programmers than mathematicians.

Thanks Nick, I'll add it to my 'will buy one day' list (which grows
quite rapidly)
$122? Ouch.

July 06, 2012

Re: Let's stop parser Hell

Posted by Nick Sabalausky
in reply to Philippe Sigaud

Nick Sabalausky

Posted in reply to Philippe Sigaud

On Fri, 6 Jul 2012 09:50:26 +0200
Philippe Sigaud <philippe.sigaud@gmail.com> wrote:

> On Fri, Jul 6, 2012 at 7:26 AM, Nick Sabalausky <SeeWebsiteToContactMe@semitwist.com> wrote:
> > On Thu, 5 Jul 2012 21:54:29 +0200
> > Philippe Sigaud <philippe.sigaud@gmail.com> wrote:
> >>
> >> (Hesitating between 'The Art of the Metaobject Protocol' and 'Compilers, Techniques and Tools', right now)
> >
> > FWIW, I've found this one to be pretty helpful:
> >
> > http://www.pearsonhighered.com/educator/product/Crafting-A-Compiler/9780136067054.page
> >
> > It's the closest one I've seen to being more for programmers than mathematicians.
> 
> Thanks Nick, I'll add it to my 'will buy one day' list (which grows
> quite rapidly)
> $122? Ouch.

Yea, it's from an academic publisher unfortunately, so $$$. I got ahold
of it through the local library systems (All of Ohio's college libraries
are connected in a system called OhioLINK, and books from it can be
put on hold and shipped to most of the public libraries - or at least
most of the public libraries around the Cleveland area.) Maybe there's
something similar out your way?

July 07, 2012

Re: Let's stop parser Hell

Posted by David Piepgrass
in reply to Andrei Alexandrescu

David Piepgrass

Posted in reply to Andrei Alexandrescu

>> Resume: everybody is welcome to join effort of translating DMD front end, and improving Pegged.
>>
>> Also I would like to invite those interested in DCT project to help me with it. Right now I'm trying to understand whether it is possible to incorporate Pegged inside without losing anything critical (and I think it is very likely possible), and how exactly to do that.
>>
>> Dmitry proposed to help improve Pegged (or some other compiler's) speed.
>>
>> Anyone else?
>
> I'd really want to create a task force on this, it is of strategic importance to D. In Walter's own words, no new feature is going to push us forward since we're not really using the great goodies we've got, and CTFE technology is the most important.

Hi everybody! My name's David and I've been dreaming since around 1999 of making my own computer language, and never found the time for it. The first time I looked at D it was around 2004 or so, and it just looked like a "moderately better C++" which I forgot about, having more lofty ideas. When I found out about D2's metaprogramming facilities I instantly became much more interested, although I still wish to accomplish more than is possible ATM.

I've been talking to my boss about reducing my working hours, mainly in order to have time to work on something related to D. My goal is to popularize a language that is efficient (as in runtime speed and size), expressive, safe, concise, readable, well-documented, easy-to-use, and good at finding errors in your code.  In other words, I want a language that is literally all things to all people, a language that is effective for any task. I want to kill off this preconceived notion that most programmers seem to have, that fast code requires a language like C++ that is hard to use. The notion that Rapid Application Development is incompatible with an efficient executable is nonsense and I want to kill it :)

To be honest I have some reservations about D, but of all the languages I know, D is currently closest to my ideal.

This work on parsers might be a good place for me to dive in. I have an interest in parsers and familiarity with LL, LALR, PEGs, and even Pratt parsers (fun!), but I am still inexperienced.

I also like writing documentation and articles, but I always find it hard to figure out how other people's code works well enough to document it.

I'm having some trouble following this thread due to the acronyms: CTFE, DCT, AA. At least I managed to figure out that CTFE is Compile Time Function Execution. DCT and AA I already know as Discrete Cosine Transform and Anti-Aliasing, of course.... but what's it mean to you?

One thing that has always concerned me about PEGs is that they always say PEGs are slower than traditional two-phase LALR(1) or LL(k) parsers. However, I have never seen any benchmarks. Does anyone know exactly how much performance you lose in an (optimized) PEG compared to an (optimized) LALR/LL parser + LL/regex lexer?

Anyway, it's the weekend, during which I hope I can find a place to fit in with you guys.

July 07, 2012

Re: Let's stop parser Hell

Posted by Jonathan M Davis
in reply to Andrei Alexandrescu

Jonathan M Davis

Posted in reply to Andrei Alexandrescu

On Thursday, July 05, 2012 14:22:35 Andrei Alexandrescu wrote:
> I also am actively opposed to a project of just translating D's front-end to D and dropping it into Phobos because it would smother (a) work on generic parser generators, and (b) strong, dependable formalization of D's syntax.

I'm not even vaguely convinced that having a lexer/parser specifically for D in Phobos (regardless of whether it comes from dmd or not) will have a negative effect on work for making generic parser generators. People are going to want a good parser generator _regardless_ of what the situation with parsing D is. And I'd be very surprised if you couldn't make a hand-written parser for D which was better than one that you can generate. Parser generation is great, because it allows you to quickly and easily put together a parser from a grammar, but it's unlikely to be as fast as a hand-written one optimized for a particular language. However, as the recent discussion on popFront shows, only benchmarking of actual solutions would show that for sure.

Now, the issue of a "strong, dependable formalization of D's syntax" is another thing entirely. Porting dmd's lexer and parser to Phobos would keep the Phobos implementation in line with dmd much more easily and avoid inconsistencies in the language definition and the like. However, if we write a new lexer and parser specifically for Phobos which _doesn't_ port the lexer or parser from dmd, then that _would_ help drive making the spec match the compiler (or vice versa). So, I agree that could be a definite argument for writing a lexer and parser from scratch rather than porting the one from dmd, but I don't buy the bit about it smothering parser generators at all. I think that the use cases are completely different.

- Jonathan M Davis

July 07, 2012

Re: Let's stop parser Hell

Posted by H. S. Teoh
in reply to David Piepgrass

H. S. Teoh

Posted in reply to David Piepgrass

On Sat, Jul 07, 2012 at 02:45:35AM +0200, David Piepgrass wrote: [...]
> I'm having some trouble following this thread due to the acronyms: CTFE, DCT, AA. At least I managed to figure out that CTFE is Compile Time Function Execution. DCT and AA I already know as Discrete Cosine Transform and Anti-Aliasing, of course.... but what's it mean to you?

Frankly, I don't know what DCT stands for (anyone?), but AA stands for
Associative Array (aka hash, dictionary, etc.).

[...]
> Anyway, it's the weekend, during which I hope I can find a place to fit in with you guys.

Welcome aboard!!!

D still has some ways to go (there are still warts in some places), but I also have to admit that it's closest to my ideal language too. I'm so sick and fed up of the tedium of C and the convolutions of C++, that I just can't see myself doing any more C/C++ except for code maintenance. New projects I would do in D. (Personal projects, anyway... work projects, I don't get to choose.)

Anyway, D still has some ways to go, meaning that more manpower is always more than welcome.

T

-- 
Elegant or ugly code as well as fine or rude sentences have something in common: they don't depend on the language. -- Luca De Vitis

July 07, 2012

Re: Let's stop parser Hell

Posted by Roman D. Boiko
in reply to H. S. Teoh

Roman D. Boiko

Posted in reply to H. S. Teoh

On Saturday, 7 July 2012 at 01:47:21 UTC, H. S. Teoh wrote:
> Frankly, I don't know what DCT stands for (anyone?)
That's a working (not final) name of my project, a.k.a. The D Compiler Tools. I've used this acronym in previous threads but didn't explain this time. Current state is very far from completion, it is in research phase. Link: https://github.com/roman-d-boiko/dct, but I'd suggest to contact me at rb@d-coding.com if anyone is interested.

July 07, 2012

Re: Let's stop parser Hell

Posted by Roman D. Boiko
in reply to David Piepgrass

Roman D. Boiko

Posted in reply to David Piepgrass

On Saturday, 7 July 2012 at 00:45:36 UTC, David Piepgrass wrote:
> This work on parsers might be a good place for me to dive in. I have an interest in parsers and familiarity with LL, LALR, PEGs, and even Pratt parsers (fun!), but I am still inexperienced.
...
> One thing that has always concerned me about PEGs is that they always say PEGs are slower than traditional two-phase LALR(1) or LL(k) parsers. However, I have never seen any benchmarks. Does anyone know exactly how much performance you lose in an (optimized) PEG compared to an (optimized) LALR/LL parser + LL/regex lexer?

I decided to ask a question about this:

http://stackoverflow.com/questions/11373644/performance-of-parsers-peg-vs-lalr1-or-llk

Don't hesitate to edit it if you would like to clarify some aspect.

July 07, 2012

Re: Let's stop parser Hell

Posted by Dmitry Olshansky
in reply to Roman D. Boiko

Dmitry Olshansky

Posted in reply to Roman D. Boiko

On 07-Jul-12 13:06, Roman D. Boiko wrote:
> On Saturday, 7 July 2012 at 00:45:36 UTC, David Piepgrass wrote:
>> This work on parsers might be a good place for me to dive in. I have
>> an interest in parsers and familiarity with LL, LALR, PEGs, and even
>> Pratt parsers (fun!), but I am still inexperienced.
> ...
>> One thing that has always concerned me about PEGs is that they always
>> say PEGs are slower than traditional two-phase LALR(1) or LL(k)
>> parsers. However, I have never seen any benchmarks. Does anyone know
>> exactly how much performance you lose in an (optimized) PEG compared
>> to an (optimized) LALR/LL parser + LL/regex lexer?
>
> I decided to ask a question about this:
>
> http://stackoverflow.com/questions/11373644/performance-of-parsers-peg-vs-lalr1-or-llk
>
>
> Don't hesitate to edit it if you would like to clarify some aspect.

I can tell you that they are not slower then one another in principle. Quality of implementations trumps every theoretical aspect here. LALR is usually fast even if implemented by book but they are hard to optimize futher and quite restrictive on "semantic extensions".

Proper CFG parsers all are liner-time anyway.

-- 
Dmitry Olshansky

July 07, 2012

Re: Let's stop parser Hell

Posted by Roman D. Boiko
in reply to Roman D. Boiko

Roman D. Boiko

Posted in reply to Roman D. Boiko

On Saturday, 7 July 2012 at 09:06:57 UTC, Roman D. Boiko wrote:
> http://stackoverflow.com/questions/11373644/performance-of-parsers-peg-vs-lalr1-or-llk

So far it looks like LALR parsers may have lower constant factors than Packrat.

The difference could be minimized by paying attention to parsing of terminal symbols, which was in my plans already. It is not necessary to strictly follow Packrat parsing algorithm.

The benefits of Pegged, in my view, are its support of Parsing Expression Grammar (PEG) and compile-time evaluation. It is easily extensible and modifiable.

When I implemented recursive-descent parser by hand in one of early drafts of DCT, I strongly felt the need to generalize code in a way which in retrospect I would call PEG-like. The structure of my hand-written recursive-descent parser was a one-to-one mapping to an implemented subset of D specification, and I considered it problematic, because it was needed to duplicate the same structure in the resulting AST.

PEG is basically a language that describes both, the implementation of parser, and the language syntax. It greatly reduces implicit code duplication.

I think that generated code can be made almost as fast as a hand-written parser for a particular language (probably, a few percent slower). Especially if that language is similar to D (context-free, with fine-grained hierarchical grammar). Optimizations might require to forget about strictly following any theoretical algorithm, but that should be OK.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation