Jump to page: 1 2 3
Thread overview
Re: Pegged: Syntax Highlighting
Mar 14, 2012
Andrej Mitrovic
Mar 14, 2012
Andrej Mitrovic
Mar 17, 2012
Philippe Sigaud
Mar 17, 2012
Extrawurst
Mar 17, 2012
Philippe Sigaud
Mar 17, 2012
Extrawurst
Mar 17, 2012
Philippe Sigaud
Mar 17, 2012
Philippe Sigaud
Mar 17, 2012
bls
Mar 27, 2012
Andrej Mitrovic
Mar 27, 2012
Philippe Sigaud
Mar 28, 2012
Andrej Mitrovic
Mar 28, 2012
Andrej Mitrovic
Mar 28, 2012
Andrej Mitrovic
Mar 28, 2012
Andrej Mitrovic
Mar 28, 2012
Andrej Mitrovic
Mar 28, 2012
Andrej Mitrovic
Mar 28, 2012
Andrej Mitrovic
Mar 28, 2012
Andrej Mitrovic
Mar 28, 2012
Philippe Sigaud
Mar 28, 2012
Philippe Sigaud
Mar 28, 2012
Philippe Sigaud
Mar 28, 2012
Andrej Mitrovic
Mar 29, 2012
Andrej Mitrovic
March 14, 2012
On 3/14/12, Andrej Mitrovic <andrej.mitrovich@gmail.com> wrote:
> how would one use a parser like Pegged for syntax highlighting?

Ok, typically one would use a lexer and not a parser. But using a parser might be more interesting for creating more complex syntax highlighting. :)
March 14, 2012
On 3/14/12, Andrej Mitrovic <andrej.mitrovich@gmail.com> wrote:
> On 3/14/12, Andrej Mitrovic <andrej.mitrovich@gmail.com> wrote:
>> how would one use a parser like Pegged for syntax highlighting?
>
> Ok, typically one would use a lexer and not a parser. But using a parser might be more interesting for creating more complex syntax highlighting. :)
>

Actually I think I can use the new ddmd-clean port for just this purpose. Sorry for the noise.
March 17, 2012
On Wed, Mar 14, 2012 at 21:03, Andrej Mitrovic <andrej.mitrovich@gmail.com> wrote:

>>> how would one use a parser like Pegged for syntax highlighting?
>>
>> Ok, typically one would use a lexer and not a parser. But using a parser might be more interesting for creating more complex syntax highlighting. :)
>>
>
> Actually I think I can use the new ddmd-clean port for just this purpose. Sorry for the noise.

Sorry for the late reply, I was away for a few days, in a Net-forsaken place ;)

If ddmd-clean is OK for you, that's cool. Keep us informed how that went.
If you want to use Pegged, you'd need to enter the entire D grammar to
get a correct parse tree.
I just finished writing it, but I'm afraid to try and compile it :)
It's one huge monster.
March 17, 2012
On 17.03.2012 08:01, Philippe Sigaud wrote:
> On Wed, Mar 14, 2012 at 21:03, Andrej Mitrovic
> <andrej.mitrovich@gmail.com>  wrote:
>
>>>> how would one use a parser like Pegged for syntax
>>>> highlighting?
>>>
>>> Ok, typically one would use a lexer and not a parser. But using a
>>> parser might be more interesting for creating more complex syntax
>>> highlighting. :)
>>>
>>
>> Actually I think I can use the new ddmd-clean port for just this
>> purpose. Sorry for the noise.
>
> Sorry for the late reply, I was away for a few days, in a Net-forsaken place ;)
>
> If ddmd-clean is OK for you, that's cool. Keep us informed how that went.
> If you want to use Pegged, you'd need to enter the entire D grammar to
> get a correct parse tree.
> I just finished writing it, but I'm afraid to try and compile it :)
> It's one huge monster.

I want to use Pegged for that purpose. So go ahead an commit the D grammar ;)
Would be so awesome if Pegged would be able to parse D.

~Extrawurst
March 17, 2012
> I want to use Pegged for that purpose. So go ahead an commit the D grammar
> ;)
> Would be so awesome if Pegged would be able to parse D.
>
> ~Extrawurst

The D grammar is a 1000-line / hundreds of rules monster. I finished
writing it and am now crushing bugs.
God, that generates a 10_000 line module to parse it. I should
simplify the code generator somewhat.
March 17, 2012
On 17.03.2012 15:13, Philippe Sigaud wrote:
> The D grammar is a 1000-line / hundreds of rules monster. I finished
> writing it and am now crushing bugs.

Any ETA when u gonna commit it for the public ? Wouldn't mind getting my hands dirty on it and looking for bugs too ;)
March 17, 2012
On 3/17/12 9:13 AM, Philippe Sigaud wrote:
>> I want to use Pegged for that purpose. So go ahead an commit the D grammar
>> ;)
>> Would be so awesome if Pegged would be able to parse D.
>>
>> ~Extrawurst
>
> The D grammar is a 1000-line / hundreds of rules monster. I finished
> writing it and am now crushing bugs.
> God, that generates a 10_000 line module to parse it. I should
> simplify the code generator somewhat.

Science is done. Welcome to implementation :o).

I can't say how excited I am about this direction. I have this vision of having a D grammar published on the website that is actually "it", i.e. the same exact grammar is used by a validator that goes through all of our test suite. (The validator wouldn't do any semantic checking.) The parser generator _and_ the reference D grammar would be available in Phobos, so for anyone it would be dirt cheap to parse some D code and wander through the generated AST. The availability of a reference grammar and parser would be golden to a variety of D toolchain creators.

Just to gauge interest:

1. Would you consider submitting your work to Phobos?

2. Do you think your approach can generate parsers competitive with hand-written ones? If not, why?


Andrei
March 17, 2012
On Sat, Mar 17, 2012 at 15:44, Extrawurst <spam@extrawurst.org> wrote:
> On 17.03.2012 15:13, Philippe Sigaud wrote:
>>
>> The D grammar is a 1000-line / hundreds of rules monster. I finished writing it and am now crushing bugs.
>
>
> Any ETA when u gonna commit it for the public ? Wouldn't mind getting my hands dirty on it and looking for bugs too ;)

I just pushed it on Github.

pegged/examples/dgrammar.d just contains the D grammar as a string. pegged/examples/ddump.d is the generated parser family.

There are no more syntax bugs, Pegged accepts the string as a correct
grammar and DMD accepts to compile the resulting classes.
I tested the generated parser on microscopic D files and... it
sometimes works :)

I made many mistakes and typos while writing the grammar. I corrected a few, but there are many more, without a doubt

I'll write a wiki page on how to generate the grammar anew, if need be.

Btw, the D grammar comes from the website (I didn't find the time to compare it to the grammar Rainer uses for Mono-D), and its horribly BNF-like: almost no + or * operators, etc. I tried to factor some expressions and simplify some, but it could be a bit shorter (not much, but still).
March 17, 2012
On Sat, Mar 17, 2012 at 18:11, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:

>> The D grammar is a 1000-line / hundreds of rules monster. I finished
>> writing it and am now crushing bugs.
>> God, that generates a 10_000 line module to parse it. I should
>> simplify the code generator somewhat.
>
>
> Science is done. Welcome to implementation :o).

Hey, it's only 3.000 lines now :) Coming from a thousand-lines grammar, it's not that much an inflation.


> I can't say how excited I am about this direction. I have this vision of having a D grammar published on the website that is actually "it", i.e. the same exact grammar is used by a validator that goes through all of our test suite. (The validator wouldn't do any semantic checking.) The parser generator _and_ the reference D grammar would be available in Phobos, so for anyone it would be dirt cheap to parse some D code and wander through the generated AST. The availability of a reference grammar and parser would be golden to a variety of D toolchain creators.

Indeed, but I fear the D grammar is a bit too complex to be easily
walked. Now that I read it, I realize that '1' is parsed as a
10-levels deep leaf!
Compared to lisp, it's... not in the same league, to say the least. I
will see to drastically simplify the parse tree.

Does anyone have experience with other languages similar to D and that
offer AST-walking? Doesn't C# have something like this?
(I'll have a look at Scala macros)

> Just to gauge interest:
>
> 1. Would you consider submitting your work to Phobos?

Yes, of course. It's already Boost-licensed.
Seeing the review processes for other modules, it'd most certainly put
the code in great shape. But then, it's far from being submittable
right now.


> 2. Do you think your approach can generate parsers competitive with hand-written ones? If not, why?

Right now, no, if only because I didn't take any step in making it
fast or in limiting its RAM consumption.
After applying some ideas I have, I don't know. There are many people
here that are parser-aware and could help make the code faster. But at
the core, to allow mutually recursive rules, the design use classes:

class A : someParserCombinationThatMayUseA { ... }

Which means A.parse (a static method) is just typeof(super).parse
(also static, and so on). Does that entail any crippling disadvantage
compared to hand-written parser?


Philippe
March 17, 2012
On 03/17/2012 01:53 PM, Philippe Sigaud wrote:
> Does anyone have experience with other languages similar to D and that
> offer AST-walking? Doesn't C# have something like this?
> (I'll have a look at Scala macros)
>

Hi Philippe.
Of course the visitor pattern comes in mind.

Eclipse (Java) uses a specialized visitor pattern  called "hierarchical visitor pattern" to traverse the AST.

The classic visitor pattern has the following disadvantages :

-- hierarchical navigation -- the traditional Visitor Pattern has no concept of depth. As a result, visitor cannot determine if one composite is within another composite or beside it.

-- conditional navigation -- the traditional Visitor Pattern does not allow branches to be skipped. As a result, visitor cannot stop, filter, or optimize traversal based on some condition.

Interesting stuff at :

http://c2.com/cgi/wiki?HierarchicalVisitorPattern
You'll find some implementation details at the bottom of the doc.
hth Bjoern
« First   ‹ Prev
1 2 3