View mode: basic / threaded / horizontal-split · Log in · Help
May 18, 2011
Goldie Parsing System v0.5 - Speed
Goldie Parsing System v0.5 is now out. This version focuses mainly on speed 
improvements.

== Links: ==

Homepage and Documentation:
   http://www.semitwist.com/goldie/

Prepackaged Downloads:
   http://www.dsource.org/projects/goldie/browser/downloads

== New in v0.5: ==

   - Improved lexing/parsing speed by about 5x-6x.

   - Small additional speedup lexing languages with large character sets 
(such as Unicode).

   - GRMC: Grammar Compiler: Supports {All Valid} character set.

   - GRMC: Grammar Compiler: Complex grammars are compiled to CGT up to 
about 4x-8x faster.

   - GRMC: Grammar Compiler: Verbose (-v) flag shows each step and amount 
of time taken.

   - Parse Anything: No more unhandled exception when parsing a source with 
an error.

   - Fixed to work with DMD 2.053 (still works with 2.052, too).

There are still more optimizations than can be done, but I felt this was 
enough to warrant a new release.
May 18, 2011
Re: Goldie Parsing System v0.5 - Speed
On 18.05.2011 05:47, Nick Sabalausky wrote:
> Goldie Parsing System v0.5 is now out. This version focuses mainly on speed
> improvements.
>
> == Links: ==
>
> Homepage and Documentation:
>      http://www.semitwist.com/goldie/
>
> Prepackaged Downloads:
>      http://www.dsource.org/projects/goldie/browser/downloads
>
> == New in v0.5: ==
>
>      - Improved lexing/parsing speed by about 5x-6x.
>
>      - Small additional speedup lexing languages with large character sets
> (such as Unicode).
>
>      - GRMC: Grammar Compiler: Supports {All Valid} character set.
>
>      - GRMC: Grammar Compiler: Complex grammars are compiled to CGT up to
> about 4x-8x faster.
>
>      - GRMC: Grammar Compiler: Verbose (-v) flag shows each step and amount
> of time taken.
>
>      - Parse Anything: No more unhandled exception when parsing a source with
> an error.
>
>      - Fixed to work with DMD 2.053 (still works with 2.052, too).
>
> There are still more optimizations than can be done, but I felt this was
> enough to warrant a new release.
>
>

Great work.

Is it possible to generate a parser for D with this ?

Regards,
Stephan
May 18, 2011
Re: Goldie Parsing System v0.5 - Speed
"Stephan" <spam@extrawurst.org> wrote in message 
news:ir05te$tbd$1@digitalmars.com...
> On 18.05.2011 05:47, Nick Sabalausky wrote:
>> Goldie Parsing System v0.5 is now out. This version focuses mainly on 
>> speed
>> improvements.
>>
>
> Great work.
>

Thanks :)

> Is it possible to generate a parser for D with this ?
>

It should be possible to write a grammar that handles most of D. But there 
would be some awkwardness and corner cases that, to really be handled right, 
would need some enhancements I haven't put in yet.

For example:

- Nested comments aren't yet officially supported. GOLD (which Goldie is 
based on) will support them in the currently-in-beta v4.2 ( 
http://www.devincook.com/goldparser/v4.2.htm ). I intend to make Goldie 
fully compatible with all the new GOLD v4.2 features, but just haven't 
gotten to them yet. In the meantime, what you can do is lex the D source 
first, then go through the resulting token array removing everything from a 
"/+" token to its matching "+/" token (there will be a bunch of junk in 
between, including some error tokens, you can just rip it all out), and then 
send that through the parser.

- Another comment-related thing that'll be fixed with the v4.2 enhancements: 
Currently, GOLD and Goldie handle (non-nested) block comments by actually 
lexing what's inside the comment (and ignoring any errors). Normally this 
works out fine, but it does lead to some occasional edge-cases where the 
"*/" isn't handled right.

- D relies on certain disambiguation rules. For instance: "a*b" could be 
either a multiplication expression or a pointer declaration. D handles this 
by saying "if something can be either an expression or a declaration, then 
always interpret it as (umm...actually I forget which one it always chooses, 
but it's always that same one)". Goldie (and GOLD) currently doesn't have 
any conflict resolution. If you try to create a grammar that has such an 
ambiguity, you'll just get a "reduce-reduce conflict" error, or 
"shift-reduce" problems. The way to work around this is to design the 
grammar to completely conflate the two notions, so instead of having 
<Expression> and <Declaration>, you'd just have something like <ExprOrDecl>. 
Unfortunately, this isn't always easy, it does tend to obfuscate the 
grammar, it makes the nonterminals less meaningful, and it'll create much 
more work for your semantics pass. I do intend to solve this, but it'll 
probably be a very non-trival matter. More discussion (possibly a bit 
technical) on this issue is here: 
http://groups.google.com/group/gold-parsing-system/browse_thread/thread/5959e0cfef76ce68

FWIW, Goldie does include a lex-only grammar for D2, which could be used as 
a starting point (although it's possible I might have gotten some edge cases 
wrong regarding the decimal literals. Also, this grammar is currently 
ASCII-only, but that can easily be changed):

http://www.dsource.org/projects/goldie/browser/tags/v0.5/lang/dlex.grm
Top | Discussion index | About this forum | D home