View mode: basic / threaded / horizontal-split · Log in · Help
July 09, 2012
Re: Let's stop parser Hell
"Jonathan M Davis" <jmdavisProg@gmx.com> wrote in message 
news:mailman.190.1341818983.31962.digitalmars-d@puremagic.com...
>>
>> I'm pretty sure UFCS affects lexing or parsing. How else would this be
>> legal:
>>
>> 4.foo();
>
> That definitely wouldn't affect lexing, because it doesn't affect the 
> tokens at
> all.

Not true.  This used to be lexed as '4.f' 'oo'. (I think)
July 10, 2012
Re: Let's stop parser Hell
On Saturday, 7 July 2012 at 16:37:56 UTC, Roman D. Boiko wrote:
>> Note that PEG does not impose to use packrat parsing, even 
>> though it was developed to use it. I think it's a historical 
>> 'accident' that put the two together: Bryan Ford thesis used 
>> the two together.
>>
>> Note that many PEG parsers do not rely on packrat (Pegged does 
>> not).
>> There are a bunch of articles on Bryan Ford's website by a guy
>> writting a PEG parser for Java, and who found that storing the 
>> last rules was enought to get a slight speed improvement, buth 
>> that doing anymore sotrage was detrimental to the parser's 
>> overall efficiency.
>
> That's great! Anyway I want to understand the advantages and 
> limitations of both Pegged and ANTLR, and probably study some 
> more techniques. Such research consumes a lot of time but can 
> be done incrementally along with development.

One disadvantage of Packrat parsers I mentioned was problematic 
error recovery (according to the article from ANTLR website). 
After some additional research, I found that it is not a critical 
problem. To find the exact place of error (from parser's 
perspective, not user's) one only needs to remember the farthest 
successfully parsed position (among several backtracking 
attempts) and the reason that it failed.

It is also possible to rerun parsing with some additional 
heuristics after failing, thus enabling advanced error repair 
scenarios.

Since Pegged doesn't use Packrat algorithm, this solution might 
be either not relevant or not applicable, but I doubt that there 
will be any fundamental problem with error recovery.

Unpleasant debugging experience, however, should be relevant for 
any parser that uses backtracking heavily.
July 10, 2012
Re: Let's stop parser Hell
Tue, Jul 10, 2012 at 12:41 PM, Roman D. Boiko <rb@d-coding.com> wrote:


> One disadvantage of Packrat parsers I mentioned was problematic error
> recovery (according to the article from ANTLR website). After some
> additional research, I found that it is not a critical problem. To find the
> exact place of error (from parser's perspective, not user's) one only needs
> to remember the farthest successfully parsed position (among several
> backtracking attempts) and the reason that it failed.

IIRC, that's what I encoded in Pegged (admittedly limited) error
reporting: remember the farthest error.

> It is also possible to rerun parsing with some additional heuristics after
> failing, thus enabling advanced error repair scenarios.

Do people really what error-repairing parsers? I want my parsers to
tell me something is bad, and, optionally to advance a possible
repair, but definitely *not* to automatically repair a inferred error
and continue happily.
July 10, 2012
Re: Let's stop parser Hell
On 07/10/2012 09:14 PM, Philippe Sigaud wrote:
> Tue, Jul 10, 2012 at 12:41 PM, Roman D. Boiko<rb@d-coding.com>  wrote:
>
>
>> One disadvantage of Packrat parsers I mentioned was problematic error
>> recovery (according to the article from ANTLR website). After some
>> additional research, I found that it is not a critical problem. To find the
>> exact place of error (from parser's perspective, not user's) one only needs
>> to remember the farthest successfully parsed position (among several
>> backtracking attempts) and the reason that it failed.
>
> IIRC, that's what I encoded in Pegged (admittedly limited) error
> reporting: remember the farthest error.
>
>> It is also possible to rerun parsing with some additional heuristics after
>> failing, thus enabling advanced error repair scenarios.
>
> Do people really what error-repairing parsers? I want my parsers to
> tell me something is bad, and, optionally to advance a possible
> repair, but definitely *not* to automatically repair a inferred error
> and continue happily.

FWIW, this is what most HTML parsers are doing.
July 10, 2012
Re: Let's stop parser Hell
On Tue, Jul 10, 2012 at 9:25 PM, Timon Gehr <timon.gehr@gmx.ch> wrote:

>> Do people really what error-repairing parsers? I want my parsers to
>> tell me something is bad, and, optionally to advance a possible
>> repair, but definitely *not* to automatically repair a inferred error
>> and continue happily.
>
>
> FWIW, this is what most HTML parsers are doing.

Ah, right. I can get it for HTML/XML. JSON also, maybe.
I was thinking of parsing a programming language (C, D, etc)

Consider me half-convinced :)
July 10, 2012
Re: Let's stop parser Hell
On Tuesday, 10 July 2012 at 19:41:29 UTC, Philippe Sigaud wrote:
> On Tue, Jul 10, 2012 at 9:25 PM, Timon Gehr <timon.gehr@gmx.ch> 
> wrote:
>
>>> Do people really what error-repairing parsers? I want my 
>>> parsers to
>>> tell me something is bad, and, optionally to advance a 
>>> possible
>>> repair, but definitely *not* to automatically repair a 
>>> inferred error
>>> and continue happily.
>>
>>
>> FWIW, this is what most HTML parsers are doing.
>
> Ah, right. I can get it for HTML/XML. JSON also, maybe.
> I was thinking of parsing a programming language (C, D, etc)
>
> Consider me half-convinced :)

It would still generate errors. But would enable a lot of useful 
functionality: autocompletion, refactoring, symbol documentation 
in a tooltip, displaying method overloads with parameters 
as-you-type, go to definition, etc.
July 10, 2012
Re: Let's stop parser Hell
On Tuesday, July 10, 2012 21:25:52 Timon Gehr wrote:
> On 07/10/2012 09:14 PM, Philippe Sigaud wrote:
> > Tue, Jul 10, 2012 at 12:41 PM, Roman D. Boiko<rb@d-coding.com> wrote:
> >> One disadvantage of Packrat parsers I mentioned was problematic error
> >> recovery (according to the article from ANTLR website). After some
> >> additional research, I found that it is not a critical problem. To find
> >> the
> >> exact place of error (from parser's perspective, not user's) one only
> >> needs
> >> to remember the farthest successfully parsed position (among several
> >> backtracking attempts) and the reason that it failed.
> > 
> > IIRC, that's what I encoded in Pegged (admittedly limited) error
> > reporting: remember the farthest error.
> > 
> >> It is also possible to rerun parsing with some additional heuristics
> >> after
> >> failing, thus enabling advanced error repair scenarios.
> > 
> > Do people really what error-repairing parsers? I want my parsers to
> > tell me something is bad, and, optionally to advance a possible
> > repair, but definitely *not* to automatically repair a inferred error
> > and continue happily.
> 
> FWIW, this is what most HTML parsers are doing.

Which is horrible. You pretty much have to with HTML because of the horrid 
decision that it should be parsed so laxly by browsers, but pretty much 
nothing else should do that. Either it's correct or it's not. Having the 
compiler "fix" your code would cause far more problems that it would ever fix.

- Jonathan M Davis
July 10, 2012
Re: Let's stop parser Hell
On Tuesday, 10 July 2012 at 20:25:12 UTC, Jonathan M Davis wrote:
> On Tuesday, July 10, 2012 21:25:52 Timon Gehr wrote:
>> FWIW, this is what most HTML parsers are doing.
>
> Which is horrible. You pretty much have to with HTML because of 
> the horrid
> decision that it should be parsed so laxly by browsers, but 
> pretty much
> nothing else should do that. Either it's correct or it's not. 
> Having the
> compiler "fix" your code would cause far more problems that it 
> would ever fix.

Not having control over parser or source code causes problems. 
Ability to deliver useful functionality (see my post above) is a 
different use case.
July 10, 2012
Re: Let's stop parser Hell
On 2012-07-10 22:25, Jonathan M Davis wrote:

> Which is horrible. You pretty much have to with HTML because of the horrid
> decision that it should be parsed so laxly by browsers, but pretty much
> nothing else should do that. Either it's correct or it's not. Having the
> compiler "fix" your code would cause far more problems that it would ever fix.

I'm not sure but I think he was referring to a kind of error reporting 
technique used by compilers. Example:

int foo ()
{
   int a = 3 // note the missing semicolon
   return a;
}

Instead of the parser going completely mad because of the missing 
semicolon. It will basically insert a semicolon, report the error and 
then happily continue parsing. I think this will make it easier to find 
later errors and less likely to report incorrect errors due to a 
previous error.

-- 
/Jacob Carlborg
July 10, 2012
Re: Let's stop parser Hell
On 11-Jul-12 00:25, Jonathan M Davis wrote:
> On Tuesday, July 10, 2012 21:25:52 Timon Gehr wrote:
>> On 07/10/2012 09:14 PM, Philippe Sigaud wrote:
>>> Tue, Jul 10, 2012 at 12:41 PM, Roman D. Boiko<rb@d-coding.com> wrote:
>>>> One disadvantage of Packrat parsers I mentioned was problematic error
>>>> recovery (according to the article from ANTLR website). After some
>>>> additional research, I found that it is not a critical problem. To find
>>>> the
>>>> exact place of error (from parser's perspective, not user's) one only
>>>> needs
>>>> to remember the farthest successfully parsed position (among several
>>>> backtracking attempts) and the reason that it failed.
>>>
>>> IIRC, that's what I encoded in Pegged (admittedly limited) error
>>> reporting: remember the farthest error.
>>>
>>>> It is also possible to rerun parsing with some additional heuristics
>>>> after
>>>> failing, thus enabling advanced error repair scenarios.
>>>
>>> Do people really what error-repairing parsers? I want my parsers to
>>> tell me something is bad, and, optionally to advance a possible
>>> repair, but definitely *not* to automatically repair a inferred error
>>> and continue happily.
>>
>> FWIW, this is what most HTML parsers are doing.
>
> Which is horrible. You pretty much have to with HTML because of the horrid
> decision that it should be parsed so laxly by browsers, but pretty much
> nothing else should do that. Either it's correct or it's not. Having the
> compiler "fix" your code would cause far more problems that it would ever fix.
>

BTW clang does this and even more of stuff on semantic level. It's known 
to won a legions of users because of that (well not only that but good 
diagnostic in general).


-- 
Dmitry Olshansky
13 14 15 16 17 18 19 20 21
Top | Discussion index | About this forum | D home