View mode: basic / threaded / horizontal-split · Log in · Help
July 10, 2012
Re: Let's stop parser Hell
On Tuesday, July 10, 2012 22:40:17 Jacob Carlborg wrote:
> On 2012-07-10 22:25, Jonathan M Davis wrote:
> > Which is horrible. You pretty much have to with HTML because of the horrid
> > decision that it should be parsed so laxly by browsers, but pretty much
> > nothing else should do that. Either it's correct or it's not. Having the
> > compiler "fix" your code would cause far more problems that it would ever
> > fix.
> I'm not sure but I think he was referring to a kind of error reporting
> technique used by compilers. Example:
> 
> int foo ()
> {
> int a = 3 // note the missing semicolon
> return a;
> }
> 
> Instead of the parser going completely mad because of the missing
> semicolon. It will basically insert a semicolon, report the error and
> then happily continue parsing. I think this will make it easier to find
> later errors and less likely to report incorrect errors due to a
> previous error.

Well, giving an error, continuing to parse, and giving a partial result can be 
useful (and you give a prime example of that), but "fixing" the problem (e.g by 
inserting the semicolon) and not considering it to be an error would be a 
_huge_ mistake IMHO. And that's essentially what happens with HTML.

- Jonathan M Davis
July 10, 2012
Re: Let's stop parser Hell
On 07/10/2012 10:53 PM, Jonathan M Davis wrote:
> On Tuesday, July 10, 2012 22:40:17 Jacob Carlborg wrote:
>> On 2012-07-10 22:25, Jonathan M Davis wrote:
>>> Which is horrible. You pretty much have to with HTML because of the horrid
>>> decision that it should be parsed so laxly by browsers, but pretty much
>>> nothing else should do that. Either it's correct or it's not. Having the
>>> compiler "fix" your code would cause far more problems that it would ever
>>> fix.
>> I'm not sure but I think he was referring to a kind of error reporting
>> technique used by compilers. Example:
>>
>> int foo ()
>> {
>>     int a = 3 // note the missing semicolon
>>     return a;
>> }
>>
>> Instead of the parser going completely mad because of the missing
>> semicolon. It will basically insert a semicolon, report the error and
>> then happily continue parsing. I think this will make it easier to find
>> later errors and less likely to report incorrect errors due to a
>> previous error.
>
> Well, giving an error, continuing to parse, and giving a partial result can be
> useful (and you give a prime example of that), but "fixing" the problem (e.g by
> inserting the semicolon) and not considering it to be an error would be a
> _huge_ mistake IMHO.

This is actually precisely what many of the more recent curly-brace-
and-semicolon languages have been doing with regard to semicolons.
July 10, 2012
Re: Let's stop parser Hell
On 09/07/2012 10:14, Christophe Travert wrote:
> deadalnix , dans le message (digitalmars.D:171330), a écrit :
>> D isn't 100% CFG. But it is close.
>
> What makes D fail to be a CFG?

type[something] <= something can be a type or an expression.
typeid(somethning) <= same here
identifier!(something) <= again
July 10, 2012
Re: Let's stop parser Hell
On 07/11/2012 01:16 AM, deadalnix wrote:
> On 09/07/2012 10:14, Christophe Travert wrote:
>> deadalnix , dans le message (digitalmars.D:171330), a écrit :
>>> D isn't 100% CFG. But it is close.
>>
>> What makes D fail to be a CFG?
>
> type[something] <= something can be a type or an expression.
> typeid(somethning) <= same here
> identifier!(something) <= again

'something' is context-free:

something ::= type | expression.
July 11, 2012
Re: Let's stop parser Hell
On 2012-07-10 22:53, Jonathan M Davis wrote:

> Well, giving an error, continuing to parse, and giving a partial result can be
> useful (and you give a prime example of that), but "fixing" the problem (e.g by
> inserting the semicolon) and not considering it to be an error would be a
> _huge_ mistake IMHO. And that's essentially what happens with HTML.

No, that is _not_ what happens with HTML. With HTML, the browser _do 
not_ output the error and continues as if it was valid could. As far as 
I know, up until HTML 5, the spec hasn't mentioned what should happen 
with invalid code.

This is just a error handling strategy that is an implementation detail. 
It will not change what is and what isn't valid code. Are you preferring 
getting just the first error when compiling? Fix the error, compile 
again, get a new error and so on.

-- 
/Jacob Carlborg
July 11, 2012
Re: Let's stop parser Hell
On Wednesday, July 11, 2012 08:41:53 Jacob Carlborg wrote:
> On 2012-07-10 22:53, Jonathan M Davis wrote:
> > Well, giving an error, continuing to parse, and giving a partial result
> > can be useful (and you give a prime example of that), but "fixing" the
> > problem (e.g by inserting the semicolon) and not considering it to be an
> > error would be a _huge_ mistake IMHO. And that's essentially what happens
> > with HTML.
> No, that is _not_ what happens with HTML. With HTML, the browser _do
> not_ output the error and continues as if it was valid could. As far as
> I know, up until HTML 5, the spec hasn't mentioned what should happen
> with invalid code.
> 
> This is just a error handling strategy that is an implementation detail.
> It will not change what is and what isn't valid code. Are you preferring
> getting just the first error when compiling? Fix the error, compile
> again, get a new error and so on.

??? I guess that I wasn't clear. I mean that with HTML, it ignores errors. The 
browser doesn't spit out errors. It just guesses at what you really meant and 
displays that. It "fixes" the error for you, which is a horrible design IMHO. 
Obviously, we're stuck with it for HTML, but it should not be replicated with 
anything else.

This is in contrast to your example of outputting an error and continuing to 
parse as best it can in order to provide more detail and more error messages 
but _not_ ultimately considering the parsing successful. _That_ is useful. 
HTML's behavior is not.

- Jonathan M Davis
July 11, 2012
Re: Let's stop parser Hell
Timon Gehr , dans le message (digitalmars.D:171814), a écrit :
> On 07/11/2012 01:16 AM, deadalnix wrote:
>> On 09/07/2012 10:14, Christophe Travert wrote:
>>> deadalnix , dans le message (digitalmars.D:171330), a écrit :
>>>> D isn't 100% CFG. But it is close.
>>>
>>> What makes D fail to be a CFG?
>>
>> type[something] <= something can be a type or an expression.
>> typeid(somethning) <= same here
>> identifier!(something) <= again
> 
> 'something' is context-free:
> 
> something ::= type | expression.

Do you have to know if something is a type or an expression for a simple 
parsing? The langage would better not require this, otherwise simple 
parsing is not possible without looking at all forward references and 
imported files.
July 11, 2012
Re: Let's stop parser Hell
On 2012-07-11 08:52, Jonathan M Davis wrote:

> ??? I guess that I wasn't clear. I mean that with HTML, it ignores errors. The
> browser doesn't spit out errors. It just guesses at what you really meant and
> displays that. It "fixes" the error for you, which is a horrible design IMHO.
> Obviously, we're stuck with it for HTML, but it should not be replicated with
> anything else.
>
> This is in contrast to your example of outputting an error and continuing to
> parse as best it can in order to provide more detail and more error messages
> but _not_ ultimately considering the parsing successful. _That_ is useful.
> HTML's behavior is not.

Ok, I see. It seems we're meaning the same thing.

-- 
/Jacob Carlborg
July 11, 2012
Re: Let's stop parser Hell
On 07/11/2012 10:23 AM, Christophe Travert wrote:
> Timon Gehr , dans le message (digitalmars.D:171814), a écrit :
>> On 07/11/2012 01:16 AM, deadalnix wrote:
>>> On 09/07/2012 10:14, Christophe Travert wrote:
>>>> deadalnix , dans le message (digitalmars.D:171330), a écrit :
>>>>> D isn't 100% CFG. But it is close.
>>>>
>>>> What makes D fail to be a CFG?
>>>
>>> type[something]<= something can be a type or an expression.
>>> typeid(somethning)<= same here
>>> identifier!(something)<= again
>>
>> 'something' is context-free:
>>
>> something ::= type | expression.
>
> Do you have to know if something is a type or an expression for a simple
> parsing?

No. Some token sequences can be both a type and an expression based on
the context (the CFG is ambiguous), but that is the analysers business.
Parsing D code does not require any kind of analysis.

> The langage would better not require this, otherwise simple
> parsing is not possible without looking at all forward references and
> imported files.
July 11, 2012
Re: Let's stop parser Hell
On Tuesday, 10 July 2012 at 23:49:58 UTC, Timon Gehr wrote:
> On 07/11/2012 01:16 AM, deadalnix wrote:
>> On 09/07/2012 10:14, Christophe Travert wrote:
>>> deadalnix , dans le message (digitalmars.D:171330), a écrit :
>>>> D isn't 100% CFG. But it is close.
>>> What makes D fail to be a CFG?
>> type[something] <= something can be a type or an expression.
>> typeid(somethning) <= same here
>> identifier!(something) <= again
>
> 'something' is context-free:
>
> something ::= type | expression.

I don't see how "type | expression" is context free. The input 
"Foo" could be a type or expression, you can't tell which without 
looking at the context.
14 15 16 17 18 19 20 21 22
Top | Discussion index | About this forum | D home