RFC: std.json sucessor (page 14) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » RFC: std.json sucessor (page 14)

August 26, 2014

Re: RFC: std.json sucessor

Posted by Ola Fosheim Grøstad
in reply to Don

Ola Fosheim Grøstad

Posted in reply to Don

On Tuesday, 26 August 2014 at 13:24:11 UTC, Don wrote:
> No, it's more subtle. On the original x87, signalling NaNs are triggered for 64 bits loads, but not for 80 bit loads. You have to read the fine print to discover this.

You are right, but it happens for loads from the FP-stack too: «Source operand is an SNaN. Does not occur if the source operand is in double extended-precision floating-point format (FLD m80fp or FLD ST(i)).»

> I don't think the behaviour was intentional.

It seems reasonable, you need to load/save NaNs without exceptions if you do a context switch? I don't think the extended format was not meant for "end users".

Anyway, the x87 FP stack is history, even MOVSS is considered legacy by Intel…

August 26, 2014

Re: RFC: std.json sucessor

Posted by Sönke Ludwig
in reply to Don

Sönke Ludwig

Posted in reply to Don

Am 26.08.2014 15:43, schrieb Don:
> On Monday, 25 August 2014 at 14:04:12 UTC, Sönke Ludwig wrote:
>> Am 25.08.2014 15:07, schrieb Don:
>>> ie this should be parsable:
>>>
>>> {"foo": NaN, "bar": Infinity, "baz": -Infinity}
>>
>> This would probably best added as another (CT) optional feature. I
>> think the default should strictly adhere to the JSON specification,
>> though.
>
> Yes, it should be optional, but not a compile-time option.
> I think it should parse it, and based on a runtime flag, throw an error
> (perhaps an OutOfRange error or something, and use the same thing for
> values that exceed the representable range).
>
> An app may accept these non-standard values under certain circumstances
> and not others. In real-world code, you see a *lot* of these guys.

Why not a compile time option?

That sounds to me like such an app should simply enable parsing those values and manually test for NaN at places where it matters. For all other (the majority) of applications, encountering NaN/Infinity will simply mean that there is a bug, so it makes sense to not accept those at all by default.

Apart from that I don't think that it's a good idea for the lexer in general to accept non-standard input by default.

>
> Part of the reason these are important, is that NaN or Infinity
> generally means some Javascript code just has an uninitialized variable.
> Any other kind of invalid JSON typically means something very nasty has
> happened. It's important to distinguish these.

As far as I understood, JavaScript will output those special values as null (at least when not using external JSON libraries). But even if not, an uninitialized variable can also be very nasty, so it's hard to see why that kind of bug should be silently supported (by default).

August 26, 2014

Re: RFC: std.json sucessor

Posted by Ola Fosheim Grøstad
in reply to Ola Fosheim Grøstad

Ola Fosheim Grøstad

Posted in reply to Ola Fosheim Grøstad

On Tuesday, 26 August 2014 at 13:43:56 UTC, Ola Fosheim Grøstad wrote:
> Anyway, the x87 FP stack is history, even MOVSS is considered legacy by Intel…

Sorry for being off-topic, but MOVSS and VMOVSS on AMD don't throw FP exceptions either, but calculations does. So it seems like AMD and Intel sufficiently close for D to support NaNs, IMHO. Forget the legacy…

http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2012/10/26568_APM_v41.pdf

Ola.

August 26, 2014

Re: RFC: std.json sucessor

Posted by Don
in reply to Sönke Ludwig

Don

Posted in reply to Sönke Ludwig

On Tuesday, 26 August 2014 at 14:06:42 UTC, Sönke Ludwig wrote:
> Am 26.08.2014 15:43, schrieb Don:
>> On Monday, 25 August 2014 at 14:04:12 UTC, Sönke Ludwig wrote:
>>> Am 25.08.2014 15:07, schrieb Don:
>>>> ie this should be parsable:
>>>>
>>>> {"foo": NaN, "bar": Infinity, "baz": -Infinity}
>>>
>>> This would probably best added as another (CT) optional feature. I
>>> think the default should strictly adhere to the JSON specification,
>>> though.
>>
>> Yes, it should be optional, but not a compile-time option.
>> I think it should parse it, and based on a runtime flag, throw an error
>> (perhaps an OutOfRange error or something, and use the same thing for
>> values that exceed the representable range).
>>
>> An app may accept these non-standard values under certain circumstances
>> and not others. In real-world code, you see a *lot* of these guys.
>
> Why not a compile time option?
>
> That sounds to me like such an app should simply enable parsing those values and manually test for NaN at places where it matters.
> For all other (the majority) of applications, encountering NaN/Infinity will simply mean that there is a bug, so it makes sense to not accept those at all by default.
>
> Apart from that I don't think that it's a good idea for the lexer in general to accept non-standard input by default.

Please note, I've been talking about the lexer. I'm choosing my words very carefully.

>> Part of the reason these are important, is that NaN or Infinity
>> generally means some Javascript code just has an uninitialized variable.
>> Any other kind of invalid JSON typically means something very nasty has
>> happened. It's important to distinguish these.
>
> As far as I understood, JavaScript will output those special values as null (at least when not using external JSON libraries).

No. Javascript generates them directly. Naive JS code generates these guys. That's why they're so important.

> But even if not, an uninitialized variable can also be very nasty, so it's hard to see why that kind of bug should be silently supported (by default).

I never said it should accepted by default. I said it is a situation which should be *lexed*. Ideally, by default it should give a different error from simply 'invalid JSON'. I believe it should ALWAYS be lexed, even if an error is ultimately generated.

This is the difference: if you get NaN or Infinity, there's probably a straightforward bug in the Javascript code, but your D code is fine. Any other kind of JSON parsing error means you've got a garbage string that isn't JSON at all. They are very different errors.
It's a diagnostics issue.

August 26, 2014

Re: RFC: std.json sucessor

Posted by Sönke Ludwig
in reply to Don

Sönke Ludwig

Posted in reply to Don

Am 26.08.2014 16:40, schrieb Don:
> On Tuesday, 26 August 2014 at 14:06:42 UTC, Sönke Ludwig wrote:
>> Am 26.08.2014 15:43, schrieb Don:
>>> On Monday, 25 August 2014 at 14:04:12 UTC, Sönke Ludwig wrote:
>>>> Am 25.08.2014 15:07, schrieb Don:
>>>>> ie this should be parsable:
>>>>>
>>>>> {"foo": NaN, "bar": Infinity, "baz": -Infinity}
>>>>
>>>> This would probably best added as another (CT) optional feature. I
>>>> think the default should strictly adhere to the JSON specification,
>>>> though.
>>>
>>> Yes, it should be optional, but not a compile-time option.
>>> I think it should parse it, and based on a runtime flag, throw an error
>>> (perhaps an OutOfRange error or something, and use the same thing for
>>> values that exceed the representable range).
>>>
>>> An app may accept these non-standard values under certain circumstances
>>> and not others. In real-world code, you see a *lot* of these guys.
>>
>> Why not a compile time option?
>>
>> That sounds to me like such an app should simply enable parsing those
>> values and manually test for NaN at places where it matters.
>> For all other (the majority) of applications, encountering
>> NaN/Infinity will simply mean that there is a bug, so it makes sense
>> to not accept those at all by default.
>>
>> Apart from that I don't think that it's a good idea for the lexer in
>> general to accept non-standard input by default.
>
> Please note, I've been talking about the lexer. I'm choosing my words
> very carefully.

I've been talking about the lexer, too. Sorry for the confusing use of the term "parsing" (after all, the lexer is also a parser, but anyway).

>
>>> Part of the reason these are important, is that NaN or Infinity
>>> generally means some Javascript code just has an uninitialized variable.
>>> Any other kind of invalid JSON typically means something very nasty has
>>> happened. It's important to distinguish these.
>>
>> As far as I understood, JavaScript will output those special values as
>> null (at least when not using external JSON libraries).
>
> No. Javascript generates them directly. Naive JS code generates these
> guys. That's why they're so important.

JSON.stringify(0/0) == "null"

Holds for all browsers that I've tested.

>
>> But even if not, an uninitialized variable can also be very nasty, so
>> it's hard to see why that kind of bug should be silently supported (by
>> default).
>
> I never said it should accepted by default. I said it is a situation
> which should be *lexed*. Ideally, by default it should give a different
> error from simply 'invalid JSON'. I believe it should ALWAYS be lexed,
> even if an error is ultimately generated.
>
> This is the difference: if you get NaN or Infinity, there's probably a
> straightforward bug in the Javascript code, but your D code is fine. Any
> other kind of JSON parsing error means you've got a garbage string that
> isn't JSON at all. They are very different errors.
> It's a diagnostics issue.

The error will be more like "filename(line:column): Invalid token" - possibly the text following the line/column could also be displayed. Wouldn't that be sufficient?

August 26, 2014

Re: RFC: std.json sucessor

Posted by Ola Fosheim Grøstad
in reply to Don

Ola Fosheim Grøstad

Posted in reply to Don

On Tuesday, 26 August 2014 at 14:40:02 UTC, Don wrote:
> This is the difference: if you get NaN or Infinity, there's probably a straightforward bug in the Javascript code, but your D code is fine. Any other kind of JSON parsing error means you've got a garbage string that isn't JSON at all. They are very different errors.

I don't care either way, but JSON.stringify() has the following support:

IE8 and up
Firefox 3.5 and up
Safari 4 and up
Chrome

So not using it is very much legacy…

August 26, 2014

Re: RFC: std.json sucessor

Posted by Sönke Ludwig
in reply to Sönke Ludwig

Sönke Ludwig

Posted in reply to Sönke Ludwig

Am 26.08.2014 16:51, schrieb Sönke Ludwig:
> Am 26.08.2014 16:40, schrieb Don:
>> This is the difference: if you get NaN or Infinity, there's probably a
>> straightforward bug in the Javascript code, but your D code is fine. Any
>> other kind of JSON parsing error means you've got a garbage string that
>> isn't JSON at all. They are very different errors.
>> It's a diagnostics issue.
>
> The error will be more like "filename(line:column): Invalid token" -
> possibly the text following the line/column could also be displayed.
> Wouldn't that be sufficient?

One argument against supporting it in the parser is that the parser currently works without any configuration, but the user would then have to specify two sets of configuration options with this added.

August 27, 2014

Re: RFC: std.json sucessor

Posted by Walter Bright
in reply to Don

Walter Bright

Posted in reply to Don

On 8/26/2014 12:24 AM, Don wrote:
> On Monday, 25 August 2014 at 23:29:21 UTC, Walter Bright wrote:
>> On 8/25/2014 4:15 PM, "Ola Fosheim Grøstad"
>> <ola.fosheim.grostad+dlang@gmail.com>" wrote:
>>> On Monday, 25 August 2014 at 21:24:11 UTC, Walter Bright wrote:
>>>> I didn't know that. But recall I did implement it in DMC++, and it turned out
>>>> to simply not be useful. I'd be surprised if the new C++ support for it does
>>>> anything worthwhile.
>>>
>>> Well, one should initialize with signaling NaN. Then you get an exception if you
>>> try to compute using uninitialized values.
>>
>>
>> That's the theory. The practice doesn't work out so well.
>
> To be more concrete:
>
> Processors from AMD have signalling NaN behaviour which is different from
> processors from Intel.
>
> And the situation is worst on most other architectures. It's a lost cause, I think.

The other issues were just when the snan => qnan conversion took place. This is quite unclear given the extensive constant folding, CTFE, etc., that D does.

It was also affected by how dmd generates code. Some code gen on floating point doesn't need the FPU, such as toggling the sign bit. But then what happens with snan => qnan?

The whole thing is an undefined, unmanageable mess.

August 28, 2014

Re: RFC: std.json sucessor

Posted by Don
in reply to Walter Bright

Don

Posted in reply to Walter Bright

On Wednesday, 27 August 2014 at 23:51:54 UTC, Walter Bright wrote:
> On 8/26/2014 12:24 AM, Don wrote:
>> On Monday, 25 August 2014 at 23:29:21 UTC, Walter Bright wrote:
>>> On 8/25/2014 4:15 PM, "Ola Fosheim Grøstad"
>>> <ola.fosheim.grostad+dlang@gmail.com>" wrote:
>>>> On Monday, 25 August 2014 at 21:24:11 UTC, Walter Bright wrote:
>>>>> I didn't know that. But recall I did implement it in DMC++, and it turned out
>>>>> to simply not be useful. I'd be surprised if the new C++ support for it does
>>>>> anything worthwhile.
>>>>
>>>> Well, one should initialize with signaling NaN. Then you get an exception if you
>>>> try to compute using uninitialized values.
>>>
>>>
>>> That's the theory. The practice doesn't work out so well.
>>
>> To be more concrete:
>>
>> Processors from AMD have signalling NaN behaviour which is different from
>> processors from Intel.
>>
>> And the situation is worst on most other architectures. It's a lost cause, I think.
>
> The other issues were just when the snan => qnan conversion took place. This is quite unclear given the extensive constant folding, CTFE, etc., that D does.
>
> It was also affected by how dmd generates code. Some code gen on floating point doesn't need the FPU, such as toggling the sign bit. But then what happens with snan => qnan?
>
> The whole thing is an undefined, unmanageable mess.

I think the way to think of it is, to the programmer, there is *no such thing* as an snan value. It's an implementation detail that should be invisible.
Semantically, a signalling nan is a qnan value with a hardware breakpoint on it.

An SNAN should never enter the CPU. The CPU always converts them to QNAN if you try. You're kind of not supposed to know that SNAN exists.

Because of this, I think SNAN only ever makes sense for static variables. Setting local variables to snan doesn't make sense. since the snan has to enter the CPU. Making that work without triggering the snan is very painful. Making it trigger the snan on all forms of access is even worse.

If float.init exists, it cannot be an snan, since you are allowed to use float.init.

August 28, 2014

Re: RFC: std.json sucessor

Posted by Ola Fosheim Grøstad
in reply to Don

Ola Fosheim Grøstad

Posted in reply to Don

On Thursday, 28 August 2014 at 11:09:16 UTC, Don wrote:
> I think the way to think of it is, to the programmer, there is *no such thing* as an snan value. It's an implementation detail that should be invisible.
> Semantically, a signalling nan is a qnan value with a hardware breakpoint on it.

I disagree with this view.

QNAN: there is a value, but it does not result in a real

SNAN: the value is missing for an unspecified reason

AFAIK some x86 ops such as ROUNDPD allows you to treat SNAN as QNAN or throw an exception. So there is an builtin test if needed.

Other ops such as reciprocals don't throw any FP exceptions and will treat SNAN as QNAN.

> An SNAN should never enter the CPU. The CPU always converts them to QNAN if you try. You're kind of not supposed to know that SNAN exists.

I'm not sure how you reached this interpretation?

The solution should be to emit a test for SNAN explicitly or implicitly if you cannot prove that SNAN is impossible.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation