Thread overview
Float rounding (in JSON)
Oct 13, 2022
Sergey
Oct 13, 2022
Sergey
Oct 14, 2022
Patrick Schluter
Oct 14, 2022
bauss
Dec 30, 2022
Sergey
October 13, 2022

I'm not a professional of IEEE 754, but just found this behavior at rounding in comparison with other languages. I supose it happened because in D float numbers parsed as double and have a full length of double while rounding. But this is just doesn't match with behavior in other languages.

I'm not sure if this is somehow connected with JSON realizations.

Is it possible in D to have the same results as in others? Because explicit formatting is not the answer, since length of rounding could be different. That's why just specify "%.XXf" will not resolve the issue - two last numbers have 14 and 15 positions after the dot.

Code Python

import json

str = '{ "f1": 43.476379000000065, "f2": 43.499718999999987, "f3": 43.499718000000087, "f4": 43.418052999999986 }'
print(json.loads(str))

Result

{'f1': 43.476379000000065, 'f2': 43.499718999999985, 'f3': 43.49971800000009, 'f4': 43.418052999999986}

Code Crystal

require "json"

str = "{ \"f1\": 43.476379000000065, \"f2\": 43.499718999999987, \"f3\": 43.499718000000087, \"f4\": 43.418052999999986 }"

puts JSON.parse(str)

Result

{"f1" => 43.476379000000065, "f2" => 43.499718999999985, "f3" => 43.49971800000009, "f4" => 43.418052999999986}

Code D

import std;

void main() {
    string s = `{ "f1": 43.476379000000065, "f2": 43.499718999999987, "f3": 43.499718000000087, "f4": 43.418052999999986 }`;
    JSONValue j = parseJSON(s);
    writeln(j);
}

Result

{"f1":43.4763790000000654,"f2":43.4997189999999847,"f3":43.4997180000000867,"f4":43.4180529999999862}
October 13, 2022

On 10/13/22 3:00 PM, Sergey wrote:

>

I'm not a professional of IEEE 754, but just found this behavior at rounding in comparison with other languages. I supose it happened because in D float numbers parsed as double and have a full length of double while rounding. But this is just doesn't match with behavior in other languages.

I'm not sure if this is somehow connected with JSON realizations.

It doesn't look really that far off. You can't expect floating point parsing to be exact, as floating point does not perfectly represent decimal numbers, especially when you get down to the least significant bits.

>

Is it possible in D to have the same results as in others? Because explicit formatting is not the answer, since length of rounding could be different. That's why just specify "%.XXf" will not resolve the issue - two last numbers have 14 and 15 positions after the dot.

It seems like you are looking to output a certain number of digits. If you limit the digits, you can get the outcome you desire.

But I want to point out something you may have missed:

>

Code Python

import json

str = '{ "f1": 43.476379000000065, "f2": 43.499718999999987, "f3": 43.499718000000087, "f4": 43.418052999999986 }'
print(json.loads(str))

Result

{'f1': 43.476379000000065, 'f2': 43.499718999999985, 'f3': 43.49971800000009, 'f4': 43.418052999999986}

Let's line these up so we can read it easier

f1 in:  43.476379000000065
f1 out: 43.476379000000065
f2 in:  43.499718999999987
f2 out: 43.499718999999985
f3 in:  43.499718000000087
f3 out: 43.49971800000009
f4 in:  43.418052999999986
f4 out: 43.418052999999986

Note how f2 is a different output significantly than the input. This is an artifact of floating point parsing and the digits that are the most insignificant.

Also note that the omission of the 7 in f3 doesn't seem to have to do with rounding, because the digits are less than the original. If that digit were anywhere close to significant, you would have expected the digit to appear.

>

Code Crystal

require "json"

str = "{ \"f1\": 43.476379000000065, \"f2\": 43.499718999999987, \"f3\": 43.499718000000087, \"f4\": 43.418052999999986 }"

puts JSON.parse(str)

Result

{"f1" => 43.476379000000065, "f2" => 43.499718999999985, "f3" => 43.49971800000009, "f4" => 43.418052999999986}

Same here

>

Code D

import std;

void main() {
     string s = `{ "f1": 43.476379000000065, "f2": 43.499718999999987, "f3": 43.499718000000087, "f4": 43.418052999999986 }`;
     JSONValue j = parseJSON(s);
     writeln(j);
}

Result

{"f1":43.4763790000000654,"f2":43.4997189999999847,"f3":43.4997180000000867,"f4":43.4180529999999862}

Let's look at D's representation:

f1 in:  43.476379000000065
f1 out: 43.4763790000000654
f2 in:  43.499718999999987
f2 out: 43.4997189999999847
f3 in:  43.499718000000087
f3 out: 43.4997180000000867
f4 in:  43.418052999999986
f4 out: 43.4180529999999862

Why does it print one more digit than the other languages? Because that must be the default for writeln. You can affect this by changing the number of digits printed. But probably not when printing an entire JSON structure.

But look also at f3, and how actually D is closer to the expected value than with the other languages.

If you want exact representation of data, parse it as a string instead of a double.

I'm assuming you are comparing for testing purposes? If you are, just realize you can never be accurate here. You just have to live with the difference. Typically when comparing floating point values, you use an epsilon to ensure that the floating point value is "close enough", you can't enforce exact representation.

-Steve

October 13, 2022

On Thursday, 13 October 2022 at 19:27:22 UTC, Steven Schveighoffer wrote:

Thank you Steven, for your very detailed answer.

>

It doesn't look really that far off. You can't expect floating point parsing to be exact, as floating point does not perfectly represent decimal numbers, especially when you get down to the least significant bits.

This is sad - because "exact" match is what I need in this toy example.

>

But I want to point out something you may have missed:

Actually I've meant those things too :)

>

But look also at f3, and how actually D is closer to the expected value than with the other languages.

If you want exact representation of data, parse it as a string instead of a double.

Unfortunately it is not helped me in this task (which is pretty awkward): it parses some GeoData from JSON file. Then create representation of that data into string format and use hash from that string. Because they use the Hash - I need exact the same string representation to match the answer.

>

I'm assuming you are comparing for testing purposes? If you are, just realize you can never be accurate here. You just have to live with the difference. Typically when comparing floating point values, you use an epsilon to ensure that the floating point value is "close enough", you can't enforce exact representation.

-Steve

Actually it was my attempt to implement the benchmark-game: https://programming-language-benchmarks.vercel.app/problem/json-serde

As you can see many languages have passed tests which I assume they have exactly same representation of that float numbers.
Maybe I am wrong and did not understand code from other realizations. But at least I test python and crystal and found pretty confusing their results (what you wrote about more accurate example of f3), but what surpsised me even more: they have exactly same confused results with those floating numbers.
That's why I've made a conclusion that maybe it is some special and declared behavior/rule for that and I just can't find how to replicate that "well known behavior" in D.

October 14, 2022

On Thursday, 13 October 2022 at 19:27:22 UTC, Steven Schveighoffer wrote:

>

On 10/13/22 3:00 PM, Sergey wrote:

>

[...]

It doesn't look really that far off. You can't expect floating point parsing to be exact, as floating point does not perfectly represent decimal numbers, especially when you get down to the least significant bits.

[...]
To me it looks like there is a conversion to real (80 bit floats) somewhere in the D code and that the other languages stay in double mode everywhere. Maybe forcing double by disabling x87 on the D side would yield the same results as the other languages?

October 14, 2022

On Friday, 14 October 2022 at 09:00:11 UTC, Patrick Schluter wrote:

>

On Thursday, 13 October 2022 at 19:27:22 UTC, Steven Schveighoffer wrote:

>

On 10/13/22 3:00 PM, Sergey wrote:

>

[...]

It doesn't look really that far off. You can't expect floating point parsing to be exact, as floating point does not perfectly represent decimal numbers, especially when you get down to the least significant bits.

[...]
To me it looks like there is a conversion to real (80 bit floats) somewhere in the D code and that the other languages stay in double mode everywhere. Maybe forcing double by disabling x87 on the D side would yield the same results as the other languages?

Looking through the source code then for floating points we call parse!double when parsing the json as a floating point.

I don't see real being used anywhere when parsing.

So if anything then it would have to be internally in parse or dmd. I haven't checked either yet.

December 30, 2022

On Thursday, 13 October 2022 at 19:00:30 UTC, Sergey wrote:

>

I'm not a professional of IEEE 754, but just found this behavior at rounding in comparison with other languages. I supose it happened because in D float numbers parsed as double and have a full length of double while rounding. But this is just doesn't match with behavior in other languages.

So there is no luck with std.json for me. But when std is not the solution, third party libraries could help. I've tried ASDF. This is kind of archived library, but it works well, its documentation is small and clear (mir-ion really needs to improve documentation).

So in asdf we could just serialize the json and it will automatically round numbers with the same magic logic for floating as other languages do.

The only thing: some numbers which are usually double could be presented in JSON as integers. Automatically asdf convert them to double too. In case you need to process them exactly as integers you could use Variant!(int, double) as a type of the data. And provide your custom serializer/deserializer as it is proposed in asdf documentation example.
http://asdf.libmir.org/asdf_serialization.html#.serializeToAsdf

PS Thanks to Steven for his suggestions in Discord.