January 26, 2021
On Tuesday, 26 January 2021 at 19:47:47 UTC, John Colvin wrote:
> On Tuesday, 26 January 2021 at 19:02:53 UTC, Bruce Carneal wrote:
>> On Tuesday, 26 January 2021 at 16:31:35 UTC, Ola Fosheim Grøstad wrote:
>>> On Monday, 25 January 2021 at 21:40:01 UTC, Bruce Carneal wrote:
>>>> If I'm wrong, if it really is as you say "no problem", then the RYU author sure wasted a lot of time on the proof in his PLDI 2018 paper.
>>>
>>> Why is that? If the proof is for the algorithm then it has nothing to do with proving your implementation to be correct.
>>
>> If you can construct a tractable exhaustive proof of an implementation, which is what I believe you asserted in this case, then you don't need any other proof.
>>
>> I'm very happy to see the 32 bit floats done exhaustively BTW.
>>  It is very suggestive, but not definitive if I understand the problem correctly.
>>
>> The mapping from the domain to the co-domain is the entire problem IIUC so asserting that certain ranges within the domain can be skipped is tantamount to saying that you know the implementation performs correctly in those regions.  What am I missing?
>
> You can probably inductively prove - or at least become quite certain - that your implementation is identical to the original algorithm within a given set of inputs, given one good result within that set. What those sets look like depends on the algorithm of course.
>
> E.g. given an algorithm for a function with type integer -> integer, if you can prove your implementation of an algorithm is linear w.r.t. its input and you can demonstrate it's correct for 1, you can then infer it will be correct for any other integer input. I think? Off the top of my head.
>
> Of course <insert interesting algorithm here> isn't linear w.r.t. its inputs, but maybe it is within some fixed subset of its inputs, so you only have to test one of those to cover the whole set. Also, linearity isn't the only property one can consider.

Yes.  If we can define, with confidence, a representative set which spans the entire domain then we're good to go: inductive, piecewise, whatever.  We trade one proof (the original) for another (that the coverage is exhaustive/sufficient).  That can be a great trade but I'm not confident in my ability to define such a set in this case (gt 32 bit floats).  Perhaps others are confident.

However it goes from this point on, I'm very happy to see the progress made towards repeatability, towards solid foundations. My thanks to those hammering on this for the rest of us.


January 26, 2021
On Tuesday, 26 January 2021 at 19:02:53 UTC, Bruce Carneal wrote:
> If you can construct a tractable exhaustive proof of an implementation, which is what I believe you asserted in this case, then you don't need any other proof.

Of course you do, you need proof of the tests being correct...

> can be skipped is tantamount to saying that you know the implementation performs correctly in those regions.

In general, if one understands the algorithm as implemented then one can either manually or automatically prove certain regions/ranges. Then you only need to test the regions/ranges that are hard to prove.

For instance, for a converter from int to string, the most likely failure point will be tied to crossing between "9" and "0" as well as the ends of the base-2 input. Meaning, if it works for 1,2,3, you know that it also works for 4,5,6,7,8, but maybe not 9,10.

January 26, 2021
On Tuesday, 26 January 2021 at 20:27:25 UTC, Ola Fosheim Grøstad wrote:
> For instance, for a converter from int to string, the most likely failure point will be tied to crossing between "9" and "0" as well as the ends of the base-2 input. Meaning, if it works for 1,2,3, you know that it also works for 4,5,6,7,8, but maybe not 9,10.

Or to put it another way. If you can prove that these implication holds:

0-1 works => 2-8 works
0-11 works => 12-98 works
0-101 works => 102-998 works
etc
then you only need to test 9-11, 99-101.


January 26, 2021
On Tuesday, 26 January 2021 at 20:27:25 UTC, Ola Fosheim Grøstad wrote:
> On Tuesday, 26 January 2021 at 19:02:53 UTC, Bruce Carneal wrote:
>> If you can construct a tractable exhaustive proof of an implementation, which is what I believe you asserted in this case, then you don't need any other proof.
>
> Of course you do, you need proof of the tests being correct...

I believed that to be obvious.  There is no utility in exhaustively looping over tests that are incorrect.  The domain <==> co-domain properties in this case means that the correctness check is direct.

>
>> can be skipped is tantamount to saying that you know the implementation performs correctly in those regions.
>
> In general, if one understands the algorithm as implemented then one can either manually or automatically prove certain regions/ranges. Then you only need to test the regions/ranges that are hard to prove.
>
> For instance, for a converter from int to string, the most likely failure point will be tied to crossing between "9" and "0" as well as the ends of the base-2 input. Meaning, if it works for 1,2,3, you know that it also works for 4,5,6,7,8, but maybe not 9,10.

Yes.  So now we have to prove that we can subset the exhaustive test.

This is something like the algorithm development technique that starts with a backtracking solution (full enumeration) and then prunes away via memoization or other to achieve a lower complexity.  If, OTOH, I start in the middle, asserting that I "know" what the shape of the problem is, then I'm on shaky ground.

As practicing programmers we live with "shaky ground" every day, but I'm hoping we can do better in this case.  It's already moving in a good direction.


January 26, 2021
On Tuesday, 26 January 2021 at 20:45:22 UTC, Bruce Carneal wrote:
> As practicing programmers we live with "shaky ground" every day, but I'm hoping we can do better in this case.  It's already moving in a good direction.

Yes, the real issue in practice is constantly changing requirements and evolving software over time.

We don't really have programming languages that support restructuring/modifying code very well. And well, if you have lots of "correct" tests, it becomes even more expensive to change...

So, breaking things up into smaller units with concepts like actors might be a way to go, perhaps. Dunno.


January 26, 2021
On Tuesday, 26 January 2021 at 20:34:51 UTC, Ola Fosheim Grøstad wrote:
> On Tuesday, 26 January 2021 at 20:27:25 UTC, Ola Fosheim Grøstad wrote:
>> For instance, for a converter from int to string, the most likely failure point will be tied to crossing between "9" and "0" as well as the ends of the base-2 input. Meaning, if it works for 1,2,3, you know that it also works for 4,5,6,7,8, but maybe not 9,10.
>
> Or to put it another way. If you can prove that these implication holds:
>
> 0-1 works => 2-8 works
> 0-11 works => 12-98 works
> 0-101 works => 102-998 works
> etc
> then you only need to test 9-11, 99-101.

I think this is a great way to go for simple functions.  If the domain ==> co-domain mapping is truly obvious then we can exploit that to write an efficient unittest that we can take as the next best thing to the simpler exhaustive proof.

I don't see a mapping for the current problem that would allow for such efficient enumerative testing but it's not my field.  Still, if/when you prove that you've fully covered the domain I'd be very excited to read about it.  It could be a major advance.  Note: by "proof" I do not mean probabilistic evidence.  We already have that.

Even if you don't come up with proof, the attempt may be worth your time.  Good luck.

January 26, 2021
On Tuesday, 26 January 2021 at 21:10:20 UTC, Bruce Carneal wrote:
> Even if you don't come up with proof, the attempt may be worth your time.  Good luck.

Not sure what you mean... Why would anyone waste their time on trying to prove something for code they have no control over, it could change any minute, and which implementation by the way? *shrugs*



January 27, 2021
On Tuesday, 26 January 2021 at 16:25:54 UTC, Berni44 wrote:
> The paper states 104KB for doubles (can be compressed on the
> expense of slowing  everthing down).

It also notes that "the size could be halved to 52 kB with no performance impact".

For comparison, a simple "Hello, World!", built with optimizations and stripped, comes to about 700 kB.  Even allowing that the 128-bit table (which will support 80-bit reals) is going to be larger, is this really an issue on modern machines?

The only circumstances I can see it mattering is in very low-resource embedded use-cases where D stdlib is unlikely to be viable anyway.
January 27, 2021
On Tuesday, 26 January 2021 at 22:23:34 UTC, Ola Fosheim Grøstad wrote:
> On Tuesday, 26 January 2021 at 21:10:20 UTC, Bruce Carneal wrote:
>> Even if you don't come up with proof, the attempt may be worth your time.  Good luck.
>
> Not sure what you mean... Why would anyone waste their time on trying to prove something for code they have no control over, it could change any minute, and which implementation by the way? *shrugs*

You made an assertion, in a proof discussion, that the tractable extension of the exhaustive proof to cover larger FP types, was "no problem".

Either it really is "no problem", and clarification to the good of all concerned would take little of your time, or you were mistaken and others should not expect proof level enlightenment from you on the topic any time soon.

It is certainly "no problem" to extend probabilistic testing but closing the gap between probable and proven appears to me to be quite a bit harder.  That's why I had hoped that you'd had a breakthrough.

Your inability and/or unwillingness to produce a "no problem" proof suggests that you have not had a breakthrough (Fermat at least had the excuse of dieing :-) ).

Ah well. There are a lot of other things to work on.  I just hope that we can come to agree on what it means to actually prove something.

January 27, 2021
On Wednesday, 27 January 2021 at 00:15:41 UTC, Joseph Rushton Wakeling wrote:
> On Tuesday, 26 January 2021 at 16:25:54 UTC, Berni44 wrote:
>> The paper states 104KB for doubles (can be compressed on the
>> expense of slowing  everthing down).
>
> It also notes that "the size could be halved to 52 kB with no performance impact".
>
> For comparison, a simple "Hello, World!", built with optimizations and stripped, comes to about 700 kB.  Even allowing that the 128-bit table (which will support 80-bit reals) is going to be larger, is this really an issue on modern machines?
>
> The only circumstances I can see it mattering is in very low-resource embedded use-cases where D stdlib is unlikely to be viable anyway.

I'd vote for "overweight and almost-certainly-correct" over "trim but iffy" if the extra poundage is reasonable which, in the non-embedded environments, it appears to be.

On another note, it looks like ryu was submitted by MS in late 2019 for inclusion in libc++.  Despite a massive 3 minutes of google-fuing, I'm still not sure how far that has gotten but the direction seems pretty clear.  A large body of others believe that ryu is a good way to go and are moving to standardize on it.

It's not a live-or-die issue for me but ryu looks pretty good.