Jump to page: 1 26  
Page
Thread overview
Replacement for snprintf
Oct 30, 2019
berni44
Oct 30, 2019
Ali Çehreli
Oct 30, 2019
berni44
Oct 30, 2019
Jon Degenhardt
Oct 30, 2019
berni44
Oct 30, 2019
Ali Çehreli
Oct 30, 2019
berni44
Oct 30, 2019
Sebastiaan Koppe
Oct 31, 2019
drug
Oct 31, 2019
Walter Bright
Oct 31, 2019
Stefan Koch
Oct 31, 2019
H. S. Teoh
Oct 30, 2019
H. S. Teoh
Oct 30, 2019
berni44
Oct 30, 2019
H. S. Teoh
Oct 30, 2019
Rumbu
Oct 30, 2019
berni44
Oct 30, 2019
Rumbu
Oct 31, 2019
berni44
Oct 30, 2019
H. S. Teoh
Oct 30, 2019
Stefan Koch
Oct 31, 2019
berni44
Oct 31, 2019
Walter Bright
Oct 31, 2019
Guillaume Piolat
Oct 31, 2019
H. S. Teoh
Nov 01, 2019
Jacob Carlborg
Nov 01, 2019
H. S. Teoh
Nov 02, 2019
Jacob Carlborg
Nov 04, 2019
berni44
Nov 04, 2019
Guillaume Piolat
Nov 03, 2019
Uknown
Oct 31, 2019
Jonathan M Davis
Nov 06, 2019
berni44
Nov 06, 2019
H. S. Teoh
Nov 06, 2019
lithium iodate
Nov 06, 2019
Rumbu
Nov 06, 2019
H. S. Teoh
Nov 07, 2019
berni44
Nov 07, 2019
berni44
Nov 06, 2019
Jacob Carlborg
Nov 06, 2019
H. S. Teoh
Nov 06, 2019
Andre Pany
Nov 07, 2019
berni44
Nov 06, 2019
lithium iodate
Oct 31, 2019
berni44
Oct 31, 2019
Robert Schadek
Nov 08, 2019
berni44
Dec 14, 2019
berni44
Jan 24, 2020
berni44
October 30, 2019
In PR 7222 [1] Robert Schadek suggested replacing the call to snprinf in std.format with an own method written in D. During the last days I took a deeper look into this and meanwhile I've got a function that works for floats (and probably also doubles, but I havn't tested that yet and it should also work with reals if ucent would be available; without ucent I need a workaround for real or fall back to BigInt).

I only implemented f qualifier yet, but it shouldn't be difficult to add e and g qualifiers and the uppercase versions. Also some work needs to be done, to implement the flags (-,+,0,<space>,#), but again, I think, this will not be very difficult. Unfortunately I'll be busy with some other (non-D) stuff for some time. I'll probably continue work on this someday in november.

I checked correctness for floats by comparing to the result of snprintf for about 1% of all numbers (I will do that for all, before filing an PR though). The only difference are rounding issues, when the number is exactly between two adjacent ways of displaying. The implementation of snprintf on my computer always rounds towards zero while mine rounds in the opposite direction. (E.g. 0.125 rounded to two digits is 0.13 in my implementation while it's 0.12 in snprintfs implementation) I doubt, that different implementations of printf-variants are all identical in this regard.

I also compared the speed of both implementations. They are generally in the same order of magnitude (600-2800ns per number, depending on precision and number). On average my implementation is slightly faster. For numbers close to 0 the snprintf implementation is faster (I wasn't able to follow the algorithm they use), especially if the desired precision is large (I'll try to improve this, because it might get a real problem for reals). For all other numbers my current implementation wins by a more or less small margin.

[1] https://github.com/dlang/phobos/pull/7222#issuecomment-544909188
October 30, 2019
On 10/30/2019 06:44 AM, berni44 wrote:
> The only difference are rounding issues, when the number is
> exactly between two adjacent ways of displaying. The implementation of
> snprintf on my computer always rounds towards zero while mine rounds in
> the opposite direction. (E.g. 0.125 rounded to two digits is 0.13 in my
> implementation while it's 0.12 in snprintfs implementation)

The tie-breaker is to always round towards the even digit. So it should always produce 1.12, 1.14, etc.

Ali

October 30, 2019
On Wednesday, 30 October 2019 at 15:48:44 UTC, Ali Çehreli wrote:
> The tie-breaker is to always round towards the even digit. So it should always produce 1.12, 1.14, etc.

As far as I know that's for avoiding error propagation, when intermediate results need to be rounded. When I'm not completely mistaken, Donald Knuth prooved that rounding toward even avoids errors that might building up using several such steps.

But here there is little chance, that the result will be used for new calculations. It's most often used for printing a result that humans have to read. This is different.

October 30, 2019
On Wed, Oct 30, 2019 at 01:44:52PM +0000, berni44 via Digitalmars-d wrote:
> In PR 7222 [1] Robert Schadek suggested replacing the call to snprinf in std.format with an own method written in D. During the last days I took a deeper look into this and meanwhile I've got a function that works for floats (and probably also doubles, but I havn't tested that yet and it should also work with reals if ucent would be available; without ucent I need a workaround for real or fall back to BigInt).
> 
> I only implemented f qualifier yet, but it shouldn't be difficult to add e and g qualifiers and the uppercase versions. Also some work needs to be done, to implement the flags (-,+,0,<space>,#), but again, I think, this will not be very difficult. Unfortunately I'll be busy with some other (non-D) stuff for some time. I'll probably continue work on this someday in november.

If you haven't already, please read:

	https://www.zverovich.net/2019/02/11/formatting-floating-point-numbers.html

especially the papers linked in the first paragraph.

Formatting floating-point numbers is not a trivial task. It's easy to write up something that works for common cases, but it's not so easy to get something to gives the best results in *all* cases. You probably should use the algorithms referenced above for your implementation, instead of coming up with your own that may have unexpected corner cases that don't produce the right output.


T

-- 
Valentine's Day: an occasion for florists to reach into the wallets of nominal lovers in dire need of being reminded to profess their hypothetical love for their long-forgotten.
October 30, 2019
On Wednesday, 30 October 2019 at 16:04:10 UTC, berni44 wrote:
> On Wednesday, 30 October 2019 at 15:48:44 UTC, Ali Çehreli wrote:
>> The tie-breaker is to always round towards the even digit. So it should always produce 1.12, 1.14, etc.
>
> As far as I know that's for avoiding error propagation, when intermediate results need to be rounded. When I'm not completely mistaken, Donald Knuth prooved that rounding toward even avoids errors that might building up using several such steps.
>
> But here there is little chance, that the result will be used for new calculations. It's most often used for printing a result that humans have to read. This is different.

It's reasonably common to have numeric values written out in text format and read back in and used in subsequent computations. Not always a great idea, especially when done without much consideration for round-off errors. But it's not uncommon.

--Jon
October 30, 2019
On Wednesday, 30 October 2019 at 13:44:52 UTC, berni44 wrote:
> In PR 7222 [1] Robert Schadek suggested replacing the call to snprinf in std.format with an own method written in D. During the last days I took a deeper look into this and meanwhile I've got a function that works for floats (and probably also doubles, but I havn't tested that yet and it should also work with reals if ucent would be available; without ucent I need a workaround for real or fall back to BigInt).
>
> [...]

According to ieee754-2008:

"5.12.2 External decimal character sequences representing finite numbers

[...]

For binary formats, all conversions of H significant digits or fewer round correctly according to the applicable rounding direction;"

Where H is 9 for single, 17 for double. IEE754 doesn't specify a H for reals.


That means that snprintf must use the current rounding mode that can be read using FloatingPointControl.rounding from std.math.

October 30, 2019
On Wednesday, 30 October 2019 at 17:41:26 UTC, H. S. Teoh wrote:
> If you haven't already, please read:
>
> 	https://www.zverovich.net/2019/02/11/formatting-floating-point-numbers.html
>
> especially the papers linked in the first paragraph.

Thanks for that link. I havn't had a look into the grisu algorithms. But I'll definitivly do that.

> Formatting floating-point numbers is not a trivial task. It's easy to write up something that works for common cases, but it's not so easy to get something to gives the best results in *all* cases.

I know, that this is something we all wish. Anyway, my goal is set somewhat lower: I'd like to replace the existing call to snprintf with something that is programmed in D and which should be pure, @safe and ctfeable. And ideally it should not be slower then snprintf.

> You probably should use the algorithms referenced above for your implementation,

I read through the paper for the ryu algorithm and rejected it (at least for me; if someone else is goint to implement it and file a PR that's fine). My reason for rejecting is, that the algorithm has not exactly the same goal as printf, which IMHO means, that it cannot be used here; and that it needs a lookuptable, that is too large (300K for 128bit reals).

I fear a little bit, from what I read in the ryu paper about the grisu algorithms, that it has the first of the above mentioned problems too. But yet I can't tell for sure.

> instead of coming up with your own that may have
> unexpected corner cases that don't produce the right output.

Obviously I need to prove, that the algorithm is correct somehow. While this can be done for floats by running it on all numbers and comparing these results with the result of snprintf (or the result calculated by bc), for doubles and reals, this isn't possible anymore (a random sample can be tested anyway, but that's no proof). Anyway, I think, that the proof isn't hard to give. The current algorithm is short and straight forward. (And: When I implement one of the mentioned algorithms, it can still contain bugs, because I made a mistake somewhere.)
October 30, 2019
On Wednesday, 30 October 2019 at 17:50:03 UTC, Jon Degenhardt wrote:
> It's reasonably common to have numeric values written out in text format and read back in and used in subsequent computations. Not always a great idea, especially when done without much consideration for round-off errors. But it's not uncommon.

But IMHO this is the fault of people who do this and not the fault of a printing routine.

But: When pondering about how to fix the results of format for ranges of strings (it places currently quotes arround each string, which is somewhat inconsistent because single strings are printed without quotes, and causes confusion).

I came up with the idea of having a new format qualifier, maybe S like source, in addition to s, which prints the type in a way, that it can be directly used in D code (which is, as far as I know, the reason why the quotes are printed). That could be also used, to produce a representation of a float, that, when readin, is still the same float as before; which could be done by ryu or grisu algorithm, because these algorithms have exactly this goal.

October 30, 2019
On 10/30/2019 12:19 PM, berni44 wrote:

> But: When pondering about how to fix the results of format for ranges of
> strings (it places currently quotes arround each string

Just to make sure, you are aware of the optional '-' before '(', right? "%-(%s%)" does not print the quotes.

Ali

October 30, 2019
On Wednesday, 30 October 2019 at 18:16:56 UTC, Rumbu wrote:
> That means that snprintf must use the current rounding mode that can be read using FloatingPointControl.rounding from std.math.

Is it really a "must"? We are not completely bound by the IEEE standard and, if good reasons are available, might reject it. For example, comparing two floats with <= produces either "false" or "true" in D. According to IEEE there should be a third result possible, namly "not comparable". Having said this, it would be possible to implement it the way you claim, but probably at some cost (=slower, more and less easy readable lines of code). I'll think about it.
« First   ‹ Prev
1 2 3 4 5 6