Thread overview | |||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
October 30, 2019 Replacement for snprintf | ||||
---|---|---|---|---|
| ||||
In PR 7222 [1] Robert Schadek suggested replacing the call to snprinf in std.format with an own method written in D. During the last days I took a deeper look into this and meanwhile I've got a function that works for floats (and probably also doubles, but I havn't tested that yet and it should also work with reals if ucent would be available; without ucent I need a workaround for real or fall back to BigInt). I only implemented f qualifier yet, but it shouldn't be difficult to add e and g qualifiers and the uppercase versions. Also some work needs to be done, to implement the flags (-,+,0,<space>,#), but again, I think, this will not be very difficult. Unfortunately I'll be busy with some other (non-D) stuff for some time. I'll probably continue work on this someday in november. I checked correctness for floats by comparing to the result of snprintf for about 1% of all numbers (I will do that for all, before filing an PR though). The only difference are rounding issues, when the number is exactly between two adjacent ways of displaying. The implementation of snprintf on my computer always rounds towards zero while mine rounds in the opposite direction. (E.g. 0.125 rounded to two digits is 0.13 in my implementation while it's 0.12 in snprintfs implementation) I doubt, that different implementations of printf-variants are all identical in this regard. I also compared the speed of both implementations. They are generally in the same order of magnitude (600-2800ns per number, depending on precision and number). On average my implementation is slightly faster. For numbers close to 0 the snprintf implementation is faster (I wasn't able to follow the algorithm they use), especially if the desired precision is large (I'll try to improve this, because it might get a real problem for reals). For all other numbers my current implementation wins by a more or less small margin. [1] https://github.com/dlang/phobos/pull/7222#issuecomment-544909188 |
October 30, 2019 Re: Replacement for snprintf | ||||
---|---|---|---|---|
| ||||
Posted in reply to berni44 | On 10/30/2019 06:44 AM, berni44 wrote: > The only difference are rounding issues, when the number is > exactly between two adjacent ways of displaying. The implementation of > snprintf on my computer always rounds towards zero while mine rounds in > the opposite direction. (E.g. 0.125 rounded to two digits is 0.13 in my > implementation while it's 0.12 in snprintfs implementation) The tie-breaker is to always round towards the even digit. So it should always produce 1.12, 1.14, etc. Ali |
October 30, 2019 Re: Replacement for snprintf | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ali Çehreli | On Wednesday, 30 October 2019 at 15:48:44 UTC, Ali Çehreli wrote:
> The tie-breaker is to always round towards the even digit. So it should always produce 1.12, 1.14, etc.
As far as I know that's for avoiding error propagation, when intermediate results need to be rounded. When I'm not completely mistaken, Donald Knuth prooved that rounding toward even avoids errors that might building up using several such steps.
But here there is little chance, that the result will be used for new calculations. It's most often used for printing a result that humans have to read. This is different.
|
October 30, 2019 Re: Replacement for snprintf | ||||
---|---|---|---|---|
| ||||
Posted in reply to berni44 | On Wed, Oct 30, 2019 at 01:44:52PM +0000, berni44 via Digitalmars-d wrote: > In PR 7222 [1] Robert Schadek suggested replacing the call to snprinf in std.format with an own method written in D. During the last days I took a deeper look into this and meanwhile I've got a function that works for floats (and probably also doubles, but I havn't tested that yet and it should also work with reals if ucent would be available; without ucent I need a workaround for real or fall back to BigInt). > > I only implemented f qualifier yet, but it shouldn't be difficult to add e and g qualifiers and the uppercase versions. Also some work needs to be done, to implement the flags (-,+,0,<space>,#), but again, I think, this will not be very difficult. Unfortunately I'll be busy with some other (non-D) stuff for some time. I'll probably continue work on this someday in november. If you haven't already, please read: https://www.zverovich.net/2019/02/11/formatting-floating-point-numbers.html especially the papers linked in the first paragraph. Formatting floating-point numbers is not a trivial task. It's easy to write up something that works for common cases, but it's not so easy to get something to gives the best results in *all* cases. You probably should use the algorithms referenced above for your implementation, instead of coming up with your own that may have unexpected corner cases that don't produce the right output. T -- Valentine's Day: an occasion for florists to reach into the wallets of nominal lovers in dire need of being reminded to profess their hypothetical love for their long-forgotten. |
October 30, 2019 Re: Replacement for snprintf | ||||
---|---|---|---|---|
| ||||
Posted in reply to berni44 | On Wednesday, 30 October 2019 at 16:04:10 UTC, berni44 wrote:
> On Wednesday, 30 October 2019 at 15:48:44 UTC, Ali Çehreli wrote:
>> The tie-breaker is to always round towards the even digit. So it should always produce 1.12, 1.14, etc.
>
> As far as I know that's for avoiding error propagation, when intermediate results need to be rounded. When I'm not completely mistaken, Donald Knuth prooved that rounding toward even avoids errors that might building up using several such steps.
>
> But here there is little chance, that the result will be used for new calculations. It's most often used for printing a result that humans have to read. This is different.
It's reasonably common to have numeric values written out in text format and read back in and used in subsequent computations. Not always a great idea, especially when done without much consideration for round-off errors. But it's not uncommon.
--Jon
|
October 30, 2019 Re: Replacement for snprintf | ||||
---|---|---|---|---|
| ||||
Posted in reply to berni44 | On Wednesday, 30 October 2019 at 13:44:52 UTC, berni44 wrote:
> In PR 7222 [1] Robert Schadek suggested replacing the call to snprinf in std.format with an own method written in D. During the last days I took a deeper look into this and meanwhile I've got a function that works for floats (and probably also doubles, but I havn't tested that yet and it should also work with reals if ucent would be available; without ucent I need a workaround for real or fall back to BigInt).
>
> [...]
According to ieee754-2008:
"5.12.2 External decimal character sequences representing finite numbers
[...]
For binary formats, all conversions of H significant digits or fewer round correctly according to the applicable rounding direction;"
Where H is 9 for single, 17 for double. IEE754 doesn't specify a H for reals.
That means that snprintf must use the current rounding mode that can be read using FloatingPointControl.rounding from std.math.
|
October 30, 2019 Re: Replacement for snprintf | ||||
---|---|---|---|---|
| ||||
Posted in reply to H. S. Teoh | On Wednesday, 30 October 2019 at 17:41:26 UTC, H. S. Teoh wrote: > If you haven't already, please read: > > https://www.zverovich.net/2019/02/11/formatting-floating-point-numbers.html > > especially the papers linked in the first paragraph. Thanks for that link. I havn't had a look into the grisu algorithms. But I'll definitivly do that. > Formatting floating-point numbers is not a trivial task. It's easy to write up something that works for common cases, but it's not so easy to get something to gives the best results in *all* cases. I know, that this is something we all wish. Anyway, my goal is set somewhat lower: I'd like to replace the existing call to snprintf with something that is programmed in D and which should be pure, @safe and ctfeable. And ideally it should not be slower then snprintf. > You probably should use the algorithms referenced above for your implementation, I read through the paper for the ryu algorithm and rejected it (at least for me; if someone else is goint to implement it and file a PR that's fine). My reason for rejecting is, that the algorithm has not exactly the same goal as printf, which IMHO means, that it cannot be used here; and that it needs a lookuptable, that is too large (300K for 128bit reals). I fear a little bit, from what I read in the ryu paper about the grisu algorithms, that it has the first of the above mentioned problems too. But yet I can't tell for sure. > instead of coming up with your own that may have > unexpected corner cases that don't produce the right output. Obviously I need to prove, that the algorithm is correct somehow. While this can be done for floats by running it on all numbers and comparing these results with the result of snprintf (or the result calculated by bc), for doubles and reals, this isn't possible anymore (a random sample can be tested anyway, but that's no proof). Anyway, I think, that the proof isn't hard to give. The current algorithm is short and straight forward. (And: When I implement one of the mentioned algorithms, it can still contain bugs, because I made a mistake somewhere.) |
October 30, 2019 Re: Replacement for snprintf | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jon Degenhardt | On Wednesday, 30 October 2019 at 17:50:03 UTC, Jon Degenhardt wrote:
> It's reasonably common to have numeric values written out in text format and read back in and used in subsequent computations. Not always a great idea, especially when done without much consideration for round-off errors. But it's not uncommon.
But IMHO this is the fault of people who do this and not the fault of a printing routine.
But: When pondering about how to fix the results of format for ranges of strings (it places currently quotes arround each string, which is somewhat inconsistent because single strings are printed without quotes, and causes confusion).
I came up with the idea of having a new format qualifier, maybe S like source, in addition to s, which prints the type in a way, that it can be directly used in D code (which is, as far as I know, the reason why the quotes are printed). That could be also used, to produce a representation of a float, that, when readin, is still the same float as before; which could be done by ryu or grisu algorithm, because these algorithms have exactly this goal.
|
October 30, 2019 Re: Replacement for snprintf | ||||
---|---|---|---|---|
| ||||
Posted in reply to berni44 | On 10/30/2019 12:19 PM, berni44 wrote: > But: When pondering about how to fix the results of format for ranges of > strings (it places currently quotes arround each string Just to make sure, you are aware of the optional '-' before '(', right? "%-(%s%)" does not print the quotes. Ali |
October 30, 2019 Re: Replacement for snprintf | ||||
---|---|---|---|---|
| ||||
Posted in reply to Rumbu | On Wednesday, 30 October 2019 at 18:16:56 UTC, Rumbu wrote:
> That means that snprintf must use the current rounding mode that can be read using FloatingPointControl.rounding from std.math.
Is it really a "must"? We are not completely bound by the IEEE standard and, if good reasons are available, might reject it. For example, comparing two floats with <= produces either "false" or "true" in D. According to IEEE there should be a third result possible, namly "not comparable". Having said this, it would be possible to implement it the way you claim, but probably at some cost (=slower, more and less easy readable lines of code). I'll think about it.
|
Copyright © 1999-2021 by the D Language Foundation