What are best practices around toString?

Sep 30, 2022

christian.koestlin

Oct 01, 2022

tsbockman

Oct 01, 2022

Oct 01, 2022

Oct 02, 2022

Oct 06, 2022

... phobos/std/typecons.d: void toString(DG)(scope DG sink) const ... phobos/std/typecons.d: void toString(DG, Char)(scope DG sink, scope const ref FormatSpec!Char fmt) const ... phobos/std/typecons.d: void toString()(scope void delegate(const(char)[]) sink, scope const ref FormatSpec!char fmt) ... phobos/std/sumtype.d: void toString(this This, Sink, Char)(ref Sink sink, const ref FormatSpec!Char fmt); ...

October 01, 2022

Re: What are best practices around toString?

Posted by tsbockman
in reply to christian.koestlin

Permalink

tsbockman

Posted in reply to christian.koestlin

Permalink

On Friday, 30 September 2022 at 13:11:56 UTC, christian.koestlin wrote:

Dear Dlang experts,

up until now I was perfectly happy with implementing (override) string toString() const or something to get nicely formatted (mostly debug) output for my structs, classes and exceptions.

Human beings read extremely slowly compared to how quickly the GC can allocate and free strings as needed, so there is no need to complicate your code with more text formatting strategies unless you want to generate this debug output far faster than a human can actually read it.

But recently I stumbled upon https://wiki.dlang.org/Defining_custom_print_format_specifiers and additionally https://github.com/dlang/dmd/blob/4ff1eec2ce7d990dcd58e5b641ef3d0a1676b9bb/druntime/src/object.d#L2637 which at first sight is great, because it provides the same customization of an objects representation with less memory allocations.

When grepping through phobos, there are a bunch of "different" signatures implemented for this, e.g.

...
phobos/std/typecons.d:        void toString(DG)(scope DG sink) const
...
phobos/std/typecons.d:        void toString(DG, Char)(scope DG sink,  scope const ref FormatSpec!Char fmt) const
...
phobos/std/typecons.d:        void toString()(scope void delegate(const(char)[]) sink, scope const ref FormatSpec!char fmt)
...
phobos/std/sumtype.d:        void toString(this This, Sink, Char)(ref Sink sink, const ref FormatSpec!Char fmt);
...

to just show a few.

The FormatSpec parameter only belongs there if you're actually going to do something useful with it in your toString implementation. Even if you are going to use it, you should probably still provide a convenience overload with a default specifier.

Furthermore, when one works with instances of struct, objects or exceptions a aInstance.toString() does not "work" when one only implements the sink interface (which is to be expected), whereas a std.conv.to!string or a formatted write with %s always works (no matter what was used to implement the toString).

I generally do something like this:

struct A {
    string message;
    int enthusiasm;

    void toString(DG)(scope DG sink) scope const @safe
        if(is(DG : void delegate(scope const(char[])) @safe)
        || is(DG : void function(scope const(char[])) @safe))
    {
        import std.format : formattedWrite;
        sink(message);
        sink(" x ");
        formattedWrite!"%d"(sink, enthusiasm);
        sink("!");
    }
    string toString() scope const pure @safe {
        StringBuilder builder;
        toString(&(builder.opCall)); // Find the exact string length.
        builder.allocate();
        toString(&(builder.opCall)); // Actually write the chars.
        return builder.finish();
    }
}

So, the first toString overload defines how to format the value to text, while the second overload does memory management and forwards the formatting work to the first.

StringBuilder is a utility shared across the entire project:

struct StringBuilder {
private:
    char[] buffer;
    size_t next;

public:
    void opCall(scope const(char[]) str) scope pure @safe nothrow @nogc {
        const curr = next;
        next += str.length;
        if(buffer !is null)
            buffer[curr .. next] = str[];
    }
    void allocate() scope pure @safe nothrow {
        buffer = new char[next];
        next = 0;
    }
    void allocate(const(size_t) maxLength) scope pure @safe nothrow {
        buffer = new char[maxLength];
        next = 0;
    }
    string finish() pure @trusted nothrow @nogc {
        assert(buffer !is null);
        string ret = cast(immutable) buffer[0 .. next];
        buffer = null;
        next = 0;
        return ret;
    }
}

The first formatting pass to find the required buffer length can be skipped if you can somehow pre-calculate the maximum possible length, or if you prefer the common strategy of repeatedly re-allocating the buffer with exponentially increasing size used by the likes of std.array.Appender. Since the API for toString remains the same regardless, you are free to choose the best strategy for each type.

On Saturday, 1 October 2022 at 17:50:54 UTC, tsbockman wrote: > but unless it is provided with a good estimate of the final > length at the beginning, it will allocate several times for > a longer string, and the final buffer will be, on average, 50% larger than needed. I see, it's smart! SDB@79

On Saturday, 1 October 2022 at 17:50:54 UTC, tsbockman wrote:

On Saturday, 1 October 2022 at 10:02:34 UTC, Salih Dincer wrote:

On Saturday, 1 October 2022 at 08:26:43 UTC, tsbockman wrote:

StringBuilder is a utility shared across the entire project:

Appender not good enough; at least in terms of allocating memory and accumulating a string?

Appender is a legitimate option, but unless it is provided with a good estimate of the final length at the beginning, it will allocate several times for a longer string, and the final buffer will be, on average, 50% larger than needed.

Neither of these things is a major problem, but StringBuilder is only a few lines of code to perfectly minimize allocation, so why not?

Thanks a lot. One needs to go twice through the serialization, but perhaps thats better than reallocing memory.

Kind regards,
Christian

Forums