Thread overview
Detecting at compile time if a string is zero terminated
Jan 19, 2011
Johannes Pfau
Jan 19, 2011
Jesse Phillips
Jan 20, 2011
Johannes Pfau
Jan 20, 2011
Jesse Phillips
January 19, 2011
Hi,
I'm currently writing a small wrapper layer for gettext. My current
getText template looks like this:

----------------------------------------------------
string getText()(string msgid, string domain = null, Category cat =
Category.Messages) {
    auto cmsg = toStringz(msgid);
    auto nmsg = dcgettext(domain ? toStringz(domain) : null,
              cmsg, cat);
    if(cmsg == nmsg)
        return msgid;
    else
    {
        string nstr = cast(string)nmsg[0 .. strlen(nmsg)];
        return nstr;
    }
}

string getText(string msgid, string domain = null, Category cat =
Category.Messages)() {
    auto nmsg = dcgettext(domain ? domain.ptr : null,
              msgid.ptr, cat);
    if(msgid.ptr == nmsg)
        return msgid;
    else
    {
        string nstr = cast(string)nmsg[0 .. strlen(nmsg)];
        return nstr;
    }
}
----------------------------------------------------

As string literals in D are zero terminated, there's no need
for the toStringz overhead. The overload taking compile time
parameters takes advantage of that. The code works and can be used like
this:
----------------------------------------------------
    writeln(getText!"Hello World!"); //no toStringz
    writeln(getText("Hello World!")); //toStringz
----------------------------------------------------

But if somehow possible I'd like to merge the templates so that there is only one way to call getText and the fastest way is chosen automatically.

Does anyone know how to do that?
-- 
Johannes Pfau


January 19, 2011
First off no. Second, is their really going to be a performance gain from this. I wouldn't expect static strings to be converted very often. And last I will copy and past a comment from the source code:

198 	    /+ Unfortunately, this isn't reliable.
199 	     We could make this work if string literals are put
200 	     in read-only memory and we test if s[] is pointing into
201 	     that.
202
203 	     /* Peek past end of s[], if it's 0, no conversion necessary.
204 	     * Note that the compiler will put a 0 past the end of static
205 	     * strings, and the storage allocator will put a 0 past the end
206 	     * of newly allocated char[]'s.
207 	     */
208 	     char* p = &s[0] + s.length;
209 	     if (*p == 0)
210 	     return s;
211 	     +/
January 20, 2011
Jesse Phillips wrote:
>First off no. Second, is their really going to be a performance gain from this. I wouldn't expect static strings to be converted very often. And last I will copy and past a comment from the source code:

Thanks for your reply.
In case you don't know: gettext is used to translate strings.
You call gettext("english string") and it returns the translated
string. Gettext might be the only corner case, but the strings gettext
returns are usually not cached and big projects could translate many
strings, so I thought it could be an issue. But maybe I'm
overestimating that.
I had a look at the source code of toStringz and found the comment you
mentioned. The comment is for toStringz(const(char)[] s)
toStringz(string s) is even more interesting in this case as it does do
that optimization in most cases. I think that's good enough ;-)

-- 
Johannes Pfau


January 20, 2011
I would bet that you'd end up spending more time translating the string then copying it.

Didn't think to look at what type the function accepted. I figured that any such optimization would exist inside of toStringz if it was possible.