Thread overview
#dbugfix 15136
Jan 18, 2019
Dennis
Jan 18, 2019
Dennis
Jan 19, 2019
Adam D. Ruppe
Jan 19, 2019
Jonathan Marler
January 18, 2019
I don't like how the hacky implementation of toStringz is @trusted.

Manifestation of problems: https://forum.dlang.org/thread/rbyetgbruyiohgudsbhg@forum.dlang.org
Issue: https://issues.dlang.org/show_bug.cgi?id=15136
Implementation: https://github.com/dlang/phobos/blob/master/std/string.d#L364-L365




January 18, 2019
On 1/18/19 3:21 PM, Dennis wrote:
> I don't like how the hacky implementation of toStringz is @trusted.
> 
> Manifestation of problems: https://forum.dlang.org/thread/rbyetgbruyiohgudsbhg@forum.dlang.org
> Issue: https://issues.dlang.org/show_bug.cgi?id=15136
> Implementation: https://github.com/dlang/phobos/blob/master/std/string.d#L364-L365
> 

TBH, fixing toStringz to not be hacky would make it so the manifestation happened every time :)

I do agree that it shouldn't be @trusted if it's done the way it is. I can think of ways to make it buffer overflow.

-Steve
January 18, 2019
On Friday, 18 January 2019 at 20:45:05 UTC, Steven Schveighoffer wrote:
> TBH, fixing toStringz to not be hacky would make it so the manifestation happened every time :)

In that case it would be consistent. Now the culprit appeared to be __FUNCTION__ and the test case couldn't be reduced further. It can be made clear that toStringz is an allocating function, and if @nogc is important some other options are available. In arsd/simpledisplay.d, toStringz is simply defined as:

```
const(char)* toStringz(string s) { return (s ~ '\0').ptr; }
```

You may lose the avoidance of appending 75% of the time, but it won't allow for buffer overflows. When performance is important, other facilities are useful:

- string literals can already safely be passed
- I don't know if std.file.readText is guaranteed null-terminated, but a zero-terminated version/option could be made if there isn't one already
- if a string is constructed by concatenating strings, a null byte can be appended if there's enough capacity without needing to reallocate

The hardest part is when the string origin is unknown, i.e. passed as parameter.
If I make a convenience function that takes a D string and passes it to a C library function, then even when I pass a string literal, the function only sees a slice and doesn't know whether the zero byte at the end belongs to the string or not. A special type for zero-terminated strings would be needed, or a way to recognize pointers in a readonly section.
January 19, 2019
On Friday, 18 January 2019 at 23:26:07 UTC, Dennis wrote:
> In arsd/simpledisplay.d, toStringz is simply defined as:
>
> ```
> const(char)* toStringz(string s) { return (s ~ '\0').ptr; }
> ```

That definition is still in my code, but there are no more uses - I've replaced every instance of it with either calls to C functions that take a pointer+length combo anyway, or by copying into local buffers made out of static arrays. (Notably on Windows, there is a WCharzBuffer struct that can also do wchar and cr/lf conversions too.)

But, yeah, the s ~ 0 .ptr is a really easy way to do it basically right most the time. Just as long as the C function doesn't store it! (but then the local buffer is broken too so separate issue)
January 19, 2019
On Friday, 18 January 2019 at 20:21:51 UTC, Dennis wrote:
> I don't like how the hacky implementation of toStringz is @trusted.
>
> Manifestation of problems: https://forum.dlang.org/thread/rbyetgbruyiohgudsbhg@forum.dlang.org
> Issue: https://issues.dlang.org/show_bug.cgi?id=15136
> Implementation: https://github.com/dlang/phobos/blob/master/std/string.d#L364-L365

I can't believe this has existed in phobos for so many years. Ridiculous.