Thread overview
How to test if a string is pointing into read-only memory?
5 days ago
jfondren
5 days ago
Elronnd
5 days ago
Elronnd
5 days ago
IGotD-
4 days ago
Paul Backus
5 days ago
ag0aep6g
5 days ago
Kagamin
5 days ago

std.string.toStringz always allocates a new string, but it has this note:

/+ Unfortunately, this isn't reliable.
 We could make this work if string literals are put
 in read-only memory and we test if s[] is pointing into
 that.

 /* Peek past end of s[], if it's 0, no conversion necessary.
 * Note that the compiler will put a 0 past the end of static
 * strings, and the storage allocator will put a 0 past the end
 * of newly allocated char[]'s.
 */
 char* p = &s[0] + s.length;
 if (*p == 0)
 return s;
 +/

and string literals weren't reliably in read-only memory as recently as early 2017: https://github.com/dlang/dmd/pull/6546#issuecomment-280612721

What's a reliable test that could be used in a toStringz that skips allocation when given a string in read-only memory?

As for whether it's a necessarily a good idea to patch toStringz, I'd worry that

  1. someone will slice a string literal and pass the test while not having NUL where it's expected

  2. people are probably relying by now on toStringz always allocating, to e.g. safely cast immutable off the result.

5 days ago
On Tuesday, 12 October 2021 at 08:19:01 UTC, jfondren wrote:
> What's a reliable test that could be used in a toStringz that skips allocation when given a string in read-only memory?

There is no good way.

- You could peek in /proc, but that's not portable

- You could poke the data and catch the resulting fault; but that's: 1) horrible, 2) slow, 3) problematic wrt threading, 4) sensitive to user code mapping its own memory and then remapping as rw (or unmapping)

- You could make a global hash table into which are registered the addresses of all rodata; but that is difficult to get right across translation units, especially in the face of dynamic linking.  This is probably the most feasible, but is really not worth the hassle.
5 days ago
On Tuesday, 12 October 2021 at 09:20:42 UTC, Elronnd wrote:
> problematic wrt threading

Not to mention signals.  Reentrancy's a bitch.
5 days ago
On 12.10.21 10:19, jfondren wrote:
> ```d
> /+ Unfortunately, this isn't reliable.
>   We could make this work if string literals are put
>   in read-only memory and we test if s[] is pointing into
>   that.
> 
>   /* Peek past end of s[], if it's 0, no conversion necessary.
>   * Note that the compiler will put a 0 past the end of static
>   * strings, and the storage allocator will put a 0 past the end
>   * of newly allocated char[]'s.
>   */
>   char* p = &s[0] + s.length;
>   if (*p == 0)
>   return s;
>   +/
> ```
[...]
> As for whether it's a necessarily a good idea to patch toStringz, I'd worry that
> 
> 1. someone will slice a string literal and pass the test while not having NUL where it's expected

The (commented-out) code checks if the NUL is there. Just make sure that it's also read-only.

> 2. people are probably relying by now on toStringz always allocating, to e.g. safely cast immutable off the result.

It doesn't matter if the result is freshly allocated. Casting away immutable is only allowed as long as you don't use it to actually change the data (i.e. it remains de-facto immutable).
5 days ago

On Tuesday, 12 October 2021 at 08:19:01 UTC, jfondren wrote:

>

and string literals weren't reliably in read-only memory as recently as early 2017: https://github.com/dlang/dmd/pull/6546#issuecomment-280612721

Sometimes sections have defined symbols for start and end, you can check if the string is in rdata section. On windows you can test it generically with IsBadWritePtr function.

5 days ago
On Tuesday, 12 October 2021 at 09:20:42 UTC, Elronnd wrote:
>
> There is no good way.

Can't it be done using function overloading?
4 days ago
On Tuesday, 12 October 2021 at 21:42:45 UTC, IGotD- wrote:
> On Tuesday, 12 October 2021 at 09:20:42 UTC, Elronnd wrote:
>>
>> There is no good way.
>
> Can't it be done using function overloading?

Function overloading lets you distinguish between arguments with different types, but strings in read-only memory and strings in read-write memory both have the same type: string.