Jump to page: 1 2
Thread overview
July 16
https://github.com/WalterBright/documents/blob/ed4f1b441e71b5ac5e23a54e7c93e68997981e9a/SafePrintf.md
July 16
On 7/16/2024 5:42 PM, Walter Bright wrote:
> https://github.com/WalterBright/documents/blob/ed4f1b441e71b5ac5e23a54e7c93e68997981e9a/SafePrintf.md

Paul Backus writes:

> What I find objectionable in this case is that (a) the better interface is implemented using a bunch of compiler-internal rewrites, rather than normal D code; and (b) it shadows the existing C printf function rather than existing alongside it.


It's a pretty thin piece of paper over printf. Consider:

```
printf("%s\n", 3);
```
That's going to crash a C program. Currently, for D an error will be given. Under this proposal, it will be rewritten as:

```
printf("%d\n", 3);
```

The rewrite will only happen for %s format specifiers.

For the following:

```
char* s;
printf("%s\n", s);
```
there will be no rewrite, but that call will be considered unsafe. For:
```
char[] s;
printf("%s\n", s);
```
that is currently rejected by the compiler. Under this proposal, it will be rewritten as:
```
char[] s;
printf("%.*s\n", cast(int)s.length & 0x7FFF_FFFF);
```
which will make it safe.

I can't think of a case where the proposal makes any existing uses of printf impossible. If they exist, there are workarounds:

1. use a variable rather than a string literal for the format:
```
char* fmt = "hello %s!\n";
printf(fmt, "betty");
```
2. this behavior is triggered by the function being marked as `pragma(printf)`. Don't do that if you don't want it. Or declare printf yourself as:
```
extern (C) int printf(const(char)*, ...);
```

> If we need a safer printf for DMD that doesn't carry all the bloat and baggage of Phobos's writef, then by all means, let's write one. But let's write it in D and put it in a normal D module, instead of sneaking around and redefining printf behind our users' backs.

The printf argument checking code added in has been an unblemished win for us. C and C++ compilers seem to be adding it, too. This is just a small improvement over that.
July 17
On 17/07/2024 12:42 PM, Walter Bright wrote:
> https://github.com/WalterBright/documents/blob/ed4f1b441e71b5ac5e23a54e7c93e68997981e9a/SafePrintf.md

I may have questioned in the motives in the past, however this is a useful feature and the DIP looks fine.

I'm commenting that I cannot find anything wrong with it so this DIP can move into the queue sooner.
July 17

On Wednesday, 17 July 2024 at 00:42:03 UTC, Walter Bright wrote:

>

https://github.com/WalterBright/documents/blob/ed4f1b441e71b5ac5e23a54e7c93e68997981e9a/SafePrintf.md

>

This DIP applies to any function marked with pragma(printf) and @safe or @trusted.

So how does printf benefit from this then? It can't be marked @trusted. Would we add a @trusted overload taking a string format parameter and use non-C variadic arguments?

The overload would have to handle the cases in this DIP itself. Or we could use an enum parameter as the format string, if we had those.

July 17
On 7/17/2024 3:31 AM, Nick Treleaven wrote:
>> This DIP applies to any function marked with pragma(printf) and @safe or @trusted.
> 
> So how does `printf` benefit from this then? It can't be marked `@trusted`. Would we add a `@trusted` overload taking a string format parameter and use non-C variadic arguments?

If pragma(printf) is there, the user is asserting that if the format string and arguments are compatible, and the function is also marked @trusted or @safe, then that particular call is @safe. If the function is marked @safe, and the call checks determine that it is not safe, then that call is marked as not safe.

This is how functions like sprintf(), which cannot ever be safe, can still be marked as @system, and still get printf format checking. And calls to fprintf can be marked @safe.

Yes, it's a bit of special compiler magic, but it works. But it would be so useful, and we already apply compiler magic via pragma(printf).
July 17

On Wednesday, 17 July 2024 at 00:42:03 UTC, Walter Bright wrote:

>

https://github.com/WalterBright/documents/blob/ed4f1b441e71b5ac5e23a54e7c93e68997981e9a/SafePrintf.md

printf always performs pointer arithmetic, and therefore should not be marked @safe:

>

No pointer arithmetic (including pointer indexing & slicing).

Marking it as @trusted is fine, and historically other core.stdc functions that have similar behaviour have also been marked @trusted.

July 17

On Wednesday, 17 July 2024 at 17:45:12 UTC, IchorDev wrote:

>

Marking it as @trusted is fine

Oops, I didn't re-read the whole section on safe interfaces:

>

C's strlen and memcpy do not have safe interfaces:

extern (C) @system size_t strlen(char* s);
extern (C) @system void* memcpy(void* dst, void* src, size_t nbytes);

because they iterate pointers based on unverified assumptions (strlen assumes that s is zero-terminated; memcpy assumes that the memory objects pointed to by dst and src are at least nbytes big). Any function that traverses a C string passed as an argument can only be @system. Any function that trusts a separate parameter for array bounds can only be @system.

So, printf must be @system. Even %.*s is @system!

July 18

On Wednesday, 17 July 2024 at 00:42:03 UTC, Walter Bright wrote:

>

https://github.com/WalterBright/documents/blob/ed4f1b441e71b5ac5e23a54e7c93e68997981e9a/SafePrintf.md

Let’s say I have a @safe-annotated function. If I understand the DIP draft correctly, it’s proposed that I can call printf if

  • the format is a compile-time constant (possibly derived through CTFE) and

plus the compiler:

  • issues a hard error on incorrect use, e.g. number of arguments and specifiers mismatch,
  • silently changes the format specifier if it’s meaningful, e.g. %s to %d for integers, making %s essentially universal,
  • static array arguments are implicitly sliced, e.g. a char[10] argument becomes char[] argument,
  • if a %s specifier lines up with const(Char)[] argument, silently changes the format specifier %s to %.*s/%.*ls and the corresponding argument xs is replaced by cast(int)(xs.length & int.max and xs.ptr.

My only issue is the & int.max, that should be a non-assert feature. With asserts enabled, just assert(xs.length < int.max).

Otherwise, it’s a great idea. I’d make it __printf, though, and ideally, __printf becomes a new core-language keyword so that all the compiler-magic and special casing is appropriately justified. It should also not require any imports then, which would make it even easier to use. Changing printf in any shape or form will make some people unhappy. I could imagine people being much happier having a keyword that is guaranteed to lower to a printf call, with some checks and convenience added.

If we’re at it, __printf could also support slices of non-character type: When a non-character array is an argument type that lines up with some specifier, cut the format in half, loop over elements and print them individually comma-separated and using that specifier, then continue with the rest of the format:

int a, b;
int[] xs;

int n = __printf("%d xyz %X abc %d", a, xs, b);
// lowers to:
int n = {
    int __result = __printf("%d xyz [", a);
    if (xs.length > 0)
    {
        __result += __printf("%X", xs[0]);
        foreach (__x; xs[1..$]) __result += __printf(", %X", __x);
    }
    return __result + __printf("] abc %d", a, xs, b);
}();

A similar approach would work for associative arrays as well. What’s so cool about it is that it would work with nested arrays! The cutting-and-loop approach also works for struct types, printing some header (the type name) and then the comma-separated tupleof, provided the members are of printf-friendly types.

July 18
On 7/17/2024 11:05 AM, IchorDev wrote:
> So, `printf` must be `@system`. Even `%.*s` is `@system`!

This proposal puts a safe interface around %.*s.
August 02

On Wednesday, 17 July 2024 at 00:42:03 UTC, Walter Bright wrote:

>

https://github.com/WalterBright/documents/blob/ed4f1b441e71b5ac5e23a54e7c93e68997981e9a/SafePrintf.md

>

This proposal will cause the format specifier to be rewritten to match the argument type, if the format specifier is %s.

Dennis has pointed out that this can corrupt memory (in a @system or @trusted function) just by simple refactoring:

>

You would think it's safe to transform this:

int x;
...
printf("x = %s\n", x);
printf("x = %s\n", x);
>

Into this:

const(char)* fmt = "x = %s\n";
printf(fmt, x);
printf(fmt, x);

That's quite a pitfall and easy to overlook in code review. I suggest removing that feature for argument types other than character arrays.

>

If the format specifier is %s and the corresponding argument is a D array of char or wchar_t, the format will be replace with %.*s (or %.*ls) and the argument will be replaced with two arguments

I think that's fine, because D doesn't allow passing arrays to variadic arguments. So if those calls were refactored, they would cause a compile-time error.

« First   ‹ Prev
1 2