@trusted assumptions about @safe code (page 4)

May 27, 2020

Re: @trusted assumptions about @safe code

Posted by ag0aep6g
in reply to Paul Backus

Permalink

ag0aep6g

Posted in reply to Paul Backus

Permalink

On 27.05.20 18:04, Paul Backus wrote:
> On Wednesday, 27 May 2020 at 15:14:17 UTC, ag0aep6g wrote:
[...]
>> From your previous post I figured that this is your position towards calling @safe from @trusted:
>>
>>     @trusted code is not allowed to rely on the documented
>>     return value of an @safe function. The @trusted function
>>     must instead verify that the actually returned value is
>>     safe to use.
> 
> This is my position on *any* function calling *any other* function. Even in 100% @system code, I must (for example) check the return value of malloc if I want to rely on it not being null.
> 
>> I'm not sure if I'm representing you correctly, but that position makes sense to me. At the same time, it doesn't seem feasible, because I don't see how we're going to get users to adhere to that.
> 
> Why does it not seem feasible? Checking return values is defensive programming 101. People already do this sort of thing all the time.

I think we're on the same page for the most part, but not here.

I'm pretty sure that you agree with this: When we call C's strlen like so: `strlen("foo\0".ptr)`, we can assume that the result will be 3, because strlen is documented to behave that way.

I'm not sure where you stand on this: If an @safe function is documented to return 42, can we rely on that in the same way we can rely on strlen's documented behavior? Let's say that the author of the @safe is as trustworthy as the C standard library.

If we can assume 42, a bug in @safe code can lead to memory corruption by breaking an assumption that is in @trusted/@system code. Just like a bug in @system code can have that same effect. This might be obviously true to you. But I hadn't really made that connection until recently. So if you agree with this, then I think we're on the same page now. And we just accept that a mistake in @safe code can possibly kill safety.

On the other side, if we cannot assume that 42 being returned, but we need to check that 42 was indeed returned, then the weird scenario happens where an @safe `my_strlen` becomes more cumbersome to use than C's @system `strlen`. I don't think any existing @trusted code was written with that in mind.

If I'm not making any sense, maybe we can compare our answers to the question whether the following snippets (copied from earlier) can really be @trusted.

I think this one is fine:

----
void f() @trusted
{
    import core.stdc.string: strlen;
    import std.stdio: writeln;
    char[5] buf = "foo\0\0";
    char last_char = buf.ptr[strlen(buf.ptr) - 1];
    writeln(last_char);
}
----

And I think this one is fine, too:

----
size_t my_strlen(ref char[5] buf) @safe
{
    foreach (i; 0 .. buf.length) if (buf[i] == '\0') return i;
    return buf.length;
}
void f() @trusted
{
    import std.stdio: writeln;
    char[5] buf = "foo\0\0";
    char last_char = buf.ptr[my_strlen(buf) - 1];
    writeln(last_char);
}
----

On 27.05.20 19:02, ag0aep6g wrote: > Let's say that the author of the @safe is as trustworthy as the C standard library. Should read: Let's say that the author of the @safe **function** is as trustworthy as the C standard library. > On the other side, if we cannot assume that 42 being returned, Should read: On the other side, if we cannot assume that 42 **is** being returned,

On 27/5/20 19:02, ag0aep6g wrote: > I'm pretty sure that you agree with this: When we call C's strlen like so: `strlen("foo\0".ptr)`, we can assume that the result will be 3, because strlen is documented to behave that way. > There's one big difference: in @safe there are bound checks by default, so even if you as the programmer assume that `strlen` will return the right value, the compiler is still inserting checks at every access: ``` @safe void safeFunc() { string foo = "foo\0"); auto len = strlen(foo); auto bar = foo[0..len - 1]; // It'll be checked even in -release mode } @trusted void trustedFunc() { string foo = "foo\0"); auto len = strlen(foo); auto bar = foo[0..len - 1]; } ``` So it's not that you needn't checks in @safe code, it's that they are added automatically.

On Wednesday, 27 May 2020 at 17:02:15 UTC, ag0aep6g wrote: > I'm not sure where you stand on this: If an @safe function is documented to return 42, can we rely on that in the same way we can rely on strlen's documented behavior? Let's say that the author of the @safe is as trustworthy as the C standard library. It depends entirely on how far you are willing to extend your trust. If you're willing to trust the C library to conform to the C standard, and you stipulate that this other function and its documentation are equally trustworthy, then yes, you can trust it to return 42. I think the key insight here is that the memory-safety property established by D's @safe is *conditional*, not absolute. In other words, the D compiler (modulo bugs) establishes the following logical implication: my @trusted code is memory-safe -> my @safe code is memory-safe Proving the left-hand side is left as an exercise to the programmer. In practice, proving your @trusted code correct without making *any* assumptions about other code is too difficult, since at the very least you have to account for dependencies like the C library and the operating system. At some point, you have to trust that the code you're calling does what it says it does. So what you end up doing is establishing another logical implication: my dependencies behave as-documented -> my @trusted code is memory-safe By the rules of logical inference, it follows that my dependencies behave as-documented -> my @safe code is correct This is far from an absolute safety guarantee, but it can be good enough, as long as you stick to dependencies that are well-documented and well-tested. Of course, if you are willing to put in enough effort, you can tighten up the condition on the left as much as you want. For example, maybe you decide to only trust the C library and the operating system, and audit every other dependency yourself. Then you end up with my libc and OS behave as-documented -> my @safe code is correct This is more or less the broadest assurance of safety you can achieve if you are trying to write "portable" code, and I expect the vast majority of programmers would be satisfied with it. In princple, if you are doing bare-metal development, you could get all the way to my hardware behaves as-documented -> my @safe code is correct ...but that's unlikely to be worth the effort outside of very specialized and safety-critical fields.

On 27.05.20 19:30, Paul Backus wrote: > In practice, proving your @trusted code correct without making *any* assumptions about other code is too difficult, since at the very least you have to account for dependencies like the C library and the operating system. At some point, you have to trust that the code you're calling does what it says it does. So what you end up doing is establishing another logical implication: > > my dependencies behave as-documented -> my @trusted code is memory-safe So: my dependencies do not behave as-documented -> my @trusted code may be not memory-safe right? Let's say an @safe function `my_strlen` is part of my dependencies. Then: `my_strlen` does not behave as-documented -> my @trusted code may be not memory-safe Or in other words: A mistake in an @safe function can lead to memory corruption. Which is what Steven is saying. And I agree.

On Wednesday, 27 May 2020 at 18:46:59 UTC, ag0aep6g wrote: > > So: > > my dependencies do not behave as-documented -> my @trusted code may be not memory-safe > > right? [...] > Or in other words: A mistake in an @safe function can lead to memory corruption. > > Which is what Steven is saying. And I agree. Yes, that's correct.

On Tuesday, 26 May 2020 at 07:07:33 UTC, Timon Gehr wrote: > I don't think so. @trusted code can't rely on @safe code behaving a certain way to ensure memory safety, it has to be defensive. I'd say it's sound to rely on any postconditions in a function's `out` contract. Those can be mechanically enforced. I'd also say it's sound for @trusted code to rely on the behavior of a function whose source code the author of the @trusted code has reviewed and either has control over or can prevent from changing (such as by specifying a library version in a dependency manager). I'd also say it's within the spirit of @trusted to rely on the behavior of any function that allegedly adheres to a specification and has been blessed by some expensive certifying body or meticulous review process as meeting that specification. After all, all @trusted is (including the @trusted the author is applying to his own function) is some human's statement that he has carefully inspected some function and can vouch that calling it won't cause memory corruption / undefined behavior for a program that wasn't already in an invalid state.

Forums